How to trace memory leaks using eBPF/BCC script?

Solution Unverified - Updated

Environment

  • Red Hat Enterprise Linux 8+
  • bcc-tools
  • memleak

Issue

  • How to trace outstanding memory allocations that weren't freed?

Resolution

  • The eBPF/BCC memleak script prints a summary of outstanding allocations and their call stacks to detect memory leaks.

RHEL8:

  • Install bcc, bcc-tools, and kernel-devel packages to use eBPF memleak script on RHEL8 (please note that in RHEL 8 the bcc and bcc-tools packages are in the rhel-8-for-x86_64-appstream-rpms repo).
# yum install bcc bcc-tools kernel-devel-`uname -r` -y

RHEL 9 & 10:

  • Install bcc and bcc-tools packages to use eBPF memleak script on RHEL9 .
# dnf install bcc bcc-tools -y 
  • Execute the memleak script to print allocations older than mentioned milliseconds.
# /usr/share/bcc/tools/memleak -o 1
Attaching to kernel allocators, Ctrl+C to quit.
[13:41:33] Top 10 stacks with outstanding allocations:
	1280 bytes in 20 allocations from stack
		b'kmem_cache_alloc_node_trace+0x153 [kernel]\n\t\t__get_vm_area_node+0x7a [kernel]'
	2048 bytes in 1 allocations from stack
		b'__kmalloc_node_track_caller+0x1ba [kernel]'
	2448 bytes in 49 allocations from stack
		b'__kmalloc_track_caller+0x15a [kernel]'
	4800 bytes in 19 allocations from stack
		b'__kmalloc+0x161 [kernel]'
	6432 bytes in 67 allocations from stack
		b'kmem_cache_alloc_node_trace+0x153 [kernel]\n\t\talloc_vmap_area+0x75 [kernel]\n\t\talloc_vmap_area+0x75 [kernel]\n\t\t__get_vm_area_node+0xb0 [kernel]\n\t\t__vmalloc_node_range+0x6a [kernel]\n\t\tcopy_process.part.31+0x5db [kernel]\n\t\t_do_fork+0xe2 [kernel]\n\t\tdo_syscall_64+0x5b [kernel]\n\t\tentry_SYSCALL_64_after_hwframe+0x65 [kernel]'
	13264 bytes in 32 allocations from stack
		b'__kmalloc_node+0x1cb [kernel]'
	43776 bytes in 12 allocations from stack
		b'kmem_cache_alloc_node+0x157 [kernel]'
	54736 bytes in 132 allocations from stack
		b'kmem_cache_alloc_trace+0x143 [kernel]'
	164056 bytes in 848 allocations from stack
		b'kmem_cache_alloc+0x141 [kernel]'
	4780032 bytes in 1067 allocations from stack
		b'__alloc_pages_nodemask+0x1bd [kernel]'
  • Reference:
# man 8 memleak

Root Cause

  • When tracing a specific process, memleak instruments a list of allocation function from libc, specifically: malloc, calloc, realloc, posix_memalign, valloc, memalign, pvalloc, aligned_alloc, and free.

  • When tracing all processes, memleak instruments kmalloc/kfree, kmem_cache_alloc/kmem_cache_free, and also page allocations made by get_free_pages/free_pages.

Note: The memleak script may introduce significant overhead when tracing processes that allocate and free many blocks very quickly.

SBR
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.