You are not logged in.
Hi,
Since recently I've repeatedly had memory leak which I suspect is in the Kernel space, since I can not see any processes using the RAM. Earlyoom starts to kill my actual software, but still 20GiB+ of RAM (on a system of 32GiB total RAM) keeps being occupied by something (Kernel?) and is not available for processes (<10GiB is not enough to run my software). I'm using amdgpu, in case that makes a difference.
After stopping X.org (sddm), so that the system has no services up which should be using a lot of RAM, around 20GB+ is still in use (I'm expecting ~1GiB or less). I can not see any proccesses using a lot of RAM in (h)top! (only the available amount has diminished as reported by 'top', 'free' etc...).
Has anyone else noticed a similar problem recently?
Any other ideas save compiling kernel with CONFIG_DEBUG_KMEMLEAK do find out, what is actually eating the RAM?
Or, hot to conclusively confirm if this is a kernel, kernel module or - after all - some process eating RAM? (I've used top/htop sorted by RAM usage).
Some info:
Some inxi info:
System: Host: ArkkiVille Kernel: 5.10.16-zen1-1-zen x86_64 bits: 64 compiler: N/A Desktop: KDE Plasma 5.21.0
Distro: Arch Linux
Machine: Type: Desktop System: ASUS product: All Series v: N/A serial: <filter>
Mobo: ASUSTeK model: MAXIMUS VII GENE v: Rev 1.xx serial: <filter> UEFI: American Megatrends v: 3503
date: 04/18/2018
Battery: ID-1: hidpp_battery_0 charge: N/A condition: N/A model: Logitech G703 Wired/Wireless Gaming Mouse
status: Discharging
CPU: Topology: Quad Core model: Intel Core i7-4790K bits: 64 type: MT MCP arch: Haswell rev: 3 L2 cache: 8192 KiB
flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 63994
Speed: 4000 MHz min/max: 800/4800 MHz Core speeds (MHz): 1: 4000 2: 4000 3: 4000 4: 4000 5: 4000 6: 4001 7: 4000
8: 4000
Graphics: Device-1: Advanced Micro Devices [AMD/ATI] Vega 10 XL/XT [Radeon RX Vega 56/64] driver: amdgpu v: kernel
bus ID: 03:00.0
Display: x11 server: X.Org 1.20.10 driver: amdgpu resolution: 3440x1440~100Hz
OpenGL: renderer: Radeon RX Vega (VEGA10 DRM 3.40.0 5.10.16-zen1-1-zen LLVM 11.1.0) v: 4.6 Mesa 20.3.4
direct render: Yes
Audio: Device-1: Intel 9 Series Family HD Audio vendor: ASUSTeK driver: snd_hda_intel v: kernel bus ID: 00:1b.0
Device-2: Advanced Micro Devices [AMD/ATI] Vega 10 HDMI Audio [Radeon Vega 56/64] driver: snd_hda_intel v: kernel
bus ID: 03:00.1
Device-3: Digital Devices Octopus DVB Adapter driver: ddbridge v: 0.9.33-integrated bus ID: 04:00.0
Device-4: Micronas BLUE USB Audio 2.0 type: USB driver: snd-usb-audio bus ID: 5-2.4:5
Device-5: Creative Sound BlasterX G6 type: USB driver: hid-generic,snd-usb-audio,usbhid bus ID: 5-2.3:4
Sound Server: ALSA v: k5.10.16-zen1-1-zen
Network: Device-1: Intel Ethernet I218-V vendor: ASUSTeK driver: e1000e v: kernel port: f040 bus ID: 00:19.0
IF: eno1 state: up speed: 1000 Mbps duplex: full mac: <filter>
/proc/meminfo during a leak:
MemTotal: 32820080 kB
MemFree: 6456052 kB
MemAvailable: 9874904 kB
Buffers: 197772 kB
Cached: 3749128 kB
SwapCached: 0 kB
Active: 1087708 kB
Inactive: 5288832 kB
Active(anon): 25724 kB
Inactive(anon): 2742628 kB
Active(file): 1061984 kB
Inactive(file): 2546204 kB
Unevictable: 1760 kB
Mlocked: 1760 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 2388068 kB
Mapped: 887796 kB
Shmem: 337236 kB
KReclaimable: 274584 kB
Slab: 1978800 kB
SReclaimable: 274584 kB
SUnreclaim: 1704216 kB
KernelStack: 20016 kB
PageTables: 40252 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 16410040 kB
Committed_AS: 10225220 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 78300 kB
VmallocChunk: 0 kB
Percpu: 6528 kB
HardwareCorrupted: 0 kB
AnonHugePages: 440320 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
FileHugePages: 14336 kB
FilePmdMapped: 0 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 0 kB
DirectMap4k: 10412600 kB
DirectMap2M: 23080960 kB
DirectMap1G: 1048576 kB
/proc/slabinfo during a leak:
slabinfo - version: 2.1
# name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
fat_inode_cache 357 441 752 21 4 : tunables 0 0 0 : slabdata 21 21 0
fat_cache 714 714 40 102 1 : tunables 0 0 0 : slabdata 7 7 0
kvm_async_pf 0 0 136 30 1 : tunables 0 0 0 : slabdata 0 0 0
kvm_vcpu 0 0 11328 1 4 : tunables 0 0 0 : slabdata 0 0 0
kvm_mmu_page_header 0 0 168 24 1 : tunables 0 0 0 : slabdata 0 0 0
x86_emulator 0 0 2672 12 8 : tunables 0 0 0 : slabdata 0 0 0
x86_fpu 0 0 4160 7 8 : tunables 0 0 0 : slabdata 0 0 0
nfs4_layout_stateid 0 0 296 27 2 : tunables 0 0 0 : slabdata 0 0 0
nfsd4_delegations 0 0 248 33 2 : tunables 0 0 0 : slabdata 0 0 0
nfsd4_files 0 0 288 28 2 : tunables 0 0 0 : slabdata 0 0 0
nfsd4_lockowners 0 0 392 20 2 : tunables 0 0 0 : slabdata 0 0 0
nfsd4_openowners 0 0 432 37 4 : tunables 0 0 0 : slabdata 0 0 0
nfsd4_clients 0 0 1296 25 8 : tunables 0 0 0 : slabdata 0 0 0
rpc_inode_cache 46 46 704 23 4 : tunables 0 0 0 : slabdata 2 2 0
ovl_inode 0 0 696 23 4 : tunables 0 0 0 : slabdata 0 0 0
fuse_request 0 0 152 26 1 : tunables 0 0 0 : slabdata 0 0 0
fuse_inode 0 0 896 36 8 : tunables 0 0 0 : slabdata 0 0 0
ext4_groupinfo_4k 141596 141596 144 28 1 : tunables 0 0 0 : slabdata 5057 5057 0
ext4_fc_dentry_update 0 0 80 51 1 : tunables 0 0 0 : slabdata 0 0 0
ext4_inode_cache 93288 93288 1192 27 8 : tunables 0 0 0 : slabdata 3456 3456 0
ext4_allocation_context 256 256 128 32 1 : tunables 0 0 0 : slabdata 8 8 0
ext4_io_end 5056 5056 64 64 1 : tunables 0 0 0 : slabdata 79 79 0
ext4_extent_status 74766 74766 40 102 1 : tunables 0 0 0 : slabdata 733 733 0
jbd2_journal_handle 5110 5110 56 73 1 : tunables 0 0 0 : slabdata 70 70 0
jbd2_journal_head 1939 2074 120 34 1 : tunables 0 0 0 : slabdata 61 61 0
jbd2_revoke_table_s 1024 1024 16 256 1 : tunables 0 0 0 : slabdata 4 4 0
jbd2_revoke_record_s 1024 1024 32 128 1 : tunables 0 0 0 : slabdata 8 8 0
bio-1 294 378 384 21 2 : tunables 0 0 0 : slabdata 18 18 0
dm_bufio_buffer-72 576 576 224 36 2 : tunables 0 0 0 : slabdata 16 16 0
dm_bio_prison_cell 1806 1806 96 42 1 : tunables 0 0 0 : slabdata 43 43 0
kcopyd_job 108 108 3312 9 8 : tunables 0 0 0 : slabdata 12 12 0
dm_uevent 0 0 2888 11 8 : tunables 0 0 0 : slabdata 0 0 0
fsverity_info 0 0 256 32 2 : tunables 0 0 0 : slabdata 0 0 0
fscrypt_info 0 0 136 30 1 : tunables 0 0 0 : slabdata 0 0 0
MPTCPv6 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0
ip6-frags 0 0 184 22 1 : tunables 0 0 0 : slabdata 0 0 0
PINGv6 0 0 1216 26 8 : tunables 0 0 0 : slabdata 0 0 0
RAWv6 663 676 1216 26 8 : tunables 0 0 0 : slabdata 26 26 0
UDPv6 240 240 1344 24 8 : tunables 0 0 0 : slabdata 10 10 0
tw_sock_TCPv6 66 66 248 33 2 : tunables 0 0 0 : slabdata 2 2 0
request_sock_TCPv6 0 0 304 26 2 : tunables 0 0 0 : slabdata 0 0 0
TCPv6 195 195 2496 13 8 : tunables 0 0 0 : slabdata 15 15 0
scsi_sense_cache 640 640 128 32 1 : tunables 0 0 0 : slabdata 20 20 0
bfq_io_cq 875 875 160 25 1 : tunables 0 0 0 : slabdata 35 35 0
mqueue_inode_cache 272 272 960 34 8 : tunables 0 0 0 : slabdata 8 8 0
userfaultfd_ctx_cache 0 0 192 21 1 : tunables 0 0 0 : slabdata 0 0 0
dnotify_struct 0 0 32 128 1 : tunables 0 0 0 : slabdata 0 0 0
dio 0 0 640 25 4 : tunables 0 0 0 : slabdata 0 0 0
pid_namespace 224 224 144 28 1 : tunables 0 0 0 : slabdata 8 8 0
ip4-frags 20 20 200 20 1 : tunables 0 0 0 : slabdata 1 1 0
MPTCP 0 0 1856 17 8 : tunables 0 0 0 : slabdata 0 0 0
request_sock_subflow 0 0 376 21 2 : tunables 0 0 0 : slabdata 0 0 0
xfrm_state 0 0 768 21 4 : tunables 0 0 0 : slabdata 0 0 0
PING 1257 1408 1024 32 8 : tunables 0 0 0 : slabdata 44 44 0
RAW 896 896 1024 32 8 : tunables 0 0 0 : slabdata 28 28 0
tw_sock_TCP 297 297 248 33 2 : tunables 0 0 0 : slabdata 9 9 0
request_sock_TCP 208 208 304 26 2 : tunables 0 0 0 : slabdata 8 8 0
TCP 135 182 2368 13 8 : tunables 0 0 0 : slabdata 14 14 0
hugetlbfs_inode_cache 50 50 640 25 4 : tunables 0 0 0 : slabdata 2 2 0
dquot 256 256 256 32 2 : tunables 0 0 0 : slabdata 8 8 0
eventpoll_pwq 1792 1792 72 56 1 : tunables 0 0 0 : slabdata 32 32 0
dax_cache 126 126 768 21 4 : tunables 0 0 0 : slabdata 6 6 0
bio_crypt_ctx 9180 9180 40 102 1 : tunables 0 0 0 : slabdata 90 90 0
request_queue 96 96 2032 16 8 : tunables 0 0 0 : slabdata 6 6 0
biovec-max 202 232 4096 8 8 : tunables 0 0 0 : slabdata 29 29 0
biovec-128 240 272 2048 16 8 : tunables 0 0 0 : slabdata 17 17 0
biovec-64 384 384 1024 32 8 : tunables 0 0 0 : slabdata 12 12 0
khugepaged_mm_slot 288 288 112 36 1 : tunables 0 0 0 : slabdata 8 8 0
user_namespace 240 240 536 30 4 : tunables 0 0 0 : slabdata 8 8 0
dmaengine-unmap-256 15 15 2112 15 8 : tunables 0 0 0 : slabdata 1 1 0
dmaengine-unmap-128 30 30 1088 30 8 : tunables 0 0 0 : slabdata 1 1 0
sock_inode_cache 3432 3432 832 39 8 : tunables 0 0 0 : slabdata 88 88 0
skbuff_ext_cache 2065 2247 192 21 1 : tunables 0 0 0 : slabdata 107 107 0
skbuff_fclone_cache 1236 1504 512 32 4 : tunables 0 0 0 : slabdata 47 47 0
skbuff_head_cache 2368 2368 256 32 2 : tunables 0 0 0 : slabdata 74 74 0
file_lock_cache 296 296 216 37 2 : tunables 0 0 0 : slabdata 8 8 0
file_lock_ctx 12987 13651 56 73 1 : tunables 0 0 0 : slabdata 187 187 0
fsnotify_mark_connector 15616 15616 32 128 1 : tunables 0 0 0 : slabdata 122 122 0
net_namespace 78 78 5056 6 8 : tunables 0 0 0 : slabdata 13 13 0
x86_lbr 0 0 800 20 4 : tunables 0 0 0 : slabdata 0 0 0
task_delay_info 5154 5202 80 51 1 : tunables 0 0 0 : slabdata 102 102 0
taskstats 184 184 352 23 2 : tunables 0 0 0 : slabdata 8 8 0
proc_dir_entry 2735 2814 192 21 1 : tunables 0 0 0 : slabdata 134 134 0
pde_opener 816 816 40 102 1 : tunables 0 0 0 : slabdata 8 8 0
proc_inode_cache 15429 15939 688 23 4 : tunables 0 0 0 : slabdata 693 693 0
seq_file 272 272 120 34 1 : tunables 0 0 0 : slabdata 8 8 0
bdev_cache 312 312 832 39 8 : tunables 0 0 0 : slabdata 8 8 0
shmem_inode_cache 3067 3696 728 22 4 : tunables 0 0 0 : slabdata 168 168 0
kernfs_node_cache 49445 50528 128 32 1 : tunables 0 0 0 : slabdata 1579 1579 0
mnt_cache 2030 2100 320 25 2 : tunables 0 0 0 : slabdata 84 84 0
filp 19181 20768 256 32 2 : tunables 0 0 0 : slabdata 649 649 0
inode_cache 22035 22230 616 26 4 : tunables 0 0 0 : slabdata 855 855 0
dentry 166057 167013 192 21 1 : tunables 0 0 0 : slabdata 7953 7953 0
names_cache 64 64 4096 8 8 : tunables 0 0 0 : slabdata 8 8 0
buffer_head 260247 260247 104 39 1 : tunables 0 0 0 : slabdata 6673 6673 0
uts_namespace 148 148 440 37 4 : tunables 0 0 0 : slabdata 4 4 0
vm_area_struct 80323 81620 200 20 1 : tunables 0 0 0 : slabdata 4081 4081 0
mm_struct 390 390 1088 30 8 : tunables 0 0 0 : slabdata 13 13 0
files_cache 368 368 704 23 4 : tunables 0 0 0 : slabdata 16 16 0
signal_cache 896 896 1152 28 8 : tunables 0 0 0 : slabdata 32 32 0
sighand_cache 541 555 2112 15 8 : tunables 0 0 0 : slabdata 37 37 0
task_struct 1276 1368 7872 4 8 : tunables 0 0 0 : slabdata 342 342 0
cred_jar 1979 2205 192 21 1 : tunables 0 0 0 : slabdata 105 105 0
anon_vma_chain 35568 39040 64 64 1 : tunables 0 0 0 : slabdata 610 610 0
anon_vma 20573 22632 88 46 1 : tunables 0 0 0 : slabdata 492 492 0
pid 3712 3712 128 32 1 : tunables 0 0 0 : slabdata 116 116 0
Acpi-Operand 4872 4872 72 56 1 : tunables 0 0 0 : slabdata 87 87 0
Acpi-ParseExt 312 312 104 39 1 : tunables 0 0 0 : slabdata 8 8 0
Acpi-State 408 408 80 51 1 : tunables 0 0 0 : slabdata 8 8 0
numa_policy 136895 137360 24 170 1 : tunables 0 0 0 : slabdata 808 808 0
trace_event_file 3956 3956 88 46 1 : tunables 0 0 0 : slabdata 86 86 0
ftrace_event_field 8415 8415 48 85 1 : tunables 0 0 0 : slabdata 99 99 0
pool_workqueue 749 1152 256 32 2 : tunables 0 0 0 : slabdata 36 36 0
radix_tree_node 82516 82516 584 28 4 : tunables 0 0 0 : slabdata 2947 2947 0
task_group 224 224 576 28 4 : tunables 0 0 0 : slabdata 8 8 0
vmap_area 18937 19072 64 64 1 : tunables 0 0 0 : slabdata 298 298 0
dma-kmalloc-8k 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-4k 0 0 4096 8 8 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-2k 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-1k 0 0 1024 32 8 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-512 0 0 512 32 4 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-256 0 0 256 32 2 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-128 0 0 128 32 1 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-64 0 0 64 64 1 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-32 0 0 32 128 1 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-16 0 0 16 256 1 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-8 0 0 8 512 1 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-192 0 0 192 21 1 : tunables 0 0 0 : slabdata 0 0 0
dma-kmalloc-96 0 0 96 42 1 : tunables 0 0 0 : slabdata 0 0 0
kmalloc-rcl-8k 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0
kmalloc-rcl-4k 0 0 4096 8 8 : tunables 0 0 0 : slabdata 0 0 0
kmalloc-rcl-2k 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0
kmalloc-rcl-1k 0 0 1024 32 8 : tunables 0 0 0 : slabdata 0 0 0
kmalloc-rcl-512 2199 3744 512 32 4 : tunables 0 0 0 : slabdata 117 117 0
kmalloc-rcl-256 608 608 256 32 2 : tunables 0 0 0 : slabdata 19 19 0
kmalloc-rcl-192 671 3171 192 21 1 : tunables 0 0 0 : slabdata 151 151 0
kmalloc-rcl-128 3552 3680 128 32 1 : tunables 0 0 0 : slabdata 115 115 0
kmalloc-rcl-96 5334 5334 96 42 1 : tunables 0 0 0 : slabdata 127 127 0
kmalloc-rcl-64 34752 34752 64 64 1 : tunables 0 0 0 : slabdata 543 543 0
kmalloc-rcl-32 0 0 32 128 1 : tunables 0 0 0 : slabdata 0 0 0
kmalloc-rcl-16 0 0 16 256 1 : tunables 0 0 0 : slabdata 0 0 0
kmalloc-rcl-8 0 0 8 512 1 : tunables 0 0 0 : slabdata 0 0 0
kmalloc-8k 194 200 8192 4 8 : tunables 0 0 0 : slabdata 50 50 0
kmalloc-4k 199245 199392 4096 8 8 : tunables 0 0 0 : slabdata 24924 24924 0
kmalloc-2k 257858 257984 2048 16 8 : tunables 0 0 0 : slabdata 16124 16124 0
kmalloc-1k 209829 213056 1024 32 8 : tunables 0 0 0 : slabdata 6658 6658 0
kmalloc-512 41273 62880 512 32 4 : tunables 0 0 0 : slabdata 1965 1965 0
kmalloc-256 20173 47200 256 32 2 : tunables 0 0 0 : slabdata 1475 1475 0
kmalloc-192 37779 56070 192 21 1 : tunables 0 0 0 : slabdata 2670 2670 0
kmalloc-128 47482 49760 128 32 1 : tunables 0 0 0 : slabdata 1555 1555 0
kmalloc-96 108187 110082 96 42 1 : tunables 0 0 0 : slabdata 2621 2621 0
kmalloc-64 350083 356672 64 64 1 : tunables 0 0 0 : slabdata 5573 5573 0
kmalloc-32 82316 89600 32 128 1 : tunables 0 0 0 : slabdata 700 700 0
kmalloc-16 50432 50432 16 256 1 : tunables 0 0 0 : slabdata 197 197 0
kmalloc-8 12288 12288 8 512 1 : tunables 0 0 0 : slabdata 24 24 0
kmem_cache_node 512 512 64 64 1 : tunables 0 0 0 : slabdata 8 8 0
kmem_cache 288 288 256 32 2 : tunables 0 0 0 : slabdata 9 9 0
Last edited by Wild Penguin (2021-06-04 08:18:59)
Offline
Don't worry, these are just buffers. Just use more your system RAM and when you will be low on free memory kernel will use that buffers and other memory it finds to your newer load.
BTW. do you use some kind of swap, zswap, zram, etc.? Even if you have plenty of RAM, you may need some kind of swap or zram if your normal memory load is high and reaching limit.
See this:
https://chrisdown.name/2018/01/02/in-de … -swap.html
Offline
Hi xerxes_,
Thanks for your reply, but this is really a real case of RAM getting eaten up.
For comparison, here is a normal situation where no (at least a large / easily detectable amount of) leak has occurred, after an uptime of little shy of 3 hours:
MemTotal: 32820080 kB
MemFree: 13566588 kB
MemAvailable: 26347332 kB
Buffers: 156964 kB
Cached: 13977724 kB
For comparison see the same figures in the leak situation in my earlier post. I didn't mention, but there I had an uptime of nearly two days. Buffers and cache take up only around 4GiB of RAM.
Any (more relevant) input is welcome!
Last edited by Wild Penguin (2021-02-22 18:01:21)
Offline
Has anyone else noticed a similar problem recently?
Hellu, I'm actually not running Arch (but Manjaro), but I experienced your described behavior with Kernel 5.10(.30) too and just registered to confirm this.
The system (running 63hours now) had 2 GiB RAM used this morning (which is normal) with no applications open (except for the GUI: XFCE) and 16 hours later it is up to 4,5 GiB (also no applications open). Also the used swap went up from 2,61 GiB this morning to 2,69 GiB now. Just like you I also can't see any applications eating the RAM or using the swap.
I have Kernel 5.4 parallel installed and when I boot it I definitely don't experience this.
Offline
% cat /tmp/slabs | awk '{print $1, $3*$4}' | sort -rnk2 | awk '{print $1, $2/1000000}'
…
kmalloc-4k 816.71
kmalloc-2k 528.351
kmalloc-1k 218.169
ext4_inode_cache 111.199
radix_tree_node 48.1893
kmalloc-512 32.1946
dentry 32.0665
buffer_head 27.0657
kmalloc-64 22.827
ext4_groupinfo_4k 20.3898
vm_area_struct 16.324
inode_cache 13.6937
kmalloc-256 12.0832
proc_inode_cache 10.966
task_struct 10.7689
kmalloc-192 10.7654
kmalloc-96 10.5679
kernfs_node_cache 6.46758
kmalloc-128 6.36928
filp 5.31661
numa_policy 3.29664
ext4_extent_status 2.99064
kmalloc-32 2.8672
sock_inode_cache 2.85542
shmem_inode_cache 2.69069
anon_vma_chain 2.49856
kmalloc-rcl-64 2.22413
anon_vma 1.99162
kmalloc-rcl-512 1.91693
kmalloc-8k 1.6384
PING 1.44179
vmap_area 1.22061
sighand_cache 1.17216
signal_cache 1.03219
biovec-max 0.950272
RAW 0.917504
RAWv6 0.822016
kmalloc-16 0.806912
skbuff_fclone_cache 0.770048
file_lock_ctx 0.764456
mnt_cache 0.672
kmalloc-rcl-192 0.608832
skbuff_head_cache 0.606208
biovec-128 0.557056
proc_dir_entry 0.540288
kmalloc-rcl-96 0.512064
fsnotify_mark_connector 0.499712
TCPv6 0.48672
pid 0.475136
kmalloc-rcl-128 0.47104
skbuff_ext_cache 0.431424
TCP 0.430976
mm_struct 0.42432
cred_jar 0.42336
task_delay_info 0.41616
ftrace_event_field 0.40392
net_namespace 0.394368
biovec-64 0.393216
bio_crypt_ctx 0.3672
kcopyd_job 0.357696
Acpi-Operand 0.350784
trace_event_file 0.348128
fat_inode_cache 0.331632
ext4_io_end 0.323584
UDPv6 0.32256
pool_workqueue 0.294912
jbd2_journal_handle 0.28616
names_cache 0.262144
mqueue_inode_cache 0.26112
bdev_cache 0.259584
files_cache 0.259072
jbd2_journal_head 0.24888
request_queue 0.195072
dm_bio_prison_cell 0.173376
kmalloc-rcl-256 0.155648
bio-1 0.145152
bfq_io_cq 0.14
task_group 0.129024
eventpoll_pwq 0.129024
dm_bufio_buffer-72 0.129024
user_namespace 0.12864
kmalloc-8 0.098304
dax_cache 0.096768
scsi_sense_cache 0.08192
kmem_cache 0.073728
tw_sock_TCP 0.073656
dquot 0.065536
uts_namespace 0.06512
taskstats 0.064768
file_lock_cache 0.063936
request_sock_TCP 0.063232
kmem_cache_node 0.032768
jbd2_revoke_record_s 0.032768
ext4_allocation_context 0.032768
seq_file 0.03264
pde_opener 0.03264
dmaengine-unmap-128 0.03264
Acpi-State 0.03264
Acpi-ParseExt 0.032448
rpc_inode_cache 0.032384
pid_namespace 0.032256
khugepaged_mm_slot 0.032256
hugetlbfs_inode_cache 0.032
dmaengine-unmap-256 0.03168
fat_cache 0.02856
jbd2_revoke_table_s 0.016384
tw_sock_TCPv6 0.016368
ip4-frags 0.004
Do you have a similar pattern in kmalloc-{1,2,4}k? (sudo cat /proc/slabinfo …)
Since this is likely in some device driver, I'd compare lsmod. The more exotic the module, the more likely the cause.
Offline
I experienced your described behavior with Kernel 5.10(.30)
Great, after a short intermezzo with the 5.4 Kernel I cannot longer reproduce this with the 5.10 Kernel. No updates between, just 2 reboots. 60h runtime: No RAM "missing" so far.
(╮°-°)┳┳ ( ╯°□°)╯┻┻
Offline
Hi all,
Here is my lsmod:
$ lsmod
Module Size Used by
udp_diag 16384 0
tcp_diag 16384 0
inet_diag 24576 2 tcp_diag,udp_diag
ip6table_filter 16384 0
ip6_tables 32768 1 ip6table_filter
iptable_filter 16384 0
rfkill 32768 3
uinput 24576 2
nct6775 77824 0
hwmon_vid 16384 1 nct6775
tda18271c2dd 32768 2
nls_iso8859_1 16384 1
vfat 24576 1
fat 90112 1 vfat
intel_rapl_msr 20480 0
intel_rapl_common 32768 1 intel_rapl_msr
x86_pkg_temp_thermal 20480 0
intel_powerclamp 20480 0
coretemp 20480 0
kvm_intel 348160 0
kvm 1085440 1 kvm_intel
drxk 90112 2
ddbridge 102400 7
irqbypass 16384 1 kvm
iTCO_wdt 16384 0
intel_pmc_bxt 16384 1 iTCO_wdt
mei_hdcp 24576 0
at24 24576 0
iTCO_vendor_support 16384 1 iTCO_wdt
wmi_bmof 16384 0
mxm_wmi 16384 0
snd_hda_codec_realtek 163840 1
gpio_ich 16384 0
snd_hda_codec_generic 110592 1 snd_hda_codec_realtek
mousedev 24576 0
ledtrig_audio 16384 1 snd_hda_codec_generic
crct10dif_pclmul 16384 1
snd_hda_codec_hdmi 86016 1
joydev 28672 0
crc32_pclmul 16384 0
dvb_core 172032 2 drxk,ddbridge
ghash_clmulni_intel 16384 0
snd_hda_intel 57344 8
videobuf2_vmalloc 20480 1 dvb_core
snd_intel_dspcfg 28672 1 snd_hda_intel
videobuf2_memops 20480 1 videobuf2_vmalloc
snd_intel_sdw_acpi 20480 1 snd_intel_dspcfg
aesni_intel 376832 0
snd_hda_codec 184320 4 snd_hda_codec_generic,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec_realtek
crypto_simd 16384 1 aesni_intel
cryptd 28672 2 crypto_simd,ghash_clmulni_intel
videobuf2_common 69632 3 videobuf2_vmalloc,dvb_core,videobuf2_memops
rapl 16384 0
i2c_i801 36864 0
snd_hda_core 114688 5 snd_hda_codec_generic,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec,snd_hda_codec_realtek
intel_cstate 20480 0
videodev 294912 1 videobuf2_common
snd_hwdep 16384 1 snd_hda_codec
i2c_smbus 20480 1 i2c_i801
snd_pcm 163840 6 snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec,snd_hda_core
intel_uncore 184320 0
snd_timer 45056 1 snd_pcm
snd 118784 22 snd_hda_codec_generic,snd_hda_codec_hdmi,snd_hwdep,snd_hda_intel,snd_hda_codec,snd_hda_codec_realtek,snd_timer,snd_pcm
mc 77824 3 videodev,dvb_core,videobuf2_common
e1000e 319488 0
soundcore 16384 1 snd
mei_me 45056 1
mei 155648 3 mei_hdcp,mei_me
lpc_ich 28672 0
wmi 36864 2 wmi_bmof,mxm_wmi
mac_hid 16384 0
video 53248 0
acpi_pad 24576 0
nfsd 634880 13
auth_rpcgss 159744 1 nfsd
vboxnetflt 32768 0
vboxnetadp 28672 0
nfs_acl 16384 1 nfsd
lockd 139264 1 nfsd
vboxdrv 552960 2 vboxnetadp,vboxnetflt
grace 16384 2 nfsd,lockd
sunrpc 655360 18 nfsd,auth_rpcgss,lockd,nfs_acl
overlay 147456 0
sg 45056 0
vhba 36864 0
fuse 167936 1
nfs_ssc 16384 1 nfsd
crypto_user 20480 0
bpf_preload 16384 0
ip_tables 32768 1 iptable_filter
x_tables 57344 4 ip6table_filter,iptable_filter,ip6_tables,ip_tables
ext4 966656 4
crc16 16384 1 ext4
mbcache 16384 1 ext4
jbd2 151552 1 ext4
hid_logitech_hidpp 53248 0
hid_logitech_dj 28672 0
usbhid 69632 1 hid_logitech_dj
dm_cache_smq 32768 1
dm_cache 73728 2 dm_cache_smq
dm_persistent_data 98304 1 dm_cache
libcrc32c 16384 1 dm_persistent_data
crc32c_generic 16384 0
dm_bio_prison 20480 1 dm_cache
dm_bufio 40960 1 dm_persistent_data
dm_mod 159744 22 dm_cache,dm_bufio
crc32c_intel 24576 9
xhci_pci 24576 0
xhci_pci_renesas 20480 1 xhci_pci
amdgpu 7331840 65
drm_ttm_helper 16384 1 amdgpu
ttm 90112 2 amdgpu,drm_ttm_helper
gpu_sched 45056 1 amdgpu
i2c_algo_bit 16384 1 amdgpu
drm_kms_helper 315392 1 amdgpu
syscopyarea 16384 1 drm_kms_helper
sysfillrect 16384 1 drm_kms_helper
sysimgblt 16384 1 drm_kms_helper
fb_sys_fops 16384 1 drm_kms_helper
cec 81920 1 drm_kms_helper
drm 626688 29 gpu_sched,drm_kms_helper,amdgpu,drm_ttm_helper,ttm
agpgart 40960 2 ttm,drm
The only thing which could be considered "exotic" is the ddbridge and related module(s) (DVB card). However, I'm veering towards amdgpu. The only unusual(?) usage pattern I have is that I'm running foldingathome in the background (>90% of the time). Maybe that causes a kernel memory leak, which most users don't notice (as they don't run folding)... however IIRC I've tried disabling foldingathome service and still faced a leak, but maybe I should retest.
Only other thing even remotely unusual/exotic is lvm (dm_cache), but I'm really not sure it should be considered exotic. I'm using it currently as an SSD cache solution for mechanical HDDs (lvm-cache), and I believe this kind of setup is very common in servers.
In any case: I have no out-of-tree modules (if there were, my first troubleshooting step would have been to remove them).
For what it's worth, I can see from monitorix logs that this problem began ~the middle of december. Before that point of time my memory usage has been usual (and I recall no cases of RAM just "disappearing" before that point). Because of systemd log configuration, I don't have any logs before december 2020 anymore (*)..
Also, I have compiled kernel-zen (and will be compiling regular kernel, too) with CONFIG_DEBUG_KMEMLEAK. Now I'm trying to get something useful from /sys/kernel/debug/kmemleak. I'm trying to confirm this is an amdgpu issue (in my case) to make a useful bug report.
It could be helpful if other people here report what GPU you are using!
Some graphs from Monitorix (from february, but I still see exactly the same pattern). At 15G+ of memory "lost":
Daily graph: https://flic.kr/p/2kY2J1Q (at the "drop" I've shut down the GUI and services; graph taken after a reboot)
Monthy graph: https://flic.kr/p/2kXSDaN
Yearly graph: https://flic.kr/p/2kXSDax (we can see abnormal memory usage since and including december)
Stay tuned!
p.s. *) actually, the issue for logs is is that all kinds of software spam the system log these days, causing the default systemd configuration to not store any logs (including actually useful logs) as the limits are reached, which happens way too soon to what could be useful in situation like these! I've since tweaked systemd configuration so that logs are retained, and started to back them up in case systemd still insists on deleting them (I got plenty of HDD space). But this is another issue...
Last edited by Wild Penguin (2021-05-11 16:33:01)
Offline
I've been observing this issue for months, too. I spent some time narrowing it down, and I, too, came to the conclusion that amdgpu is at fault. Sadly, I didn't have time to dig deeper than that, so I just resorted to rebooting every couple of days. I also can't run f@h anymore because summer is coming and I don't need the additional heating .
However, I can add a new datapoint: I just started playing Kerbal Space Program after a long hiatus and, unlike f@h which takes days to eat up my 32GB of memory, going EVA with a Kerbal will cause the OOM killer to jump in within a minute. The symptoms are the same: no userspace process is consuming the memory, it's the kernel. Unlike f@h, however, killing KSP does reduce the memory usage, although it still doesn't recover all the memory lost. As a workaround, I'll start disabling mods with graphical effects to see which one tickles amdgpu the wrong way. I hope someone has time to dig into the kernel; I've wanted to try my hand at tracing and such for a long time, but I just can't find the energy these days
Offline
https://www.kernel.org/doc/html/latest/ … mleak.html but it's not enabled in the default kernel, so you'd have to compile a kernel yourself.
https://bugzilla.redhat.com/show_bug.cgi?id=1880833#c19 seemed interesting for f@h but you probably *play* w/ an active display…
Offline
https://www.kernel.org/doc/html/latest/ … mleak.html but it's not enabled in the default kernel, so you'd have to compile a kernel yourself.
I already know this (as stated in my OP).
I was asking in the beginning if there is any other way, since recompiling the kernel and trying to reproduce this bug is a hassle. Especially since it seems the leaks seems to come and go since last december. On some weeks the leak is there, then suddenly it is back.
FWIW Since posting the OP, I've recompiled with the options and played around with kmemleak. I just have no idea if the output (which I've saved) is useful - and partially because of other duties I haven't had the time to come around posting a bug report.
https://bugzilla.redhat.com/show_bug.cgi?id=1880833#c19 seemed interesting for f@h but you probably *play* w/ an active display…
Thanks for this link, but it seems it is probably unrelated, as I haven't had this problem before December (I can deduce this from my monitoring data). Most reporters seem to state the problems began before 5.9 series, and my started at around 5.9 and have been ongoing since.
It is possible I haven't had a leak for two weeks. We'll see... (it is possible a leak has started to happen just as I'm posting this!).
Offline
Since upgrading to 5.12.2-zen2-1-zen (and newer) I've not experienced any memory leaks (at least large enough I can detect them). The leak went away just as I recompiled a kernel with kmemleak ;-).
Marking as [SOLVED], however feel free to comment here in case you think you still have the same memory leak. However chances are high it could be another unrelated leak...
Last edited by Wild Penguin (2021-06-04 08:20:12)
Offline