You are not logged in.

Hi,
Since recently I've repeatedly had memory leak which I suspect is in the Kernel space, since I can not see any processes using the RAM. Earlyoom starts to kill my actual software, but still 20GiB+ of RAM (on a system of 32GiB total RAM) keeps being occupied by something (Kernel?) and is not available for processes (<10GiB is not enough to run my software). I'm using amdgpu, in case that makes a difference.
After stopping X.org (sddm), so that the system has no services up which should be using a lot of RAM, around 20GB+ is still in use (I'm expecting ~1GiB or less). I can not see any proccesses using a lot of RAM in (h)top! (only the available amount has diminished as reported by 'top', 'free' etc...).
Has anyone else noticed a similar problem recently?
Any other ideas save compiling kernel with CONFIG_DEBUG_KMEMLEAK do find out, what is actually eating the RAM?
Or, hot to conclusively confirm if this is a kernel, kernel module or - after all - some process eating RAM? (I've used top/htop sorted by RAM usage).
Some info:
Some inxi info:
System:    Host: ArkkiVille Kernel: 5.10.16-zen1-1-zen x86_64 bits: 64 compiler: N/A Desktop: KDE Plasma 5.21.0 
           Distro: Arch Linux 
Machine:   Type: Desktop System: ASUS product: All Series v: N/A serial: <filter> 
           Mobo: ASUSTeK model: MAXIMUS VII GENE v: Rev 1.xx serial: <filter> UEFI: American Megatrends v: 3503 
           date: 04/18/2018 
Battery:   ID-1: hidpp_battery_0 charge: N/A condition: N/A model: Logitech G703 Wired/Wireless Gaming Mouse 
           status: Discharging 
CPU:       Topology: Quad Core model: Intel Core i7-4790K bits: 64 type: MT MCP arch: Haswell rev: 3 L2 cache: 8192 KiB 
           flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 63994 
           Speed: 4000 MHz min/max: 800/4800 MHz Core speeds (MHz): 1: 4000 2: 4000 3: 4000 4: 4000 5: 4000 6: 4001 7: 4000 
           8: 4000 
Graphics:  Device-1: Advanced Micro Devices [AMD/ATI] Vega 10 XL/XT [Radeon RX Vega 56/64] driver: amdgpu v: kernel 
           bus ID: 03:00.0 
           Display: x11 server: X.Org 1.20.10 driver: amdgpu resolution: 3440x1440~100Hz 
           OpenGL: renderer: Radeon RX Vega (VEGA10 DRM 3.40.0 5.10.16-zen1-1-zen LLVM 11.1.0) v: 4.6 Mesa 20.3.4 
           direct render: Yes 
Audio:     Device-1: Intel 9 Series Family HD Audio vendor: ASUSTeK driver: snd_hda_intel v: kernel bus ID: 00:1b.0 
           Device-2: Advanced Micro Devices [AMD/ATI] Vega 10 HDMI Audio [Radeon Vega 56/64] driver: snd_hda_intel v: kernel 
           bus ID: 03:00.1 
           Device-3: Digital Devices Octopus DVB Adapter driver: ddbridge v: 0.9.33-integrated bus ID: 04:00.0 
           Device-4: Micronas BLUE USB Audio 2.0 type: USB driver: snd-usb-audio bus ID: 5-2.4:5 
           Device-5: Creative Sound BlasterX G6 type: USB driver: hid-generic,snd-usb-audio,usbhid bus ID: 5-2.3:4 
           Sound Server: ALSA v: k5.10.16-zen1-1-zen 
Network:   Device-1: Intel Ethernet I218-V vendor: ASUSTeK driver: e1000e v: kernel port: f040 bus ID: 00:19.0 
           IF: eno1 state: up speed: 1000 Mbps duplex: full mac: <filter> /proc/meminfo during a leak:
MemTotal:       32820080 kB
MemFree:         6456052 kB
MemAvailable:    9874904 kB
Buffers:          197772 kB
Cached:          3749128 kB
SwapCached:            0 kB
Active:          1087708 kB
Inactive:        5288832 kB
Active(anon):      25724 kB
Inactive(anon):  2742628 kB
Active(file):    1061984 kB
Inactive(file):  2546204 kB
Unevictable:        1760 kB
Mlocked:            1760 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:       2388068 kB
Mapped:           887796 kB
Shmem:            337236 kB
KReclaimable:     274584 kB
Slab:            1978800 kB
SReclaimable:     274584 kB
SUnreclaim:      1704216 kB
KernelStack:       20016 kB
PageTables:        40252 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    16410040 kB
Committed_AS:   10225220 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       78300 kB
VmallocChunk:          0 kB
Percpu:             6528 kB
HardwareCorrupted:     0 kB
AnonHugePages:    440320 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
FileHugePages:     14336 kB
FilePmdMapped:         0 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB
DirectMap4k:    10412600 kB
DirectMap2M:    23080960 kB
DirectMap1G:     1048576 kB/proc/slabinfo during a leak:
slabinfo - version: 2.1
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
fat_inode_cache      357    441    752   21    4 : tunables    0    0    0 : slabdata     21     21      0
fat_cache            714    714     40  102    1 : tunables    0    0    0 : slabdata      7      7      0
kvm_async_pf           0      0    136   30    1 : tunables    0    0    0 : slabdata      0      0      0
kvm_vcpu               0      0  11328    1    4 : tunables    0    0    0 : slabdata      0      0      0
kvm_mmu_page_header      0      0    168   24    1 : tunables    0    0    0 : slabdata      0      0      0
x86_emulator           0      0   2672   12    8 : tunables    0    0    0 : slabdata      0      0      0
x86_fpu                0      0   4160    7    8 : tunables    0    0    0 : slabdata      0      0      0
nfs4_layout_stateid      0      0    296   27    2 : tunables    0    0    0 : slabdata      0      0      0
nfsd4_delegations      0      0    248   33    2 : tunables    0    0    0 : slabdata      0      0      0
nfsd4_files            0      0    288   28    2 : tunables    0    0    0 : slabdata      0      0      0
nfsd4_lockowners       0      0    392   20    2 : tunables    0    0    0 : slabdata      0      0      0
nfsd4_openowners       0      0    432   37    4 : tunables    0    0    0 : slabdata      0      0      0
nfsd4_clients          0      0   1296   25    8 : tunables    0    0    0 : slabdata      0      0      0
rpc_inode_cache       46     46    704   23    4 : tunables    0    0    0 : slabdata      2      2      0
ovl_inode              0      0    696   23    4 : tunables    0    0    0 : slabdata      0      0      0
fuse_request           0      0    152   26    1 : tunables    0    0    0 : slabdata      0      0      0
fuse_inode             0      0    896   36    8 : tunables    0    0    0 : slabdata      0      0      0
ext4_groupinfo_4k 141596 141596    144   28    1 : tunables    0    0    0 : slabdata   5057   5057      0
ext4_fc_dentry_update      0      0     80   51    1 : tunables    0    0    0 : slabdata      0      0      0
ext4_inode_cache   93288  93288   1192   27    8 : tunables    0    0    0 : slabdata   3456   3456      0
ext4_allocation_context    256    256    128   32    1 : tunables    0    0    0 : slabdata      8      8      0
ext4_io_end         5056   5056     64   64    1 : tunables    0    0    0 : slabdata     79     79      0
ext4_extent_status  74766  74766     40  102    1 : tunables    0    0    0 : slabdata    733    733      0
jbd2_journal_handle   5110   5110     56   73    1 : tunables    0    0    0 : slabdata     70     70      0
jbd2_journal_head   1939   2074    120   34    1 : tunables    0    0    0 : slabdata     61     61      0
jbd2_revoke_table_s   1024   1024     16  256    1 : tunables    0    0    0 : slabdata      4      4      0
jbd2_revoke_record_s   1024   1024     32  128    1 : tunables    0    0    0 : slabdata      8      8      0
bio-1                294    378    384   21    2 : tunables    0    0    0 : slabdata     18     18      0
dm_bufio_buffer-72    576    576    224   36    2 : tunables    0    0    0 : slabdata     16     16      0
dm_bio_prison_cell   1806   1806     96   42    1 : tunables    0    0    0 : slabdata     43     43      0
kcopyd_job           108    108   3312    9    8 : tunables    0    0    0 : slabdata     12     12      0
dm_uevent              0      0   2888   11    8 : tunables    0    0    0 : slabdata      0      0      0
fsverity_info          0      0    256   32    2 : tunables    0    0    0 : slabdata      0      0      0
fscrypt_info           0      0    136   30    1 : tunables    0    0    0 : slabdata      0      0      0
MPTCPv6                0      0   2048   16    8 : tunables    0    0    0 : slabdata      0      0      0
ip6-frags              0      0    184   22    1 : tunables    0    0    0 : slabdata      0      0      0
PINGv6                 0      0   1216   26    8 : tunables    0    0    0 : slabdata      0      0      0
RAWv6                663    676   1216   26    8 : tunables    0    0    0 : slabdata     26     26      0
UDPv6                240    240   1344   24    8 : tunables    0    0    0 : slabdata     10     10      0
tw_sock_TCPv6         66     66    248   33    2 : tunables    0    0    0 : slabdata      2      2      0
request_sock_TCPv6      0      0    304   26    2 : tunables    0    0    0 : slabdata      0      0      0
TCPv6                195    195   2496   13    8 : tunables    0    0    0 : slabdata     15     15      0
scsi_sense_cache     640    640    128   32    1 : tunables    0    0    0 : slabdata     20     20      0
bfq_io_cq            875    875    160   25    1 : tunables    0    0    0 : slabdata     35     35      0
mqueue_inode_cache    272    272    960   34    8 : tunables    0    0    0 : slabdata      8      8      0
userfaultfd_ctx_cache      0      0    192   21    1 : tunables    0    0    0 : slabdata      0      0      0
dnotify_struct         0      0     32  128    1 : tunables    0    0    0 : slabdata      0      0      0
dio                    0      0    640   25    4 : tunables    0    0    0 : slabdata      0      0      0
pid_namespace        224    224    144   28    1 : tunables    0    0    0 : slabdata      8      8      0
ip4-frags             20     20    200   20    1 : tunables    0    0    0 : slabdata      1      1      0
MPTCP                  0      0   1856   17    8 : tunables    0    0    0 : slabdata      0      0      0
request_sock_subflow      0      0    376   21    2 : tunables    0    0    0 : slabdata      0      0      0
xfrm_state             0      0    768   21    4 : tunables    0    0    0 : slabdata      0      0      0
PING                1257   1408   1024   32    8 : tunables    0    0    0 : slabdata     44     44      0
RAW                  896    896   1024   32    8 : tunables    0    0    0 : slabdata     28     28      0
tw_sock_TCP          297    297    248   33    2 : tunables    0    0    0 : slabdata      9      9      0
request_sock_TCP     208    208    304   26    2 : tunables    0    0    0 : slabdata      8      8      0
TCP                  135    182   2368   13    8 : tunables    0    0    0 : slabdata     14     14      0
hugetlbfs_inode_cache     50     50    640   25    4 : tunables    0    0    0 : slabdata      2      2      0
dquot                256    256    256   32    2 : tunables    0    0    0 : slabdata      8      8      0
eventpoll_pwq       1792   1792     72   56    1 : tunables    0    0    0 : slabdata     32     32      0
dax_cache            126    126    768   21    4 : tunables    0    0    0 : slabdata      6      6      0
bio_crypt_ctx       9180   9180     40  102    1 : tunables    0    0    0 : slabdata     90     90      0
request_queue         96     96   2032   16    8 : tunables    0    0    0 : slabdata      6      6      0
biovec-max           202    232   4096    8    8 : tunables    0    0    0 : slabdata     29     29      0
biovec-128           240    272   2048   16    8 : tunables    0    0    0 : slabdata     17     17      0
biovec-64            384    384   1024   32    8 : tunables    0    0    0 : slabdata     12     12      0
khugepaged_mm_slot    288    288    112   36    1 : tunables    0    0    0 : slabdata      8      8      0
user_namespace       240    240    536   30    4 : tunables    0    0    0 : slabdata      8      8      0
dmaengine-unmap-256     15     15   2112   15    8 : tunables    0    0    0 : slabdata      1      1      0
dmaengine-unmap-128     30     30   1088   30    8 : tunables    0    0    0 : slabdata      1      1      0
sock_inode_cache    3432   3432    832   39    8 : tunables    0    0    0 : slabdata     88     88      0
skbuff_ext_cache    2065   2247    192   21    1 : tunables    0    0    0 : slabdata    107    107      0
skbuff_fclone_cache   1236   1504    512   32    4 : tunables    0    0    0 : slabdata     47     47      0
skbuff_head_cache   2368   2368    256   32    2 : tunables    0    0    0 : slabdata     74     74      0
file_lock_cache      296    296    216   37    2 : tunables    0    0    0 : slabdata      8      8      0
file_lock_ctx      12987  13651     56   73    1 : tunables    0    0    0 : slabdata    187    187      0
fsnotify_mark_connector  15616  15616     32  128    1 : tunables    0    0    0 : slabdata    122    122      0
net_namespace         78     78   5056    6    8 : tunables    0    0    0 : slabdata     13     13      0
x86_lbr                0      0    800   20    4 : tunables    0    0    0 : slabdata      0      0      0
task_delay_info     5154   5202     80   51    1 : tunables    0    0    0 : slabdata    102    102      0
taskstats            184    184    352   23    2 : tunables    0    0    0 : slabdata      8      8      0
proc_dir_entry      2735   2814    192   21    1 : tunables    0    0    0 : slabdata    134    134      0
pde_opener           816    816     40  102    1 : tunables    0    0    0 : slabdata      8      8      0
proc_inode_cache   15429  15939    688   23    4 : tunables    0    0    0 : slabdata    693    693      0
seq_file             272    272    120   34    1 : tunables    0    0    0 : slabdata      8      8      0
bdev_cache           312    312    832   39    8 : tunables    0    0    0 : slabdata      8      8      0
shmem_inode_cache   3067   3696    728   22    4 : tunables    0    0    0 : slabdata    168    168      0
kernfs_node_cache  49445  50528    128   32    1 : tunables    0    0    0 : slabdata   1579   1579      0
mnt_cache           2030   2100    320   25    2 : tunables    0    0    0 : slabdata     84     84      0
filp               19181  20768    256   32    2 : tunables    0    0    0 : slabdata    649    649      0
inode_cache        22035  22230    616   26    4 : tunables    0    0    0 : slabdata    855    855      0
dentry            166057 167013    192   21    1 : tunables    0    0    0 : slabdata   7953   7953      0
names_cache           64     64   4096    8    8 : tunables    0    0    0 : slabdata      8      8      0
buffer_head       260247 260247    104   39    1 : tunables    0    0    0 : slabdata   6673   6673      0
uts_namespace        148    148    440   37    4 : tunables    0    0    0 : slabdata      4      4      0
vm_area_struct     80323  81620    200   20    1 : tunables    0    0    0 : slabdata   4081   4081      0
mm_struct            390    390   1088   30    8 : tunables    0    0    0 : slabdata     13     13      0
files_cache          368    368    704   23    4 : tunables    0    0    0 : slabdata     16     16      0
signal_cache         896    896   1152   28    8 : tunables    0    0    0 : slabdata     32     32      0
sighand_cache        541    555   2112   15    8 : tunables    0    0    0 : slabdata     37     37      0
task_struct         1276   1368   7872    4    8 : tunables    0    0    0 : slabdata    342    342      0
cred_jar            1979   2205    192   21    1 : tunables    0    0    0 : slabdata    105    105      0
anon_vma_chain     35568  39040     64   64    1 : tunables    0    0    0 : slabdata    610    610      0
anon_vma           20573  22632     88   46    1 : tunables    0    0    0 : slabdata    492    492      0
pid                 3712   3712    128   32    1 : tunables    0    0    0 : slabdata    116    116      0
Acpi-Operand        4872   4872     72   56    1 : tunables    0    0    0 : slabdata     87     87      0
Acpi-ParseExt        312    312    104   39    1 : tunables    0    0    0 : slabdata      8      8      0
Acpi-State           408    408     80   51    1 : tunables    0    0    0 : slabdata      8      8      0
numa_policy       136895 137360     24  170    1 : tunables    0    0    0 : slabdata    808    808      0
trace_event_file    3956   3956     88   46    1 : tunables    0    0    0 : slabdata     86     86      0
ftrace_event_field   8415   8415     48   85    1 : tunables    0    0    0 : slabdata     99     99      0
pool_workqueue       749   1152    256   32    2 : tunables    0    0    0 : slabdata     36     36      0
radix_tree_node    82516  82516    584   28    4 : tunables    0    0    0 : slabdata   2947   2947      0
task_group           224    224    576   28    4 : tunables    0    0    0 : slabdata      8      8      0
vmap_area          18937  19072     64   64    1 : tunables    0    0    0 : slabdata    298    298      0
dma-kmalloc-8k         0      0   8192    4    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-4k         0      0   4096    8    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-2k         0      0   2048   16    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-1k         0      0   1024   32    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-512        0      0    512   32    4 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-256        0      0    256   32    2 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-128        0      0    128   32    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-64         0      0     64   64    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-32         0      0     32  128    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-16         0      0     16  256    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-8          0      0      8  512    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-192        0      0    192   21    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-96         0      0     96   42    1 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-8k         0      0   8192    4    8 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-4k         0      0   4096    8    8 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-2k         0      0   2048   16    8 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-1k         0      0   1024   32    8 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-512     2199   3744    512   32    4 : tunables    0    0    0 : slabdata    117    117      0
kmalloc-rcl-256      608    608    256   32    2 : tunables    0    0    0 : slabdata     19     19      0
kmalloc-rcl-192      671   3171    192   21    1 : tunables    0    0    0 : slabdata    151    151      0
kmalloc-rcl-128     3552   3680    128   32    1 : tunables    0    0    0 : slabdata    115    115      0
kmalloc-rcl-96      5334   5334     96   42    1 : tunables    0    0    0 : slabdata    127    127      0
kmalloc-rcl-64     34752  34752     64   64    1 : tunables    0    0    0 : slabdata    543    543      0
kmalloc-rcl-32         0      0     32  128    1 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-16         0      0     16  256    1 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-8          0      0      8  512    1 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-8k           194    200   8192    4    8 : tunables    0    0    0 : slabdata     50     50      0
kmalloc-4k        199245 199392   4096    8    8 : tunables    0    0    0 : slabdata  24924  24924      0
kmalloc-2k        257858 257984   2048   16    8 : tunables    0    0    0 : slabdata  16124  16124      0
kmalloc-1k        209829 213056   1024   32    8 : tunables    0    0    0 : slabdata   6658   6658      0
kmalloc-512        41273  62880    512   32    4 : tunables    0    0    0 : slabdata   1965   1965      0
kmalloc-256        20173  47200    256   32    2 : tunables    0    0    0 : slabdata   1475   1475      0
kmalloc-192        37779  56070    192   21    1 : tunables    0    0    0 : slabdata   2670   2670      0
kmalloc-128        47482  49760    128   32    1 : tunables    0    0    0 : slabdata   1555   1555      0
kmalloc-96        108187 110082     96   42    1 : tunables    0    0    0 : slabdata   2621   2621      0
kmalloc-64        350083 356672     64   64    1 : tunables    0    0    0 : slabdata   5573   5573      0
kmalloc-32         82316  89600     32  128    1 : tunables    0    0    0 : slabdata    700    700      0
kmalloc-16         50432  50432     16  256    1 : tunables    0    0    0 : slabdata    197    197      0
kmalloc-8          12288  12288      8  512    1 : tunables    0    0    0 : slabdata     24     24      0
kmem_cache_node      512    512     64   64    1 : tunables    0    0    0 : slabdata      8      8      0
kmem_cache           288    288    256   32    2 : tunables    0    0    0 : slabdata      9      9      0Last edited by Wild Penguin (2021-06-04 08:18:59)
Offline
Don't worry, these are just buffers. Just use more your system RAM and when you will be low on free memory kernel will use that buffers and other memory it finds to your newer load.
BTW. do you use some kind of swap, zswap, zram, etc.? Even if you have plenty of RAM, you may need some kind of swap or zram if your normal memory load is high and reaching limit. 
See this:
https://chrisdown.name/2018/01/02/in-de … -swap.html
Offline

Hi xerxes_,
Thanks for your reply, but this is really a real case of RAM getting eaten up.
For comparison, here is a normal situation where no (at least a large / easily detectable amount of) leak has occurred, after an uptime of little shy of 3 hours:
MemTotal:       32820080 kB
MemFree:        13566588 kB
MemAvailable:   26347332 kB
Buffers:          156964 kB
Cached:         13977724 kBFor comparison see the same figures in the leak situation in my earlier post. I didn't mention, but there I had an uptime of nearly two days. Buffers and cache take up only around 4GiB of RAM.
Any (more relevant) input is welcome!
Last edited by Wild Penguin (2021-02-22 18:01:21)
Offline
Has anyone else noticed a similar problem recently?
Hellu, I'm actually not running Arch (but Manjaro), but I experienced your described behavior with Kernel 5.10(.30) too and just registered to confirm this.
The system (running 63hours now) had 2 GiB RAM used this morning (which is normal) with no applications open (except for the GUI: XFCE) and 16 hours later it is up to 4,5 GiB (also no applications open). Also the used swap went up from 2,61 GiB this morning to 2,69 GiB now. Just like you I also can't see any applications eating the RAM or using the swap.
I have Kernel 5.4 parallel installed and when I boot it I definitely don't experience this.
Offline

% cat /tmp/slabs | awk '{print $1, $3*$4}' | sort -rnk2 | awk '{print $1, $2/1000000}' 
…
kmalloc-4k 816.71
kmalloc-2k 528.351
kmalloc-1k 218.169
ext4_inode_cache 111.199
radix_tree_node 48.1893
kmalloc-512 32.1946
dentry 32.0665
buffer_head 27.0657
kmalloc-64 22.827
ext4_groupinfo_4k 20.3898
vm_area_struct 16.324
inode_cache 13.6937
kmalloc-256 12.0832
proc_inode_cache 10.966
task_struct 10.7689
kmalloc-192 10.7654
kmalloc-96 10.5679
kernfs_node_cache 6.46758
kmalloc-128 6.36928
filp 5.31661
numa_policy 3.29664
ext4_extent_status 2.99064
kmalloc-32 2.8672
sock_inode_cache 2.85542
shmem_inode_cache 2.69069
anon_vma_chain 2.49856
kmalloc-rcl-64 2.22413
anon_vma 1.99162
kmalloc-rcl-512 1.91693
kmalloc-8k 1.6384
PING 1.44179
vmap_area 1.22061
sighand_cache 1.17216
signal_cache 1.03219
biovec-max 0.950272
RAW 0.917504
RAWv6 0.822016
kmalloc-16 0.806912
skbuff_fclone_cache 0.770048
file_lock_ctx 0.764456
mnt_cache 0.672
kmalloc-rcl-192 0.608832
skbuff_head_cache 0.606208
biovec-128 0.557056
proc_dir_entry 0.540288
kmalloc-rcl-96 0.512064
fsnotify_mark_connector 0.499712
TCPv6 0.48672
pid 0.475136
kmalloc-rcl-128 0.47104
skbuff_ext_cache 0.431424
TCP 0.430976
mm_struct 0.42432
cred_jar 0.42336
task_delay_info 0.41616
ftrace_event_field 0.40392
net_namespace 0.394368
biovec-64 0.393216
bio_crypt_ctx 0.3672
kcopyd_job 0.357696
Acpi-Operand 0.350784
trace_event_file 0.348128
fat_inode_cache 0.331632
ext4_io_end 0.323584
UDPv6 0.32256
pool_workqueue 0.294912
jbd2_journal_handle 0.28616
names_cache 0.262144
mqueue_inode_cache 0.26112
bdev_cache 0.259584
files_cache 0.259072
jbd2_journal_head 0.24888
request_queue 0.195072
dm_bio_prison_cell 0.173376
kmalloc-rcl-256 0.155648
bio-1 0.145152
bfq_io_cq 0.14
task_group 0.129024
eventpoll_pwq 0.129024
dm_bufio_buffer-72 0.129024
user_namespace 0.12864
kmalloc-8 0.098304
dax_cache 0.096768
scsi_sense_cache 0.08192
kmem_cache 0.073728
tw_sock_TCP 0.073656
dquot 0.065536
uts_namespace 0.06512
taskstats 0.064768
file_lock_cache 0.063936
request_sock_TCP 0.063232
kmem_cache_node 0.032768
jbd2_revoke_record_s 0.032768
ext4_allocation_context 0.032768
seq_file 0.03264
pde_opener 0.03264
dmaengine-unmap-128 0.03264
Acpi-State 0.03264
Acpi-ParseExt 0.032448
rpc_inode_cache 0.032384
pid_namespace 0.032256
khugepaged_mm_slot 0.032256
hugetlbfs_inode_cache 0.032
dmaengine-unmap-256 0.03168
fat_cache 0.02856
jbd2_revoke_table_s 0.016384
tw_sock_TCPv6 0.016368
ip4-frags 0.004Do you have a similar pattern in kmalloc-{1,2,4}k? (sudo cat /proc/slabinfo …)
Since this is likely in some device driver, I'd compare lsmod. The more exotic the module, the more likely the cause.
Offline
I experienced your described behavior with Kernel 5.10(.30)
Great, after a short intermezzo with the 5.4 Kernel I cannot longer reproduce this with the 5.10 Kernel. No updates between, just 2 reboots. 60h runtime: No RAM "missing" so far.
(╮°-°)┳┳ ( ╯°□°)╯┻┻
Offline

Hi all,
Here is my lsmod:
$ lsmod
Module                  Size  Used by
udp_diag               16384  0
tcp_diag               16384  0
inet_diag              24576  2 tcp_diag,udp_diag
ip6table_filter        16384  0
ip6_tables             32768  1 ip6table_filter
iptable_filter         16384  0
rfkill                 32768  3
uinput                 24576  2
nct6775                77824  0
hwmon_vid              16384  1 nct6775
tda18271c2dd           32768  2
nls_iso8859_1          16384  1
vfat                   24576  1
fat                    90112  1 vfat
intel_rapl_msr         20480  0
intel_rapl_common      32768  1 intel_rapl_msr
x86_pkg_temp_thermal    20480  0
intel_powerclamp       20480  0
coretemp               20480  0
kvm_intel             348160  0
kvm                  1085440  1 kvm_intel
drxk                   90112  2
ddbridge              102400  7
irqbypass              16384  1 kvm
iTCO_wdt               16384  0
intel_pmc_bxt          16384  1 iTCO_wdt
mei_hdcp               24576  0
at24                   24576  0
iTCO_vendor_support    16384  1 iTCO_wdt
wmi_bmof               16384  0
mxm_wmi                16384  0
snd_hda_codec_realtek   163840  1
gpio_ich               16384  0
snd_hda_codec_generic   110592  1 snd_hda_codec_realtek
mousedev               24576  0
ledtrig_audio          16384  1 snd_hda_codec_generic
crct10dif_pclmul       16384  1
snd_hda_codec_hdmi     86016  1
joydev                 28672  0
crc32_pclmul           16384  0
dvb_core              172032  2 drxk,ddbridge
ghash_clmulni_intel    16384  0
snd_hda_intel          57344  8
videobuf2_vmalloc      20480  1 dvb_core
snd_intel_dspcfg       28672  1 snd_hda_intel
videobuf2_memops       20480  1 videobuf2_vmalloc
snd_intel_sdw_acpi     20480  1 snd_intel_dspcfg
aesni_intel           376832  0
snd_hda_codec         184320  4 snd_hda_codec_generic,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec_realtek
crypto_simd            16384  1 aesni_intel
cryptd                 28672  2 crypto_simd,ghash_clmulni_intel
videobuf2_common       69632  3 videobuf2_vmalloc,dvb_core,videobuf2_memops
rapl                   16384  0
i2c_i801               36864  0
snd_hda_core          114688  5 snd_hda_codec_generic,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec,snd_hda_codec_realtek
intel_cstate           20480  0
videodev              294912  1 videobuf2_common
snd_hwdep              16384  1 snd_hda_codec
i2c_smbus              20480  1 i2c_i801
snd_pcm               163840  6 snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec,snd_hda_core
intel_uncore          184320  0
snd_timer              45056  1 snd_pcm
snd                   118784  22 snd_hda_codec_generic,snd_hda_codec_hdmi,snd_hwdep,snd_hda_intel,snd_hda_codec,snd_hda_codec_realtek,snd_timer,snd_pcm
mc                     77824  3 videodev,dvb_core,videobuf2_common
e1000e                319488  0
soundcore              16384  1 snd
mei_me                 45056  1
mei                   155648  3 mei_hdcp,mei_me
lpc_ich                28672  0
wmi                    36864  2 wmi_bmof,mxm_wmi
mac_hid                16384  0
video                  53248  0
acpi_pad               24576  0
nfsd                  634880  13
auth_rpcgss           159744  1 nfsd
vboxnetflt             32768  0
vboxnetadp             28672  0
nfs_acl                16384  1 nfsd
lockd                 139264  1 nfsd
vboxdrv               552960  2 vboxnetadp,vboxnetflt
grace                  16384  2 nfsd,lockd
sunrpc                655360  18 nfsd,auth_rpcgss,lockd,nfs_acl
overlay               147456  0
sg                     45056  0
vhba                   36864  0
fuse                  167936  1
nfs_ssc                16384  1 nfsd
crypto_user            20480  0
bpf_preload            16384  0
ip_tables              32768  1 iptable_filter
x_tables               57344  4 ip6table_filter,iptable_filter,ip6_tables,ip_tables
ext4                  966656  4
crc16                  16384  1 ext4
mbcache                16384  1 ext4
jbd2                  151552  1 ext4
hid_logitech_hidpp     53248  0
hid_logitech_dj        28672  0
usbhid                 69632  1 hid_logitech_dj
dm_cache_smq           32768  1
dm_cache               73728  2 dm_cache_smq
dm_persistent_data     98304  1 dm_cache
libcrc32c              16384  1 dm_persistent_data
crc32c_generic         16384  0
dm_bio_prison          20480  1 dm_cache
dm_bufio               40960  1 dm_persistent_data
dm_mod                159744  22 dm_cache,dm_bufio
crc32c_intel           24576  9
xhci_pci               24576  0
xhci_pci_renesas       20480  1 xhci_pci
amdgpu               7331840  65
drm_ttm_helper         16384  1 amdgpu
ttm                    90112  2 amdgpu,drm_ttm_helper
gpu_sched              45056  1 amdgpu
i2c_algo_bit           16384  1 amdgpu
drm_kms_helper        315392  1 amdgpu
syscopyarea            16384  1 drm_kms_helper
sysfillrect            16384  1 drm_kms_helper
sysimgblt              16384  1 drm_kms_helper
fb_sys_fops            16384  1 drm_kms_helper
cec                    81920  1 drm_kms_helper
drm                   626688  29 gpu_sched,drm_kms_helper,amdgpu,drm_ttm_helper,ttm
agpgart                40960  2 ttm,drmThe only thing which could be considered "exotic" is the ddbridge and related module(s) (DVB card). However, I'm veering towards amdgpu. The only unusual(?) usage pattern I have is that I'm running foldingathome in the background (>90% of the time). Maybe that causes a kernel memory leak, which most users don't notice (as they don't run folding)... however IIRC I've tried disabling foldingathome service and still faced a leak, but maybe I should retest.
Only other thing even remotely unusual/exotic is lvm (dm_cache), but I'm really not sure it should be considered exotic. I'm using it currently as an SSD cache solution for mechanical HDDs (lvm-cache), and I believe this kind of setup is very common in servers.
In any case: I have no out-of-tree modules (if there were, my first troubleshooting step would have been to remove them).
For what it's worth, I can see from monitorix logs that this problem began ~the middle of december. Before that point of time my memory usage has been usual (and I recall no cases of RAM just "disappearing" before that point). Because of systemd log configuration, I don't have any logs before december 2020 anymore (*)..
Also, I have compiled kernel-zen (and will be compiling regular kernel, too) with CONFIG_DEBUG_KMEMLEAK. Now I'm trying to get something useful from /sys/kernel/debug/kmemleak. I'm trying to confirm this is an amdgpu issue (in my case) to make a useful bug report.
It could be helpful if other people here report what GPU you are using!
Some graphs from Monitorix (from february, but I still see exactly the same pattern). At 15G+ of memory "lost":
Daily graph: https://flic.kr/p/2kY2J1Q (at the "drop" I've shut down the GUI and services; graph taken after a reboot)
Monthy graph: https://flic.kr/p/2kXSDaN
Yearly graph: https://flic.kr/p/2kXSDax (we can see abnormal memory usage since and including december)
Stay tuned!
p.s. *) actually, the issue for logs is is that all kinds of software spam the system log these days, causing the default systemd configuration to not store any logs (including actually useful logs) as the limits are reached, which happens way too soon to what could be useful in situation like these! I've since tweaked systemd configuration so that logs are retained, and started to back them up in case systemd still insists on deleting them (I got plenty of HDD space). But this is another issue...
Last edited by Wild Penguin (2021-05-11 16:33:01)
Offline
I've been observing this issue for months, too. I spent some time narrowing it down, and I, too, came to the conclusion that amdgpu is at fault. Sadly, I didn't have time to dig deeper than that, so I just resorted to rebooting every couple of days. I also can't run f@h anymore because summer is coming and I don't need the additional heating  .
 .
However, I can add a new datapoint: I just started playing Kerbal Space Program after a long hiatus and, unlike f@h which takes days to eat up my 32GB of memory, going EVA with a Kerbal will cause the OOM killer to jump in within a minute. The symptoms are the same: no userspace process is consuming the memory, it's the kernel. Unlike f@h, however, killing KSP does reduce the memory usage, although it still doesn't recover all the memory lost. As a workaround, I'll start disabling mods with graphical effects to see which one tickles amdgpu the wrong way. I hope someone has time to dig into the kernel; I've wanted to try my hand at tracing and such for a long time, but I just can't find the energy these days 
Offline

https://www.kernel.org/doc/html/latest/ … mleak.html but it's not enabled in the default kernel, so you'd have to compile a kernel yourself.
https://bugzilla.redhat.com/show_bug.cgi?id=1880833#c19 seemed interesting for f@h but you probably *play* w/ an active display…
Offline

https://www.kernel.org/doc/html/latest/ … mleak.html but it's not enabled in the default kernel, so you'd have to compile a kernel yourself.
I already know this (as stated in my OP).
I was asking in the beginning if there is any other way, since recompiling the kernel and trying to reproduce this bug is a hassle. Especially since it seems the leaks seems to come and go since last december. On some weeks the leak is there, then suddenly it is back.
FWIW Since posting the OP, I've recompiled with the options and played around with kmemleak. I just have no idea if the output (which I've saved) is useful - and partially because of other duties I haven't had the time to come around posting a bug report.
https://bugzilla.redhat.com/show_bug.cgi?id=1880833#c19 seemed interesting for f@h but you probably *play* w/ an active display…
Thanks for this link, but it seems it is probably unrelated, as I haven't had this problem before December (I can deduce this from my monitoring data). Most reporters seem to state the problems began before 5.9 series, and my started at around 5.9 and have been ongoing since.
It is possible I haven't had a leak for two weeks. We'll see... (it is possible a leak has started to happen just as I'm posting this!).
Offline

Since upgrading to 5.12.2-zen2-1-zen (and newer) I've not experienced any memory leaks (at least large enough I can detect them). The leak went away just as I recompiled a kernel with kmemleak ;-).
Marking as [SOLVED], however feel free to comment here in case you think you still have the same memory leak. However chances are high it could be another unrelated leak...
Last edited by Wild Penguin (2021-06-04 08:20:12)
Offline