You are not logged in.

#1 2021-06-10 08:27:52

Salkay
Member
Registered: 2014-05-22
Posts: 527

Available memory gradually decreases

According to free, my "available" memory gradually disappears over the course of a few days, eventually resulting in OOM killer. There is very little memory "used". My understanding is that "buffers" and "cache" should be made available when necessary, but this doesn't seem to occur. Even after quitting all applications, the "available" memory is quite low.

$ free -wm
               total        used        free      shared     buffers       cache   available
Mem:           31989        6148        1596       16693        1213       23031        8606
Swap:              0           0           0

Here, free + buffers + cache = 25840, much more than the available memory of 8606.

I attempted to fill the cache by writing/reading a large file, to see if somehow it was stuck in general.

$ free -wm
               total        used        free      shared     buffers       cache   available
Mem:           31989        6346         204       16938         546       24891        8249
Swap:              0           0           0

"cache" increased by almost 2G, but there was not much difference in "available" (as expected).

I then attempted to free up the caches.

$ echo 1 | sudo tee /proc/sys/vm/drop_caches
$ free -wm
               total        used        free      shared     buffers       cache   available
Mem:           31989        6111        7853       16867           2       18021        8250
Swap:              0           0           0
$ echo 3 | sudo tee /proc/sys/vm/drop_caches
$ free -wm
               total        used        free      shared     buffers       cache   available
Mem:           31989        6134        7864       16946          20       17970        8157
Swap:              0           0           0

As expected, this also had no result on "available". I expected "cache" to be almost zero, but it was still ~18G. However, now "free" was pretty similar to "available", so I wonder if there is some large amount of memory in "cache" that is inaccessible, and therefore not contributing to "available", nor able to be dropped.

Offline

#2 2021-06-10 09:01:47

Ropid
Member
Registered: 2015-03-09
Posts: 1,053

Re: Available memory gradually decreases

Maybe it's files inside a tmpfs filesystem? Check what's going on there with "df". This here filters out just the tmpfs entries:

df -h -t tmpfs

I just tried creating a file here in /tmp with "fallocate -l 10G testfile" and those 10G show up in the "cache" column in free's output.

Last edited by Ropid (2021-06-10 09:04:38)

Offline

#3 2021-06-10 09:20:49

Salkay
Member
Registered: 2014-05-22
Posts: 527

Re: Available memory gradually decreases

Thanks @Ropid. Good idea. It looks like there is ~0.75G there, which goes some way to explaining the shortfall, but it looks like there's still another ~16.5 G missing.

$ df -h -t tmpfs
Filesystem      Size  Used Avail Use% Mounted on
run              16G  1.7M   16G   1% /run
tmpfs            16G   45M   16G   1% /dev/shm
tmpfs            16G  699M   15G   5% /tmp
tmpfs           3.2G  176K  3.2G   1% /run/user/1000

Offline

#4 2021-06-10 10:58:40

sabroad
Member
Registered: 2015-05-24
Posts: 180

Re: Available memory gradually decreases

Salkay wrote:

looks like there's still another ~16.5 G missing.

Looks about the same as "shared" memory:

$ free -wm
               total        used        free      shared     buffers       cache   available
Mem:           31989        6148        1596       16693        1213       23031        8606
Swap:              0           0           0

What does

# cat /proc/meminfo

have to say about this?


--
saint_abroad

Offline

#5 2021-06-10 11:03:18

sabroad
Member
Registered: 2015-05-24
Posts: 180

Re: Available memory gradually decreases

Salkay wrote:

looks like there's still another ~16.5 G missing.

$ df -h -t tmpfs
Filesystem      Size  Used Avail Use% Mounted on
run              16G  1.7M   16G   1% /run
tmpfs            16G   45M   16G   1% /dev/shm
tmpfs            16G  699M   15G   5% /tmp
tmpfs           3.2G  176K  3.2G   1% /run/user/1000

Also, there are other filesystems backed by tmpfs consuming shmem such as udev.


--
saint_abroad

Offline

#6 2021-06-10 12:53:47

seth
Member
Registered: 2012-09-03
Posts: 21,101

Re: Available memory gradually decreases

man free wrote:

shared Memory used (mostly) by tmpfs (Shmem in /proc/meminfo)

In addition, if there are files mapped by processes that have changed (or removed) on disk, they cannot be dropped.
I'm unsure that neither are file caches that are simply active.

Online

#7 2021-06-11 02:34:37

Salkay
Member
Registered: 2014-05-22
Posts: 527

Re: Available memory gradually decreases

sabroad wrote:

Looks about the same as "shared" memory:

What does

# cat /proc/meminfo

have to say about this?

Also, there are other filesystems backed by tmpfs consuming shmem such as udev.

Thanks sabroad. The sizes have changed a bit overnight, so here are the new numbers. I'm not exactly sure what to look at here, but by my calculation, the shortfall is approximately free + buffers + cache - available, i.e. 440+3480+21457-6127 = 19250. Shmem indeed looks close at 18869 MB (18868708 kB). What does this mean, and how can I fix it?

$ free -wm
               total        used        free      shared     buffers       cache   available
Mem:           31989        6610         440       18406        3480       21457        6127
Swap:              0           0           0
$ cat /proc/meminfo
MemTotal:       32757108 kB
MemFree:          402584 kB
MemAvailable:    6168884 kB
Buffers:         3589912 kB
Cached:         20130860 kB
SwapCached:            0 kB
Active:          4171764 kB
Inactive:        7950616 kB
Active(anon):      14196 kB
Inactive(anon):  7257676 kB
Active(file):    4157568 kB
Inactive(file):   692940 kB
Unevictable:    17943260 kB
Mlocked:            2008 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:             46852 kB
Writeback:            56 kB
AnonPages:       6344956 kB
Mapped:           528088 kB
Shmem:          18868708 kB
KReclaimable:    1779544 kB
Slab:            2009948 kB
SReclaimable:    1779544 kB
SUnreclaim:       230404 kB
KernelStack:       29232 kB
PageTables:        85416 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    16378552 kB
Committed_AS:   38599792 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       50136 kB
VmallocChunk:          0 kB
Percpu:             8544 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
FileHugePages:         0 kB
FilePmdMapped:         0 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB
DirectMap4k:     1650280 kB
DirectMap2M:    31780864 kB
DirectMap1G:           0 kB

As per the link, I also looked at /dev, but df didn't report its size, nor did du.

$ df -h
Filesystem      Size  Used Avail Use% Mounted on
dev              16G     0   16G   0% /dev
run              16G  1.7M   16G   1% /run
/dev/nvme0n1p2   49G   38G  8.2G  83% /
tmpfs            16G   70M   16G   1% /dev/shm
tmpfs            16G  699M   15G   5% /tmp
/dev/nvme0n1p3  185G  142G   34G  81% /home
/dev/nvme0n1p1  356M  202M  155M  57% /boot
/dev/sdb1       1.8T  1.8T   60G  97% /externalHDD
/dev/sda6       1.7T  1.2T  443G  73% /HDD
tmpfs           3.2G  204K  3.2G   1% /run/user/1000
encfs           185G  142G   34G  81% /home/salkay/.decrypt
$ sudo du -hs /dev
0	/dev
seth wrote:
man free wrote:

shared Memory used (mostly) by tmpfs (Shmem in /proc/meminfo)

In addition, if there are files mapped by processes that have changed (or removed) on disk, they cannot be dropped.
I'm unsure that neither are file caches that are simply active.

Thanks seth. Is there any way I can test this?

Offline

#8 2021-06-11 05:13:03

Ropid
Member
Registered: 2015-03-09
Posts: 1,053

Re: Available memory gradually decreases

For that "shmem" entry from /proc/meminfo, I tried finding out how to research what's going on there. I found a tool "ipcs" and a location "/dev/shm".

You can do "ipcs -m --human" to get an overview of those shmem objects and their size. Then you can do "ipcs -p" to get the process IDs that are involved with those.

About those deleted files being kept open by programs, you can find those programs with "lsof +L1" and "lsof -dDEL". The two lsof commands find different stuff.

Last edited by Ropid (2021-06-11 05:19:35)

Offline

#9 2021-06-11 05:44:47

Salkay
Member
Registered: 2014-05-22
Posts: 527

Re: Available memory gradually decreases

Thanks Ropid. Unfortunately I was forced to restart because the oom killer was going crazy. Previously the shortfall in memory was over 19G, but even after a restart it's still reasonably large at 20824+3+4206-21825 = 3208 M. I also checked again, and /proc/meminfo reports Shmem at ~3G (3074312 kB), so it's still consistent at least.

Ropid wrote:

You can do "ipcs -m --human" to get an overview of those shmem objects and their size. Then you can do "ipcs -p" to get the process IDs that are involved with those.

Unfortunately this didn't seem to reveal any processes with a large memory footprint.

$ ipcs -m --human
------ Shared Memory Segments --------
key        shmid      owner      perms      size       nattch     status      
0x00000000 32771      salkay     600          128K     2          dest         
0x00000000 32772      salkay     600          128K     2          dest         
0x00000000 32773      salkay     600          1.2M     2          dest         
0x00000000 32774      salkay     600          1.2M     2          dest         
0x00000000 65547      salkay     600           48K     2          dest         
0x00000000 65548      salkay     600           48K     2          dest         
0x00000000 32787      salkay     600          512K     2          dest         
0x00000000 32788      salkay     600            4M     2          dest         
0x51210046 21         salkay     600            1K     1                       
0x00000000 65560      salkay     600           24K     2          dest         
0x00000000 65561      salkay     600           24K     2          dest         
0x00000000 65564      salkay     600          840K     2          dest         
0x00000000 29         salkay     600          384K     2          dest         
0x00000000 65566      salkay     600          840K     2          dest         
0x00000000 32         salkay     600          384K     2          dest         
0x00000000 33         salkay     600          512K     2          dest         
0x00000000 36         salkay     600          512K     2          dest         
0x00000000 32805      salkay     600          512K     2          dest         

The summary didn't seem helpful either.

$ ipcs -u
------ Messages Status --------
allocated queues = 0
used headers = 0
used space = 0 bytes

------ Shared Memory Status --------
segments allocated 18
pages allocated 2839
pages resident  1402
pages swapped   0
Swap performance: 0 attempts	 0 successes

------ Semaphore Status --------
used arrays = 1
allocated semaphores = 1
Ropid wrote:

About those deleted files being kept open by programs, you can find those programs with "lsof +L1"

Am I just looking for the rows with a large SIZE/OFF? The largest only had 67 M.

$ lsof +L1
COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF NLINK    NODE NAME
...
pulseaudi   874  sal    6u   REG    0,1 67108864     0    3100 /memfd:pulseaudio (deleted)
...
Ropid wrote:

and "lsof -dDEL".

Here the column SIZE/OFF was empty for each row, so I wasn't sure what to look for. I tried adding the size flag -s but the column remained empty.

Offline

#10 2021-06-11 06:02:22

seth
Member
Registered: 2012-09-03
Posts: 21,101

Re: Available memory gradually decreases

Active(anon):      14196 kB
Inactive(anon):  7257676 kB

There's about 7GB in anon pages (allocated by programs), mostly inactive

Active(file):    4157568 kB
Inactive(file):   692940 kB

~4.5 GB in file caches (mostly active)

Unevictable:    17943260 kB
…
Shmem:          18868708 kB

These here are the bad numbers

Mlocked:            2008 kB

Suggests it's NOT some userspace process mlock'ing them and then just forgetting to unlock them.
The more likely cause is a "leaking" kernel module (it's technically not a leak, just dumb)

=> Which kernel do you use and does this happen w/ the lts kernel as well?

Online

#11 2021-06-11 06:06:31

Salkay
Member
Registered: 2014-05-22
Posts: 527

Re: Available memory gradually decreases

Thanks @seth. Very interesting.

I use the vanilla linux kernel, but I will restart and try the lts.

Offline

#12 2021-06-11 16:06:24

sabroad
Member
Registered: 2015-05-24
Posts: 180

Re: Available memory gradually decreases

Salkay wrote:
Unevictable:    17943260 kB

Does a KDE session restart (logout + login) free the Unevictable memory?


--
saint_abroad

Offline

#13 2021-06-12 00:51:53

Salkay
Member
Registered: 2014-05-22
Posts: 527

Re: Available memory gradually decreases

Thanks @sabroad. That looks interesting too! These problems are actually on my work computer, so I'll report back in a few days after the weekend (long weekend here in Australia). BTW I haven't experienced similar levels of OOM killing on my home computer, but in both there is a similar shortfall in free + buffers + cache - available, with both systems currently ~2GB. Is that an "expected" amount? What is reasonable?

EDIT: Regarding the "expected" shortfall, I had a look on my (non-leaking) home computer. "shared" was ~2GB and df -BM | grep tmpfs showed /dev/shm with 1GB. ls -l did not, possibly suggesting deleted but open files. lsof /dev/shm revelaed these deleted files were open in Signal and Steam. Quitting both programs freed up this directory, and the shortfall was now "only" 1GB.

Last edited by Salkay (2021-06-12 01:21:11)

Offline

#14 2021-06-12 06:19:35

seth
Member
Registered: 2012-09-03
Posts: 21,101

Re: Available memory gradually decreases

The numbers you want look at here are mlock and unevictable - if they're similar (unevictable maybe slightly bigger) you're "good"
shmem/shared is typically your combined tmpfs usage (so /tmp and /run/user/* next to /dev/shm)

Online

#15 Yesterday 20:28:36

Wild Penguin
Member
Registered: 2015-03-19
Posts: 217

Re: Available memory gradually decreases

Hi Salkay,

The behavior you've described sounds very similar (in it's timespan) to what I encountered in a thread I've begun; however the problems just went away after a kernel upgrade in my case. I had the leak on both -zen and regular Kernel. You might want to take a look at kmemleak documentation and try recompiling the kernel with it enabled. Wish I had known the tips in this thread (about lsof and ipcs!).

Just a few questions: what GPU are you using? Do you have any potential heavily GPU-using (such as folding@home) or other heavy (server-like) software running in the background besides the GUI?

In my case, my working theory is that the leak resided in amdgpu, possibly exacerbated by running F@H. However that is just a working theory and the problems went away before I could allocate time to investigate this. But in any case my leak is probably unrelated, this is just my 2 cents in general tips!

EDIT: As another workaround in addition to using the -lts branch, you might like to take a look at earlyOOM or nohang. I prefer the latter, having tried them both. Nohang has saved my butt a few times while working with the leaky Kernel (after >20GiB has been consumed by "something", before processes started to be killed). The in-Kernel OOM is very "stupid", especially from a desktop-oriented users point of view.

Last edited by Wild Penguin (Yesterday 20:41:56)

Offline

Board footer

Powered by FluxBB