You are not logged in.
I often run into the issue where some workflow I have takes up all of my system resources, either CPU or RAM, which makes the computer freeze/lag and become unusable. I can not even change to tty in those situations. The only thing that helps is magic sysrq or hardware reboot.
I am looking for a way to set it up so that this never happens.
First I have found this posts, which mentions some possible solutions and other interesting asides: https://forums.opensuse.org/t/how-to-ma … /168752/21
But it doesn't seem to mention a real solution for the underlying issue.
My searchfoo is probably crap, because I couldn't find much else online on this topic. If you have any additional links I can read up on, please share them with me.
My naive idea was to reserve some system resources for a list of apps somehow, so that they always have what they need.
An example workflow in this scenario would look like this:
1. Some app freezes my system due to using up all the RAM
2. I tab out to desktop
3. I start Ksysguard from the start menu
4. I identify the full memory
5. I decide which app to kill/shutdown
This way I could continue working with minimal impact and without playing oom russian roulette, which could kill my current work in progress, potentially losing data.
As far as I understood cgroups can be used to achieve this, but it seems very hacky so far. I've read that apps are added to a cgroup through their PID, which means that they have to be already running. This approach doesn't seem to cover my example workflow, since Ksysguard wouldn't be in the cgroup and would have no resources left to be started. I could of course have ksysguard running all the time, but that is not foolproof since I could close it on accident at some point and later run into the same problem again.
Also it seems hacky to wait for KDE plasma to start in a systemd service, just to find it's PID.
The opposite approach would be to restrict all the apps, which could potentially hog up ressources as shown in https://wiki.archlinux.org/title/cgroup … _a_command
This approach seems to require lots of thinking and babysitting.
I would really prefer a set and forget solution that has no/minimal edgecases and just works all the time.
So I am asking for your input and looking for ideas and suggestions. Thank you in advance for your time and effort. I greatly appreciate.
__________________________________________________
Additional relevant system information:
OS: Arch Linux x86_64
Kernel: 6.6.2-zen1-1-zen
DE: Plasma 5.27.9
WM: KWin
CPU: AMD Ryzen 7 4700U with Radeon Graphics (8) @ 2.000GHz
Memory: 5849MiB / 15398MiB
$ cat /proc/swaps
Filename Type Size Used Priority
/dev/zram0 partition 8388604 2472432 100
$ sudo -s sysctl -a | grep oom
vm.oom_dump_tasks = 1
vm.oom_kill_allocating_task = 0
vm.panic_on_oom = 0
Last edited by Deckweiss (2023-11-30 17:05:54)
Offline
I often run into the issue where some workflow I have takes up all of my system resources, either CPU or RAM
* Get more RAM
* quota the hogs, https://wiki.archlinux.org/title/Cgroup … _a_command
* add some real swap that doesn't eat away from your RAM, https://chrisdown.name/2018/01/02/in-de … -swap.html
Once the system starts thrashing, it's too late.
Instead of trying a fat tool like ksysguard while you're already short on resources, keep an instance of top running - as root and with "sudo nice -n -20 top" - and even then you might not have the resources to switch or interact w/ that window anymore.
I would really prefer a set and forget solution that has no/minimal edgecases and just works all the time.
Get more RAM.
You want this to "just work" but you don't want to rely on the OOM killer and the only way this works is by getting more RAM.
Userspace OOM daemons like earlyoom or systemd-oomd allow you some control about what gets (and esp. what gets not) killed, but the idea that you run into OOM and then start acting by launching a process that probably has a gizillion file accesses before rendering and showin a window (all of which requires heaps of RAM) is DOA.
Also, in case it's actually not an OOM situation: https://wiki.archlinux.org/title/Ryzen# … k_freezing
Offline
I can't get more RAM, because it's a laptop where the RAM is soldered in. (Now for the love of all, don't say to just get a laptop with more RAM)
I know exactly what is using the RAM, it is intended. I need to run some heavy work stuff.
To be more precise, it is not deterministic how much RAM will be used.
Sometimes it completes no problem, sometimes not. I am usually just 1 or 2 GB short and I'd rather kill slack and a browser when the case occurs and let the process complete, instead of a full shutdown because it starts freezing.
- Keeping things open just in case is the worst possible UX.
- Using OOM is the second worst, because it might equate to the same loss of data/progress as a sudden reboot. Configuring userspace OOM is not failsafe (if you didn't forsee a future case) and requires constant maintenance and thinking ahead.
- Using swap is in my case also not productive, the system becomes unusable because it lags as hell and the work process that would usually need 20 minutes now needs 20 hours because of endless pageswapping. Might as well reboot and start over.
None of those are solutions, but bandaids for the symptoms of the problem.
(Rant: No wonder people don't switch to Linux, when the community tells you that you just have to use your brainjuice to tiptoe around it's shortcomings instead of focusing just on whatever you need to get done. I do not want to think about the OS and it's quirks when I work.)
Why can there be no proper solution where the system keeps a configured list of programs always safe from full RAM? (through cgroups or other means)
Last edited by Deckweiss (2023-11-30 19:58:40)
Offline
I can't get more RAM, because it's a laptop where the RAM is soldered in.
Add proper swap - that's still the next best thing.
Using OOM is the second worst, because it might equate to the same loss of data/progress as a sudden reboot.
Which is why I told you to look at userspace OOM daemons that allow you to imopact what processes will and won't get taken out
Why can there be no proper solution where the system keeps a configured list of programs always safe from full RAM?
Because we're not living in a fantasy land? If you're OOM, you can't pull some RAM out of your ass for some elevated process. And that has absolutely nothing to do w/ the OS.
Using swap is in my case also not productive, the system becomes unusable because it lags as hell and the work process that would usually need 20 minutes now needs 20 hours because of endless pageswapping.
Ie. no RAM is inactive, all is currently accessed - you're *actively* requiring mor RAM than you have. That's not solvable.
That being said: if you're not having this problem at all with not-linux, why are we having this debate?
I mean, you could just happily stay in your fantasy land where RAM magically mulitplies?
Offline
I don't need more RAM.
I need a RAM guarantee for essential software like the DE and the task manager.
I feel like you are ignoring the for me important parts of what I keep writing on purpose. Which isn't productive to figuring out a proper solution for my case. If there is nothing else you can contrubute I thank you for your effort so far. Cheers.
Last edited by Deckweiss (2023-11-30 20:36:41)
Offline
Using swap is in my case also not productive, the system becomes unusable because it lags as hell and the work process that would usually need 20 minutes now needs 20 hours because of endless pageswapping. Might as well reboot and start over.
This sounds very similar to an issue I had in the past when I had to fall back to old HW due to HW failure of the modern desktop .
That system had little ram but the cause of the endless pageswapping was the drive controller couldn't cope with the IO demand.
Adding a 2nd drive which connected through a different controller and putting swap on that 2nd drive reduced the IO load so far the system could handle the task .
It did respond slow and sluggish but became usable .
I suggest you use TMPDIR to point /tmp (which by default is in RAM) on a 2nd drive that does not use the same controller and test.
If that reduces the problems, your bottleneck is not cpu or ram but IO throughput .
Last edited by Lone_Wolf (2023-12-01 11:04:52)
Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.
clean chroot building not flexible enough ?
Try clean chroot manager by graysky
Online
OP has a notebook that apparently doesn't even allow to extend RAM…
On the swap/thrashing case, another reason might simply be (still) underdimensioned swap, forcing the kernel to (still) constantly drop caches.
If you're systematically overdemanding RAM, no swap will help you, though (in terms of performance) - you simply cannot run a task on an underdimensioned system and ask for the magic sysfs switch to make it work nevertheless. That wish isn't based in reality.
"I want to compile chromium on my Atom netbook with 2GB RAM, but I don't want to use swap. Also the fans should not spin and ideally the process take no longer than 15 minutes. I'm sure there must be some config option to allow this. WHY DON'T YOU JUST TELL ME!!!"
I need a RAM guarantee for essential software like the DE and the task manager.
This idea is based on a gross misunderstanding on how reality works.
There's nothing such as "special process".
If you start ksysguard, it'll want to read some 100MBs of libraries and probably some config/graphic files from the disk. Into RAM.
It'll also want to allocate memory for it's internal business and rendering the output.
It'll also require the display server/compositor to allocate memory to get that nice window in front of you.
You can mlockall a process to prevent it from being swapped out - except that you don't have any swap to begin with (we don't count the thing that eats away your RAM in turn) and this only works for *running* processes. The idea to face a thrashing system¹ and then be able to start a special, elevated, essential process™ is a complete fantasy.
You cannot constrain "future" processes as that's simply not a thing.
¹What you're gonna witness with your current setup is that the kernel constantly drops file caches and then has to re-read the file from disk to use as little disk cache as possible and the BY FAR strongest defence against that (aside more RAM) is a decently sized swap file/partition, as it'll allow the kernel to map out inactive anon pages instead.
If your task starts thrashing, despite a reasonably sized *REAL* swap (not fucking zram!) the *only* explanation for that is that the immediate and active memory demand outsizies your physical RAM and the *only* way to fix that is with more RAM.
Or apparently use windows…
Offline
FWIW something which might help you, Plasma preloads a stripped down version of ksysguard to help in situations like these, if you invoke it via it's shortcut Ctrl+Esc by default. Maybe this already helps for your perceived usecases.
Offline