You are not logged in.

#1 2023-07-02 11:51:22

gen2arch
Member
Registered: 2013-05-16
Posts: 182

Install on system with very large amount of RAM (ramdisks, tmpfs?)

Hi,

we have a 32-core machine with 512GB of ECC RAM at our uni department. This machine is meant to be used also as a "normal" desktop workstation. Typical use cases involve lots of massively parallel I/O by using tesseract OCR and then ripgrep and similar programs on very large amount of (mainly textual) data.

I was wondering: what would be the best use of this amount of RAM in an arch install with regard to optimizing the desktop experience / speed ?

I was thinking of ramdisks/tmpfs: what could typically be put on a ramdisk to increase speed? Is there a standardized install that would put the entire $HOME directory on a ramdisk? Is there a protocol for data safety in this case?

What suffers typically the most from hard drive speed bottlenecks?

Thank you for any ideas

gen2arch

Offline

#2 2023-07-02 12:48:13

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 29,564
Website

Re: Install on system with very large amount of RAM (ramdisks, tmpfs?)

I doubt any such efforts are worthwhile.  Everything relevant is loaded into memory when it is used, and if there is so much free ram remaining the kernel itslef will maintain a record of those cached pages and they should never be removed from memory.

So preloading *everything* into memory when you are not actually going to use *everything* every time you use the system will actually result in an overall slow-down.  You could argue that it would push all this loading to ram to boot-time rather than at first-use of a bit of code or data, but I highly doubt this would have any significant effect.

gen2arch wrote:

What suffers typically the most from hard drive speed bottlenecks?

You have atypical hardware and an atypical usage pattern - so why would any "typical" bottleneck be relevant?  Test under your own usage, then you'll know.

But again, even without any strategic approach, only the first read of any code or data from the disk would be "slow" - from then on it would remain cached indefinitely.  This has the added benefit of taking advantage of not having to re-invent OS processes to maintain what you refer to as "data safety" (which I'm assuming refers to maintaining syncronization between the in-ram data and what's stored on disk).

If you really do want to preload some data, a simple `cat /path/to/data/* > /dev/null` should do.  Of course you can test this by benchmarking read speed of data files on a fresh boot without using this cat command versus one that does use it.  However, a negative result wouldn't necessarily mean the "preloading" didn't take - it could mean it worked as intended, but as I initially suspected, the benefit could be too small to detect.

Note that you might benefit from tuning some sysctl parameters for RAM use to ensure the kernel does retain cached code / data "forever".  However, I doubt such settings would have any effect if the system is truly never facing any "ram pressure".  And if RAM is starting to fill up, the kernel's default choices on which pages to drop will probably be as good - if not better - than your intuitive predictions.  So in other words, I'd wager your best performance results would come from doing nothing and staying out of the kernel's way.

EDIT: one thing you may want to configure if you haven't yet is the size of the tmpfs for /tmp.  I believe the default is 1/2 of RAM, and I don't know that there is a builtin cap.  A 256G /tmp is probably of no use.  So specifying a more moderate size might be worthwhile.  Though this too may not really matter as the tmpfs "size" really just serves as a maximum; that amount of ram isn't really claimed and remains available for other processes until /tmp content actually needs it.

Last edited by Trilby (2023-07-02 13:37:07)


"UNIX is simple and coherent..." - Dennis Ritchie, "GNU's Not UNIX" -  Richard Stallman

Offline

#3 2023-07-15 16:21:32

gen2arch
Member
Registered: 2013-05-16
Posts: 182

Re: Install on system with very large amount of RAM (ramdisks, tmpfs?)

Although somewhat depressing, this is a very informative roundup! thanks a lot.
I have to say I wasn't fully aware of the kernel's workings wrt to preloading/caching, it would indeed seem that doing nothing special is best here.

Thanks

gen2arch

Offline

Board footer

Powered by FluxBB