You are not logged in.

#1 2020-05-22 09:51:39

kokoko3k
Member
Registered: 2008-11-14
Posts: 2,390

Keep pagecache (file buffers) across reboots.

There are several preload daemons that more or less do the same thing:
* see what files are being opened, read them in the next reboot to warm the cache.

Since linux pagecache keeps the blocks (not files) in cache, i think that a more direct+efficient+effective approach could be:
1) boot the system
1.1) Use it for a while doing common things
2) read what blocks are cached.
3) read those blocks on subsequent reboots.

Point #2 is my issue... is there a tool that would tell me that (example):
cached bytes on /dev/sda:
bytes from 1 to 10.000
[..]
bytes from 50.000 to 300.000

I searched around but found nothing.

Last edited by kokoko3k (2020-05-22 21:07:24)


Help me to improve ssh-rdp !
Retroarch User? Try my koko-aio shader !

Offline

#2 2020-05-22 14:27:37

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 29,449
Website

Re: Keep pagecache (file buffers) across reboots.

I can't help with identifying what blocks are cached - but do you see any actual benefit in preload tools?  Do you have an old or slow harddrive?

The only preload tool I ever experimented with did more than what you describe.  It relocated the files read at boot to a more optimal place on the (spinning platter) disk to make the boot-time read faster.  While this could theoretically provided a benefit, I'm not sure if any non-trivial effect was acheived.  But that was on an HDD - this relocation is irrelevant on SSDs.

Without relocation, I can't imagine even a theoretical benefit of preloading.  If files X, Y, and Z get read at boot time, and it takes A, B, and C milliseconds to read each one respectively, it will take  A+B+C ms of disk read time during boot.  If you use a preloader, it may make those file get read before they are needed, but it can't possibly speed up the read time (without relocating data on a spinning disk), so the read time is still A+B+C ms.  So even theoretically, I see no possible benefit, but a possible cost as those files are read before they are actually needed, the processing time to read them may be delaying something else that could have started earlier.

All this logic applies just the same to preloading blocks as well as it does to files.  With relocation of blocks on a slow platter disk, there could be theoretical benefit, but still questionable on whether it'd be a real benefit.  Without relocation or on a SSD, I don't see how there'd be any possible use of preloading file data.


"UNIX is simple and coherent..." - Dennis Ritchie, "GNU's Not UNIX" -  Richard Stallman

Offline

#3 2020-05-22 15:31:19

Ropid
Member
Registered: 2015-03-09
Posts: 1,069

Re: Keep pagecache (file buffers) across reboots.

I got the impression that there's no way to do point #2 on a normal kernel.

There's a tool "ureadahead" for Ubuntu that traces what's being read. A short while after boot is finished, it writes a file that it then uses on the next boot for preloading stuff. I mention this "ureadahead" thingy because it needs a kernel patch to work. It can't work on a normal kernel. If something more simple like that tool already needs a patch to work, then I bet point #2 can't be done with a normal kernel.

Offline

#4 2020-05-22 15:34:39

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 29,449
Website

Re: Keep pagecache (file buffers) across reboots.

Ropid, there are ample similar tools that do not need a kernel patch.


"UNIX is simple and coherent..." - Dennis Ritchie, "GNU's Not UNIX" -  Richard Stallman

Offline

#5 2020-05-22 16:00:01

kokoko3k
Member
Registered: 2008-11-14
Posts: 2,390

Re: Keep pagecache (file buffers) across reboots.

Trilby wrote:

I can't help with identifying what blocks are cached - but do you see any actual benefit in preload tools?  Do you have an old or slow harddrive?

Yep, it is an old laptop with a 5400rpm disk.

Without relocation, I can't imagine even a theoretical benefit of preloading.  If files X, Y, and Z get read at boot time, and it takes A, B, and C milliseconds to read each one respectively, it will take  A+B+C ms of disk read time during boot.  If you use a preloader, it may make those file get read before they are needed, but it can't possibly speed up the read time (without relocating data on a spinning disk), so the read time is still A+B+C ms.  So even theoretically, I see no possible benefit, but a possible cost as those files are read before they are actually needed, the processing time to read them may be delaying something else that could have started earlier.

After a cold boot, it can take even 30 seconds to start the browser.
I use a preload script i made time ago, and it can cut down that time to an half;
I start it while the system is booting *with the lowest io and cpu priority* (ionice+nice), the idea is that it will do his dirty job while i'm doing other things without slowing down other processes, it works well and does not slow down the boot process.

Ropid:
There are tools that (maybe) proves that it is possible, like vmtouch, but for some reason they only work on "regular" files.
vmtouch is able to tell you what parts of a given file is paged in ram, but it is not verbose enough for this purpose and even if it was, it fails if you pass a block device to it.

Last edited by kokoko3k (2020-05-22 16:20:42)


Help me to improve ssh-rdp !
Retroarch User? Try my koko-aio shader !

Offline

#6 2020-05-22 16:07:13

Ropid
Member
Registered: 2015-03-09
Posts: 1,069

Re: Keep pagecache (file buffers) across reboots.

My thinking was, if they felt the need to add a patch to the kernel just for tracing file open, then there's probably no way to do point #2. Because if getting a list of the blocks in the page cache would be possible, why wouldn't they use that? Then again, maybe they still want to have their own tracing because they want to know the order in which files are read?

Offline

#7 2020-05-22 16:15:34

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 29,449
Website

Re: Keep pagecache (file buffers) across reboots.

kokoko3k wrote:

After a cold boot, it can take even 30 seconds to start the browser.
I use a preload script i made time ago, and it can cut down that time to an half.

Ah.  I was considering more basic system processes.  It's not *actually* faster with the browser, the first load of the browser would always be slow, all subsequent loads would be faster.  It'd still take ~30 seconds to read everything for the browser during boot - it's just done for you before you start the browser.

You could get the same effect by simply starting the browser at login (e.g., from xinitrc) particularly if your WM had a way to it to start hidden or off-screen.  But then this is quite tangential to your goals, and for those goals, a readaehad would be just as sound as autostarting the browser.

(edit: in case anyone wants to tinker with this, it'd be much simpler.  Identify the files (e.g., profiles, etc) that would be needed by the browser, and simply `cat file1 file2 dir/* >/dev/null` in xinitrc).

Last edited by Trilby (2020-05-22 16:21:03)


"UNIX is simple and coherent..." - Dennis Ritchie, "GNU's Not UNIX" -  Richard Stallman

Offline

#8 2020-05-22 16:17:53

kokoko3k
Member
Registered: 2008-11-14
Posts: 2,390

Re: Keep pagecache (file buffers) across reboots.

Ropid wrote:

My thinking was, if they felt the need to add a patch to the kernel just for tracing file open, then there's probably no way to do point #2. Because if getting a list of the blocks in the page cache would be possible, why wouldn't they use that? Then again, maybe they still want to have their own tracing because they want to know the order in which files are read?

I don't know why other solutions are file based, (running at kernel level probably it is just to reach better performance, but you can trace opened files even with strace userspace utils, but it is really, really slow.)
btw, the order in which "files" are read matches the order in which "blocks" are read.

But the kernel knows what blocks are cached,
you can try do do a read of /dev/sdc via dd multiple times, and you'll see that subsequent reads are almost instant.

One could write the most lame script by iterating dd in little blocks and if the read is faster than the drive speed then it means those blocks are in the page cache :-D

Last edited by kokoko3k (2020-05-22 21:00:22)


Help me to improve ssh-rdp !
Retroarch User? Try my koko-aio shader !

Offline

#9 2020-05-22 16:31:47

kokoko3k
Member
Registered: 2008-11-14
Posts: 2,390

Re: Keep pagecache (file buffers) across reboots.

Trilby wrote:

You could get the same effect by simply starting the browser at login (e.g., from xinitrc) particularly if your WM had a way to it to start hidden or off-screen.  But then this is quite tangential to your goals, and for those goals, a readaehad would be just as sound as autostarting the browser.

To be honest, i never considered to start the whole program !:-O
It would work for sure.

Last edited by kokoko3k (2020-05-22 16:32:53)


Help me to improve ssh-rdp !
Retroarch User? Try my koko-aio shader !

Offline

#10 2020-05-22 17:33:58

Ropid
Member
Registered: 2015-03-09
Posts: 1,069

Re: Keep pagecache (file buffers) across reboots.

The vmtouch man-page mentioned a "mincore" system call. I tried looking around on github for "mincore", and I think there's one person here that tried to do this cache content list plus later importing:

https://github.com/radii/mincore

The scripts there don't seem to work for me. I fixed up that "get-workingset.sh" one to make it use "./mincore" and "./blockno" names and then started it but it seems to be stuck.

Offline

#11 2020-05-22 18:02:05

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 29,449
Website

Re: Keep pagecache (file buffers) across reboots.

kokoko3k, while it may be blocks that are cached, not files, I'm not sure the distnction is useful.  The only way it'd be useful is if a process that was going to read a file has a predefined / static offset (and a notably large one at that) that'd it's seek to before starting to read the file.  Not many processes do this - and I highly doubt a browser (or it's major dependencies) do.  They read the file from the start.  So even if the really needed information were only at the end of the file, if the blocks for the start of the file were not already/still cached, they'd be read just the same.

Last edited by Trilby (2020-05-22 18:02:38)


"UNIX is simple and coherent..." - Dennis Ritchie, "GNU's Not UNIX" -  Richard Stallman

Offline

#12 2020-05-22 21:18:56

kokoko3k
Member
Registered: 2008-11-14
Posts: 2,390

Re: Keep pagecache (file buffers) across reboots.

Not just that,
tracing what files are being opened costs in terms of human time (the need to use strace or similar tools while the program is starting) or needs kernel patches which almost certainly have a performance hit anyway.
Given that the given kernel has the info I need in any given moment, having a way to dump that data would allow to virtually "snapshot" the pagecache and restore it after a reboot with minimal seeks.

Yeah, I could spend 60 bucks and get a decent ssd, but then all the fun would be lost forever.

Ropid:
Thanks i'll check it out!

Last edited by kokoko3k (2020-05-22 21:20:12)


Help me to improve ssh-rdp !
Retroarch User? Try my koko-aio shader !

Offline

#13 2020-05-23 06:16:57

kokoko3k
Member
Registered: 2008-11-14
Posts: 2,390

Re: Keep pagecache (file buffers) across reboots.

Ropid, thanks, mincore is process oriented, unfortunately. And his brother, fincore, is file oriented. But as vmtouch, fails on block devices and his output is not verbose enough anyway.

I found:
echo 1 > /proc/sys/vm/block_dump

While not exactly what i was looking for, it cause kernel to log block reads and writes as they occour.

Last edited by kokoko3k (2020-05-23 06:17:34)


Help me to improve ssh-rdp !
Retroarch User? Try my koko-aio shader !

Offline

Board footer

Powered by FluxBB