[Patch] Reduce syscalls

tindzk · 2010-07-25 18:16:51

Looking at the strace output, pacman seems to invoke lots of unnecessary syscalls.

This patch has reduced the number of needed syscalls on my system by 70%:
http://pastebin.archlinux.fr/406557

Reading the "desc" files is still inefficient:

open("/var/lib/pacman/sync/extra/firefox-i18n-3.6.8-1/desc", O_RDONLY|O_LARGEFILE) = 4
fstat64(4, {st_mode=S_IFREG|0644, st_size=321, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb770e000
read(4, "%FILENAME%\nfirefox-i18n-3.6.8-1-"..., 4096) = 321
read(4, "", 4096)                       = 0
close(4)                                = 0
munmap(0xb770e000, 4096)                = 0

This could be reduced to three syscalls:

open("/var/lib/pacman/sync/extra/firefox-i18n-3.6.8-1/desc", O_RDONLY) = 4
read(4, "%FILENAME%\nfirefox-i18n-3.6.8-1-"..., 4096) = 321
close(4)                                = 0

However, this would involve moving away from the glibc functions fopen(), etc. and use the syscalls directly. What do you think?

The 4096 bytes should be large enough for all "desc" files. A simple heuristic could be included to check whether some bytes are missing: 1) all sections are covered (NAME, VERSION, DESC, URL, etc.) and 2) the buffer ends with \n\n.

When I flush the disk cache, pacman takes 1 min and 18s for a single "pacman -Syu" run! With the patch applied it takes 1 min 7s which is still very slow. I guess it would be a lot faster to read the .gz compressed databases into memory rather than using the uncompressed package tree.

Last edited by tindzk (2010-07-26 13:30:24)

falconindy · 2010-07-26 01:11:41

Interesting stuff. You'll want to subscribe and post this to pacman-dev@archlinux.org if you want this to get any real attention.

flamelab · 2010-07-26 01:29:26

Make a bug report (feature request) on the bugtracker, and as falconindy suggested, propose that to the pacman-dev mailing list.

Allan · 2010-07-26 01:56:59

tindzk wrote:

I guess it would be a lot faster to read the .gz compressed databases into memory rather than using the uncompressed package tree.

Which is currently being worked on...

Still, send your patch to the pacman-dev list and it will get reviewed by the people involved.

tindzk · 2010-07-26 13:24:27

Alright. Thanks for the responses.

Rip-Rip · 2010-07-26 22:38:36

tindzk wrote:

The 4096 bytes should be large enough for all "desc" files. A simple heuristic could be included to check whether some bytes are missing: 1) all sections are covered (NAME, VERSION, DESC, URL, etc.) and 2) the buffer ends with \n\n.

Or you could simply compare the value returned by read with 4096...

int readed;
while ((readed = read(fd, buf, 4096)) == 4096)
        continue;

This is the best "heuristic" you could have dream of

diegonc · 2010-07-27 00:02:00

Rip-Rip wrote:

int readed =0;
while ((readed += read(fd, buf, 4096 -  readed )) == 4096)
        continue;
This is the best "heuristic" you could have dream of

Did you mean that instead?

EDIT: hmm.. may be not

Last edited by diegonc (2010-07-27 00:08:10)

tindzk · 2010-07-27 00:16:05

Rip-Rip wrote:

Or you could simply compare the value returned by read with 4096...
int readed;
while ((readed = read(fd, buf, 4096)) == 4096)
        continue;
This is the best "heuristic" you could have dream of

Yes, but that would also imply that each file contains 4096 (or more) bytes which seems not to be the case.

Another potential issue is that read() may actually return less than 4096 even though the file contains >=4096 bytes. I've never experienced this behaviour before when dealing with normal disk files but it's quite common with TCP sockets. Well, I still wouldn't want to rely on the situation of always returning the highest number of available bytes because a directory could still be mounted via network. In this case our assumption isn't guaranteed anymore.

What I thought of in my initial post was something like this:

    size_t len;
    String s = StackString(4096);

    do {
        len = File_Read(&file,
            s.buf  + s.len,
            s.size - s.len);

        s.len += len;
    } while (len > 0 && s.len < s.size);

    if (!String_EndsWith(s, String("\n\n")) {
         /* The file is either a) invalid or b) larger than
         * 4096 bytes (unlikely) and thus the final \n\n
         * is missing here.
         */
    }

Is it even true that a complete "desc" file always has to end with \n\n?

Last edited by tindzk (2010-08-01 00:27:44)

Arch Linux

#1 2010-07-25 18:16:51

[Patch] Reduce syscalls

#2 2010-07-26 01:11:41

Re: [Patch] Reduce syscalls

#3 2010-07-26 01:29:26

Re: [Patch] Reduce syscalls

#4 2010-07-26 01:56:59

Re: [Patch] Reduce syscalls

#5 2010-07-26 13:24:27

Re: [Patch] Reduce syscalls

#6 2010-07-26 22:38:36

Re: [Patch] Reduce syscalls

#7 2010-07-27 00:02:00

Re: [Patch] Reduce syscalls

#8 2010-07-27 00:16:05

Re: [Patch] Reduce syscalls

Board footer