You are not logged in.

#1 2008-05-05 10:02:03

ora
Member
Registered: 2007-06-20
Posts: 26

speedup pacman's pkg fetch and other thoughts

There's an interesting feature in Debian lenny's aptitude: it grabs debs from different server parrelly.
I think this can greatly speedup fetch progress. And do no harm to the server,still one connection one server.

Change pkg's compress algorithm to lzma can gain both size and speed  benefit.
http://tukaani.org/lzma/benchmarks

A tool to convert LSB's "restricted RPM"  to pkg.tar.gz to  make arch LSB-compliance

No offence, just some suggestions. Arch rocks.
I'm also interested in devs' attitude towards GUI. There are rumours arch devs refuse to accept GUI in order to keep KISS.

Offline

#2 2008-05-05 13:37:10

bender02
Member
From: Germany
Registered: 2007-02-04
Posts: 1,328

Re: speedup pacman's pkg fetch and other thoughts

http://wiki.archlinux.org/index.php/Imp … erformance the section about aria2

lzma has been discussed on ML (EDIT: fixed link) http://www.archlinux.org/pipermail/pacm … 11497.html


that tool - why don't you code it yourself?

whether or not "devs refuse gui" has nothing to do with the fact that some people actually code a gui for pacman - which you can use if you want to - have a look at http://shaman.iskrembilen.com/site/

Last edited by bender02 (2008-05-05 13:41:23)

Offline

#3 2008-05-06 02:03:37

hacosta
Member
From: Mexico
Registered: 2006-10-22
Posts: 422

Re: speedup pacman's pkg fetch and other thoughts

it sounds good but there might be problems when the mirrors aren't synced, (e.g. mirror 1 has kernel 2.6.24 while mirror 2 has a module compiled for 2.6.25) a work arround for this situation will add complexity imho, and downloading from just a mirror was, in my opinion, one of the nice side effects of the /etc/pacman.d/{testing, extra, foo} to /etc/pacman.d/mirrorlist move.

Offline

#4 2008-05-13 13:32:15

ngaba
Pacman Developer
Registered: 2008-05-13
Posts: 16

Re: speedup pacman's pkg fetch and other thoughts

"devs refuse gui"

No, they don't. They split pacman2 to back-end (library) and front-end to ease up GUI development. However, they don't develop alternative front-end, but this is not "forbidden". Package maintainers and Arch leaderships decide about packages, so they decide about the official "front-ends" of ArchLinux, not pacman devs.

A pacman "contributor"

Offline

#5 2008-05-13 13:52:50

skymt
Member
Registered: 2006-11-27
Posts: 443

Re: speedup pacman's pkg fetch and other thoughts

The syncing problem is easily resolved. As the version of a package is in the filename, simply check for the existence of the file on each mirror in turn until the connection limit is reached. I've looked at implementing native multi-source downloads in libdownload, but that's currently on the back burner.

As for LZMA, it would ideally be implemented in libarchive. You can contact its maintainer about that. One significant roadblock would be the LGPL license of the LZMA reference implementation. libarchive is under the BSD license and is intended for eventual inclusion in FreeBSD, some of whose developers are notoriously anti-GPL.

Offline

#6 2008-05-13 14:01:25

ngaba
Pacman Developer
Registered: 2008-05-13
Posts: 16

Re: speedup pacman's pkg fetch and other thoughts

"As the version of a package is in the filename...".
Slight correction: filename is determined from the %FILENAME% field of syncdb. That could be even foo.arj (however, not common ;-).

Offline

#7 2008-05-14 13:16:51

schivmeister
Developer/TU
From: Singapore
Registered: 2007-05-17
Posts: 960
Website

Re: speedup pacman's pkg fetch and other thoughts

Wait, don't you mean parallel package fetch rather than parallel mirror fetch when you say pkg fetch? Meaning, fetch multiple packages at once, i.e options=('parallel-fetch') lol


I need real, proper pen and paper for this.

Offline

#8 2008-05-15 02:41:00

ora
Member
Registered: 2007-06-20
Posts: 26

Re: speedup pacman's pkg fetch and other thoughts

I mean fetch multiple packages at once.
At first i thought it's not hard. Then i find it is complex.
(1).when do pacman -Sy, must sync all mirrors and pickup the latest to keep consistent.
(2).give each mirror a download queue, comes up a problem: how to dispatch the packages to the queues, and when user hit ctrl-c, how to deal with the canceled package, move to another mirror's queue or simply add to the tail of the current queue.
(3).there're sync/mutex problems if  i want to maximize download speed by moving packages around queues(eg. if a queue finished download, append the package from other unfinished queues).
It's hard to write a wrapper to implement (2), simply modify libdownload is not enough, have to make alpm multi-threaded.
To me alpm is a monster, is there any documentations on it?
From my first view, it is asynchronous and transaction based, the design is great but the interface looks a bit weird.

I think the performance critical part is searching packages by name/regex. alpm is not KISS.
rewrite it with python plus some c modules will benefit many people.
gentoo's emerge, fedora's anaconda,yum are really easy to understand ,add new features and hack into what you like.

Last edited by ora (2008-05-15 02:48:43)

Offline

#9 2008-05-15 06:01:24

shining
Pacman Developer
Registered: 2006-05-10
Posts: 2,043

Re: speedup pacman's pkg fetch and other thoughts

ora wrote:

I mean fetch multiple packages at once.
At first i thought it's not hard. Then i find it is complex.
(1).when do pacman -Sy, must sync all mirrors and pickup the latest to keep consistent.
(2).give each mirror a download queue, comes up a problem: how to dispatch the packages to the queues, and when user hit ctrl-c, how to deal with the canceled package, move to another mirror's queue or simply add to the tail of the current queue.
(3).there're sync/mutex problems if  i want to maximize download speed by moving packages around queues(eg. if a queue finished download, append the package from other unfinished queues).
It's hard to write a wrapper to implement (2), simply modify libdownload is not enough, have to make alpm multi-threaded.
To me alpm is a monster, is there any documentations on it?
From my first view, it is asynchronous and transaction based, the design is great but the interface looks a bit weird.

I think the performance critical part is searching packages by name/regex. alpm is not KISS.

alpm is not KISS so you want to add multi-thread support in both libdownload and libalpm to make it better?

rewrite it with python plus some c modules will benefit many people.

I can't wait to see your code.

gentoo's emerge, fedora's anaconda,yum are really easy to understand ,add new features and hack into what you like.

Yet, they all suck. Why are people here using pacman / makepkg and not one of these?


pacman roulette : pacman -S $(pacman -Slq | LANG=C sort -R | head -n $((RANDOM % 10)))

Offline

#10 2008-05-15 11:11:05

ora
Member
Registered: 2007-06-20
Posts: 26

Re: speedup pacman's pkg fetch and other thoughts

shining wrote:

gentoo's emerge, fedora's anaconda,yum are really easy to understand ,add new features and hack into what you like.

Yet, they all suck. Why are people here using pacman / makepkg and not one of these?

for users, emerge etc. all sucks, but for developers the code is good written.

I have done some initial python code. I'm trying to integrate abs/aur into  the python interface.

Offline

#11 2008-05-15 17:23:03

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: speedup pacman's pkg fetch and other thoughts

Ah the "multi threaded" suggestions. Even right now, as we speak, the people in the know are saying that multi-threading was a mistake. We don't need it or want it. It doesn't scale, and it certainly is much slower when half your program is IO wait. Take a look at static file serving stats for thttpd - it is a single threaded app and outperforms most other web servers by leaps and bounds.

Offline

#12 2008-05-15 17:49:05

toofishes
Developer
From: Chicago, IL
Registered: 2006-06-06
Posts: 602
Website

Re: speedup pacman's pkg fetch and other thoughts

There are mirrors out there that max out any fat connection I've ever used. I'm not sure why this always seems to come up. Find a mirror that actually utilizes your bandwidth.

Offline

#13 2011-03-15 10:39:17

mallrat
Member
From: Russia, Voronezh
Registered: 2011-03-15
Posts: 3

Re: speedup pacman's pkg fetch and other thoughts

I'm using the following script to download packages before I start pacman -Syu. This gets up to 4 packages in parallel:

# How many packages in parallel
PP=4
pushd /var/cache/pacman/pkg
for i in $(pacman -Syup | grep "^http")
do
    while [ "$(pgrep wget | wc -l)" -gt "$PP" ]
    do
        echo "Wget processes: $(pgrep wget | wc -l). Sleeping for 1 sec" >> /var/log/pac_get.log
        sleep 1
    done
    echo "Downloading: $i" >> /var/log/pac_get.log
    wget --passive-ftp -c "$i" >>/var/log/pac_get.wget 2>&1 &
done
popd

It's not very good because isn't using %o parameter of pacman's download options. But commonly it works.

Last edited by mallrat (2011-03-15 10:44:15)

Offline

#14 2011-03-15 12:35:12

ChoK
Member
From: France
Registered: 2008-10-01
Posts: 340

Re: speedup pacman's pkg fetch and other thoughts

https://wiki.archlinux.org/index.php/Powerpill

edit: please don't necro thread

Last edited by ChoK (2011-03-15 12:36:07)


Ah, good taste! What a dreadful thing! Taste is the enemy of creativeness.
Picasso
Perfection is reached, not when there is no longer anything to add, but when there is no longer anything to take away.
Saint Exupéry

Offline

#15 2011-03-15 12:53:04

mallrat
Member
From: Russia, Voronezh
Registered: 2011-03-15
Posts: 3

Re: speedup pacman's pkg fetch and other thoughts

Thank you, powerpill seems a great solution.

P.S. I'm sorry for necro-threading, but I've found no reason to create new thread.

Last edited by mallrat (2011-03-15 12:53:18)

Offline

#16 2011-03-16 01:12:59

ngoonee
Forum Fellow
From: Between Thailand and Singapore
Registered: 2009-03-17
Posts: 6,717

Re: speedup pacman's pkg fetch and other thoughts

mallrat wrote:

Thank you, powerpill seems a great solution.

P.S. I'm sorry for necro-threading, but I've found no reason to create new thread.

There's conversely no reason to necro-bump, since tools to do exactly that are now available (and widely used), which was not the case back in 2008.

Closing.


Allan-Volunteer on the (topic being discussed) mailn lists. You never get the people who matters attention on the forums.
jasonwryan-Installing Arch is a measure of your literacy. Maintaining Arch is a measure of your diligence. Contributing to Arch is a measure of your competence.
Griemak-Bleeding edge, not bleeding flat. Edge denotes falls will occur from time to time. Bring your own parachute.

Offline

Board footer

Powered by FluxBB