You are not logged in.

#1 2008-06-16 14:08:26

hdoria
Forum Fellow
From: Brazil
Registered: 2007-06-28
Posts: 52
Website

Pacman Options (total download, show size and use delta)

Hi,

pacman has great non-default options. I liked these ones:

UseDelta, TotalDownload, ShowSize


ShowSize
     Display the size of individual packages for --sync and --query modes.

UseDelta
     Download delta files instead of complete packages if possible. Requires the xdelta
     program to be installed.

TotalDownload
     When downloading, display the amount downloaded, download rate, ETA, and completed
     percentage of the entire download list rather than the percent of each individual
     download target. The progress bar is still based solely on the current file download.

TotalDownload and ShowSize are working ok, but i have some questions about the UseDelta option:

How this works? Where can i find the delta files?

Offline

#2 2008-06-16 14:15:35

Allan
Pacman
From: Brisbane, AU
Registered: 2007-06-09
Posts: 11,478
Website

Re: Pacman Options (total download, show size and use delta)

The idea is to use binary diffs to patch the package if this results in a significantly smaller download.  I saw someone posted numbers recently showing it would work great with most pkgrel bumps and minor pkgver bumps.

However, you can't find delta files for official Arch packages anywhere because this is not used yet.  The Arch infrastructure is not setup for it and I don't think it is high on anyones priority list...

Offline

#3 2008-06-16 16:44:54

shining
Pacman Developer
Registered: 2006-05-10
Posts: 2,043

Re: Pacman Options (total download, show size and use delta)

Allan wrote:

The idea is to use binary diffs to patch the package if this results in a significantly smaller download.  I saw someone posted numbers recently showing it would work great with most pkgrel bumps and minor pkgver bumps.

However, you can't find delta files for official Arch packages anywhere because this is not used yet.  The Arch infrastructure is not setup for it and I don't think it is high on anyones priority list...

To make it worse, this feature is currently broken in pacman 3.2 development version, and it will probably still be broken when 3.2 is released.
But who cares, it is not used anyway wink


pacman roulette : pacman -S $(pacman -Slq | LANG=C sort -R | head -n $((RANDOM % 10)))

Offline

#4 2008-06-16 17:18:55

u_no_hu
Member
Registered: 2008-06-15
Posts: 453

Re: Pacman Options (total download, show size and use delta)

@ shining
But wont it be much beneficial for a rolling release distro like arch to have delta packages instead of regular packages when we are updating the whole system?? Will help those who have low speed connections...


Don't be a HELP VAMPIRE. Please search before you ask.

Subscribe to The Arch Daily News.

Offline

#5 2008-06-16 19:21:45

shining
Pacman Developer
Registered: 2006-05-10
Posts: 2,043

Re: Pacman Options (total download, show size and use delta)

u_no_hu wrote:

@ shining
But wont it be much beneficial for a rolling release distro like arch to have delta packages instead of regular packages when we are updating the whole system?? Will help those who have low speed connections...

I didn't say that it wasn't useful, just that no one is interested in fixing it, and no one is interested to putting it in place.


pacman roulette : pacman -S $(pacman -Slq | LANG=C sort -R | head -n $((RANDOM % 10)))

Offline

#6 2008-08-17 19:17:21

g2g591
Member
Registered: 2007-12-24
Posts: 54

Re: Pacman Options (total download, show size and use delta)

sorry for the bump, but I'd really love to see UseDelta become implemented, it saves tons of time downloading (on Gentoo at least) and it would put use less bandwith. It would greatly benefit dial-up users and people with slow DSL (like me, I get 13Kb/s ...) .

Offline

#7 2008-08-17 19:47:25

shining
Pacman Developer
Registered: 2006-05-10
Posts: 2,043

Re: Pacman Options (total download, show size and use delta)

g2g591 wrote:

sorry for the bump, but I'd really love to see UseDelta become implemented, it saves tons of time downloading (on Gentoo at least) and it would put use less bandwith. It would greatly benefit dial-up users and people with slow DSL (like me, I get 13Kb/s ...) .

Help then smile


pacman roulette : pacman -S $(pacman -Slq | LANG=C sort -R | head -n $((RANDOM % 10)))

Offline

#8 2008-08-18 17:13:33

Daenyth
Forum Fellow
From: Boston, MA
Registered: 2008-02-24
Posts: 1,244

Re: Pacman Options (total download, show size and use delta)

I seem to remember a discussion about it (from the server side perspective). The reason (IIRC) that it isn't being worked on is that the bandwidth savings were not enough to outweight the additional cpu load of creating the xdeltas. I'm not entirely sure, but I do remember it being mentioned.

Offline

#9 2008-08-18 18:08:12

Garns
Member
Registered: 2008-05-28
Posts: 239

Re: Pacman Options (total download, show size and use delta)

Daenyth wrote:

I seem to remember a discussion about it (from the server side perspective). The reason (IIRC) that it isn't being worked on is that the bandwidth savings were not enough to outweight the additional cpu load of creating the xdeltas. I'm not entirely sure, but I do remember it being mentioned.

Afaik the delta creation is/was handled completely by makepkg and in most cases the cpu load should be small compared to compiling. The reason this wasn't taken any further are more in the line of: lack of a pacman contributor with much interest in deltas, lack of "someone" maintaining a private repo using deltas, etc.

On a sidenote: The way I see it, most of the people interested in delta updates don't have a broadband connection, which makes things such as maintaining a repo relatively difficult.

Offline

#10 2008-08-18 18:37:20

shining
Pacman Developer
Registered: 2006-05-10
Posts: 2,043

Re: Pacman Options (total download, show size and use delta)

Garns wrote:

Afaik the delta creation is/was handled completely by makepkg and in most cases the cpu load should be small compared to compiling. The reason this wasn't taken any further are more in the line of: lack of a pacman contributor with much interest in deltas, lack of "someone" maintaining a private repo using deltas, etc.

There was some discussion that it could be better to put it also in repo-add. Or even only in repo-add, which would be simpler than having it in both.
But I am still not sure what is best...

On a sidenote: The way I see it, most of the people interested in delta updates don't have a broadband connection, which makes things such as maintaining a repo relatively difficult.

Indeed, we need altruist people with broadband smile


pacman roulette : pacman -S $(pacman -Slq | LANG=C sort -R | head -n $((RANDOM % 10)))

Offline

#11 2008-08-18 19:46:58

Garns
Member
Registered: 2008-05-28
Posts: 239

Re: Pacman Options (total download, show size and use delta)

shining wrote:
Garns wrote:

Afaik the delta creation is/was handled completely by makepkg and in most cases the cpu load should be small compared to compiling. The reason this wasn't taken any further are more in the line of: lack of a pacman contributor with much interest in deltas, lack of "someone" maintaining a private repo using deltas, etc.

There was some discussion that it could be better to put it also in repo-add. Or even only in repo-add, which would be simpler than having it in both.
But I am still not sure what is best...

I just read a bit up on this. I actually missed the whole delta 2.0 story at first yikes .

On a sidenote: The way I see it, most of the people interested in delta updates don't have a broadband connection, which makes things such as maintaining a repo relatively difficult.

Indeed, we need altruist people with broadband smile

As always. But first we need new delta generation in makepkg or repo-add, or in both... (the last one could be a bad idea)

Offline

#12 2008-08-20 22:56:48

shining
Pacman Developer
Registered: 2006-05-10
Posts: 2,043

Re: Pacman Options (total download, show size and use delta)

Garns wrote:
shining wrote:
Garns wrote:

Afaik the delta creation is/was handled completely by makepkg and in most cases the cpu load should be small compared to compiling. The reason this wasn't taken any further are more in the line of: lack of a pacman contributor with much interest in deltas, lack of "someone" maintaining a private repo using deltas, etc.

There was some discussion that it could be better to put it also in repo-add. Or even only in repo-add, which would be simpler than having it in both.
But I am still not sure what is best...

I just read a bit up on this. I actually missed the whole delta 2.0 story at first yikes .

On a sidenote: The way I see it, most of the people interested in delta updates don't have a broadband connection, which makes things such as maintaining a repo relatively difficult.

Indeed, we need altruist people with broadband smile

As always. But first we need new delta generation in makepkg or repo-add, or in both... (the last one could be a bad idea)

All the information about this stalled delta makepkg / repo-add rework are there :
http://www.nabble.com/Add-delta-creatio … #a15513733

Any help would be really appreciated, because I feel bad about this and even regret Dan and I tried to improve the pacman side, but did not complete the work by working on makepkg / repo-add side as well.


pacman roulette : pacman -S $(pacman -Slq | LANG=C sort -R | head -n $((RANDOM % 10)))

Offline

#13 2008-08-21 00:38:03

Allan
Pacman
From: Brisbane, AU
Registered: 2007-06-09
Posts: 11,478
Website

Re: Pacman Options (total download, show size and use delta)

FYI, this is #2 (out of 2) on my "Big Things I Want to get Working in Arch" list behind getting a testing repo for [community] packages.  But given I haven't started #1 yet, it may take me a while to get this working so others should do it for me  big_smile

Offline

#14 2008-08-29 22:54:11

g2g591
Member
Registered: 2007-12-24
Posts: 54

Re: Pacman Options (total download, show size and use delta)

well, I'm glad this thread didn't just go into the dustbin. man, I'd loveeee to have this feature as I said earlier, like now, kde 4.1.1 went in, and I'll have to redownload at least 230M , that will take about 8-12 hours, for about 50M or less of diffs, so please, get started on #1 (and what is #1 btw) , so you can get this implimented

Last edited by g2g591 (2008-08-29 22:54:46)

Offline

#15 2008-10-13 19:19:00

Wintervenom
Member
Registered: 2008-08-20
Posts: 1,011

Re: Pacman Options (total download, show size and use delta)

I, too, would also be really happy if deltas were used.  I get very erratic speeds between four kilobytes to -- when I'm lucky -- two or more megabytes a second, until the throttling kick in.  I am stuck in the middle of a five-hundred-megabyte update at around thirty kilobytes a second.  neutral

Last edited by Wintervenom (2009-08-05 14:04:48)

Offline

#16 2009-02-09 11:14:18

zatricky
Member
From: Stockholm
Registered: 2008-09-03
Posts: 56
Website

Re: Pacman Options (total download, show size and use delta)

Deltas can make a *huge* difference when releasing package upgrades. When implemented, if we want to be picky about which packages are delta'd, we should probably do the largest packages at the very least.

A good example is OpenOffice.org. The latest openoffice-base-3.0.1-1-x86_64.pkg.tar.gz is 150-odd MB. If there were a package upgrade, say to 3.0.1-2 the delta would probably be a few kb. An actual application upgrade, say, to 3.0.2-1, might not have such a dramatic difference - but I'd bet it would be very good anyway, probably less than 5 MB.

Also, I hope the devs that have been involved with the delta feature so far have taken note of gzip's --rsyncable option which makes deltas much much more effective.

I manage backups for an ISP and we happen to still be using xdelta for some of our servers still using legacy "home-brew" backups. They're running fine and the daily "deltas" don't take up much space. We're not going to fix what ain't broke. wink


pacman russian roulette: yes | pacman -Rcs $(pacman -Q | LANG=C sort -R | head -n $((RANDOM % 10)))
(yes, I know its broken)

Offline

#17 2009-02-09 11:40:29

shining
Pacman Developer
Registered: 2006-05-10
Posts: 2,043

Re: Pacman Options (total download, show size and use delta)

Delta support was originally a contribution of Nathan Jones, but unfortunately it had to be rewritten due to some design limitations :
http://projects.archlinux.org/?p=pacman … 8c3cf05797
And it was never finished due to lack of interest.

The only recent interest was from Garns who posted above in this thread, which led to this last discussion on the ML :
http://archive.netbsd.se/?ml=pacman-dev … &m=9005926

There are still very important questions that are unanswered, like where should deltas be created : makepkg, repo-add, external tool?
And then some issues like the "gzip -n" one.
There is no need to endlessly repeat how great delta is. What we need is people brainstorming on the implementation issues, and then some coding!


pacman roulette : pacman -S $(pacman -Slq | LANG=C sort -R | head -n $((RANDOM % 10)))

Offline

#18 2009-02-10 09:03:03

zatricky
Member
From: Stockholm
Registered: 2008-09-03
Posts: 56
Website

Re: Pacman Options (total download, show size and use delta)

According to http://code.google.com/p/xdelta/wiki/Ex … ompression :

Xdelta decompresses the input stream (target) using pipes to the external compression program; it decompresses the source file to a temporary file. There is a hard-coded maximum size of 256MB for external compression.

Recognition of externally-compressed inputs can be disabled by -D.

I was looking into the backup implementation that I mentioned earlier. In the oldest versions it used xdelta3 before compression, in others its using gzip --rsyncable before running xdelta3. Most of the files or folders being backed up are a lot bigger than 256MB so the -D option was probably never looked at or necessary to improve performance.

Would it be simpler to add --rsyncable to the gzip makepkg and then use the -D option when using xdelta3?

The downside to using --rsyncable is a slight increase in the size of the .gz files, of a few kb per MB. I'm not sure if bzip has a similar feature.

Though I can follow the concepts in the mailing lists, I feel lost in the code. If I had the time to get into it, I'd gladly do it myself.


pacman russian roulette: yes | pacman -Rcs $(pacman -Q | LANG=C sort -R | head -n $((RANDOM % 10)))
(yes, I know its broken)

Offline

#19 2009-02-10 15:21:19

shining
Pacman Developer
Registered: 2006-05-10
Posts: 2,043

Re: Pacman Options (total download, show size and use delta)

zatricky wrote:

According to http://code.google.com/p/xdelta/wiki/Ex … ompression :

Xdelta decompresses the input stream (target) using pipes to the external compression program; it decompresses the source file to a temporary file. There is a hard-coded maximum size of 256MB for external compression.

Recognition of externally-compressed inputs can be disabled by -D.

I was looking into the backup implementation that I mentioned earlier. In the oldest versions it used xdelta3 before compression, in others its using gzip --rsyncable before running xdelta3. Most of the files or folders being backed up are a lot bigger than 256MB so the -D option was probably never looked at or necessary to improve performance.

Would it be simpler to add --rsyncable to the gzip makepkg and then use the -D option when using xdelta3?

The downside to using --rsyncable is a slight increase in the size of the .gz files, of a few kb per MB. I'm not sure if bzip has a similar feature.

Though I can follow the concepts in the mailing lists, I feel lost in the code. If I had the time to get into it, I'd gladly do it myself.

What is that --rsyncable option? It is not even an official flag?
Could you do some tests showing the results of using --rsyncable and xdelta3 -D on the size of the delta?
Isn't the resulting delta still much much bigger?


pacman roulette : pacman -S $(pacman -Slq | LANG=C sort -R | head -n $((RANDOM % 10)))

Offline

#20 2009-02-10 16:31:03

zatricky
Member
From: Stockholm
Registered: 2008-09-03
Posts: 56
Website

Re: Pacman Options (total download, show size and use delta)

I don't think --rsyncable is strictly posix. The way I understand it is that it adds a (tiny) bit of fragmentation to the resulting gzip. The big upside is that the entropy caused by small changes are greatly lessened in the compressed output file. For rsync and deltas, this is a big boost.

I'm putting together some comparative data.


pacman russian roulette: yes | pacman -Rcs $(pacman -Q | LANG=C sort -R | head -n $((RANDOM % 10)))
(yes, I know its broken)

Offline

#21 2009-02-10 18:22:56

zatricky
Member
From: Stockholm
Registered: 2008-09-03
Posts: 56
Website

Re: Pacman Options (total download, show size and use delta)

I took the backups of my own domain's web site, from the 9th to the 3rd of February. I unzipped each days' backups to at least tar format and then regzipped normally and then with --rsyncable. I used no other gzip parameters.

last tar file: 134 656 000 bytes
last tgz file: 60 625 581 bytes
last rsyncable.tgz file: 63 664 000 bytes

I applied all the deltas to get each days "full" tar and put them in separate subfolders. Then I ran xdelta3 to get a delta for each day and for each differentiation of the scheme. These are the total sizes for each type of delta over the 8-day period

"gzip" "xdelta -D" : 222 525 619 bytes (all 7 deltas together, 31 789 374 bytes average per delta)
"gzip --rsyncable" "xdelta -D" : 27 676 472 bytes (3 953 781 bytes average)
"gzip" "xdelta" : 1 247 887 bytes (178 269 bytes average)

Obviously, allowing xdelta to decompress and recompress is the best way in terms of bandwidth. But at the same time, short of fixing the problem we have with xdelta, gzip --rsyncable and xdelta -D isn't too bad a stopgap since its also probably the easiest to implement.

The actual bash commands I used to do all the above are at http pastebin swiftspirit co za/9

The math was done by hand... silly me. wink

edit...
Forgot to add up the actual total bandwidth used in this comparison of "updating" each day:

no deltas, just download the fresh gzip each day : 394 404 842 bytes
"gzip" "xdelta -D" : 357 181 619 bytes
"gzip --rsyncable" "xdelta -D" : 91 340 472 bytes
"gzip" "xdelta" : 61 873 468 bytes

Last edited by zatricky (2009-02-10 18:51:39)


pacman russian roulette: yes | pacman -Rcs $(pacman -Q | LANG=C sort -R | head -n $((RANDOM % 10)))
(yes, I know its broken)

Offline

#22 2009-02-10 18:44:12

shining
Pacman Developer
Registered: 2006-05-10
Posts: 2,043

Re: Pacman Options (total download, show size and use delta)

Thanks for the numbers, they are very interesting smile
It would indeed be easier to implement but I think the fact that --rsyncable is not in the official gzip is a showstopper. Arch tries to be vanilla when possible. Also xdelta3 is only in AUR/unsupported, but that is not the problem. If it was really needed, I am sure someone could maintain it in an official repo.

Anyway I sent a mail about what I think is the current status of the implementation : http://www.archlinux.org/pipermail/pacm … 08129.html
There is a lot to discuss, and a lot to implement, and no one motivated to do all the jobs, so we won't go far smile


pacman roulette : pacman -S $(pacman -Slq | LANG=C sort -R | head -n $((RANDOM % 10)))

Offline

#23 2009-02-10 18:56:31

zatricky
Member
From: Stockholm
Registered: 2008-09-03
Posts: 56
Website

Re: Pacman Options (total download, show size and use delta)

In all this time, working on the server (where the backups are), I never realised that we don't already have --rsyncable in arch's gzip. sad

CentOS seems to have --rsyncable builtin by default. Its shown in gzip --help but not in the manpage.


pacman russian roulette: yes | pacman -Rcs $(pacman -Q | LANG=C sort -R | head -n $((RANDOM % 10)))
(yes, I know its broken)

Offline

#24 2010-02-15 20:19:13

milomak
Member
Registered: 2009-11-04
Posts: 61

Re: Pacman Options (total download, show size and use delta)

forgive me for reviving an old thread, but what are the reasons for not enabling by default TotalDownload in /etc/pacman.conf?

the progress report of the total update process is useful information surely?

Offline

#25 2010-02-15 20:41:02

shining
Pacman Developer
Registered: 2006-05-10
Posts: 2,043

Re: Pacman Options (total download, show size and use delta)

We just kept the old behavior by default, and added a new option to change it, that's all.
If you are not happy, feel free to open a feature request.


pacman roulette : pacman -S $(pacman -Slq | LANG=C sort -R | head -n $((RANDOM % 10)))

Offline

Board footer

Powered by FluxBB