You are not logged in.

#1 2010-06-17 01:12:58

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,597
Website

tar gzip vs. xz compression

What are folks using these days?  I know the official package format got switched over from tar.gz to .tar.xz but unless I'm not using tar correctly, my system backups using xz are taking way longer.

$ tar zcvfp home.tar.gz /home

vs

$ tar cvpJf home.tar.xz /home

Am I omitting an option on the xz format or is it really just that much slower?


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#2 2010-06-17 01:35:10

falconindy
Developer
From: New York, USA
Registered: 2009-10-22
Posts: 4,111
Website

Re: tar gzip vs. xz compression

Speed is the tradeoff for higher compression. Apparently, newer releases are making headway in a more efficient algorithm.

Offline

#3 2010-06-17 01:41:04

sand_man
Member
From: Australia
Registered: 2008-06-10
Posts: 2,164

Re: tar gzip vs. xz compression

I prefer a smaller download rather than faster compression time.


neutral

Offline

#4 2010-06-17 01:43:48

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,597
Website

Re: tar gzip vs. xz compression

For smaller dirtrees, it's no biggie but when I run my backup script that does (in part) tar cvpJf arch-system.tar.xz /etc /boot /root /var --exclude "/var/cache/pacman/pkg" and it takes 10x longer than the gz compression only to save about 9 % file size, it doesn't payback for me.  I don't want to start a debate of the two formats, I just wanted to make sure that I had the switches right for the xz compression.

Last edited by graysky (2010-06-17 01:44:03)


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#5 2010-06-17 01:44:42

some-guy94
Member
Registered: 2009-08-15
Posts: 360

Re: tar gzip vs. xz compression

xz decompression is not supposed to be as 'extreme' as the compression.

Offline

#6 2010-06-17 01:49:00

IgnorantGuru
Member
Registered: 2009-11-09
Posts: 640
Website

Re: tar gzip vs. xz compression

Another option is --bzip2 in tar, which IIRC is somewhere between gz and xz in compression speed and size.

Also... xz uses multiple threads when run on a multi-core CPU (by default).  You can also specify the compression level, which will affect the speed.  To specify options to xz you'll probably need to use

tar -cf - files | xz -2 > archive.txz

Personally, I have my system do automated backups in the wee hours of the night, so I don't care if it grinds away for awhile with xz.

Last edited by IgnorantGuru (2010-06-17 01:58:33)

Offline

#7 2010-06-17 01:54:44

andresp
Member
Registered: 2010-05-29
Posts: 62

Re: tar gzip vs. xz compression

I use gz because just about every system has gzip installed, and bzip is just not useful.

I recently went through the pain of converting an ext4 1tb drive someone else formatted to ext3 because the main netbsd server box couldn't mount it...

Most the time it makes sense to just stay portable.

Offline

#8 2010-06-17 01:56:16

wonder
Developer
From: Bucharest, Romania
Registered: 2006-07-05
Posts: 5,941
Website

Re: tar gzip vs. xz compression

you are using in the right way. is knew that xz compression take  bit longer than gzip.


Give what you have. To someone, it may be better than you dare to think.

Offline

#9 2010-06-17 01:57:08

some-guy94
Member
Registered: 2009-08-15
Posts: 360

Re: tar gzip vs. xz compression

IgnorantGuru wrote:

Another option is --bzip2 in tar, which IIRC is somewhere between gz and xz in compression speed and size.

Personally, I have my system do automated backups in the wee hours of the night, so I don't care if it grinds away for awhile with xz.

AFAIK, bzip2 is between them in size, but xz actually beats bzip2 is speed.
http://tukaani.org/lzma/benchmarks.html

Offline

#10 2010-06-17 01:59:51

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,597
Website

Re: tar gzip vs. xz compression

@wonder - thanks for the feedback.  I noticed that neither tar nor xz is multithreaded.  By contrast winrar for windows is reported to be multithreaded.  I wonder if i/o is the bottle neck or if it's CPU cycles...

EDIT:  watching tar go with xz compression, I can speculate that i/o is not the limiting factor.  Some times, the system (quad core here) just waits while the CPU compresses a large file using only one core.  Perhaps if xz or tar with xz can use all physical cores, we would see a speed increase?


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#11 2010-06-17 02:01:18

wonder
Developer
From: Bucharest, Romania
Registered: 2006-07-05
Posts: 5,941
Website

Re: tar gzip vs. xz compression

xz is not multicore. if i remember well, the multicore support is on the way


Give what you have. To someone, it may be better than you dare to think.

Offline

#12 2010-06-17 02:10:29

IgnorantGuru
Member
Registered: 2009-11-09
Posts: 640
Website

Re: tar gzip vs. xz compression

some-guy94 wrote:

AFAIK, bzip2 is between them in size, but xz actually beats bzip2 is speed.
http://tukaani.org/lzma/benchmarks.html

Those benchmarks are comparing bzip2 to lzma, but xz is different (aka lzma2).  According to the benchmarks below, with default settings xz is quite a bit slower than bzip2.
http://stephane.lesimple.fr/wiki/blog/l … k_reloaded

But I can't say I've tested it.  bzip2 is handy because the compression is usually fairly close to xz (compared to gz), but more people have bzip2 installed.  XArchiver, for example, doesn't yet support tar.xz.

I recently changed a 1.6G tar.gz tarball to tar.bz2 with a savings of 600MB!  I didn't use xz because I'm not sure how widespread it is yet.

Offline

#13 2010-06-17 02:12:13

IgnorantGuru
Member
Registered: 2009-11-09
Posts: 640
Website

Re: tar gzip vs. xz compression

wonder wrote:

xz is not multicore. if i remember well, the multicore support is on the way

Oh, you're right - I read the man page but missed the comment at the end that says it's not yet implemented.

Offline

#14 2010-06-17 02:40:07

IgnorantGuru
Member
Registered: 2009-11-09
Posts: 640
Website

Re: tar gzip vs. xz compression

Well now I had to try it.  smile  xz is way slower than bzip2, at least in my test.

$ time tar --bzip2 -cf xxx.tar.bz2 xxx

real    0m13.735s
user    0m13.542s
sys    0m0.237s
$ time tar --xz -cf xxx.tar.xz xxx

real    0m55.535s
user    0m55.106s
sys    0m0.387s

Where xxx is a 125MB folder containing a mixture of text file and executables.  For the speed difference, bzip2 usually comes pretty close to xz:

bzip2: 125MB -> 75MB
xz: 125MB -> 72MB

Offline

#15 2010-06-17 04:24:13

Anikom15
Banned
From: United States
Registered: 2009-04-30
Posts: 836
Website

Re: tar gzip vs. xz compression

I dislike xz, I feel the tradeoff isn't worth it. Faster compression is better than smaller size in my opinion. I have lots of space and a healthy connection, xz seems uneccessary.

But you really shouldn't worry too much about it. Just know that gz and bzip2 have been the de facto and xz hasn't, in fact the repos switched to that format just recently.

Last edited by Anikom15 (2010-06-17 04:24:32)


Personally, I'd rather be back in Hobbiton.

Offline

#16 2010-06-17 19:23:59

fredre
Member
Registered: 2009-12-18
Posts: 45

Re: tar gzip vs. xz compression

Although xz compression is way slower than gzip compression, the decompression times are about the same. This makes xz ideal for things like package distribution, as the file is only compressed once but is downloaded and decompressed thousands of times.

However, for backups xz is probably not worth using, as the file is compressed once and usually never decompressed. gzip is imo still the best choice for backing up stuff.

As for bzip, it's slow at compression and decompression, and usually produces file sizes somewhere in-between what you would get from gzip and xz.

Offline

#17 2010-06-17 21:26:34

Cdh
Member
Registered: 2009-02-03
Posts: 1,098

Re: tar gzip vs. xz compression

Try building the sage-mathematics-bin packet from AUR on an intel atom. It takes ages to compress the 3.4 gb into the .xz

Last edited by Cdh (2010-06-17 21:26:53)


฿ 18PRsqbZCrwPUrVnJe1BZvza7bwSDbpxZz

Offline

#18 2010-06-19 05:33:13

fredre
Member
Registered: 2009-12-18
Posts: 45

Re: tar gzip vs. xz compression

Cdh wrote:

Try building the sage-mathematics-bin packet from AUR on an intel atom. It takes ages to compress the 3.4 gb into the .xz

makepkg can use gzip by editing PKGEXT in /etc/makepkg.conf, which might be useful for building large packages from AUR for example.

Last edited by fredre (2010-06-19 05:34:56)

Offline

#19 2010-06-19 05:43:46

Mardoct
Member
Registered: 2009-08-17
Posts: 208

Re: tar gzip vs. xz compression

My favourite thing is not compressing it. When I get somewhere even resembling close to my maximum storage I might care more.


The human being created civilization not because of willingness but of a need to be assimilated into higher orders of structure and meaning.

Offline

#20 2010-06-19 10:36:56

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,597
Website

Re: tar gzip vs. xz compression

fredre wrote:

makepkg can use gzip by editing PKGEXT in /etc/makepkg.conf, which might be useful for building large packages from AUR for example.

Nice, thanks.


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

Board footer

Powered by FluxBB