You are not logged in.

#1 2009-02-13 18:56:20

Xyne
Moderator/TU
Registered: 2008-08-03
Posts: 5,688
Website

rebase: a pacman database synchronization tool

info page: http://xyne.archlinux.ca/info/rebase

rebase_update.png

Summary:
Rebase updates the pacman sync databases listed in a pacman configuration file. It only updates files which are newer or not present in the database instead of overwriting everything. It also minimizes IO operations by piping data around instead of writing unnecessary files to the disk. It should be ideal for anyone who wants to regularly sync the database without worrying above the disk (solid-state drives, flash sticks, etc). There's a minimal script on the info page that you can use to run it with cron.

You can also specify an output directory and a target architecture which should be useful for installing Arch remotely or in a 32-bit chroot environment. It should liven up sync operations a bit too with its formatted output.

Other features include handing multiple configuration files and logging output.

Last edited by Xyne (2009-08-09 12:42:30)

Offline

#2 2009-04-23 18:47:59

Shunpike
Member
From: France
Registered: 2009-01-28
Posts: 47

Re: rebase: a pacman database synchronization tool

Hi Xyne, it's me again :-)

I noticed this while testing powerpill and reflector :
200904232015481280x800s.th.png
(Forget about the powerpill error, it is 100% normal)

rebase failed to see any database updates on [testing] and [xyne-any], but pacman found them. Any idea why that would happen?

Offline

#3 2009-04-23 19:27:13

Zariel
Member
Registered: 2008-10-07
Posts: 446

Re: rebase: a pacman database synchronization tool

Ive been thinking about a different output style, like this

pkg ver -> newver

Coloured, and maybe some info about dep/desc changes too.

Offline

#4 2009-04-23 19:52:33

Xyne
Moderator/TU
Registered: 2008-08-03
Posts: 5,688
Website

Re: rebase: a pacman database synchronization tool

@Shunpike
It's possible that I updated the repo just after you ran rebase.

@Zariel
I might implement the "pkgname oldver->newver" output (or some variation of it) but I don't think I'll include descriptions/dependencies because that would require opening and parsing files which just adds unnecessary overhead. The current messages just display the actual file creations and deletions.

Last edited by Xyne (2009-04-23 19:57:40)

Offline

#5 2009-04-23 19:54:41

Shunpike
Member
From: France
Registered: 2009-01-28
Posts: 47

Re: rebase: a pacman database synchronization tool

Xyne wrote:

It's possible that I updated the repo just after you ran rebase.

Yeah sure, but it seems weird to me that the changes appeared on both testing and xyne-any in a 2 seconds interval :-)
I'll have a closer look in the next days and I'll get back to you if that kind of stuff happens to often.

Offline

#6 2009-04-23 20:00:22

Xyne
Moderator/TU
Registered: 2008-08-03
Posts: 5,688
Website

Re: rebase: a pacman database synchronization tool

Shunpike wrote:

Yeah sure, but it seems weird to me that the changes appeared on both testing and xyne-any in a 2 seconds interval :-)
I'll have a closer look in the next days and I'll get back to you if that kind of stuff happens to often.

Experiment with the "--full-extract" option too, e.g. if you suspect that the database hasn't been updated after a rebase operation.

If you really want to test it, use the "--dir" option to create a parallel database as a control. wink

Offline

#7 2009-04-23 21:32:05

Xyne
Moderator/TU
Registered: 2008-08-03
Posts: 5,688
Website

Re: rebase: a pacman database synchronization tool

Ok, I've changed the display information a bit. Try the latest version and let me know what you think. I'm not sure about the colors (or even the layout) so feel free to make suggestions.

Offline

#8 2009-04-24 16:02:24

Daenyth
Forum Fellow
From: Boston, MA
Registered: 2008-02-24
Posts: 1,244

Re: rebase: a pacman database synchronization tool

Would you say this is faster than pacman for updating the sync database? I have -Sy in cron.daily, would this use less bandwidth or any other resource?

Offline

#9 2009-04-24 17:00:24

Shunpike
Member
From: France
Registered: 2009-01-28
Posts: 47

Re: rebase: a pacman database synchronization tool

Daenyth wrote:

Would you say this is faster than pacman for updating the sync database? I have -Sy in cron.daily, would this use less bandwidth or any other resource?

There is a description there : http://xyne.archlinux.ca/info/rebase, which was mentioned in the first post ;-)
Doesn't make much difference most of the time, but it saves disk write operations.

edit: way too much "most of the time" in my last sentence ;-)

Last edited by Shunpike (2009-04-24 17:38:06)

Offline

#10 2009-04-24 17:34:20

Xyne
Moderator/TU
Registered: 2008-08-03
Posts: 5,688
Website

Re: rebase: a pacman database synchronization tool

Exactly. The bandwidth usage is the same because the repo databases are only available as archives, so they still need to be downloaded (they're so small though that their size is nigh negligible anyway). The main features of rebase are that it minimizes disk IO by only writing newer files to the disk instead of removing and restoring the entire database and that it displays info about what it's doing as it does it. Other features include syncing individual repos, creating sync databases in other directories and creating sync databases for non-host architectures (the latter two should be useful for setting up a chroot, etc). Any speed difference is negligible.

I used to sync pacman with "-Sy" in cron.hourly but now I have rebase in there (which I use with paconky to display new repo and AUR packages).

Offline

#11 2009-04-27 17:13:15

Shunpike
Member
From: France
Registered: 2009-01-28
Posts: 47

Re: rebase: a pacman database synchronization tool

It happened again today, rebase did not show any updates, whereas pacman -Sy 10 seconds later found some updates.
200904271904051280x800s.th.png

By the way, what is exactly the --full-extract option?

Offline

#12 2009-04-27 17:53:44

Xyne
Moderator/TU
Registered: 2008-08-03
Posts: 5,688
Website

Re: rebase: a pacman database synchronization tool

I think I know why it might be doing that. Pacman just checks the time of the last database update and compares it to the time in .lastupdate in each database directory. Even if there are no new files in the database on the mirror, the timestamp will change when the mirror syncs to the main server so pacman will think it's new and download it. Rebase doesn't care about timestamps... it always downloads the latest database archive and checks it for new packages/database files.

The --full-extract option does the same thing that pacman does when it updates the database... it extracts everything from the database archive into the database directory, overwriting old files regardless of whether they've changed.

Offline

#13 2009-06-05 00:04:50

Daenyth
Forum Fellow
From: Boston, MA
Registered: 2008-02-24
Posts: 1,244

Re: rebase: a pacman database synchronization tool

You need to set the umask so that a "sudo rebase" creates the correct permissions in /var/lib/pacman/sync

Offline

#14 2009-06-05 01:40:41

Xyne
Moderator/TU
Registered: 2008-08-03
Posts: 5,688
Website

Re: rebase: a pacman database synchronization tool

Daenyth wrote:

You need to set the umask so that a "sudo rebase" creates the correct permissions in /var/lib/pacman/sync

Done.

As I no longer set my user's umask (due to the sudo devs using it for root's umask), rebase always sets the correct permissions for me.

Offline

#15 2009-06-05 13:14:56

Daenyth
Forum Fellow
From: Boston, MA
Registered: 2008-02-24
Posts: 1,244

Re: rebase: a pacman database synchronization tool

Thanks smile

Offline

#16 2009-06-06 00:23:46

Xyne
Moderator/TU
Registered: 2008-08-03
Posts: 5,688
Website

Re: rebase: a pacman database synchronization tool

I made a mistake with the umask which I've now corrected. I'd recommend that anyone who used rebase to update the database in the interval between this post and my last post should run the following commands to make sure that all permissions are correct:

rm -fr /var/lib/pacman/sync
pacman -Syyu

This will reset the sync databases and make sure that you have the latest version of rebase with the correct umask line.

Sorry to anyone affected by this.

Offline

#17 2009-06-06 03:05:01

Daenyth
Forum Fellow
From: Boston, MA
Registered: 2008-02-24
Posts: 1,244

Re: rebase: a pacman database synchronization tool

You can also do a

find /var/lib/pacman/sync -type d -not -perm 755 -exec chmod 755 {} +
find /var/lib/pacman/sync -type f -not -perm 644 -exec chmod 644 {} +

Offline

#18 2009-08-05 08:35:26

Army
Member
Registered: 2007-12-07
Posts: 1,784

Re: rebase: a pacman database synchronization tool

Hey Xyne, a few days ago rebase stopped working for me, it runs fine syncing testing and core, but when it should extract extra it simply stops. A pacman -Syy works just fine, so it doesn't seem to be a mirror problem. This happens on 2 different Arch installations. I use rebase, because I'm on xfs and using rebase safes me about a minute wink

Offline

#19 2009-08-08 10:55:46

foutrelis
Developer/TU
From: Athens, Greece
Registered: 2008-07-28
Posts: 675
Website

Re: rebase: a pacman database synchronization tool

Army wrote:

Hey Xyne, a few days ago rebase stopped working for me, it runs fine syncing testing and core, but when it should extract extra it simply stops. A pacman -Syy works just fine, so it doesn't seem to be a mirror problem. This happens on 2 different Arch installations. I use rebase, because I'm on xfs and using rebase safes me about a minute wink

I'm experiencing the same issue on my netbook. [core] and [community] synchronize fine, but rebase chokes on [extra]. An interesting observation of mine is that this problem doesn't exist on my desktop which runs Arch x86_64 (vs i686 on the netbook).

I'm suspecting that extra reaches the limit of data a pipe can hold and it blocks, but I can't be sure. smile

Last few lines of `sudo strace rebase extra':

write(4, "o)\354\4\301\326\305-\237\360\306\317Y\23\341l*\313\245s\330[\331\r\25\330\251\363\307KfY\346"..., 4096) = 4096
write(4, "\356\37\377\365\372\235\345\1&\224@\313\1\214\2\22@\333P\347\203H\206\1\212\10\364}\16|*\2\242"..., 4096) = 4096
write(4, "\373\263\327\f\317G\7\373t\325\242\261\262\1779iR*\226@'\2\337'\\\327~\21\224\0072\212\244"..., 4096) = 4096
write(4, "O0\342n\377\367\241\26\377\265\226\302\16\3723\343\332\26@\326\211\376\253ge\2760\1\32;\237\34\335"..., 4096) = 4096
write(4, "Xa\"\2271\225\241<\224\376\32\317\357S\375\214\226\315\207w\365\10\301\0241\20|\26\265\237\265\256o"..., 4096) = 4096
write(4, "$\345\257\205V\325)\321\342\343\372n>\373b\207\212%>Fl\34ZG\374\177P\f\354\214\377\225\375"..., 4096) = 4096
write(4, "\310\177\257['\375?\24\377\301V\375\27Fb\374/\210\16\363_c)\364\341\277\355\224n\376\373\356\307"..., 4096) = 4096
write(4, "g\276\327\266\271/\337\210\316<#4\346c)\307\22\fG\201}\371\17\2\214)/\353\277\23\30\371/"..., 4096) = 4096
write(4, "\0\260\17\3\237\t\354\242\361S\276i\2460'D\342\334\276\372k\10\215\312\0017\206\21\231\263\34hx"..., 4096) = 4096
write(4, "\354\37\356\374\227\222f\377\317x\377'\220\366\374\267\276\333\314\324\362\355Wp\f\16\267\376\2154\370\31\352"..., 4096) = 4096
write(4, "\321\27\214\266\357\177\220D\376\v\241\203\366\37\30\3;\371\17{\366G\356\376\17e4\326\177\r\"?\376"..., 4096) = 4096
write(4, "\30\34\256\177\10\203\377\337\213\266\360\2373\25\16a\300\315= \330\200\3\357\235~Q\255~\233\27\265\312"..., 4096) = 4096
write(4, "R\310\30\20L\223\243\202R1\22o:\24fA\355RN\325\"\34\370\333Z4pW0p\256\242\277"..., 4096) = 4096
write(4, "\317\222i\200\277\347\246\341\372?\307\371/\6\360p\375\3\36\342\377^\324\343?g*\354@?3jp"..., 1835

EDIT: Patched /usr/share/perl5/vendor_perl/Xyne/Arch/Rebase.pm to fork a new process which does the writing to the pipe opened with bsdtar. This way the main process doesn't block. Apply with:

cd /usr/share/perl5/vendor_perl/Xyne/Arch
wget -O - -q http://omploader.org/vMjRhcg/Rebase.patch | sudo patch -p0

(My Perl-fu is a bit rusty, so Xyne should probably review the above patch and possibly rewrite it before including it in rebase tongue )

Last edited by foutrelis (2009-08-08 13:15:08)

Offline

#20 2009-08-09 12:25:01

Xyne
Moderator/TU
Registered: 2008-08-03
Posts: 5,688
Website

Re: rebase: a pacman database synchronization tool

@foutrelis
Thanks for the patch. I really like the way that you used fork to get around the buffering issue. As you saw in the code, I had just kludged my way around it with cat pipes (which I've removed now). tongue

I've applied the patch although I've reverted run_tar_cmd to a class method instead of an instance method to keep it portable. Was there any particular reason for that change? I've only done some minor testing but it doesn't seem to have any effect.

Anyway, thanks again. I'm sure it would have been frustrating to find a solution on my own and it would probably have been less elegant, especially considering all the other little bugs in other things that magically appeared while I was away.

Offline

#21 2009-08-09 12:51:06

foutrelis
Developer/TU
From: Athens, Greece
Registered: 2008-07-28
Posts: 675
Website

Re: rebase: a pacman database synchronization tool

Xyne wrote:

Thanks for the patch. I really like the way that you used fork to get around the buffering issue. As you saw in the code, I had just kludged my way around it with cat pipes (which I've removed now). tongue

I was confused about the use of `cat' there. It's now clear to me that it increased the available buffer size by stacking 3 pipes together. I believe these can be removed now, as you've done already.

Xyne wrote:

I've applied the patch although I've reverted run_tar_cmd to a class method instead of an instance method to keep it portable. Was there any particular reason for that change? I've only done some minor testing but it doesn't seem to have any effect.

Well, I wanted to use msg_error(), so I needed to access $self so I can then pass it to msg_error(). As I said, I'm not very experienced in Perl, so please excuse my novice mistakes.

Xyne wrote:

Anyway, thanks again. I'm sure it would have been frustrating to find a solution on my own and it would probably have been less elegant, especially considering all the other little bugs in other things that magically appeared while I was away.

I'm glad I could help improve such a useful application. My netbook's SSD is thankful too! tongue

Offline

#22 2009-08-09 13:05:47

Xyne
Moderator/TU
Registered: 2008-08-03
Posts: 5,688
Website

Re: rebase: a pacman database synchronization tool

foutrelis wrote:
Xyne wrote:

I've applied the patch although I've reverted run_tar_cmd to a class method instead of an instance method to keep it portable. Was there any particular reason for that change? I've only done some minor testing but it doesn't seem to have any effect.

Well, I wanted to use msg_error(), so I needed to access $self so I can then pass it to msg_error(). As I said, I'm not very experienced in Perl, so please excuse my novice mistakes.

I suspected that it might have been for the msg_error function but I wanted to be sure in case it was for something else that I hadn't understood. In any case I think your Perl-fu looks fine and I'm happy to have learned a new trick with fork to get around pipe buffers, so there really is nothing to excuse. smile

Offline

#23 2009-08-18 18:13:06

xduugu
Member
Registered: 2008-10-16
Posts: 290

Re: rebase: a pacman database synchronization tool

Hi Xyne, thanks for rebase. It's a huge time saving when updating large pacman databases.

The only drawback compared to a "pacman sync" is that the databases are always downloaded, even if they weren't updated. This can cause a lot of unnecessary traffic on client and server side. Pacman uses a file ".lastupdate" which contains a timestamp to recognize database changes.

The code would be quite simple but it seems to be not that easy to integrate it into the existing rebase code.

    my (undef, undef, $time, undef, undef) = head($url);
    my $last_update = $self->get_output_dir . "/$repo/.lastupdate";
    return undef if -r $last_update and $time <= `cat "$last_update"`;

Of course the timestamp should also be updated after a successful sync and you probably want to check for an empty .lastupdate file.

edit: typo

Last edited by xduugu (2009-08-18 18:13:58)

Offline

#24 2009-08-18 21:07:08

Xyne
Moderator/TU
Registered: 2008-08-03
Posts: 5,688
Website

Re: rebase: a pacman database synchronization tool

Hi xduugu.

I've added the option "--by-lastupdate" which will check the remote database timestamp against the one recorded in ".lastupdate" and only download the remote database if it appears to be newer (and then update ".lastupdate" after extraction).

As you pointed it, it wasn't entirely trivial to integrate into the existing code (which still shows signs of its origin as a hacked-together script) so I may have introduced bugs despite my testing. Let me know if you notice any unexpected behavior.

Offline

#25 2009-08-18 21:21:59

Arm-the-Homeless
Member
Registered: 2008-12-22
Posts: 273

Re: rebase: a pacman database synchronization tool

This is awesome.

I'll have to find a way to make `sudo pacman -Syu` do `sudo rebase ; sudo pacman -Su`.

Offline

Board footer

Powered by FluxBB