You are not logged in.

#1 2009-12-11 06:07:23

Xyne
Moderator/TU
Registered: 2008-08-03
Posts: 5,688
Website

A package detection script to help rebuild the local database.

At this point this is a quick "proof of principle" script inspired by this thread.

This will download file lists for the specified repos and then check them against files on your system. It will then print a percentage for each package that indicates how many of that package's files were found on your system. Note that it ignores directories as that would lead to loads of partial false positives.

The idea is that it should provide a decent starting point for rebuilding the local package database. You can safely run it on your system to see the output as it doesn't change anything.

The script depends on Perl and curl. You should change the "$url" variable to a local mirror (I've used archlinux.org along with "`arch`" as relatively failsafe default). You can also add or remove repos from the "@repos" array.

Again, this is just something that I threw together to see if it works and it's very much a "hands-on" script right now. I might flesh it out and try to add more features later. Note that it cannot determine which packages were installed as dependencies (although I posted a script somewhere on the forum that could explicitly install only top-level packages, which could probably be merged into this if it goes anywhere). It is also limited to repos that contain <repo>.files.tar.gz.

#!/usr/bin/perl
use strict;
use warnings;

use File::Temp qw/tempdir/;

my $url = 'ftp://ftp.archlinux.org/$repo/os/' . `arch`;
chomp $url;
my @repos = qw/core extra community/;

my $tmpdir = tempdir(CLEANUP=>1);

foreach my $repo (@repos)
{
  my $files_url = $url;
  $files_url =~ s/\/\$repo\//\/$repo\//;
  $files_url .= '/' . $repo .'.files.tar.gz';
  `cd "$tmpdir" && curl "$files_url" | bsdtar -xf-`;
}
opendir(my $dh, $tmpdir) or die;
my @pkgs = readdir($dh);
close($dh);

my $l = 0;
foreach my $pkg (@pkgs)
{
  my $i = length($pkg);
  $l = $i if $i > $l;
}

foreach my $pkg (sort @pkgs)
{
  next if ($pkg eq '.' or $pkg eq '..');
  my @files = ();
  if (open(my $fh, '<', $tmpdir .'/'. $pkg .'/files'))
  {
    while (defined(my $line = <$fh>))
    {
      chomp $line;
      next if $line eq '%FILES%' or substr($line,-1) eq '/';
      push @files, '/' . $line;
    }
    close($fh);
    my $n = scalar @files;
    next if $n == 0;
    my $i = 0;
    foreach my $file (@files)
    {
      $i++ if -f $file;
    }
    printf("%-${l}s %3d%%\n", $pkg, 100*$i/$n);
  }
  else
  {
    print "error: failed to open $tmpdir/$pkg/files\n";
  }
}

Example output:

perl-xml-xpath-1.13-4                                0%
perl-xmms-0.12-4                                     0%
perl-xyne-arch-0.95-1                              100%
perl-xyne-common-0.05-3                            100%
perl-yaml-0.70-1                                     0%

Last edited by Xyne (2010-07-24 16:13:30)

Offline

#2 2009-12-11 06:42:20

tavianator
Member
From: Waterloo, ON, Canada
Registered: 2007-08-21
Posts: 858
Website

Re: A package detection script to help rebuild the local database.

Way to pick a language I don't know... I should probably learn perl anyway though.  I'll test this and maybe hack on it a bit tomorrow.

Offline

#3 2009-12-11 21:03:01

tavianator
Member
From: Waterloo, ON, Canada
Registered: 2007-08-21
Posts: 858
Website

Re: A package detection script to help rebuild the local database.

Works quite well.  I set it to only return packages with a >= 90% match, and there were only 7 false positives and 8 false negatives out of 1128 packages (excluding AUR packages).

Offline

#4 2009-12-11 23:43:34

Xyne
Moderator/TU
Registered: 2008-08-03
Posts: 5,688
Website

Re: A package detection script to help rebuild the local database.

Thanks for the feedback, tavianator.

Which false positives did it detect? Were they variants of installed packages?

I think the false negatives are due to packages which manipulate their own files during or after installation. They should still show up in the list though, albeit with a lower percentage. I don't think there's any way to work around that.

Offline

#5 2009-12-12 00:15:55

tavianator
Member
From: Waterloo, ON, Canada
Registered: 2007-08-21
Posts: 858
Website

Re: A package detection script to help rebuild the local database.

The false positives were gimp-devel, id3lib-rcc, libxft-lcd, links, taglib-rcc, ttf-freefont, and vncviewer-jar.  So yeah, variants of installed packages.  It also does a good job of detecting the official versions of git packages I have installed, which is expected but still cool.

Offline

Board footer

Powered by FluxBB