You are not logged in.

#1 2004-10-28 15:54:32

Dusty
Schwag Merchant
From: Medicine Hat, Alberta, Canada
Registered: 2004-01-18
Posts: 5,986
Website

Cleanup

Does anybody know of a pacman option (or more likely, a shell script) that would list all files on the system (or in specific directories, like /etc/ or /usr/...) that are not under pacman's control?

Before Arch, I would change distros every few months just so I could start with a fresh partition and get a brand new Linux without any cruft (ignoring the fact that most other distros are full of cruft anyway, of course).

Now, I'm into the rolling release and have no reason to reinstall Arch, and every reason (no bandwidth) not to. However, I've installed various packages that I don't use anymore that have created dynamic files in /etc/ or /usr/share and other places that were not automatically removed when the program was.  For example, when I pacman -R'd apache, /etc/httpd still had some files in it (in this case, they were files I had created manually, but they could have been created by the application too; it doesn't matter for the principle).

What I want to do is have a script that scans my system for all files that are not under package management and then print them to stdout.  Then I can scan this list manually at my leisure and remove any files I don't want around anymore. It wouldn't take as long to do this as it would to reinstall Arch, and it would leave me with a cleaner system.

On the other hand, you could say that since I can't see the useless files on the system anyway, they aren't really bothering me... but I'm the type of guy that cringes at the idea of the possible existence of a file that is doing nothing more than wasting a few bytes of space. I'm the guy that goes through his home directory every week to make sure there aren't any files that are not totally necesssary (and in the process removes files that are usually a little bit necessary by the next day and ... damn. but that's another story).

I figure a combination of pacman -Q, pacman -Ql and find might do the trick, or perhaps query the database directly... but scanning every file on the system, that's going to be a huge job, isn't it? I suppose I could run it at night and check the output in the morning...

Dusty

Offline

#2 2004-10-28 16:06:47

skeeterbug
Member
From: Oklahoma, USA
Registered: 2004-10-24
Posts: 92
Website

Re: Cleanup

I was working on a script to do something similar not too long ago - called it rogues.  I haven't had much time to mess with it but it does seem to work - sorta.  It's very rough and takes a long time but here it is.

#!/bin/sh

# Shell script to find rogue files.  Designed for Arch Linux but may be
# modified for other distros who have an ownership function provided by
# the package manager.
#
# Author:  Gary Singleton <gsinglet1@yahoo.com>
# Version: 0.1 - 16 October 2004

if [ ! $1 ]; then echo 'dumbass' && exit; fi

for filename in `find $1`; do
  monkey=`nice pacman -Qo $filename 2>&1 | grep -i 'no package owns' | cut -d' ' -f4`;
  if [ ! -d $monkey ]; then echo $monkey; fi
done

Oh, it works like 'rogues /usr' or 'rogues /etc' or whatever so you don't really have to do everything at once.  If anyone can figure out how to make it run faster I'd be happy but I think the time is in the pacman -Qo part.

Offline

#3 2004-10-28 23:54:16

z4ziggy
Member
From: Israel
Registered: 2004-03-29
Posts: 573
Website

Re: Cleanup

i think a much faster method will be dumping all files installed by all pacman packages into a file, and then producing a dir list of current directory, then comparing. it will be 10x times faster (actually, much more wink)

omho

Offline

#4 2004-10-29 00:23:55

skeeterbug
Member
From: Oklahoma, USA
Registered: 2004-10-24
Posts: 92
Website

Re: Cleanup

I do something kind of like that - I take 'snapshots' of my filesystem (excluding /home /tmp /var) with an ls -lR > filenamebasedondatetime and then every once in a while compare them or whatever.  I'm like Dusty, I hate cruft buildup in my system.

Offline

#5 2004-12-21 23:44:32

skeeterbug
Member
From: Oklahoma, USA
Registered: 2004-10-24
Posts: 92
Website

Re: Cleanup

I've updated my little script because Phrak is busier than me smile and this might help in the meantime.  I took z4ziggy's advice (partially) and it is most definitely faster.  Feel free to modify to your liking.

#!/bin/sh

# Simple shell script to find rogue files. Written for use with Arch Linux
# and the pacman package manager but can be modified for use with any
# distribution/package manager that has or can generate a list of installed
# files.
#
# There is little or no 'fact checking' done by this script (at this point)
# which means you have to interpret the results yourself because there may
# be false positives.  For example, symbolic links created by the install
# script(s) will be reported as not existing in the db - which they don't.
# So, DON'T pipe the results of this command to some kind of automatic rm
# script or function - that would probably be very very bad.
#
# By using this script you are assuming responsibility for any and all damage
# done to your system.  Do not try to hold me responsible if you do not use
# this script wisely.
#
# Arch Linux:             http://www.archlinux.org/
# pacman Package Manager: http://www.archlinux.org/pacman/
# My Arch Linux stuff:    http://singleton.homeunix.org/arch/
#
# Author:  Gary Singleton <gsinglet1@yahoo.com> (skeeterbug)
# Version: 0.1 - 16 October 2004
# Version: 0.2 - 21 December 2004
#   Rewritten completely to first generate a file listing via the
#   pacman -Ql command then use this to find rogues.  This is much
#   faster than using pacman -Qo to query the db for each filename.

if [ ! $1 ]; then
  echo
  echo 'Searches the given path for files not present in the pacman db.'
  echo
  echo "Usage  : $0 PATH"
  echo "Example: $0 /bin"
  exit
fi

rm -f /tmp/rogues.list
pacman -Ql > /tmp/rogues.list

for filename in `find $1`; do
  if ! fgrep -q $filename /tmp/rogues.list; then
    echo "$filename not found in /tmp/rogues.list"
  fi
done

Offline

#6 2004-12-22 00:06:49

iBertus
Member
From: Greenville, NC
Registered: 2004-11-04
Posts: 2,228

Re: Cleanup

I would think it even faster to grep out the files from only the search path into /tmp/rogues.list and thus eliminate most of the file entries.

EDIT: I tried this out by doing the following and testing it with my /etc directory. Without the grep the script runs in 13s but with it runs in just 4s. Someone else try it out and see if we can confirm that this leaves nothing out of the search.

pacman -Ql | grep $1 > /tmp/rogues.list

Offline

#7 2004-12-22 00:38:59

cactus
Taco Eater
From: t͈̫̹ͨa͖͕͎̱͈ͨ͆ć̥̖̝o̫̫̼s͈̭̱̞͍̃!̰
Registered: 2004-05-25
Posts: 4,622
Website

Re: Cleanup

iBertus wrote:

I would think it even faster to grep out the files from only the search path into /tmp/rogues.list and thus eliminate most of the file entries.

EDIT: I tried this out by doing the following and testing it with my /etc directory. Without the grep the script runs in 13s but with it runs in just 4s. Someone else try it out and see if we can confirm that this leaves nothing out of the search.

pacman -Ql | grep $1 > /tmp/rogues.list

that line will only output to rogues.list if the initial directory IS in the pacman file list.
ie..if you type rogues /etc, then it will populate rogues.list with all files that contain the string "/etc" somewhere in it. You want to output the files that are not listed.

For clarification,  you are trying to remove things that ARE in the list...so that only the files that are not in pacman list are left in rogues.list.


"Be conservative in what you send; be liberal in what you accept." -- Postel's Law
"tacos" -- Cactus' Law
"t̥͍͎̪̪͗a̴̻̩͈͚ͨc̠o̩̙͈ͫͅs͙͎̙͊ ͔͇̫̜t͎̳̀a̜̞̗ͩc̗͍͚o̲̯̿s̖̣̤̙͌ ̖̜̈ț̰̫͓ạ̪͖̳c̲͎͕̰̯̃̈o͉ͅs̪ͪ ̜̻̖̜͕" -- -̖͚̫̙̓-̺̠͇ͤ̃ ̜̪̜ͯZ͔̗̭̞ͪA̝͈̙͖̩L͉̠̺͓G̙̞̦͖O̳̗͍

Offline

#8 2004-12-22 01:37:43

skeeterbug
Member
From: Oklahoma, USA
Registered: 2004-10-24
Posts: 92
Website

Re: Cleanup

I see what you guys are saying and there's definitely room for improvement.  I'm not much of a bash scripter really, just small quicky things.

When grepping out stuff to not bother with it would probably be a good idea to have some kind of test for $1 that if it is equal to '/' then not bother with the grep of the pacman -Ql output?

Also, using cut to select only the second field makes the /tmp file a little smaller and faster to search.  A section somethign like this maybe?

rm -f /tmp/rogues.list
if [ ! $1=='/' ]; then
  pacman -Ql | cut -d' ' -f2 | grep $1 > /tmp/rogues.list
else
  pacman -Ql | cut -d' ' -f2 > /tmp/rogues.list
fi

Another improvement I can see opportunity for right off the bat is the option to not list symlinks [ ! -s ]?

If there are other suggestions / improvements please post them - I don't think this little script is worthy of a home base so to speak but it would be nice to incorporate all the suggestions and improvements in one place like this thread.

Offline

#9 2004-12-22 01:46:16

skeeterbug
Member
From: Oklahoma, USA
Registered: 2004-10-24
Posts: 92
Website

Re: Cleanup

Looking back over this thread perhaps the choice of /tmp/rogues.list was a bad one (naming wise).  It makes it look like this file is the list of rogue (cruft) files when in fact it's just a list of what pacman 'thinks' should be there...  The actual list of suspected rogue/cruft files is the stdout of the script.

Also related to this last it might be better to just output the filename so it could be used as a step in a longer command line without using cut or something to cut out the filename.  In other words output:
/usr/lib/mozilla/plugins/nphelix.so
instead of:
/usr/lib/mozilla/plugins/nphelix.so not found in /tmp/rogues.list

Just some thoughts...

Offline

#10 2004-12-22 02:30:13

iBertus
Member
From: Greenville, NC
Registered: 2004-11-04
Posts: 2,228

Re: Cleanup

it looks like we have a pretty good script going here. by cutting back on the size of the file to be searched i managed to get the overall runtimes down by as much as 30-50% in folders with lots of non-pacman managed files. i too am not much of a bash scripter but feel like we have something pretty good going! we have the experience hanging around this forum to create a very useful script if we keep working... good job!

Offline

#11 2004-12-22 09:10:28

vacant
Member
From: downstairs
Registered: 2004-11-05
Posts: 816

Re: Cleanup

A bit of system management keeps me on track. I have a separate "build" directory. Here I keep software not supplied as a package by the distro I happen to be using.

With /var/cache/pacman/pkg and "build" I know everything installed on my box.

Currently "build" contains stuff built from source:

baghira-0.6     mailfilter-0.6.2    smb4k-0.4.1a
athene_install   ksensors-0.7.3  smartmontools-5.33  webdec-0.46

and a text file reminding me of stuff built from binaries:

edonkeyclc
realplayer

Offline

#12 2004-12-22 21:06:20

iBertus
Member
From: Greenville, NC
Registered: 2004-11-04
Posts: 2,228

Re: Cleanup

I used to keep track of all my files by hand while using slackware and I would agree that nothing beats a good bit of system management but I've always thought we should have a tools like this to help with that task.

Offline

#13 2004-12-23 08:09:10

skeeterbug
Member
From: Oklahoma, USA
Registered: 2004-10-24
Posts: 92
Website

Re: Cleanup

I keep a pretty good eye on stuff too but there's always a chance something might be missed which is why I check it out sometimes.

But, as for the script I think it's a pretty decent beginning but I have been and I continue to be hesitant to work on it too much because Phrakture is writing a real program that does this - search the forums for 'pacscan'.

So, if Phrak decides not to finish pacscan or simply doesn't have time or whatever then I can see going forward with improving this script further.  As it is, it works OK - just a little slow and you can only run one instance because of the hardcoded /tmp file.

Offline

#14 2004-12-27 04:05:00

iBertus
Member
From: Greenville, NC
Registered: 2004-11-04
Posts: 2,228

Re: Cleanup

I'm just happy this little script is around for my current needs.

Offline

Board footer

Powered by FluxBB