You are not logged in.

#1 2008-03-24 16:30:53

Spider.007
Member
Registered: 2004-06-20
Posts: 1,150
Website

Find files that are not under pacmans control

I have just cleaned up a lot of old files from my machine, which is why I wrote this script to make it easier for the other 3 machines wink The script contains a list of paths to ignore; and also reports files that should be on the filesystem; plus reports the packages that provides these files.

http://archlinux.spider007.net/pacman-f … unowned.sh
[29/3] Above link has been updated to include tips from sabooky

requests / comments welcome

Last edited by Spider.007 (2008-03-29 15:31:03)

Offline

#2 2008-03-28 20:33:18

sabooky
Member
Registered: 2006-11-02
Posts: 89

Re: Find files that are not under pacmans control

Looks good, I think you might of given me incentive to finally clean up my comp.

some comments/suggestions:
1. It would be more efficient to use -prune in the find command instead of finding everything, then parsing out the greylist.

2.

GREYFILTER=$(sed "1,`grep --line-number '## GREYLISTED' $0| cut -d: -f1|tail -n1`d" $0 | tr '\n' '|')

can be written as

GREYFILTER=$(sed '1,/^## GREYLISTED/d' $0 | tr '\n' '|')

3.

diff --suppress-common-lines full owned | grep '>' | cut -d' ' -f2
diff --suppress-common-lines full owned | grep '<' | cut -d' ' -f2

can be written as

comm -13 full owned
comm -23 full owned

Offline

#3 2008-03-28 21:15:40

Allan
Developer
From: Brisbane, AU
Registered: 2007-06-09
Posts: 10,434
Website

Re: Find files that are not under pacmans control

Once you incorporate any suggestions you think are necessary, it would be nice to submit this to the pacman-dev mailing list for inclusion in the pacman-contrib package.

Offline

#4 2008-03-28 21:35:09

Shaika-Dzari
Member
From: Québec, Canada
Registered: 2006-04-14
Posts: 436
Website

Re: Find files that are not under pacmans control

Hello!

So this script find package not under pacman control and erase it?
That it?

Could you describe witch files will be erase? Only pkg in /var/cache/pacman/pkg or any file?

@+

Offline

#5 2008-03-28 21:44:49

Allan
Developer
From: Brisbane, AU
Registered: 2007-06-09
Posts: 10,434
Website

Re: Find files that are not under pacmans control

Looking at the script, no files will be erased, just listed.

Offline

#6 2008-03-28 23:51:33

ibendiben
Member
Registered: 2007-10-10
Posts: 519
Website

Re: Find files that are not under pacmans control

Heey
Some time ago I did the same thing on my machine, even started a topic on how to do this best...

In the end this was the code I found suited best (well, except I didn't know about comm back then wink):

sudo find / -type f -o -type l | sort -u >/tmp/all
pacman -Ql | cut -d' ' -f2- | sort -u >/tmp/owned

comm -23 /tmp/all /tmp/owned | grep -v '/boot\|/dev\|/home\|/media\|/mnt\|/proc\|/root\|/srv\|/sys\|/tmp\|/var/abs\|/var/cache\|/var/lib/pacman\|/var/log\|/var/run'

Last edited by ibendiben (2008-03-28 23:54:10)

Offline

#7 2008-03-29 00:05:30

ibendiben
Member
Registered: 2007-10-10
Posts: 519
Website

Re: Find files that are not under pacmans control

The results from the code above:

/etc/.pwd.lock
/etc/X11/xorg.conf
/etc/asound.state
/etc/fonts/conf.d/20-fix-globaladvance.conf
/etc/fonts/conf.d/20-lohit-gujarati.conf
/etc/fonts/conf.d/20-unhint-small-vera.conf
/etc/fonts/conf.d/30-amt-aliases.conf
/etc/fonts/conf.d/30-replace-bitmap-fonts.conf
/etc/fonts/conf.d/30-urw-aliases.conf
/etc/fonts/conf.d/40-generic.conf
/etc/fonts/conf.d/49-sansserif.conf
/etc/fonts/conf.d/50-user.conf
/etc/fonts/conf.d/51-local.conf
/etc/fonts/conf.d/60-latin.conf
/etc/fonts/conf.d/65-fonts-persian.conf
/etc/fonts/conf.d/65-nonlatin.conf
/etc/fonts/conf.d/69-unifont.conf
/etc/fonts/conf.d/80-delicious.conf
/etc/fonts/conf.d/90-synthetic.conf
/etc/group-
/etc/gshadow-
/etc/gtk-2.0/gdk-pixbuf.loaders
/etc/gtk-2.0/gtk.immodules
/etc/ld.so.cache
/etc/localtime
/etc/mailcap
/etc/mtab
/etc/pango/pango.modules
/etc/passwd-
/etc/profile.d/locale.sh
/etc/shadow-
/etc/xml/catalog
/opt/java/jre/man/whatis
/opt/kde/share/applications/mimeinfo.cache
/opt/qt/man/whatis
/usr/bin/rview
/usr/bin/view
/usr/lib/locale/locale-archive
/usr/lib/xorg/modules/libwfb.so
/usr/local/man/whatis
/usr/man/whatis
/usr/share/applications/mimeinfo.cache
/usr/share/fonts/TTF/andalemo.ttf
/usr/share/fonts/TTF/arial.ttf
/usr/share/fonts/TTF/arialbd.ttf
/usr/share/fonts/TTF/arialbi.ttf
/usr/share/fonts/TTF/ariali.ttf
/usr/share/fonts/TTF/ariblk.ttf
/usr/share/fonts/TTF/comic.ttf
/usr/share/fonts/TTF/comicbd.ttf
/usr/share/fonts/TTF/cour.ttf
/usr/share/fonts/TTF/courbd.ttf
/usr/share/fonts/TTF/courbi.ttf
/usr/share/fonts/TTF/couri.ttf
/usr/share/fonts/TTF/fonts.dir
/usr/share/fonts/TTF/fonts.scale
/usr/share/fonts/TTF/georgia.ttf
/usr/share/fonts/TTF/georgiab.ttf
/usr/share/fonts/TTF/georgiai.ttf
/usr/share/fonts/TTF/georgiaz.ttf
/usr/share/fonts/TTF/impact.ttf
/usr/share/fonts/TTF/msfonts.txt
/usr/share/fonts/TTF/tahoma.ttf
/usr/share/fonts/TTF/times.ttf
/usr/share/fonts/TTF/timesbd.ttf
/usr/share/fonts/TTF/timesbi.ttf
/usr/share/fonts/TTF/timesi.ttf
/usr/share/fonts/TTF/trebuc.ttf
/usr/share/fonts/TTF/trebucbd.ttf
/usr/share/fonts/TTF/trebucbi.ttf
/usr/share/fonts/TTF/trebucit.ttf
/usr/share/fonts/TTF/verdana.ttf
/usr/share/fonts/TTF/verdanab.ttf
/usr/share/fonts/TTF/verdanai.ttf
/usr/share/fonts/TTF/verdanaz.ttf
/usr/share/fonts/TTF/webdings.ttf
/usr/share/fonts/misc/fonts.dir
/usr/share/fonts/misc/fonts.scale
/usr/share/man/nl/whatis
/usr/share/man/whatis
/var/lib/dbus/machine-id
/var/lib/dhcpcd/dhcpcd-eth0.info
/var/lib/dhcpcd/dhcpcd-wlan0.info
/var/lib/hwclock/adjtime
/var/lib/kdm/kdmsts
/var/lib/logrotate.status
/var/lib/mlocate/mlocate.db

This might be a nice reference since it is from a freshly installed system (plus kde-base, firefox, jre, flashplugin and some common drivers more)
In my opinion it is not so nice there is such a list resulting from such a fresh system. I think it would be better and cleaner if the fonts, java, and kde-packages registered those files to pacman. There might be a good reason for this not being the case though. I don't know.

Offline

#8 2008-03-29 04:31:43

bender02
Member
From: Germany
Registered: 2007-02-04
Posts: 1,328

Re: Find files that are not under pacmans control

ibendiben: Those files you "don't like" are mostly cache files, and files created by actually *running* the programs (like all of the /var/lib/* stuff is of that sort). It doesn't make sense for a package to own a cache file, since it is not installed, it's just created when a user runs the program.

The fonts are weird, and I believe they are leftovers from some badly created package or something. Actually, by the looks they seem to be the standard set of microsoft fonts - so some of the programs you installed/run must have downloaded those and copied there. Or the microsoft fonts package is broken in some way.

The /etc/* stuff is also created by creating configuration files for programs.
For instance, the /etc/fonts/conf.d/* are symlinks to /etc/fonts/avail.d/* - that's the way you configure fonts.

So all in all, the only weird stuff I see in that list is the fonts - and I'm pretty sure those are not part of any base package.

Offline

#9 2008-03-29 08:29:48

ibendiben
Member
Registered: 2007-10-10
Posts: 519
Website

Re: Find files that are not under pacmans control

Well, you just gave a good reason for those files to be on the list, thanks for that. About the fonts, I installed fonts obviously... but I don't think their broken or leftovers. Try for yourself maybe and see if you get the same results, that is why I put that list up there. The fonts I installed were :
ttf-ms-fonts
ttf-bitstream-vera
ttf-dejavu
There might as well be an explanation for those files on the list, who knows, tell us!

Offline

#10 2008-03-29 08:37:28

ibendiben
Member
Registered: 2007-10-10
Posts: 519
Website

Re: Find files that are not under pacmans control

I'll try to run the script before most other programs get launched, at bootup, and see what results from that.

Offline

#11 2008-03-29 13:29:01

bender02
Member
From: Germany
Registered: 2007-02-04
Posts: 1,328

Re: Find files that are not under pacmans control

Try to see the contents of the ttf-ms-fonts (pacman -Ql ttf-ms-fonts), and compare it with the contents of the package .tar.gz (look at your pacman cache and look what's in the archive ttf-ms-fonts-<version>-<arch>.tar.gz), and finally compare this to what's in your filesystem. It might shed some light on why these fonts appear to be orphaned.

Running it before anything else (I take it that you mean even before daemons) would still list most of the above files (I think). What I meant in my previous post is that they are created by running programs, but they are meant to survive reboots - that's the way they "remember" things.

Take for instance /var/lib/mlocate/mlocate.db. The way locate works is that it goes through the filesystem usually once a day (from cron), and creates an index (that's the file). Then when you type 'locate <something>', it just looks into that database, and tells you what it had found. The point is that in order to run efficiently and not to reindex the whole system on every run, it keeps the database in /var/lib and only does incremental updated over the time. So it's *good* that the file is there.

Another thing is: if you look at that script, there is a "greylist" - those are dirs that are automatically excluded from being listed. I think it should actually include all of /var/lib - but that's each user's choice.

There are 2 files which are "true orphans" in your list - namely the stuff in /usr/bin/*. Those should not be there, and I think it's a bug in vim (it should include those 2 in its packagelist).

Offline

#12 2008-03-29 14:39:11

Spider.007
Member
Registered: 2004-06-20
Posts: 1,150
Website

Re: Find files that are not under pacmans control

sabooky wrote:

Looks good, I think you might of given me incentive to finally clean up my comp.

[...]

Thanks for the suggestions; my sed knowledge is not that good [maybe someone has a URL for a good place to start?] and I will definitely apply these to my script; as well as the prune suggestion

ibendiben wrote:

Heey
Some time ago I did the same thing on my machine, even started a topic on how to do this best...

In the end this was the code I found suited best (well, except I didn't know about comm back then wink):

[...]

Yep, that's almost the same as I am doing; and i didn't know about comm either big_smile

ibendiben wrote:

The results from the code above:

[...]

This might be a nice reference since it is from a freshly installed system (plus kde-base, firefox, jre, flashplugin and some common drivers more)
In my opinion it is not so nice there is such a list resulting from such a fresh system. I think it would be better and cleaner if the fonts, java, and kde-packages registered those files to pacman. There might be a good reason for this not being the case though. I don't know.

I noticed a lot of files which I think should be owned by some package; for example xorg.conf; and smb.conf [which, I think, should replace the smb.conf.default file]. Also, files like /etc/profile.d/locale.sh and /etc/localtime which are created by initscripts should be included as files of that package imo

Offline

#13 2008-03-29 15:08:26

bender02
Member
From: Germany
Registered: 2007-02-04
Posts: 1,328

Re: Find files that are not under pacmans control

Spider.007 wrote:

I noticed a lot of files which I think should be owned by some package; for example xorg.conf; and smb.conf [which, I think, should replace the smb.conf.default file]. Also, files like /etc/profile.d/locale.sh and /etc/localtime which are created by initscripts should be included as files of that package imo

I don't agree with you, so:
- some packages *don't* come with a default conf file (xorg, samba), and I think it is for a reason: it forces the user to actually do the configuration, and not use the defaults, which don't make any sense (especially xorg). Then of course the package should *not* own the conf file, since it doesn't come with it.
- /etc/profile/locale.sh and /etc/localtime files are *recreated* on every boot by the script /etc/rc.sysinit, so again, it makes sense to me that they are not owned by initscripts, since there are no sensible default contents for them.

EDIT: I guess now I realized that it's a matter of packaging standards. My point of view is that /etc/ dir is one of those in which I'm supposed to "manage" myself: understand what's there, edit if necessary, create/delete files depending on the situation I'm trying to configure for.

EDIT2: I forgot to say that I appreciate your work, and I really like your script (found some cruft on my system as well).

Last edited by bender02 (2008-03-29 15:12:05)

Offline

#14 2008-03-29 15:19:18

Spider.007
Member
Registered: 2004-06-20
Posts: 1,150
Website

Re: Find files that are not under pacmans control

bender02 wrote:
Spider.007 wrote:

I noticed a lot of files which I think should be owned by some package; for example xorg.conf; and smb.conf [which, I think, should replace the smb.conf.default file]. Also, files like /etc/profile.d/locale.sh and /etc/localtime which are created by initscripts should be included as files of that package imo

I don't agree with you, so:
- some packages *don't* come with a default conf file (xorg, samba), and I think it is for a reason: it forces the user to actually do the configuration, and not use the defaults, which don't make any sense (especially xorg). Then of course the package should *not* own the conf file, since it doesn't come with it.
- /etc/profile/locale.sh and /etc/localtime files are *recreated* on every boot by the script /etc/rc.sysinit, so again, it makes sense to me that they are not owned by initscripts, since there are no sensible default contents for them.

EDIT: I guess now I realized that it's a matter of packaging standards. My point of view is that /etc/ dir is one of those in which I'm supposed to "manage" myself: understand what's there, edit if necessary, create/delete files depending on the situation I'm trying to configure for.

EDIT2: I forgot to say that I appreciate your work, and I really like your script (found some cruft on my system as well).

Regarding your initial post; we probably have a different view there; which is why I agree with your first edit. What is think is not good though; is that the package-guidelines are not very clear about this; and secondly there seems to be some inconsistency in the different packages. /var/lib contains some good examples where most of them are nicely 'owned'; and others aren't; without any clear reason.

Offline

#15 2008-03-29 23:39:25

ibendiben
Member
Registered: 2007-10-10
Posts: 519
Website

Re: Find files that are not under pacmans control

Spider.007 wrote:
sabooky wrote:

Looks good, I think you might of given me incentive to finally clean up my comp.

[...]

Thanks for the suggestions; my sed knowledge is not that good [maybe someone has a URL for a good place to start?] and I will definitely apply these to my script; as well as the prune suggestion

ibendiben wrote:

Heey
Some time ago I did the same thing on my machine, even started a topic on how to do this best...

In the end this was the code I found suited best (well, except I didn't know about comm back then wink):

[...]

Yep, that's almost the same as I am doing; and i didn't know about comm either big_smile

The difference being you list all directories too; which when I first thought about it, is not so practical, cause this list a lot of doubles (each file, plus each underlaying directory as well). But since there might be some empty directories on your system that you might want to get rid of, it is useful to list those. Only you don't need sed for that, you can just add a type -d -empty to your find command. Wouldn't that be better?
Also in my opinion the prune function is not so useful in this situation cause you don't make that much of a speed improvement, and the list of owned but not found packages will only get bigger and you don't want that do you?
So I'd stick with:

sudo find / -type f -o -type l -o -type d -empty | sort -u >/tmp/all
pacman -Ql | cut -d' ' -f2- | sort -u >/tmp/owned
comm -23 /tmp/all /tmp/owned | grep -v '/boot\|/dev\|/home\|/media\|/mnt\|/proc\|/root\|/srv\|/sys\|/tmp\|/var/abs\|/var/cache\|/var/lib/pacman\|/var/log\|/var/run'

note, the sort -u which is shorter for sort | uniq.

--Ben

Offline

#16 2008-03-30 08:12:33

sabooky
Member
Registered: 2006-11-02
Posts: 89

Re: Find files that are not under pacmans control

ibendiben wrote:

Also in my opinion the prune function is not so useful in this situation cause you don't make that much of a speed improvement, and the list of owned but not found packages will only get bigger and you don't want that do you?

I disagree with you on the speed difference, but good point on owned but not found.
speed comparison:

$ time find / -type f -o -type l -o -type d -empty | sort -u >/tmp/all

real    0m12.465s
user    0m1.570s
sys    0m9.456s
$ time find / -type d \( -wholename '/boot' -o -wholename '/dev' -o -wholename '/home' -o -wholename '/media' -o -wholename '/mnt' -o -wholename '/proc' -o -wholename '/root' -o -wholename '/srv' -o -wholename '/sys' -o -wholename '/tmp' -o -wholename '/var/abs' -o -wholename '/var/cache' -o -wholename '/var/lib/pacman' -o -wholename '/var/log' -o -wholename '/var/run' \) -prune -o \( -type d -empty -o -type f -o -type l \) -print| sort -u >/tmp/allpruned

real    0m0.990s
user    0m0.483s
sys    0m0.507s

# verified that the results where the same with
$ diff <(grep -v '^\(/boot\|/dev\|/home\|/media\|/mnt\|/proc\|/root\|/srv\|/sys\|/tmp\|/var/abs\|/var/cache\|/var/lib/pacman\|/var/log\|/var/run\)' /tmp/all) /tmp/allpruned

The speed difference can be big for some people. For example, my '/mnt' has my windows ntfs mounted there, searching that takes a long time. The times above are inaccurate since they're both running from ram (cached), but I'd imagine reading from disk would give similar results.

One way to use the -prune option and still be able to get owned but not found is

find / -type d \( -wholename '/boot' -o -wholename '/dev' -o -wholename '/home' -o -wholename '/media' -o -wholename '/mnt' -o -wholename '/proc' -o -wholename '/root' -o -wholename '/srv' -o -wholename '/sys' -o -wholename '/tmp' -o -wholename '/var/abs' -o -wholename '/var/cache' -o -wholename '/var/lib/pacman' -o -wholename '/var/log' -o -wholename '/var/run' \) -prune -o \( -type d -o -type f -o -type l \) -print| sort -u >/tmp/allpruned

pacman -Ql | cut -d' ' -f2-|sed 's:/$::'|sort -u > /tmp/owned

echo "owned but not found:"
comm -23 /tmp/owned /tmp/allpruned|while read file;do
  [[ ! (-e "$file" || -L "$file") ]] && echo "$file"
done

echo -e "\nfound but not owned:"
comm -23 /tmp/allpruned /tmp/owned

Offline

#17 2008-03-30 08:40:26

dwi
Member
From: McKinney,TX
Registered: 2008-01-27
Posts: 27
Website

Re: Find files that are not under pacmans control

bender02 wrote:

The fonts are weird, and I believe they are leftovers from some badly created package or something. Actually, by the looks they seem to be the standard set of microsoft fonts - so some of the programs you installed/run must have downloaded those and copied there. Or the microsoft fonts package is broken in some way.

Back in the 90s when Microsoft started the Core fonts for the web initiative they decided to release them for "free", but one of the EULA limitations was you could not repackage them from the original self-extracting archives.

Many distros simply download the archives at install time, and extract them. I'm guessing that because the final resulting files are not actually 'in' the pkg, pacman is not able detect them thus generate a filelist for the ttf-ms-fonts package.

Just a random FYI. smile


edit: It is worth noting, the install script for the package *does* generate at text file, msfonts.txt, that is used to be able to remove the files during the packages removal.

chad

Last edited by dwi (2008-03-30 08:44:20)

Offline

#18 2008-03-30 09:11:28

Spider.007
Member
Registered: 2004-06-20
Posts: 1,150
Website

Re: Find files that are not under pacmans control

sabooky wrote:
ibendiben wrote:

Also in my opinion the prune function is not so useful in this situation cause you don't make that much of a speed improvement, and the list of owned but not found packages will only get bigger and you don't want that do you?

I disagree with you on the speed difference, but good point on owned but not found.
speed comparison:

$ time find / -type f -o -type l -o -type d -empty | sort -u >/tmp/all

real    0m12.465s
user    0m1.570s
sys    0m9.456s
$ time find / -type d \( -wholename '/boot' -o -wholename '/dev' -o -wholename '/home' -o -wholename '/media' -o -wholename '/mnt' -o -wholename '/proc' -o -wholename '/root' -o -wholename '/srv' -o -wholename '/sys' -o -wholename '/tmp' -o -wholename '/var/abs' -o -wholename '/var/cache' -o -wholename '/var/lib/pacman' -o -wholename '/var/log' -o -wholename '/var/run' \) -prune -o \( -type d -empty -o -type f -o -type l \) -print| sort -u >/tmp/allpruned

real    0m0.990s
user    0m0.483s
sys    0m0.507s

# verified that the results where the same with
$ diff <(grep -v '^\(/boot\|/dev\|/home\|/media\|/mnt\|/proc\|/root\|/srv\|/sys\|/tmp\|/var/abs\|/var/cache\|/var/lib/pacman\|/var/log\|/var/run\)' /tmp/all) /tmp/allpruned

The speed difference can be big for some people. For example, my '/mnt' has my windows ntfs mounted there, searching that takes a long time. The times above are inaccurate since they're both running from ram (cached), but I'd imagine reading from disk would give similar results.

[...]

Well, I think that the current script handles this just fine by not performing a find /; but start with a selected list of directories which does not include /mnt. Therefore the speed improvement between using prune, and using sed afterwards will be smaller. I do like the idea to perform an extra check using [[ -e || -L ]] though

Offline

#19 2008-03-30 12:51:37

ibendiben
Member
Registered: 2007-10-10
Posts: 519
Website

Re: Find files that are not under pacmans control

sabooky wrote:
ibendiben wrote:

Also in my opinion the prune function is not so useful in this situation cause you don't make that much of a speed improvement, and the list of owned but not found packages will only get bigger and you don't want that do you?

I disagree with you on the speed difference, but good point on owned but not found.
speed comparison:

$ time find / -type f -o -type l -o -type d -empty | sort -u >/tmp/all

real    0m12.465s
user    0m1.570s
sys    0m9.456s
$ time find / -type d \( -wholename '/boot' -o -wholename '/dev' -o -wholename '/home' -o -wholename '/media' -o -wholename '/mnt' -o -wholename '/proc' -o -wholename '/root' -o -wholename '/srv' -o -wholename '/sys' -o -wholename '/tmp' -o -wholename '/var/abs' -o -wholename '/var/cache' -o -wholename '/var/lib/pacman' -o -wholename '/var/log' -o -wholename '/var/run' \) -prune -o \( -type d -empty -o -type f -o -type l \) -print| sort -u >/tmp/allpruned

real    0m0.990s
user    0m0.483s
sys    0m0.507s

neutral
Those are my results (performed after successing various runs of find / before so the "first-time" slowness is gone (now did you do that?):

time -p (time sudo find / -type f -o -type l -o -type d -empty)

real 1.79
user 0.22
sys 0.53

time -p (sudo find / -type d \( -wholename '/boot' -o -wholename '/dev' -o -wholename '/home' -o -wholename '/media' -o -wholename '/mnt' -o -wholename '/proc' -o -wholename '/root' -o -wholename '/srv' -o -wholename '/sys' -o -wholename '/tmp' -o -wholename '/var/abs' -o -wholename '/var/cache' -o -wholename '/var/lib/pacman' -o -wholename '/var/log' -o -wholename '/var/run' \) -prune -o \( -type d -empty -o -type f -o -type l \) -print)

real 1.26
user 0.21
sys 0.32

And my /mnt has a windows ntfs mounted too wink

@spider.007 but what would you want to list directories for? (as i said, only the empty directories might be useful to know of but else it's just double)

Ps, don't take my comments as any offense or critic, I just want to discuss what would be best, as I like to run this "lost-packages" test as good as possible in future smile

Offline

#20 2008-03-30 14:11:26

bender02
Member
From: Germany
Registered: 2007-02-04
Posts: 1,328

Re: Find files that are not under pacmans control

dwi wrote:

Many distros simply download the archives at install time, and extract them. I'm guessing that because the final resulting files are not actually 'in' the pkg, pacman is not able detect them thus generate a filelist for the ttf-ms-fonts package.

edit: It is worth noting, the install script for the package *does* generate at text file, msfonts.txt, that is used to be able to remove the files during the packages removal.

Yea, you're right. I just checked the ttf-ms-fonts PKGBUILD and install script myself...
I should've done it before I made a fool of myself smile

Offline

#21 2008-03-30 16:36:45

sabooky
Member
Registered: 2006-11-02
Posts: 89

Re: Find files that are not under pacmans control

Spider.007 wrote:

Well, I think that the current script handles this just fine by not performing a find /; but start with a selected list of directories which does not include /mnt. Therefore the speed improvement between using prune, and using sed afterwards will be smaller. I do like the idea to perform an extra check using [[ -e || -L ]] though

Good point.

ibendiben: Guess my hd is extra cluttered or my hd is slow :\. Mine was after a few runs too, first run takes a few minutes. Here's how many files I have.

# find /|wc -l; find /mnt|wc -l
575592
229301

Back to the point, excellent script Spider.007

about fonts...

$ pacman -Ql ttf-ms-fonts|grep /tmp
ttf-ms-fonts /tmp/
ttf-ms-fonts /tmp/ttf-ms-fonts/
ttf-ms-fonts /tmp/ttf-ms-fonts/andale32.exe
ttf-ms-fonts /tmp/ttf-ms-fonts/arial32.exe
ttf-ms-fonts /tmp/ttf-ms-fonts/arialb32.exe
ttf-ms-fonts /tmp/ttf-ms-fonts/comic32.exe
ttf-ms-fonts /tmp/ttf-ms-fonts/courie32.exe
ttf-ms-fonts /tmp/ttf-ms-fonts/georgi32.exe
ttf-ms-fonts /tmp/ttf-ms-fonts/impact32.exe
ttf-ms-fonts /tmp/ttf-ms-fonts/times32.exe
ttf-ms-fonts /tmp/ttf-ms-fonts/trebuc32.exe
ttf-ms-fonts /tmp/ttf-ms-fonts/verdan32.exe
ttf-ms-fonts /tmp/ttf-ms-fonts/wd97vwr32.exe
ttf-ms-fonts /tmp/ttf-ms-fonts/webdin32.exe

umm.. "/tmp"?! Couldn't they just extract the files in the PKGBUILD instead of the .install? wouldn't that fix the problem? (I'm assuming there's a reason I'm missing for why it's done the way it's done.)

Last edited by sabooky (2008-03-30 16:40:46)

Offline

#22 2008-03-30 17:24:26

dwi
Member
From: McKinney,TX
Registered: 2008-01-27
Posts: 27
Website

Re: Find files that are not under pacmans control

sabooky wrote:

Couldn't they just extract the files in the PKGBUILD instead of the .install? wouldn't that fix the problem? (I'm assuming there's a reason I'm missing for why it's done the way it's done.)

Ultimately it comes down to honoring the EULA. I don't remember the exact verbiage, but IIRC you could not redistribute the fontpack unless the files were redistributed in the original form. Since that was a self extracting cab file, it somewhat limits things.

Offline

#23 2008-04-13 20:09:19

.:B:.
Forum Fellow
Registered: 2006-11-26
Posts: 5,819

Re: Find files that are not under pacmans control

Any update on this? The link to the script is dead smile


Got Leenucks? :: Arch: Power in simplicity :: Get Counted! Registered Linux User #392717 :: Blog thingy

Offline

#24 2008-04-13 20:16:23

bender02
Member
From: Germany
Registered: 2007-02-04
Posts: 1,328

Re: Find files that are not under pacmans control

I posted the original scripts to http://rafb.net/p/MmYfSS38.html (without the greyfilter simplifications, I don't have that version).

Offline

#25 2008-04-13 21:09:40

ibendiben
Member
Registered: 2007-10-10
Posts: 519
Website

Re: Find files that are not under pacmans control

Here the updated version:

#!/bin/bash
# Utility to generate a list of all files that are not part of a package
# Author: Spider.007 / Sjon

TMPDIR=`mktemp -d`
FILTER=$(sed '1,/^## FILTERED/d' $0 | tr '\n' '|')
FILTER=${FILTER%|}

cd $TMPDIR
find /bin /boot /etc /lib /opt /sbin /usr /var | sort -u > full
pacman -Ql | tee owned_full | cut -d' ' -f2- | sed 's/\/$//' | sort -u > owned

grep -Ev "^($FILTER)" owned > owned- && mv owned- owned

echo -e '\033[1mOwned, but not found:\033[0m'
comm -13 full owned | while read entry
do
    echo [`grep --max-count=1 $entry owned_full|cut -d' ' -f1`] $entry
done | sort

grep -Ev "^($FILTER)" full > full- && mv full- full

echo -e '\n\033[1mFound, but not owned:\033[0m'
comm -23 full owned

cd /tmp/ && rm -R $TMPDIR

exit 0

## FILTERED FILES / PATHS ##
/boot/grub
/dev
/etc/X11/xdm/authdir
/home
/media
/mnt
/proc
/root
/srv
/sys
/tmp
/var/abs
/var/cache
/var/games
/var/log
/var/lib/pacman
/var/lib/mysql
/var/run

Offline

Board footer

Powered by FluxBB