You are not logged in.
This script has been working well for me so I thought I'd share it. It is based on the rm interface and only uses standard linux commands. Includes a simulation mode, reference-only folders, a trash mode, size limits, and a custom rm command ability. You can use this to remove duplicates from a group of folders or just search for them. Also it does a full byte-for-byte comparison, not a checksum (to avoid false matches).
There are other solutions for this, I realize... I just wanted to write my own that was command-line only and had the features I wanted. You can read about it and download it here... comments welcome
For programmers, you may notice it will unnecessarily compare two files twice, so its not as efficient as it could be. Just needs a little code to make it smarter in that department, but the good news is if files change while its running, it will work on the new files, not a cached version. At some point I may improve the code, but it seems stable enough that I'm sharing it. Use the simulation mode first if you're concerned.
Usage: rmdupe [OPTIONS] FOLDER [...]
Removes duplicate files in specified folders. By default, newest duplicates
are removed.
Options:
-R, -r search specified folders recursively
--ref FOLDER also search FOLDER recursively for copies but don't
remove any files from here (multiple --ref allowed)
Note: files may be removed from a ref folder if that
folder is also a specified folder
--trash FOLDER copy duplicate files to FOLDER instead of removing
--sim simulate and report duplicates only - no removal
--quiet minimize output (disabled if used with --sim)
--verbose detailed output
--old remove oldest duplicates instead of newest
--minsize SIZE limit search to duplicate files SIZE MB and larger
--maxsize SIZE limit search to duplicate files SIZE MB and smaller
--rmcmd "RMCMD" execute RMCMD instead of rm to remove copies
(may contain arguments, eg: "srm -ll")
--xdev don't descend to other filesystems when recursing
specified or ref folders
Notes: do not use wildcards; symlinks are not followed except on the
command line; zero-length files are ignored
Offline