You are not logged in.
Hi,
I have the following problem:
My music files exist primarily in .flac (for home), .ogg (for my mp3-Player) and .mp3 (for other people) form, however I do not have every file in all three formats. I'm trying to find a way to put all that music in cmus' library without songs turning up twice because they exist in multiple formats and without me having to pick manually add those files that don't exist in all formats.
My idea was to write a script that in a special directory would create a symlink to each file, specifically to the .flac version, if that doesn't exist the .ogg version and if that doesn't exist the .mp3 version, then use that directory as a database.
Luckily the directory structure for all three filetypes is identical, .../<format>/Artist/and_so_on, so I don't have to bother with different files that might have identical names etc., so my first idea was to iterate through the mp3-tree, duplicate the directory structure using symlinks without extensions, then do so same for the .ogg-tree, then do the same for .flac, so that the "higher" files cause the "lower" symlinks to be overwritten.
That should work with some sort of "find ... -exec ln -s ..." command, but it seems needlessly inefficient, with thousands of symlinks needlessly being created and deleted.
Any ideas for a more elegant solution? (Would be nice if we stick to bash. My other-languages-fu is very weak.
Last edited by lastchancetosee (2012-02-14 22:32:35)
My ship don't crash! She crashes, you crashed her!
Offline
I could be wrong, but this seems to be an excellent task for 'sort' with the unique flag.
You can 'find' or 'ls' *all* you music files and cut off the extensions, 'sort' with the unique flag, then you have a list of the files you want to create links to. Loop through that list and check whether each format of that file exists from least preferred to most preferred format and link if it exists.
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline
In order to use sort -u you have to remove what makes each line different from one another, the extension and first directory component. But what you are left with is not very helpful since you are back to where you started without an idea of which tree and file extension you want to link to. The solution is to add the base directory/extension as a separate field and skip that field when you start sorting.
pritrees()
{
for x in "$@"
do
find "$x" -type f -a -name "*.$x" |\
awk -v ext="$x" '{ sub("[.]" ext "$", ""); sub("^" ext "/", "")
print ext, $0 }'
done | sort -u -k2
}
cd ~/music
pritrees flac ogg mp3 | while read ext path
do
...
doneOffline
There is always a more fun! way to do it, piping everything to a multitude of glue tools, invoking, of course, sed
# test case setup
formats=(flac mp3 wma)
src=srcdir
for f in ${formats[@]}
do
mkdir -p $src/$f/{a,b,c}/{d,e,f}
touch $src/$f/{a,b,c}/{d,e,f}/file-{g,h,i}.$f
done
find $src -type f | sort -R | head -n 30 | xargs rmformats=(flac mp3 wma)
src=srcdir
trg=syms
(for f in ${formats[@]}; do
find $src -type f -name \*.$f|xargs -I% -n1 echo % %
done) | sed "s,$src/\w*,$trg,;s,\.[^/]\w*,," | sort -k 1,1 -u | \
xargs -l1 sh -c 'mkdir -p $(dirname $1); ln -s $2 $1' -find syms -type l | xargs ls -lI need a sorted list of all random numbers, so that I can retrieve a suitable one later with a binary search instead of having to iterate through the generation process every time.
Offline
Hey, sorry for not answering & and thanks for your help. I really should have thought of awk/sed myself ...
Anyway, after spending quite a bit of time figuring out what all your special characters are doing, it's working flawlessly now.
My ship don't crash! She crashes, you crashed her!
Offline
I did some more playing around with tvlb's beautiful wall of special characters:
For others trying to use this, tvlb's solution doesn't play nicely with paths containing quotes or spaces. To automatically fix your filenames do
# replace whitespace with _ in paths and filenames
find . -depth -name '* *' | while read f ; do mv -i "$f" "$(dirname "$f")/$(basename "$f"|tr ' ' _)" ; done
# replace double quotes
find . -depth -name '*"*' | while read f ; do mv -i "$f" "$(dirname "$f")/$(basename "$f"|tr '"' _)" ; done
# replace single quotes (notice the change in quoting for the find and tr command)
find . -depth -name "*'*" | while read f ; do mv -i "$f" "$(dirname "$f")/$(basename "$f"|tr "'" _)" ; doneThere is also a problem with files containing more than one '.': The sed-command that strips the extension from the filename will match the first dot it encounters. If you accept that the nice configurability via formats=(...) is lost you can fic that by changing
sed "s,$src/\w*,$trg,;s,\.[^/]\w*,,"to
sed -r "s,$src/\w*,$trg,;s,\.(flac|ogg|mp3),," # '-r' is needed for the (a|b)-construct to workOh, and by the way: While the symlinking worked flawlessly, I had to use a different player. cmus derives all relevant info from the file extension, so files without the extension won't be played. And simply adding an arbitrary extension (e.g. .mp3) means that FLAC files get pushed into the mp3-decoder etc. - with hilarious sound effects. I felt like a Windows user :-) .
Long story short: Here's a fix for that:
We change the linking syntax so that the extension is taken from the link target and appended to the link name:
ln -s "$2" "$1.${2##*.}"So here's the final code:
#!/bin/bash
formats=(flac ogg mp3)
src=<source directory>
trg=<target directory>
(for f in ${formats[@]}; do
find $src -type f -name \*.$f | xargs -I% -n1 echo "% %"
done) | sed -r "s,$src/\w*,$trg,;s,\.(flac|ogg|mp3),," | sort -k 1,1 -u | \
xargs -l1 sh -c 'mkdir -p "$(dirname "$1")"; ln -s "$2" "$1.${2##*.}"' -I hope someone will find this useful ...
Last edited by lastchancetosee (2012-03-04 18:44:16)
My ship don't crash! She crashes, you crashed her!
Offline