You are not logged in.
Dear Arch community,
I would like to rename PDFs with cryptic names into their titles, which are in the first line of the files
pdftotext sw-b-13-0094.pdf
head -n 1 sw-b-13-0094.txt > title
mv sw-b-13-0094.pdf $(title)
But I can't figure out how to do this exactly.
Offline
You're just about there. The problem is your second line creates a file called title, then the third line tries to execute that file to get the new name. You could add "cat" to the third line as follows. This should work, but it would not be my recommended approach:
pdftotext sw-b-13-0094.pdf
head -n 1 sw-b-13-0094.txt > title
mv sw-b-13-0094.pdf $(cat title)
Instead, it'd be much cleaner to just use a shell variable:
pdftotext sw-b-13-0094.pdf
title=$(head -n 1 sw-b-13-0094.txt)
mv sw-b-13-0094.pdf $title
But this can be further improved by not littering all these text files all over - instead use a pipeline rather than actually creating a txt file:
mv sw-b-13-0094.pdf "$(pdftotext sw-b-13-0094.pdf | head -n 1).pdf"
Now, hopefully it should be clear how you can even replace the current pdf filename (sw-b-13...) with a parameter ($1) for a script or shell function - or have this loop through all pdf files in a directory. If you want help with that too, let us know.
EDIT: be careful to ensure that the first line of pdftotext actually has something meaningful. If all the pdfs were created in the same way, this might be known. But it is common for some whitespace or formatting character to be the first line.
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline
Personally, I think it is clearer to split that into 2 lines:
title=$(pdftotext $doc - | head -n 1)
mv $doc "$title".pdf
BTW, you need that - to get pdftotext to output to standard output.
Offline
Works great with a bash loop. Thanks a lot!
Sometimes the title is too long and spans over a second line. May you help me solving this too?
Offline
OP, you need to be a little clearer in your requirements. How are you to decide when to read 2 head lines instead of 1, etc? Anyhow to grab 2 lines, joined by a single space, as the title then an approach could be:
title=$(pdftotext $doc - | head -n 2 | paste -d' ' -s)
Last edited by bulletmark (2014-12-11 01:20:00)
Offline