You are not logged in.
Hi,
(i hope this is the right sub forum)
i have a lot of plain texts like this one (just longer and with dozens of footnotes:
Lorem ipsum dolor sit amet, consectetur adipiscing elit [01].
Footnotes:
[01] blah blah
Can anyone point me in the right direction, so i can change them to:
Lorem ipsum dolor sit amet, consectetur adipiscing elit \footnote{blah bla}.
I'd prefer a bash script, Perl or python, but anything else is appreciated as well. Just don't want to do it manually.
thank you,
rich_o
Offline
As a first approach i came up with this:
#!/usr/bin/env python
# TODO: allow different footnote numbering schemes, i.e. (01) or [ 001 ]
# TODO: allow different footnote markups, i.e <fn></fn>
# TODO: error-/exceptions-handling
import re
f = open('testfile.txt', 'r')
text = f.readlines()
f.close()
# first footnote entry at end of text
index = next((i for i in range(len(text) - 1, -1, -1) if text[i].startswith('[01]')), None)
# get all footnotes
footnotes = dict()
lastfn = ''
for i in text[index:]:
try:
results = re.split('\[(\d{1,3})]', i.strip())
footnotes[results[1]] = results[2].strip()
lastfn=results[1]
except: # catch footnotes covering more than one line
# FIXME: exception is very unreliable!
footnotes[lastfn] = ' '.join([footnotes[lastfn], i.strip()])
newtext = ''.join(text[:index])
# replace
for i in footnotes:
p = ''.join(['\[', i, '\]'])
pat = re.compile(p)
new = ''.join(['\{', footnotes[i], '}'])
newtext = re.sub(pat, new, newtext)
print(newtext)
It's working, but needs some work to be reliable under other circumstances
Last edited by rich_o (2010-12-09 16:33:58)
Offline