You are not logged in.

#1 2007-06-01 04:14:39

Phrodo_00
Member
From: Seattle, WA
Registered: 2006-04-09
Posts: 342
Website

[kinda ot] help with sed or awk, whatever suits you better

Hi, I've been reading a lot and trying bus I still cannot do it, what I want is to delete everithing in a file that's formated like

<td class="j">something</td>

and replace every

<td class="e">another thing</td>

by

<p>another thing</p>

.
When I was closer to do the former was with

sed '/<td class="j">/,/<\/td>/d'

, but looks like it was too greddy with the matching as it deleted way more that what it should have had.
Thank you.
(of course, if you provide a python or ruby script or whatever that does this is as welcome as with sed or awk, I don't care about what tool to use, I just want to get this done)

Offline

#2 2007-06-01 06:37:07

samlt
Member
Registered: 2007-05-20
Posts: 18

Re: [kinda ot] help with sed or awk, whatever suits you better

well, since we don't know if 'something' or 'another thing' can also contain other (similar) tags, or be over multiple line, the easiest way to do it is to replace <td class="."> with <p> and </td> with </p> one by one, and not as a whole ( <td class> blahblah </td>):

sed -e 's#<td class=".">#<p>#g' -e 's#</td>#</p>#g'

tada!


EDIT: btw, if sed '/<td class="j">/,/<\/td>/d' doesn't work because it will delete a range of line starting with the first line containg <td class="."> and ending on the last line containing  </td>

Hope that's clear enough?

Last edited by samlt (2007-06-01 06:39:05)

Offline

#3 2007-06-01 06:51:45

gradgrind
Member
From: Germany
Registered: 2005-10-06
Posts: 921

Re: [kinda ot] help with sed or awk, whatever suits you better

Here's a python version, but if you have these tags nested, you might need to use an xml parser!

#!/usr/bin/env python

import re, sys

r1 = re.compile(r'<td class="j">.*?</td>', re.DOTALL)
r2 = re.compile(r'<td class="e">(.*?)</td>', re.DOTALL)

text = sys.stdin.read()
sys.stdout.write(r2.sub(r"<p>\1</p>", r1.sub("", text)))

You can pipe your file through it, e.g. "cat myfile | filter.py > mynewfile"

Offline

Board footer

Powered by FluxBB