You are not logged in.

#1 2007-12-13 18:45:25

gunnihinn
Member
From: Torreón, Mexico
Registered: 2007-10-28
Posts: 81

Replacing everything except regexp with sed

I've had this problem pop up a couple of times in the past week. I've got a string, and I only want one little bit of it that can be expressed exacly with a regular expression (i.e. sed "s/{regexp}//g" only removes the regexp). How can I delete everything from the string except the regexp, using sed (or awk, etc if it's more convenient)?

Offline

#2 2007-12-13 18:55:09

Cerebral
Forum Fellow
From: Waterloo, ON, CA
Registered: 2005-04-08
Posts: 3,108
Website

Re: Replacing everything except regexp with sed

If you are guaranteed to only have one instance of the regexp per line, then you can do:

sed -e "/{regexp}/ s/.*\({regexp}\).*/\1/" -e '/{regexp}/ !{ D }'

Still trying to work out what you'd do with multiple versions of the regexp on a single line

Last edited by Cerebral (2007-12-13 18:57:21)

Offline

#3 2007-12-13 19:32:19

test1000
Member
Registered: 2005-04-03
Posts: 834

Re: Replacing everything except regexp with sed

sed is arcane. it's probably faster to do it manually tongue


KISS = "It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience." - Albert Einstein

Offline

#4 2007-12-13 21:52:44

shining
Pacman Developer
Registered: 2006-05-10
Posts: 2,043

Re: Replacing everything except regexp with sed

one stupid question : what's wrong with grep and maybe its -o, --only-matching option?


pacman roulette : pacman -S $(pacman -Slq | LANG=C sort -R | head -n $((RANDOM % 10)))

Offline

#5 2007-12-13 22:39:16

gunnihinn
Member
From: Torreón, Mexico
Registered: 2007-10-28
Posts: 81

Re: Replacing everything except regexp with sed

Cerebral: Thanks, I'll try that tomorrow (have had a bit too much beer tonight, don't drink and bash big_smile)

test1000: Wish I could, but this problem's been appearing in relation to selecting stuff for conky to show, renaming many similarly named files, and so on. Basically the sort of stuff one doesn't like to do manually.

shining: As far as I know, grep only works on entire lines. If the string you're working with is only one line, or you only want to print a substring of one of the strings in the entire set, you've got to resort to other measures.

[edit] After a very quick violation of the don't-drink-and-bash instructions above, I've seen that shining's suggestion works perfectly. I did not know grep could do that. I'll try out Cerebral's suggestion tomorrow, but thanks guys. cool

Last edited by gunnihinn (2007-12-13 22:43:25)

Offline

#6 2007-12-17 13:59:21

test1000
Member
Registered: 2005-04-03
Posts: 834

Re: Replacing everything except regexp with sed

you can also use awk. it's more straight-forward.


KISS = "It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience." - Albert Einstein

Offline

#7 2007-12-17 14:57:20

Gilneas
Member
From: Netherlands
Registered: 2006-10-22
Posts: 320

Re: Replacing everything except regexp with sed

A sed script that processes the input character by character avoids the greedy regexp inconvenience.

sed -ne ':beg;s#{REGEXP}##;t sub;s#.##;t beg;b;:sub;a {REGEXP}
b beg'

('a' needs a newline, I think)
The output should be like grep -o (every match on a seperate line)

I tried to use 'x;s#$#{REGEXP}#;x' and at the end of script an 'x;p' which a failed s#.## jumps to,

sed -ne ':beg;s#e##;t sub;s#.##;t beg;b end;:sub;x;s#$#match#;x;b beg;:end;x;p'

(looks for 'e' and prints 'match' for each (more convenient to try out))
Somehow this goes in an infinite loop as soon as you type the 'e', and I can't figure out why.
This would be a much nicer solution, I never really liked the 'a' and 'i' commands in sed.

Last edited by Gilneas (2007-12-17 14:58:27)

Offline

#8 2007-12-17 15:17:09

Cerebral
Forum Fellow
From: Waterloo, ON, CA
Registered: 2005-04-08
Posts: 3,108
Website

Re: Replacing everything except regexp with sed

Gilneas wrote:

A sed script that processes the input character by character avoids the greedy regexp inconvenience.

sed -ne ':beg;s#{REGEXP}##;t sub;s#.##;t beg;b;:sub;a {REGEXP}
b beg'

Forgive me if I'm wrong, but that won't work if your regular expression is, in fact, a regular expression.  For example, if you're searching for a*bc (ie. any number of a's, followed by a 'b' and a 'c') and you match aaaaaaaaabc, how are you going to re-insert that with the a command?

Offline

#9 2007-12-17 15:31:13

Gilneas
Member
From: Netherlands
Registered: 2006-10-22
Posts: 320

Re: Replacing everything except regexp with sed

Ah right, then you'd need another hold space to make this work. (for backing up the string you were working on). So that's a no-go, it seems.

Offline

#10 2007-12-17 16:00:33

Cerebral
Forum Fellow
From: Waterloo, ON, CA
Registered: 2005-04-08
Posts: 3,108
Website

Re: Replacing everything except regexp with sed

I'd just stick with the grep command supplied above.  Didn't know it had that feature - it's pretty slick, that grep.

Offline

#11 2007-12-17 16:07:40

Gilneas
Member
From: Netherlands
Registered: 2006-10-22
Posts: 320

Re: Replacing everything except regexp with sed

One more I came up with, it throws 'regexp' to the back of the pattern space, and ends when the pattern space is 'regexp*'

sed -ne ':beg;s#^\(ab*c\)\(.*\)$#\2\1#;t sub;:bad s#.##;t beg;b;:sub;s#^\(ab*c\)*$#&#p;T bad'

It probably won't work for all regexps, and it doesn't preserve order.

Edit: It should work a bit nicer with s#^\(ab*c\)\(.*\)$#\2\1# instead of s#\(ab*c\)\(.*\)$#\2\1#.

Last edited by Gilneas (2007-12-18 14:36:02)

Offline

Board footer

Powered by FluxBB