You are not logged in.
I've had this problem pop up a couple of times in the past week. I've got a string, and I only want one little bit of it that can be expressed exacly with a regular expression (i.e. sed "s/{regexp}//g" only removes the regexp). How can I delete everything from the string except the regexp, using sed (or awk, etc if it's more convenient)?
Offline
If you are guaranteed to only have one instance of the regexp per line, then you can do:
sed -e "/{regexp}/ s/.*\({regexp}\).*/\1/" -e '/{regexp}/ !{ D }'
Still trying to work out what you'd do with multiple versions of the regexp on a single line
Last edited by Cerebral (2007-12-13 18:57:21)
Offline
sed is arcane. it's probably faster to do it manually
KISS = "It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience." - Albert Einstein
Offline
one stupid question : what's wrong with grep and maybe its -o, --only-matching option?
pacman roulette : pacman -S $(pacman -Slq | LANG=C sort -R | head -n $((RANDOM % 10)))
Offline
Cerebral: Thanks, I'll try that tomorrow (have had a bit too much beer tonight, don't drink and bash )
test1000: Wish I could, but this problem's been appearing in relation to selecting stuff for conky to show, renaming many similarly named files, and so on. Basically the sort of stuff one doesn't like to do manually.
shining: As far as I know, grep only works on entire lines. If the string you're working with is only one line, or you only want to print a substring of one of the strings in the entire set, you've got to resort to other measures.
[edit] After a very quick violation of the don't-drink-and-bash instructions above, I've seen that shining's suggestion works perfectly. I did not know grep could do that. I'll try out Cerebral's suggestion tomorrow, but thanks guys.
Last edited by gunnihinn (2007-12-13 22:43:25)
Offline
you can also use awk. it's more straight-forward.
KISS = "It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience." - Albert Einstein
Offline
A sed script that processes the input character by character avoids the greedy regexp inconvenience.
sed -ne ':beg;s#{REGEXP}##;t sub;s#.##;t beg;b;:sub;a {REGEXP}
b beg'
('a' needs a newline, I think)
The output should be like grep -o (every match on a seperate line)
I tried to use 'x;s#$#{REGEXP}#;x' and at the end of script an 'x;p' which a failed s#.## jumps to,
sed -ne ':beg;s#e##;t sub;s#.##;t beg;b end;:sub;x;s#$#match#;x;b beg;:end;x;p'
(looks for 'e' and prints 'match' for each (more convenient to try out))
Somehow this goes in an infinite loop as soon as you type the 'e', and I can't figure out why.
This would be a much nicer solution, I never really liked the 'a' and 'i' commands in sed.
Last edited by Gilneas (2007-12-17 14:58:27)
Offline
A sed script that processes the input character by character avoids the greedy regexp inconvenience.
sed -ne ':beg;s#{REGEXP}##;t sub;s#.##;t beg;b;:sub;a {REGEXP} b beg'
Forgive me if I'm wrong, but that won't work if your regular expression is, in fact, a regular expression. For example, if you're searching for a*bc (ie. any number of a's, followed by a 'b' and a 'c') and you match aaaaaaaaabc, how are you going to re-insert that with the a command?
Offline
Ah right, then you'd need another hold space to make this work. (for backing up the string you were working on). So that's a no-go, it seems.
Offline
I'd just stick with the grep command supplied above. Didn't know it had that feature - it's pretty slick, that grep.
Offline
One more I came up with, it throws 'regexp' to the back of the pattern space, and ends when the pattern space is 'regexp*'
sed -ne ':beg;s#^\(ab*c\)\(.*\)$#\2\1#;t sub;:bad s#.##;t beg;b;:sub;s#^\(ab*c\)*$#&#p;T bad'
It probably won't work for all regexps, and it doesn't preserve order.
Edit: It should work a bit nicer with s#^\(ab*c\)\(.*\)$#\2\1# instead of s#\(ab*c\)\(.*\)$#\2\1#.
Last edited by Gilneas (2007-12-18 14:36:02)
Offline