You are not logged in.
I'm doing some shell scripting and have run into the following problem. Say you have a string like "OneTwoThree", e.g. a string consisting of several words jammed together, with each word starting with a capital letter. You want to insert a space between each word, making the above example "One Two Three". Using perl's split, or awk -F[A-Z] doesn't work because they remove the separator. I've managed via the script below, which is very similar to how I would do this in C, namely, look at each character and if it is a capital letter, then insert a space. My method works, I'm just wondering about other ways to achieve this.
#!/bin/bash
function sep_words() {
string=
for ((i=0; i < ${#1}; i++)); do
char=${1:i:1}
if [[ $char =~ [A-Z] ]]; then
string=${string}" "
fi
string=${string}${char}
done
}
for i in $@; do
sep_words $i
echo $string
done
As an example, the above script gives the following output:
[pmorris@barium ~] $ ./split.sh OneTwoThree HereIsAString
One Two Three
Here Is A String
Offline
I have been learning regular expressions recently and
$ echo "OneTwoThree HereIsAString" | sed -e 's/\([A-Z][a-z]*\)/ &/g'
produces
$ One Two Three Here Is A String
Last edited by kishd (2008-06-11 18:44:58)
---for there is nothing either good or bad, but only thinking makes it so....
Hamlet, W Shakespeare
Offline
I have been learning regular expressions recently and
$ echo "OneTwoThree HereIsAString" | sed -e 's/\([A-Z][a-z]*\)/ &/g'
produces
$ One Two Three Here Is A String
Ah, that is much simpler. I struggled with a regex first and couldn't come up with a good one. The only difference I see is that a space is inserted before the first "word". "OneTwoThree" becomes " One Two Three". Not a big deal, though. On a side note, you don't need to use the parenthesis to capture the match when using '&'.
Offline
Hm, weird stuff, I came up with a similar sed rule, and it is not working :
> echo OneTwoThree | sed 's/\([A-Z]\)/ \1/g'
O n e T w o T h r e e
Works fine with C locale though :
> echo OneTwoThree | LANG=C sed 's/\([A-Z]\)/ \1/g'
One Two Three
So fun locale stuff again (I am using fr_FR.utf8).
pacman roulette : pacman -S $(pacman -Slq | LANG=C sort -R | head -n $((RANDOM % 10)))
Offline
Hm, weird stuff, I came up with a similar sed rule, and it is not working :
> echo OneTwoThree | sed 's/\([A-Z]\)/ \1/g' O n e T w o T h r e e
Works fine with C locale though :
> echo OneTwoThree | LANG=C sed 's/\([A-Z]\)/ \1/g' One Two Three
So fun locale stuff again (I am using fr_FR.utf8).
Another reason to use posix regexp:
echo OneTwoThree | sed 's/\([[:upper:]]\)/ \1/g'
It's strange anyway as upper and lower case letters are separated in utf8, but I don't really know how these things work really.
Offline
OP: completely offtopic, but i'm a Morris too!
archlinux - please read this and this — twice — then ask questions.
--
http://rsontech.net | http://github.com/rson
Offline