You are not logged in.

#1 2008-06-11 18:11:26

PeteMo
Member
From: H'Burg, VA
Registered: 2006-01-26
Posts: 191
Website

Separate words in a string without spaces

I'm doing some shell scripting and have run into the following problem.  Say you have a string like "OneTwoThree", e.g. a string consisting of several words jammed together, with each word starting with a capital letter.  You want to insert a space between each word, making the above example "One Two Three".  Using perl's split, or awk -F[A-Z] doesn't work because they remove the separator.  I've managed via the script below, which is very similar to how I would do this in C, namely, look at each character and if it is a capital letter, then insert a space.  My method works, I'm just wondering about other ways to achieve this.

#!/bin/bash

function sep_words() {
    string=
    for ((i=0; i < ${#1}; i++)); do
        char=${1:i:1}
        if [[ $char =~ [A-Z] ]]; then
            string=${string}" "
        fi
        string=${string}${char}
    done
}

for i in $@; do
    sep_words $i
    echo $string
done

As an example, the above script gives the following output:

[pmorris@barium ~] $ ./split.sh OneTwoThree HereIsAString
One Two Three
Here Is A String

Offline

#2 2008-06-11 18:39:03

kishd
Member
Registered: 2006-06-14
Posts: 401

Re: Separate words in a string without spaces

I have been learning regular expressions recently and

$ echo "OneTwoThree HereIsAString" | sed -e 's/\([A-Z][a-z]*\)/ &/g'

produces

$ One Two Three  Here Is A String

Last edited by kishd (2008-06-11 18:44:58)


---for there is nothing either good or bad, but only thinking makes it so....
Hamlet, W Shakespeare

Offline

#3 2008-06-11 19:41:35

PeteMo
Member
From: H'Burg, VA
Registered: 2006-01-26
Posts: 191
Website

Re: Separate words in a string without spaces

kishd wrote:

I have been learning regular expressions recently and

$ echo "OneTwoThree HereIsAString" | sed -e 's/\([A-Z][a-z]*\)/ &/g'

produces

$ One Two Three  Here Is A String

Ah, that is much simpler.  I struggled with a regex first and couldn't come up with a good one.  The only difference I see is that a space is inserted before the first "word".  "OneTwoThree" becomes " One Two Three".  Not a big deal, though.  On a side note, you don't need to use the parenthesis to capture the match when using '&'.

Offline

#4 2008-06-11 21:45:03

shining
Pacman Developer
Registered: 2006-05-10
Posts: 2,043

Re: Separate words in a string without spaces

Hm, weird stuff, I came up with a similar sed rule, and it is not working :

> echo OneTwoThree | sed 's/\([A-Z]\)/ \1/g'
 O n e T w o T h r e e

Works fine with C locale though :

> echo OneTwoThree | LANG=C sed 's/\([A-Z]\)/ \1/g'
 One Two Three

So fun locale stuff again (I am using fr_FR.utf8).


pacman roulette : pacman -S $(pacman -Slq | LANG=C sort -R | head -n $((RANDOM % 10)))

Offline

#5 2008-06-12 12:59:44

carlocci
Member
From: Padova - Italy
Registered: 2008-02-12
Posts: 368

Re: Separate words in a string without spaces

shining wrote:

Hm, weird stuff, I came up with a similar sed rule, and it is not working :

> echo OneTwoThree | sed 's/\([A-Z]\)/ \1/g'
 O n e T w o T h r e e

Works fine with C locale though :

> echo OneTwoThree | LANG=C sed 's/\([A-Z]\)/ \1/g'
 One Two Three

So fun locale stuff again (I am using fr_FR.utf8).

Another reason to use posix regexp:

echo OneTwoThree | sed 's/\([[:upper:]]\)/ \1/g'

It's strange anyway as upper and lower case letters are separated in utf8, but I don't really know how these things work really.

Offline

#6 2008-06-12 15:32:49

rson451
Member
From: Annapolis, MD USA
Registered: 2007-04-15
Posts: 1,233
Website

Re: Separate words in a string without spaces

OP: completely offtopic, but i'm a Morris too!


archlinux - please read this and this — twice — then ask questions.
--
http://rsontech.net | http://github.com/rson

Offline

Board footer

Powered by FluxBB