You are not logged in.

#1 2005-08-06 02:42:25

elasticdog
Member
From: Washington, USA
Registered: 2005-05-02
Posts: 995
Website

How to keep delimiters from a string split?

I'm trying to learn some Ruby, and am wondering what is the best way to keep delimiters when doing a string split.  I basically want to take a sentence and split all of the words up, but also keep track of spacing and punctuation to reconstruct it accurately later on.  Fooling around, I came up with this bit of code, which works, but it seems to be a lot messier than it should be...

words = line.split(/(s+)|([[:punct:]])/)

I wasn't really sure why adding the or part of the regex made it keep the delimiter in the resulting array.  Is there some other string method I don't know about?

-=EDIT=- To clarify, this is what it the line above produces:

line = "Ruby, ruby, ruby!"
=> "Ruby, ruby, ruby!"
words = line.split(/(s+)|([[:punct:]])/)
=> ["Ruby", ",", "", " ", "ruby", ",", "", " ", "ruby", "!"]

Which is mostly what I'm after, but there are the extra zero-length elements words[2] and words[6].

Offline

#2 2005-08-08 15:35:10

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: How to keep delimiters from a string split?

elasticdog wrote:
line = "Ruby, ruby, ruby!"
=> "Ruby, ruby, ruby!"
words = line.split(/(s+)|([[:punct:]])/)
=> ["Ruby", ",", "", " ", "ruby", ",", "", " ", "ruby", "!"]

Which is mostly what I'm after, but there are the extra zero-length elements words[2] and words[6].

Because you're using [[:punct:]] *and* whitepace, the zero length elements are what was split between "," and " ".  You'd get the same result splitting "a,b,,c,,d" with zero length elements between the the double commas.

Offline

#3 2005-08-08 21:58:54

elasticdog
Member
From: Washington, USA
Registered: 2005-05-02
Posts: 995
Website

Re: How to keep delimiters from a string split?

Ahhh!  That does make sense...

Any thoughts on how to achieve this in a cleaner manner?  I tried making two separate splits, one that just kept the words, and the other that just kept the puncutation.  Merging those two arrays later though proved to be a bit more difficult.  I think I need to wrap my head around Ruby's block structure a bit more.

Offline

#4 2005-08-08 22:27:19

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: How to keep delimiters from a string split?

in cases like this I don't maintain whitespace - I just enforce my format on the output, or whatever... so I'd split by commas and trim the whitespace off each one... on output I'd just spit out each one plus ", " to make it look clean - it's easier to only format when you need to see it, and scrap the formatting otherwise.

Offline

#5 2005-08-08 22:37:44

elasticdog
Member
From: Washington, USA
Registered: 2005-05-02
Posts: 995
Website

Re: How to keep delimiters from a string split?

Good point.  Thanks for the advice...I'm just messing around with this stuff and thought it was strange the way split handled the particular regex I came up with.

Offline

Board footer

Powered by FluxBB