Any thoughts on how to achieve this in a cleaner manner? I tried making two separate splits, one that just kept the words, and the other that just kept the puncutation. Merging those two arrays later though proved to be a bit more difficult. I think I need to wrap my head around Ruby's block structure a bit more.
]]>line = "Ruby, ruby, ruby!" => "Ruby, ruby, ruby!" words = line.split(/(s+)|([[:punct:]])/) => ["Ruby", ",", "", " ", "ruby", ",", "", " ", "ruby", "!"]
Which is mostly what I'm after, but there are the extra zero-length elements words[2] and words[6].
Because you're using [[:punct:]] *and* whitepace, the zero length elements are what was split between "," and " ". You'd get the same result splitting "a,b,,c,,d" with zero length elements between the the double commas.
]]>words = line.split(/(s+)|([[:punct:]])/)
I wasn't really sure why adding the or part of the regex made it keep the delimiter in the resulting array. Is there some other string method I don't know about?
-=EDIT=- To clarify, this is what it the line above produces:
line = "Ruby, ruby, ruby!"
=> "Ruby, ruby, ruby!"
words = line.split(/(s+)|([[:punct:]])/)
=> ["Ruby", ",", "", " ", "ruby", ",", "", " ", "ruby", "!"]
Which is mostly what I'm after, but there are the extra zero-length elements words[2] and words[6].
]]>