You are not logged in.
Hi
I'm trying to match the first group of whitespaces in a string in order to split the string there.
What i try to accomplish is separating the arguments from a command:
mono program.exe arg1 arg2
mono___program.exe arg1 arg2
My regex matches the first group of whitespaces after the first word and all the following groups:
(?<=\w)\s+
mono___program.exe__arg1__arg2
I just cant figure out to prevent the matches between ...exe___arg1___arg2.
The match should work with any number of arguments 0 - n and also with other programs next to mono.
Any idea? You can work you regex here:
http://gskinner.com/RegExr/
Last edited by Archdove (2012-06-13 10:02:08)
Offline
The question is ill-posed. I shouldn't need to go to a website to test a regex. I already have grep, awk and lex installed, just like every other Unix machine on the planet - why can't I use them?
Offline
I did never say you cant use grep, awk or whatever. I simply said that you can work the regex there.
For people like me, with no mad regex skills, it's quite convenient to see the matching graphically. (yes vim ... yes)
But over all, this ensures a common base for all of the people helping so we (i) don't face problems with differing regex standards.
As to my ill-posted question ... Really?
Question:
How can i match the the first group of white spaces after the first word? Please refer to post 1.
Offline
How about:
^[^ ]+([ ]+)(.*)
?
Offline
@theDOC: This seems to match everything or nothing if there are preceding white-spaces.
Offline
In your case it matches
mono
<spaces>
program.exe arg1 arg2
isn't that correct?
Offline
Well. What you're saying is correct yes. But the match seems to be:
nothing on " mono program.exe foo bar "
everything on "mono program.exe foo bar"
I tried it with the above posted website. Which tool do you use to to debug regex?
Offline
Well, I don't know the range of your possible input lines...
This will skip (possible) preceding spaces:
^[ ]*[^ ]+([ ]+)(.*)
PS: I tried with your posted website, too.
Offline
Do you have to use RegEx or you can use something else? I wrote a simple (and vastly improvable) state machine that does what I thing you want to achieve.
#!/usr/bin/env node
/*
* Given a string return a map in the form { cmd: 'args', cmd2: 'args' }
*/
function parseCommand(s) {
m = {}
state = 0;
currentcmd = '';
currentargs = '';
for (i = 0; i < s.length; i++) {
c = s[i];
//console.log(c, ' ', state);
switch (state) {
case 0:
if (c != ' ') {
state = 1;
currentcmd += c;
}
break;
case 1:
if (c == ' ') {
state = 2;
} else {
currentcmd += c;
}
break;
case 2:
if (c != ' ') {
state = 3;
currentargs += c;
}
break;
case 3:
if (c == '\n') {
state = 0;
m[currentcmd] = currentargs;
currentcmd = '';
currentargs = '';
} else {
currentargs += c;
}
break;
}
}
m[currentcmd] = currentargs;
return m;
}
var cmds = ['mono program.exe arg1 arg2',
' mono program.exe arg1 arg2',
'cmd1 arg1 arg2 arg3\n cmd2 arg11 arg12 etc '];
cmds.forEach(function(c) {
console.log(c, parseCommand(c));
});
Offline
Regular expressions stop at the first match unless told to do otherwise, so all you should need is \s+
Edit: If that doesn't suit you, maybe (?<=(^\w*))\s+ would work. It depends if lookarounds (thanks for introducing me to this btw) can be variable length in whatever tool you're using (can't in grep ).
Edit: Also can't in perl... so much for that.
But I don't understand what you hope to accomplish by matching the whitespace. Isn't the stuff before and after the whitespace what you really want?
Last edited by alphaniner (2012-06-13 14:58:14)
But whether the Constitution really be one thing, or another, this much is certain - that it has either authorized such a government as we have had, or has been powerless to prevent it. In either case, it is unfit to exist.
-Lysander Spooner
Offline
I did it manually like facek suggested but much simpler. I just tought that doing it with regex would be fun but it isn't that easy it seems.
Why i tried to match the whitespaces is that i want to split the string at that position leaving mit with an array of two strings. (with a single instruction)
Matching either part of the string just felt like going the long way around since two instructions would be necessary.
So thanks for all the help. Regex is just th wrong tool for accomplishing what i wanted.
Offline
I couldn't really understand what you want to achieve.
but from what I understand, if you want mono , program.exe arg1 arg2 from 'mono program.exe arg1 arg2'
then I'll do something like
a='mono program.exe arg1 arg2'
args=${a#* } #$args will contain 'program.exe arg1 arg2'
prog=${a%% *} #$prog will contain 'mono'
you can find some explanations here: http://www.ibm.com/developerworks/linux … index.html
But if that is at all what you want..
Offline
or to get arg1 and arg2
awk '// {print $3 $4;}'
But then this is sounding more and more like a homework question by the minute.
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline