You are not logged in.
I wasn't aware that [A-Z] was a valid grep string. I believe egrep (or simply grep -e) handles posix (extended) regular expressions.
As far as I know it is. It's part of the basic regex. Now this, for example (using the OR operator):
[A-B]|[a-b]
won't work unless you do:
grep -E
(the option for extended regex is '-E' not '-e'. by the way).
Offline
About the LC_COLLATE,
setting it to LC_COLLATE=C does indeed take this problem away, unsetting it causes it to react to lowercase as well.mico : I confirmed your testing as well, it did indeed not color lowercase 'j' using [A-Z] range
I'm on deep water in this, but hope someone knows whats going on "behind the scenes". Could probably learn something good from this, I hope so at least
$ echo J | grep --color=always [A-Z]
J (orange)
$ echo j | grep --color=always [A-Z]
j (white)
$ echo j | grep --color=always -i [A-Z]
j (orange)
$ echo j | grep --color=always [a-z]
j (orange)
I think what may be happening here is a two stage process (when LC_COLLATE is not set to 'C').
1) The character ('j') is first found in the pattern [A-Z] using the LC_COLLATE variable;
2) it then gets colorized based on the pattern as if locale were set to a default, 'LC_COLLATE=C'.
So, the string is found based on the looser collation of the LC_COLLATE variable (where [A-Z] == [Aa-Zz]), but that result is then colorized based on the stricter LC_COLLATE=C (where [A-Z] == [A-Z] != [Aa-Zz]).
Just a theory.
Last edited by MrWeatherbee (2007-11-29 17:36:52)
Offline
interesting.
$ echo j | LC_COLLATE=C grep "[A-Z]"
$
$ echo j | grep "[A-Z]"
j
LC_ALL unset and others at en_US.utf8
but why different results on different grep versions (see initial post)?
Last edited by lloeki (2007-11-29 18:12:15)
To know recursion, you must first know recursion.
Offline
but why different results on different grep versions (see initial post)?
Notice it's also, apparently, two different machines. (look at the prompts)
Their environments are probably different.
Offline
Downgraded boogie
boogie:> pacman -Q | grep grep
grep 2.5.1a-4
boogie:> echo j | grep "[A-Z]"
boogie:> locale
LANG=en_US.utf8
LC_CTYPE="en_US.utf8"
LC_NUMERIC="en_US.utf8"
LC_TIME="en_US.utf8"
LC_COLLATE="en_US.utf8"
LC_MONETARY="en_US.utf8"
LC_MESSAGES="en_US.utf8"
LC_PAPER="en_US.utf8"
LC_NAME="en_US.utf8"
LC_ADDRESS="en_US.utf8"
LC_TELEPHONE="en_US.utf8"
LC_MEASUREMENT="en_US.utf8"
LC_IDENTIFICATION="en_US.utf8"
LC_ALL=
boogie:>
--HAPS
Offline
I guess problem really isn't with grep, though it seems strange to change your locale on one value.
I'm using [[:upper:]] and [[:lower:]] in scripts, because it's too much of a pain to keep the old version of bash around.
--HAPS
Offline