You are not logged in.

#26 2007-11-29 17:12:50

MrWeatherbee
Member
Registered: 2007-08-01
Posts: 277

Re: problem with grep

Bison wrote:

I wasn't aware that [A-Z] was a valid grep string.  I believe egrep (or simply grep -e) handles  posix (extended) regular expressions.

As far as I know it is. It's part of the basic regex. Now this, for example (using the OR operator):

[A-B]|[a-b]

won't work unless you do:

grep -E

(the option for extended regex is '-E' not '-e'. by the way).

Offline

#27 2007-11-29 17:26:52

MrWeatherbee
Member
Registered: 2007-08-01
Posts: 277

Re: problem with grep

Sekre wrote:

About the LC_COLLATE,
setting it to LC_COLLATE=C does indeed take this problem away, unsetting it causes it to react to lowercase as well.

mico : I confirmed your testing as well, it did indeed not color lowercase 'j' using [A-Z] range hmm

I'm on deep water in this, but hope someone knows whats going on "behind the scenes". Could probably learn something good from this, I hope so at least cool

$ echo J | grep --color=always [A-Z]
J (orange)
$ echo j | grep --color=always [A-Z]
j (white)
$ echo j | grep --color=always -i [A-Z]
j (orange)
$ echo j | grep --color=always [a-z]
j (orange)

I think what may be happening here is a two stage process (when LC_COLLATE is not set to 'C').

1) The character ('j') is first found in the pattern [A-Z] using the LC_COLLATE variable;
2) it then gets colorized based on the pattern as if locale were set to a default, 'LC_COLLATE=C'.

So, the string is found based on the looser collation of the LC_COLLATE variable (where [A-Z] == [Aa-Zz]), but that result is then colorized based on the stricter LC_COLLATE=C (where [A-Z] == [A-Z] != [Aa-Zz]).

Just a theory.

Last edited by MrWeatherbee (2007-11-29 17:36:52)

Offline

#28 2007-11-29 18:11:56

lloeki
Member
From: France
Registered: 2007-02-20
Posts: 456
Website

Re: problem with grep

interesting.

$ echo j |  LC_COLLATE=C grep "[A-Z]"
$
$ echo j | grep "[A-Z]"
j

LC_ALL unset and others at en_US.utf8

but why different results on different grep versions (see initial post)?

Last edited by lloeki (2007-11-29 18:12:15)


To know recursion, you must first know recursion.

Offline

#29 2007-11-29 18:16:49

Cerebral
Forum Fellow
From: Waterloo, ON, CA
Registered: 2005-04-08
Posts: 3,108
Website

Re: problem with grep

lloeki wrote:

but why different results on different grep versions (see initial post)?

Notice it's also, apparently, two different machines.  (look at the prompts)

Their environments are probably different.

Offline

#30 2007-12-15 21:58:18

sullivanva
Member
From: Herndon, VA USA
Registered: 2005-07-21
Posts: 126

Re: problem with grep

Downgraded boogie

boogie:> pacman -Q | grep grep
grep 2.5.1a-4
boogie:> echo j | grep "[A-Z]"
boogie:>  locale
LANG=en_US.utf8
LC_CTYPE="en_US.utf8"
LC_NUMERIC="en_US.utf8"
LC_TIME="en_US.utf8"
LC_COLLATE="en_US.utf8"
LC_MONETARY="en_US.utf8"
LC_MESSAGES="en_US.utf8"
LC_PAPER="en_US.utf8"
LC_NAME="en_US.utf8"
LC_ADDRESS="en_US.utf8"
LC_TELEPHONE="en_US.utf8"
LC_MEASUREMENT="en_US.utf8"
LC_IDENTIFICATION="en_US.utf8"
LC_ALL=
boogie:>

--HAPS

Offline

#31 2008-04-07 04:34:41

sullivanva
Member
From: Herndon, VA USA
Registered: 2005-07-21
Posts: 126

Re: problem with grep

I guess problem really isn't with grep, though it seems strange to change your locale on one value.

I'm using  [[:upper:]] and [[:lower:]] in scripts, because it's too much of a pain to keep the old version of bash around.


--HAPS

Offline

Board footer

Powered by FluxBB