You are not logged in.

#1 2010-01-07 16:53:40

Cyrusm
Member
From: Bozeman, MT
Registered: 2007-11-15
Posts: 1,053

regex. looking for an easier way....

I'm just working on a quick clean up script, and what I want to accomplish is move a specific set of .jpg files from my downloads directory
to another directory.

the file names are all exactly 13 numeric digits and end in either .jpg or .png

in bash I'm using the globbing expression [0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9].jpg (and .png)
to get these files, and I was wondering if there is a more elegant way to go about this.
I'm open to anything, zsh, perl, python, awk, sed, whatever. regional expressions are a weak point for me, and I'm trying to fix that.


Hofstadter's Law:
           It always takes longer than you expect, even when you take into account Hofstadter's Law.

Offline

#2 2010-01-07 17:00:39

Minishark
Member
Registered: 2009-09-30
Posts: 23

Re: regex. looking for an easier way....

Something like this perhaps?

[0-9]{13}\.(jpg|png)

Offline

#3 2010-01-07 17:06:53

Cyrusm
Member
From: Bozeman, MT
Registered: 2007-11-15
Posts: 1,053

Re: regex. looking for an easier way....

I'll give it a try, thanks for the response


Hofstadter's Law:
           It always takes longer than you expect, even when you take into account Hofstadter's Law.

Offline

#4 2010-01-07 17:10:20

Profjim
Member
From: NYC
Registered: 2008-03-24
Posts: 658

Re: regex. looking for an easier way....

s/regional/regular/.

Yes, there are more elegant regexes to handle this. You'll find that there are a variety of different regex syntaxes around. Traditional shell glob syntax, which you're using, is one. (Bash also has an extended glob syntax that I've only recently learned about.) Another common one is the regex syntax used in contexts like this: sed ..., grep ..., [[ blah =~ pattern ]]. I'm not sure what that is called. Perhaps it's POSIX regex syntax? Then there's the "extended" form of that, which occurs in contexts like this: sed -r ..., egrep .... The main differences are that the former treats some characters like + as literals, and you need to do \+ to get the regex operator. The latter does the opposite. Then there's the PCRE syntax used by PERL, Python, and so on. That's quite widespread. It differs from the "extended" regex syntax mentioned a moment ago only in advanced respects. Then there are various other local regex syntaxes, such as the one used by vim, the native one in Lua, and so on.

In some of these, you can match the pattern you're describing by using something like this:

[0-9]{10}\.jpg

But honestly? For a quick and dirty script, I'd just use your long globbing expression myself, rather than try to figure out which tool permits {10} and which requires \{10} and so on.

If your purpose is to learn more about regexs in general, just google things like regex tutorials and so on. There are a lot of resources available.

Offline

#5 2010-01-07 17:38:35

Cyrusm
Member
From: Bozeman, MT
Registered: 2007-11-15
Posts: 1,053

Re: regex. looking for an easier way....

The extglob is a pretty nice tool!  here's what I have so far...

[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]\.+(jpg|png)

and that cuts one line out of my script.  can't find anything in the documentation for repitition (the {13} from above)
so I think I'm going to stick with this until I can find something better.

Profjim: thanks for the info!  learned a lot about extglob today cool


Hofstadter's Law:
           It always takes longer than you expect, even when you take into account Hofstadter's Law.

Offline

#6 2010-01-07 17:45:51

Profjim
Member
From: NYC
Registered: 2008-03-24
Posts: 658

Re: regex. looking for an easier way....

I don't think there's any bash-native way to do the PATTERN{13}. You'd have to supply the filenames to something like sed or awk (or maybe expr?) and have them manipulate it, then your script would have to check the result string. Doubt it's worth it for this use case.

Offline

#7 2010-01-07 19:29:07

tlvb
Member
From: Sweden
Registered: 2008-10-06
Posts: 297
Website

Re: regex. looking for an easier way....

If you don't need to rename them in any way, ls+egrep+xargs+mv should be enough, e.g. something like

ls|egrep "^[0-9]{13}.(png|jpg)$"|xargs -I% mv % target_directory

If renaming is needed the egrep can be replaced with a suitable sed script and some modification of the xargs call.

Last edited by tlvb (2010-01-07 19:36:06)


I need a sorted list of all random numbers, so that I can retrieve a suitable one later with a binary search instead of having to iterate through the generation process every time.

Offline

#8 2010-01-07 20:16:51

hbekel
Member
Registered: 2008-10-04
Posts: 311

Re: regex. looking for an easier way....

No need to use external tools, bash supports extended regular expressions itself:

for f in *; do [[ "$f" =~ [0-9]{13}\.(jpg|png) ]] && mv $f target; done

Offline

#9 2010-01-07 20:48:41

Cyrusm
Member
From: Bozeman, MT
Registered: 2007-11-15
Posts: 1,053

Re: regex. looking for an easier way....

hbekel's code works exactly as I'd like it too, thanks smile

final results:

for files in *; do [[ "$files" =~ [0-9]{13}\.(jpg|png) ]] && mv $files $targetdir; done

Last edited by Cyrusm (2010-01-08 12:56:29)


Hofstadter's Law:
           It always takes longer than you expect, even when you take into account Hofstadter's Law.

Offline

Board footer

Powered by FluxBB