You are not logged in.

#1 2012-11-27 17:58:18

Thme
Member
From: Raleigh NC
Registered: 2012-01-22
Posts: 105

[Solved]Bash:Manipulating arrays of paths? the right way or not?

Note: Refer to the following posts for better solutions and/or alternatives.
This is a mock segment of a script I'm writting and I need to know if I'm doing it the correct way or at least in a proper bash way...
the following works but will it always parse in a sorted manner. I have a set of directories that are named by date in the YYYY-MM-DD (International) format and want to be sure that they will always be treated as directory paths even when special characters "\n,\t,\s ...etc" are encountered in the path. The number of these directories changes and is compared to a set NUM-ber variable and when there are more than the set number the oldest ones are removed which brings a second question. Being dated as YYYY-MM-DD they sort from oldest to newest(lexicographical order) but do they always and is this the right way to deal with elements in an array separated by nulls? Any Comments or suggestions would be appreciated as I'm learning bash for the  time being (perl is next) and gathered this from bits and peices on verious wikis and forums. I basically want know if this a correct approach to the subject of extracting and performing actions on elements of an array that are directory paths.

#!/bin/bash
oifs="$IFS"
IFS=$'\0'
DIRTREE=(/path/to/dated/directories/[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]/)
NUM=5
HOWMANYMORE=$(echo $(( ${#DIRTREE[@]} - $NUM )))
if (( ${#DIRTREE[@]} > $NUM )) ; then

     rm -rv "${DIRTREE[@]:0:$HOWMANYMORE}" 

fi
IFS="$oifs"

Note:I have tested this for those wondering

Last edited by Thme (2012-11-30 16:58:13)


"Hidden are the ways for those who pass by, for light is perished and darkness comes into being." Nephthys:
Ancient Egyptian Coffin Texts

Offline

#2 2012-11-28 15:09:33

p0x8
Member
Registered: 2012-09-20
Posts: 70

Re: [Solved]Bash:Manipulating arrays of paths? the right way or not?

Interesting. I had never explored the array syntax of bash. Maybe because I would never approach the problem like that.

You have to trust bash to implicitly sort $DIRTREE when resolving the path, which would make me uncomfortable. But the only actual issue I can see in your solution is in the handling of a large number of removals. The expansion of the directory names in the rm command can exceed the argument length limit.

And a minor nitpick: if $HOWMANYMORE is only needed inside the if block it could be set inside it.

Offline

#3 2012-11-28 19:44:26

Thme
Member
From: Raleigh NC
Registered: 2012-01-22
Posts: 105

Re: [Solved]Bash:Manipulating arrays of paths? the right way or not?

Good point. I missed the maximum argument length but i believe it can be resolved with xargs...

#!/bin/bash
oifs="$IFS"
IFS=$'\0'
DIRTREE=(/path/to/dated/directories/[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]/)
NUM=5
HOWMANYMORE=$(echo $(( ${#DIRTREE[@]} - $NUM )))
if (( ${#DIRTREE[@]} > $NUM )) ; then

     echo "${DIRTREE[@]:0:$HOWMANYMORE}" | xargs rm -rv 

fi
IFS="$oifs"

Oh and I should have mentioned that the $NUM variable is sourced from a user config in the actual script. hence the necessity of arithmetic and $HOWMANYMORE etc... After a bit of reading I found that sorting glob or regex expansion in an "almost" ASCII-betical order is a normal behavior for bash and other shells. "almost" being that its case insensitive and special charectors are ignored in the sorting process. So for that part I know I'm safe now using regex in pathname expension and depending on it to be sorted numerically since the main path is always going to be the same and the rest are all numeric directory names. I greatly appreciate the feedback and noticing the possible issue with maximum argument length...
I'll leave this open a few days to see if there are any other suggestions/observations then mark it as solved...

Last edited by Thme (2012-11-28 19:46:17)


"Hidden are the ways for those who pass by, for light is perished and darkness comes into being." Nephthys:
Ancient Egyptian Coffin Texts

Offline

#4 2012-11-28 20:23:53

alphaniner
Member
From: Ancapistan
Registered: 2010-07-12
Posts: 2,810

Re: [Solved]Bash:Manipulating arrays of paths? the right way or not?

You don't need the echo when calculating HOWMANYMORE:

HOWMANYMORE=$(( ${#DIRTREE[@]} - NUM ))

Note: The lack of $ before NUM is not a typo.

I wasn't aware of the ${array[@]:0:n} functionality, so I would have used a loop to iterate over the elements of the array:

#!/bin/bash
oifs="$IFS"
IFS=$'\0'
dirtree=(whatev)
num=5
howmanymore=$(( ${#dirtree[@]} - num ))
count=0
while (( count < howmanymore ))
do
     rm -rv "${dirtree[$count]}"
     let count++
done

Note 1: Variables should be seen, not heard.
Note 2: The lack of $ before the variable names in the while test is also not a typo.

If echo "${DIRTREE[@]:0:$HOWMANYMORE}" | xargs rm -rv was supposed to get around the argument length limit, I don't think it will have that effect.  Also, the [0-9] type stuff is still globbing, not regex.

Last edited by alphaniner (2012-11-28 20:32:08)


But whether the Constitution really be one thing, or another, this much is certain - that it has either authorized such a government as we have had, or has been powerless to prevent it. In either case, it is unfit to exist.
-Lysander Spooner

Offline

#5 2012-11-28 20:58:45

mcover
Member
From: Germany
Registered: 2007-01-25
Posts: 134

Re: [Solved]Bash:Manipulating arrays of paths? the right way or not?

Why are you setting IFS?

IFS is only used for word splitting after variable expansion and when using the builtin read. From the Bash manpage:

Pathname Expansion
       After word splitting, unless the -f option has been set, bash scans each word for the characters *, ?, and [.

So you don't need to set IFS, as the pathname expansion will expand to words regardless.

Offline

#6 2012-11-28 21:43:41

Thme
Member
From: Raleigh NC
Registered: 2012-01-22
Posts: 105

Re: [Solved]Bash:Manipulating arrays of paths? the right way or not?

Ok so I may of got ahead of myself a bit when using IFS... after rereading gregs wiki on it it is mainly used when parsing the output of a command into an array such as "find" with the print0 option etc... I was mainly trying to avoid any mishaps or errors in the script when encountering special charectors. but since I'm not parsing a command's output and bash is expanding the directory list with globing then, yeah I can just remove that... Also the while loop makes so much more sense for iterating over the directories with rm... better than calling xargs externally and yeah xargs may not know split them up in this case. I understand loops an don't know why it didn't occur to me to use one in this case... so now I'm using alphaniners example with the IFS stuff removed as pointed out by mcover...
again I really appreciate the help. I'm self taught and this has given me a lot that I didn't get from or understand in the wiki's/tutorials I've been reading...
BTW... alphaniner, I have to say this: I love that signature quote about voting...

Last edited by Thme (2012-11-29 17:11:59)


"Hidden are the ways for those who pass by, for light is perished and darkness comes into being." Nephthys:
Ancient Egyptian Coffin Texts

Offline

#7 2012-11-28 23:53:32

alphaniner
Member
From: Ancapistan
Registered: 2010-07-12
Posts: 2,810

Re: [Solved]Bash:Manipulating arrays of paths? the right way or not?

Actually, I think combining what you taught me with a for loop would be even better:

for i in "${DIRTREE[@]:0:$HOWMANYMORE}"
do
     rm -rv "$i"
done

This definitely works.  If it's reliable, it would truly be a breakthrough for my own scripts.  I never really liked having to resort to using a 'count', especially once I learned about ranges

$ echo {0..5}
0 1 2 3 4 5

and found they don't work properly with variables

$ var=5
$ echo {0..$var}
{0..5}

so I couldn't do

var=5
for i in {0..$var}
do
     foo ${array[$i]}
done

Also, thanks for the sig compliment.  Though if I'm honest, it is a bit wordy for my taste.  I just really needed to let off some steam around maintaintheillusionofselfgovernmentelection day...

Last edited by alphaniner (2012-11-28 23:54:04)


But whether the Constitution really be one thing, or another, this much is certain - that it has either authorized such a government as we have had, or has been powerless to prevent it. In either case, it is unfit to exist.
-Lysander Spooner

Offline

#8 2012-11-29 00:26:01

Thme
Member
From: Raleigh NC
Registered: 2012-01-22
Posts: 105

Re: [Solved]Bash:Manipulating arrays of paths? the right way or not?

WOW... It just gets better. I didn't know a for loop would iterate over each element in an array. I understood the while loop. but the for loop is even simpler. is it possible to use multiple arrays in that somehow within a single for loop. that would greatly reduce code and syntax stuff in my current script. hell just the for loop method alone  cuts my script down a bit even if I had to use multiple for loops in it.


"Hidden are the ways for those who pass by, for light is perished and darkness comes into being." Nephthys:
Ancient Egyptian Coffin Texts

Offline

#9 2012-11-29 01:21:01

alphaniner
Member
From: Ancapistan
Registered: 2010-07-12
Posts: 2,810

Re: [Solved]Bash:Manipulating arrays of paths? the right way or not?

Basic use of a for loop with an entire array is just

for i in "${array[@]}"
do
     ...
done

But you can use shell globbing directly such as for i in * or just provide a list such as for i in a b c d e (that's actually what the shell sees if you do for i in {a..e})

As far as using two arrays in a single loop, it depends on exactly what you want to do.  If you have two arrays where arrayA[0] is meant to be paired with arrayB[0] and so on, the count method works there (and seems a bit less ugly).  Alternately you can nest for loops if you want to use every element of arrayA with every element of arrayB, for example:

a=(1 2 3)
b=(a b c)
for i in "${a[@]}"
do
     for j in "${b[@]}"
     do
          echo "$i $j"
     done
done

would return

1 a
1 b
1 c
2 a
2 b
2 c
3 a
3 b
3 c

There's also associative arrays, but I've never worked with them.

Last edited by alphaniner (2012-11-29 01:22:01)


But whether the Constitution really be one thing, or another, this much is certain - that it has either authorized such a government as we have had, or has been powerless to prevent it. In either case, it is unfit to exist.
-Lysander Spooner

Offline

#10 2012-11-29 04:59:28

Thme
Member
From: Raleigh NC
Registered: 2012-01-22
Posts: 105

Re: [Solved]Bash:Manipulating arrays of paths? the right way or not?

I'll have to look into the associative arrays. read briefly on them. I've used regular globing in loops before I just didn't know it would treat an array the same way. This is great. just greatly improved a few of my scripts and I gained some insight on how to approach a few other scripting ideas I had with arrays and such.


"Hidden are the ways for those who pass by, for light is perished and darkness comes into being." Nephthys:
Ancient Egyptian Coffin Texts

Offline

#11 2012-11-29 10:22:27

p0x8
Member
Registered: 2012-09-20
Posts: 70

Re: [Solved]Bash:Manipulating arrays of paths? the right way or not?

alphaniner wrote:

so I couldn't do

var=5
for i in {0..$var}
do
     foo ${array[$i]}
done

You could, sort of, but it feels dirty:

var=5
for i in $(eval echo {0..$var})
do
     foo ${array[$i]}
done

Offline

#12 2012-11-29 14:05:09

alphaniner
Member
From: Ancapistan
Registered: 2010-07-12
Posts: 2,810

Re: [Solved]Bash:Manipulating arrays of paths? the right way or not?

p0x8 wrote:

You could, sort of, but it feels dirty

True enough.  I knew eval could be used for that purpose, but I didn't know exactly how. 

And so I will now be holding you responsible for any and all abuses of eval in my scripts. tongue

Last edited by alphaniner (2012-11-29 14:05:58)


But whether the Constitution really be one thing, or another, this much is certain - that it has either authorized such a government as we have had, or has been powerless to prevent it. In either case, it is unfit to exist.
-Lysander Spooner

Offline

#13 2012-11-29 17:08:09

Thme
Member
From: Raleigh NC
Registered: 2012-01-22
Posts: 105

Re: [Solved]Bash:Manipulating arrays of paths? the right way or not?

So I played with associative arrays but they weren't exactly for combinations like this and they weren't as predictable in terms of sorting so this is how I got it to perform an action over a set of arrays based on the earlier suggestions and playing around with process substitution... so this does it in ONE for loop (well 2 but you'll see what I mean)

#!/bin/bash
NUM=3
set1=(/home/maat/dateddirs/set1/[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]/) 
set2=(/home/maat/dateddirs/set2/[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]/) 
set3=(/home/maat/dateddirs/set3/[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]/) 
CT1=$(( ${#set1[@]} - NUM ))
CT2=$(( ${#set2[@]} - NUM ))
CT3=$(( ${#set3[@]} - NUM ))
#this builds an array of the stuff I wanted to perform an action on etc.. as in the directories exceeding
#the number set by a user wanted
combined=($(for X in "${set1[@]:0:$CT1}" "${set2[@]:0:$CT2}" "${set3[@]:0:$CT3}" ; do echo "$X" ; done))

for Xdirs in "${combined[@]}" 
do
	rm -rv "$Xdirs"
done

I think the process substitution part was neat I adapted somewhat from a post on merging arrays here in the LQ forums http://www.linuxquestions.org/questions … es-882286/ mainly the one by "Nominal Animal" on that thread...
any Thoughts on this approach when using multiple arrays?


"Hidden are the ways for those who pass by, for light is perished and darkness comes into being." Nephthys:
Ancient Egyptian Coffin Texts

Offline

#14 2012-11-29 20:55:42

aesiris
Member
Registered: 2012-02-25
Posts: 97

Re: [Solved]Bash:Manipulating arrays of paths? the right way or not?

there is a special syntax for appending elements to an array:

array+=(element1 element2 ...)

you can simplify to

NUM=3
combined=()
for dir in set1 set2 set3; do
   new=(/home/maat/dateddirs/"$dir"/[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]/)
   (( count= ${#new[@]} - NUM ))
   combined+=("${new[@]:0:$count}")
done

EDIT
another comment: the for ... echo approach fails breaking up any element with spaces (or tabs, newlines, etc.)
compare this:

$ test=("a b" c d)
$ for i in "${test[@]}"; do echo "$i"; done
a b
c
d
$ new=( $(for i in "${test[@]}"; do echo "$i"; done) )
$ for i in "${new[@]}"; do echo "$i"; done
a
b
c
d
$ new=()
$ for i in "${test[@]}"; do new+=("$i"); done
$ for i in "${new[@]}"; do echo "$i"; done
a b
c
d

Last edited by aesiris (2012-11-29 21:04:22)

Offline

#15 2012-11-30 02:22:31

Thme
Member
From: Raleigh NC
Registered: 2012-01-22
Posts: 105

Re: [Solved]Bash:Manipulating arrays of paths? the right way or not?

aesiris wrote:

there is a special syntax for appending elements to an array:

array+=(element1 element2 ...)

you can simplify to

NUM=3
combined=()
for dir in set1 set2 set3; do
   new=(/home/maat/dateddirs/"$dir"/[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]/)
   (( count= ${#new[@]} - NUM ))
   combined+=("${new[@]:0:$count}")
done

This works as really well, however, in my case I still need the other sets in separate arrays for other purposes in my personal script so I adapted the approach a little and got this which uses the array+=(foo) syntax properly now. I was aware of this feature in bash arrays and experimented with it but had no success adding elements from other arrays until you demonstrated it in your example by adding them though a "for loop" and doing the arithmetic there... My current one now is very similar...I'm not sure how to reduce it further if possible but I need the other arrays as I do something completely different with them:

#!/bin/bash
NUM=10
set1=(/home/maat/datedfolders/set1/[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]/) 
set2=(/home/maat/datedfolders/set2/[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]/) 
set3=(/home/maat/datedfolders/set3/[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]/) 

CT1=$(( ${#set1[@]} - NUM ))
CT2=$(( ${#set2[@]} - NUM ))
CT3=$(( ${#set3[@]} - NUM ))

for X in "${set1[@]:0:$CT1}" "${set2[@]:0:$CT2}" "${set3[@]:0:$CT3}" 
do 
combined+=("$X") 
done
for 
for Xdirs in "${combined[@]}" 
do
	rm -rv "$Xdirs"
done

I'm considering changing the title of this thread as it reveals a little more than I originally expected to go over and provides some pitfalls as well as ways to approach array manipulation which can be applied to wide range of uses in bash scripting. Others may find totally different uses for what we've discussed here so....

Last edited by Thme (2012-11-30 10:41:02)


"Hidden are the ways for those who pass by, for light is perished and darkness comes into being." Nephthys:
Ancient Egyptian Coffin Texts

Offline

#16 2012-11-30 15:22:11

p0x8
Member
Registered: 2012-09-20
Posts: 70

Re: [Solved]Bash:Manipulating arrays of paths? the right way or not?

Instead of two loops and a temporary array you could just do:

for X in "${set1[@]:0:$CT1}" "${set2[@]:0:$CT2}" "${set3[@]:0:$CT3}" 
do
    rm -rv "$X"
done

Unless you also need the combined list somewhere else.

Offline

#17 2012-11-30 16:07:29

Thme
Member
From: Raleigh NC
Registered: 2012-01-22
Posts: 105

Re: [Solved]Bash:Manipulating arrays of paths? the right way or not?

p0x8 wrote:

Instead of two loops and a temporary array you could just do...
...Unless you also need the combined list somewhere else.

Sigh... One my "duh" moments here I was kinda in a haste testing and rewriting my actual script this is being used in... I'll just mark that up as one of the many times I've over complicated things in my own bash ventures... Anyhow I think enough has been done here on my original post so I'm marking this as solved...


"Hidden are the ways for those who pass by, for light is perished and darkness comes into being." Nephthys:
Ancient Egyptian Coffin Texts

Offline

Board footer

Powered by FluxBB