Need to call a function within a script with xargs [SOLVED]

graysky · 2012-08-31 08:53:56

My goal is for xargs call my "process" function and to "process" everything that my find command finds. After parsing pages of google results, I am no closer to finding a working solution. Here is an over-simplified version of my script.

#!/bin/bash
process() {
  bsize=$(ls -l "{}" | awk {'print $5'})
  sqlite3 "{}" vacuum
  sqlite3 "{}" reindex
  asize=$(ls -l "{}" | awk {'print $5'})
  dsize=$(echo "scale=2; ($bsize-$asize)/1048576" | bc)
  echo "{}" reduced by $dsize Mbytes
}

find $HOME/.mozilla/firefox/fd3df43Q.default -type f -name '*.sqlite' -print0 | xargs -0 process

Question #1: When run, the xargs complains that it can't find the function. Where is my flaw?

xargs: process: No such file or directory

Question #2: I actually doubt that my usage of "{}" in the process function will work. How can I make the process function generic so that xargs can actually use it with the find?

Last edited by graysky (2012-08-31 11:56:51)

jac · 2012-08-31 10:07:01

I looked into this before and could not find anything useful dealing with xargs. It is a different process than bash so it does not have access to bash functions. You have a few options, though:
1) Put "process" into it's own file that you can execute with xargs
2) Use a `while read -d $'\0' var` loop to keep it in-process.

As for question number 2, functions get arguments just like regular bash scripts. $1 is the first argument, etc. By default xargs picks a large number of arguments to pass all at once. You can use '-n 1' to specify to use only one input per invocation.

graysky · 2012-08-31 11:56:39

Thank you for the advice. I took your suggestion to break into two files and used the xargs -0 -n 1 construct. It works great now!

https://aur.archlinux.org/packages.php?ID=62408

jac · 2012-09-01 12:52:17

Glad I could be of some help

I should have mentioned, if you want to cut down on the number of bash processes that xargs spawns, it is most likely a trivial change to wrap the logic in the new file in something like

while [[ -n $1 ]]; do
    ...
    shift
done

Then you don't need the -n switch for xargs and it should spawn a lot fewer processes. Though it probably doesn't have a practical difference in this case.

graysky · 2012-09-01 12:55:03

I should mention that skydrome provided the solution to using two files: https://github.com/graysky2/profile-cleaner/issues/1

falconindy · 2012-09-01 14:35:18

Or better yet, just inline the script. xargs isn't the problem here, nor is it needed:

#!/bin/bash

find $HOME/.mozilla/firefox/*.default -type f -name '*.sqlite' -exec bash -c $'
  for db; do
    b=$(stat -c %s "$db");
    sqlite3 "$db" \'vacuum; reindex\'
    a=$(stat -c %s "$db");
    awk -v b="$b" -v a="$a" \'BEGIN { printf "reduced by %.2f MiB\\n", (b-a)/(2**20) }\'
  done' _ {} +

graysky · 2012-09-01 14:45:01

@falconindy - I thought xargs is more efficient than find -exec ?

progandy · 2012-09-01 14:46:44

if you need xargs you won't have to create a second script. you can use a commandline parameter to switch to the function:

#!/bin/sh
function process() {
	echo $1
	echo $2
}
# delegate to subprogram
if [ "$1" = "-process" ]; then
	shift 1
	process "$@"
	exit $?
fi

# test subprogram
echo "param1 param2" | xargs $SHELL "$0" -process

Edit: In this case you don't need any extra processes. A bash-loop works:

files=$(find $HOME/.mozilla/firefox/*.default -type f -name '*.sqlite')
for file in $files; do
	echo "File: $file"
	echo "	Do cleanup here..."
done

Last edited by progandy (2012-09-01 14:53:15)

falconindy · 2012-09-01 15:02:54

progandy wrote:

Edit: In this case you don't need any extra processes. A bash-loop works:
files=$(find $HOME/.mozilla/firefox/*.default -type f -name '*.sqlite')
for file in $files; do
	echo "File: $file"
	echo "	Do cleanup here..."
done

This breaks as soon as there's whitespace in the filenames. I suppose a while loop works just fine though...

while read -rd '' db; do
  # stuff with "$db"
done < <(find ... -print0)

graysky wrote:

@falconindy - I thought xargs is more efficient than find -exec ?

Why would invoking an extra process to do exactly the same job be more efficient? The way you're using it is single threaded -- there is no parallel processing unless you use the -P flag... but you don't even need to do that...

p() {
  # do stuff with "$1"
}

while read -rd '' db; do
  p "$db" &
done < <(find ... -print0)
wait

Last edited by falconindy (2012-09-01 15:05:57)

jac · 2012-09-02 09:42:09

progandy wrote:

if you need xargs you won't have to create a second script. you can use a commandline parameter to switch to the function:
... snip ...

Oh, neat. I thought there might be a way to export the function and then the have a new bash inherit it or something; didn't even think of doing it this way.

@falconindy: Thanks for pointing out that I missed an '-r' in my while read loop; that could make a big difference.

Arch Linux

#1 2012-08-31 08:53:56

Need to call a function within a script with xargs [SOLVED]

#2 2012-08-31 10:07:01

Re: Need to call a function within a script with xargs [SOLVED]

#3 2012-08-31 11:56:39

Re: Need to call a function within a script with xargs [SOLVED]

#4 2012-09-01 12:52:17

Re: Need to call a function within a script with xargs [SOLVED]

#5 2012-09-01 12:55:03

Re: Need to call a function within a script with xargs [SOLVED]

#6 2012-09-01 14:35:18

Re: Need to call a function within a script with xargs [SOLVED]

#7 2012-09-01 14:45:01

Re: Need to call a function within a script with xargs [SOLVED]

#8 2012-09-01 14:46:44

Re: Need to call a function within a script with xargs [SOLVED]

#9 2012-09-01 15:02:54

Re: Need to call a function within a script with xargs [SOLVED]

#10 2012-09-02 09:42:09

Re: Need to call a function within a script with xargs [SOLVED]

Board footer