How can I calculate the cost of piping something (at least approximately)?
For example.. I am using following command:
time gzip -kfc1 file.tar | ssh armend@xeon-server "cat > /dev/null"
It compresses a file on one computer and sends it to another over wireless connection (16Mbps). Entire command finishes in about 15 seconds.
I am would like to know the time it takes to do "gzip -kfc1 file.tar" and "ssh armend@xeon-server "cat > /dev/null" " portions.. or compression and wireless transfer portions?
How can I separate those two part? I guess I can estimate the cost of wireless transfer based on network transfer speeds, but I am would like to do know if this would somehow cut into real execution time of first task.
Running two task separately adds extra overhead since I am writing output of first tast to drive and then use cat (with pipe) to send it over wireless network.
You can't really separate them, since ssh may begin working on the first part of the stream before gzip is done with the last part. The commands are run and the pipes are set up simultaneously, so if ssh appears to wait for gzip to finish, it's only because the pipe is buffered.
If you did run and time them separately, the results would (probably) add up to more than the total time of the piped operation.
Right. I tried to run the separately and execution times were greater...
i would expect to see little/no cost to piping, since all it should be doing is setting the stdout file handle in the sending program to be the stdin file handle in the receiving program
If I run command: time gzip -fc1 file.tar | ssh armend@server "cat > /dev/null" -- it finishes in 15.9 seconds.
If I run two parts separately, as: time gzip -fc1 file.tar > new; time cat new | ssh armend@server "cat > /dev/null"....
My time of execution ~9 seconds to compress and 15.5 seconds to send.
Right. gzip and ssh run concurrently, so the time for the whole thing to complete is basically just the time it takes for the slower command (ssh, in your case). Naively, you might say the "cost" of piping is at most 400 milliseconds (ignoring (1) the seek time of your hard drive when reading from a file, (2) the difference in the setup times of gzip and ssh (when ssh may be trying to read but blocks instead), and (3) any slowing effects caused by running two CPU-intensive processes concurrently).
However, I doubt your network speed is consistent enough to read much into the difference between 15.9 seconds and 15.5. If you're curious, run it a few dozen times and see how much it varies. I suspect you'll find this variation makes it impossible to measure any pipe overhead. Even ignoring that variable, the difference between times will still be dominated by the issues mentioned parenthetically in the previous paragraph.
I'm not really sure what you're looking for. What kind of measurement could be considered a "cost" of piping? Difference in setup time perhaps? You might measure that with a homebrew script and a carefully constructed command line, but what good would it do you?
tl;dr - your question is not well defined.