You are not logged in.

#1 2013-06-17 22:50:08

gauthma
Member
Registered: 2010-02-16
Posts: 222
Website

[Solved] diff vs rsync

I have two folders, on different machines, with the same content. One of the machines has access to both folders, through an sshfs mount. The filesystem is the same on both machines: ext4. Running diff -ur a/ b/ yields no output: i.e. the folders' content is the same. However, running "rsync --progress -avz a/ b/"  causes the ALL the content to start being copied (in this example, b/ is the sshfs mount point). Moreover, the folders are about 1 Gb large, and both contain a lot of files. If I interrupt rync's copying, and resume afterwords, the copying start ***at the point where it had been interrupted***. So I guess my question is... what is causing rsync to behave in this way? (not the way in which it resumes where it left of; rather, why does it think that files that are in fact equal (according to diff at least) are different?

Last edited by gauthma (2013-06-18 00:39:11)

Offline

#2 2013-06-17 22:54:52

WonderWoofy
Member
From: Los Gatos, CA
Registered: 2012-05-19
Posts: 8,414

Re: [Solved] diff vs rsync

Access time?  The content of the files may be the same, but the metadata may not be.  If I am not mistaken, rsync will replace older copies of the files with whatever is marked as newer.  I don't think that rsync actually checks the contect of the files anyway, just filenames and access times.  But I amay be totally wrong here.  What if you do "touch b/*" then try again?

Offline

#3 2013-06-17 23:21:24

thisoldman
Member
From: Pittsburgh
Registered: 2009-04-25
Posts: 1,172

Re: [Solved] diff vs rsync

From the rsync man page, under OPTIONS:

-c, --checksum
        This  changes the way rsync checks if the files have been changed and are   
        in need of a transfer.  Without this option, rsync uses a  "quick  check"
        that  (by  default) checks if each file’s size and time of last modifica‐
        tion match between the sender and receiver.  This option changes this  to
        compare  a 128-bit checksum for each file that has a matching size.

Edit: Seems the -c option could mean lots of disk thrashing because both copies have to be read in full to compute the checksums.

Last edited by thisoldman (2013-06-17 23:25:00)

Offline

#4 2013-06-17 23:27:15

progandy
Member
Registered: 2012-05-17
Posts: 5,286

Re: [Solved] diff vs rsync

Why are you using sshfs? I think it would be better to use ssh directly. This way rsync can create checksums and diffs on the remote system. With sshfs, rsync reads the file from sshfs like a local file. That means it has to be transferred over the network first.

       Access via remote shell:
         Pull: rsync [OPTION...] [USER@]HOST:SRC... [DEST]
         Push: rsync [OPTION...] SRC... [USER@]HOST:DEST

Last edited by progandy (2013-06-17 23:32:17)


| alias CUTF='LANG=en_XX.UTF-8@POSIX ' |

Offline

#5 2013-06-18 00:38:56

gauthma
Member
Registered: 2010-02-16
Posts: 222
Website

Re: [Solved] diff vs rsync

Boy can abstractions wreak havoc... following WonderWoofy's suggestion, made all the files in the receiving (remote) end *newer* than the originals. (the local machine was "ahead in time" relatively to the remote one). Given that I was not using rsync's --update option, the wretched thing just starting copying the files all over again, from the beginning (instead of from the point were it had previously stopped). In the end, the solution was to follow progandy's advice, and use ssh directly: now everything works as I'd expect it to.

Thank you for the help!

Offline

Board footer

Powered by FluxBB