You are not logged in.

#1 2014-04-20 20:15:55

htuttle
Member
Registered: 2014-04-20
Posts: 3

Ten bytes difference in hard drive images of same drive

Hi,
  I was recently imaging a 8GB hard drive using dc3dd running on Arch Linux. I ran into a problem that to me seems pretty weird and I'm not sure where to really begin trying to figure out what went wrong. Any help would be greatly appreciated, even just throwing out a few ideas about what might have gone wrong that I could look into on my own. Hopefully this is the right forum for this question, and if not I look forward to the ensuing excoriation.
  So, here's the problem...
  I hook up a 3.5'' IDE cable from my 8GB hard drive to the input of a write-blocker, and then hook up the write-blocker USB output cable to my laptop running Arch Linux. The device is recognized and I can see it sitting at "/dev/sdb", not mounted and I never mount it. I then plug in a 500GB usb external hard drive, I can see it at "/dev/sdc/", and I mount the external drive at "/mnt/ext-drive". Then I image the hard drive with "dc3dd if=/dev/sdb hash=sha1 log=~/mylog.txt hof=/mnt/ext-drive/image1.img", which also calculates the sha1sum of the input and the output. The sha1sum of the input and the output are the same and equal "4cbd040533a2f43fc6691d773d510cda70f4126a". 
So, then...
  I hook up a 3.5'' IDE-to-USB adapter cable from my 8GB hard drive directly to my laptop running Arch Linux (no write blocker this time). The device is recognized and I can see it sitting at "/dev/sdb", not mounted and I never mount it. I then plug in a 500GB usb external hard drive, I can see it at "/dev/sdc/", and I mount the external drive at "/mnt/ext-drive". Then I image the hard drive with "dc3dd if=/dev/sdb hash=sha1 log=~/mylog2.txt hof=/mnt/ext-drive/image2.img", which also calculates the sha1sum of the input and the output. The sha1sum of the input and the output are the same and equal "a530b3c812dd27d972c709c15744c7ccb3f062eb". Different, right? Indeed, diff say "binary files differ". cmp shows the differences, there are only ten bytes that are different, five locations randomly in the middle of one of the partitions in word-sized pairs...
But, wait, there's more...
  Again, I hook up a 3.5'' IDE cable from my 8GB hard drive to the input of a write-blocker, and then hook up the write-blocker USB output cable to my laptop running Arch Linux. The device is recognized and I can see it sitting at "/dev/sdb", not mounted and I never mount it. I then plug in a 500GB usb external hard drive, I can see it at "/dev/sdc/", and I mount the external drive at "/mnt/ext-drive". Then I image the hard drive with "dc3dd if=/dev/sdb hash=sha1 log=~/mylog.txt hof=/mnt/ext-drive/image3.img", which also calculates the sha1sum of the input and the output. The sha1sum of the input and the output are the same and equal "4cbd040533a2f43fc6691d773d510cda70f4126a".  diff says image1.img and image3.img are exactly the same.
Can anyone help me figure out what the hell is going on? Could this problem be stemming from write-caching on the drive itself, or maybe caching that linux does? If this is a software issue with linux, where would I begin to look in the source to see what is going on and whether the drive is actually being written to even though it is not mounted. Or is this not possible and more likely a hardware issue? Could the IDE-to-USB cable be at fault? Any ideas would be greatly appreciated. Thank you very much. Cheers,
h

Last edited by htuttle (2014-04-21 22:47:14)

Offline

#2 2014-04-21 16:05:23

R00KIE
Forum Fellow
From: Between a computer and a chair
Registered: 2008-09-14
Posts: 4,734

Re: Ten bytes difference in hard drive images of same drive

If I had to guess I'd point my finger at the write blocker first. If you can connect the disk to a native IDE interface and image it again, I'd say the image done that way should serve as reference.

You can also try to read the same bytes directly from the disk, with the write blocker and ide-to-usb adapter, and check what comes out. Compare the results with what you have in the images. I would also check if there are any smart errors that could point to a failing disk.

I have no experience with write blockers so I can't be of more help. I'm not sure you will get much help here since it is a very specific topic, you may want to look for forums specializing in data recovery / forensic analysis, since that seems to be what you are trying to do.


R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K

Offline

#3 2014-04-21 22:46:47

htuttle
Member
Registered: 2014-04-20
Posts: 3

Re: Ten bytes difference in hard drive images of same drive

Thanks for the response R00kie,
 
  I'll try a computer forensics forum for help with the write-blocker and ide-to-usb cable issue.

  But, regarding Arch/Linux, do you know whether or not any data should/does/could get written to an unmounted drive when an external usb drive is initially plugged in (but not mounted). I know Arch does not mount partitions by default and so I assumed there should be no data ever getting written to anyplace on the external usb drive, but could such a thing somehow happen just by plugging the drive in?

Cheers,
htuttle

Offline

#4 2014-04-22 00:39:46

R00KIE
Forum Fellow
From: Between a computer and a chair
Registered: 2008-09-14
Posts: 4,734

Re: Ten bytes difference in hard drive images of same drive

I think it might happen if you are dealing with lvm or raid.

For lvm I think that if you have any lvm VG on the disk, then the LVs in that VG will be made available automatically, I don't know if this implies any writes to the disk to update any lvm metadata.

For raid arrays I suppose there are some cases where write operations may be initiated automatically, besides the possible metadata update when the array is assembled. I'm thinking about the cases when the machine was shut down before a rebuild/resync could finish and it might be resumed automatically once all raid members are present (I'm not really sure on this one).

I don't know if there are any filesystems for which the kernel might resume unfinished background operations as soon as the filesystem is detected. The only example I am sure about, ext4's lazy init, requires you to mount the partition for the background operation to resume.

Except for when something tries to be smart and automount partitions, I'd say that it should be quite safe to plug the disk directly if you are dealing with a disk with "simple" partitions.

If you want to do an image for forensic reasons then a write blocker is a must as I suppose you have to prevent any kind of data modification.

On another note I may be terrible wrong, so don't take my words as authoritative in any way wink Maybe someone else will chime in and either confirm or deny what I have just written.


R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K

Offline

#5 2014-04-22 08:13:24

mich41
Member
Registered: 2012-06-22
Posts: 796

Re: Ten bytes difference in hard drive images of same drive

It may have been some random error of this second adapter. If something has been written to this HDD then re-reading it using the first adapter shouldn't give the same result as before. It's unlikely that modifications stayed in HDD cache and haven't been flushed before power-down because the drive had lots of time to flush them.

FWIW, I've imaged few HDDs and memory cards on Arch and I've never seen any spontaneous modifications. And, again FWIW, I've had one USB-IDE adapter go crazy during operation and start responding with errors to every IO request so it obviously wasn't quite rock solid. So much for my anecdotal "evidence" smile

Have you used these adapters in the past? Do you trust them?

Last edited by mich41 (2014-04-22 08:46:57)

Offline

#6 2014-04-23 15:57:38

htuttle
Member
Registered: 2014-04-20
Posts: 3

Re: Ten bytes difference in hard drive images of same drive

mich41 wrote:

It may have been some random error of this second adapter. If something has been written to this HDD then re-reading it using the first adapter shouldn't give the same result as before. It's unlikely that modifications stayed in HDD cache and haven't been flushed before power-down because the drive had lots of time to flush them.

FWIW, I've imaged few HDDs and memory cards on Arch and I've never seen any spontaneous modifications. And, again FWIW, I've had one USB-IDE adapter go crazy during operation and start responding with errors to every IO request so it obviously wasn't quite rock solid. So much for my anecdotal "evidence" smile

Have you used these adapters in the past? Do you trust them?

I've used both the adapters in the past, the one without the write-blocker I have used to make images of drives for backups and I have re-created functioning backups from those images (but I didn't check that *every* byte was exactly the same as the original, say with an md5 hash). The adapter with the write-blocker is newer, and I've used it to make images that are byte-for-byte the same...

So, it's looking more and more like the culprit is probably the USB-to-IDE adapter cable...

Just for my own education, do you guys know where in the source I would look to find out more about how a hot-plugged external usb hard drive is recognized. Is that the job of the kernel, or is it some utility's job? Cheers,
h

Offline

#7 2014-04-23 20:18:20

R00KIE
Forum Fellow
From: Between a computer and a chair
Registered: 2008-09-14
Posts: 4,734

Re: Ten bytes difference in hard drive images of same drive

htuttle wrote:

Just for my own education, do you guys know where in the source I would look to find out more about how a hot-plugged external usb hard drive is recognized. Is that the job of the kernel, or is it some utility's job?

If I'm not wrong, you would have to look into the kernel and udev.


R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K

Offline

Board footer

Powered by FluxBB