You are not logged in.

#1 2013-08-22 12:15:53

oojeiph01
Member
Registered: 2013-08-22
Posts: 3

[solved] Data corruption while reading from disk

Hi

Recently I had problems copying big files on my system and then I noticed a strange behaviour.
I booted an Arch Live CD (so it should be a clean system), mounted any of my hard drives and then I did the following:

dd if=/dev/urandom of=bigfile bs=1M count=2048
echo 3 > /proc/sys/vm/drop_caches
md5sum bigfile
echo 3 > /proc/sys/vm/drop_caches
md5sum bigfile

And I got two different checksums!

I tried it on different hard drives so I guess it is not a hard drive failure.
It happens only when processing big files (> 1GB).
I ran Memtest86 and it said everything is ok.

Now I think it could be some hardware failure of the mainboard.
But before exchanging parts of the system I'd like to ask if someone of you guys has any advice for me or tell me what could be the cause of such a behaviour?

Among other things I use this system to run virtual machines. And if I do the same thing in a virtual machine (running on the same system, on the same hard drive) everything is fine.
It's a Desktop PC with an Asus P5Q Mainboard and a Intel Core2Duo processor. I used the Arch Linux 2013-04-01 Live CD.

Thanks for any help and let me know if you need further informations.

Greets

Last edited by oojeiph01 (2013-08-22 17:31:14)

Offline

#2 2013-08-22 13:18:21

alphaniner
Member
From: Ancapistan
Registered: 2010-07-12
Posts: 2,810

Re: [solved] Data corruption while reading from disk

According to the drop_caches entry in the sysctl docs, "the user should run sync first".

When I'm doing anything with dd I always just run sync afterwards. The purpose of sync is to "flush file system buffers". So when the prompt returns after running sync, it's a pretty good indication that the buffers have been flushed. The same isn't true of the echo command. When your command prompt is returned after that command, it just means the echo command completed. It has nothing to do with whether or not all data has been successfully flushed to disk.

Last edited by alphaniner (2013-08-22 13:19:05)


But whether the Constitution really be one thing, or another, this much is certain - that it has either authorized such a government as we have had, or has been powerless to prevent it. In either case, it is unfit to exist.
-Lysander Spooner

Offline

#3 2013-08-22 13:41:22

oojeiph01
Member
Registered: 2013-08-22
Posts: 3

Re: [solved] Data corruption while reading from disk

alphaniner wrote:

According to the drop_caches entry in the sysctl docs, "the user should run sync first".

When I'm doing anything with dd I always just run sync afterwards. The purpose of sync is to "flush file system buffers". So when the prompt returns after running sync, it's a pretty good indication that the buffers have been flushed. The same isn't true of the echo command. When your command prompt is returned after that command, it just means the echo command completed. It has nothing to do with whether or not all data has been successfully flushed to disk.

Thanks for your quick answer. Of course you are right and I should have used sync after generating the random file. But the purpose of "echo 3 > /proc/sys/vm/drop_caches" was to make sure that the file is read from disk and not from cache in memory. I retried it using the sync command:

# dd if=/dev/urandom of=bigfile bs=1M count=2048
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 144.327 s, 14.9 MB/s

# sync

# md5sum bigfile 
b551b97bf112d882e33a0835ba5c8f07  bigfile

# md5sum bigfile 
b551b97bf112d882e33a0835ba5c8f07  bigfile

# echo 3 > /proc/sys/vm/drop_caches 

# md5sum bigfile 
efc158b842012da157396ef65280787e  bigfile

# echo 3 > /proc/sys/vm/drop_caches 

# md5sum bigfile 
a5ce3538d0a7ad3264fadd2d082878d1  bigfile

As you can see if the file is still in cache, md5sum reports the same checksum as expected. But each time I clear the cache, forcing it to read the data from disk, I get another result. It just makes no sense to me, because the system is running fine except this problem. Why are there no errors when booting the system or running a VM which also requires reading a lot of data?

Offline

#4 2013-08-22 17:30:54

oojeiph01
Member
Registered: 2013-08-22
Posts: 3

Re: [solved] Data corruption while reading from disk

Ok, it looks like Memtest didn't run long enough. Or it's some kind of other fault. Anyway the problem seems to be fixed if I use other RAM modules...

Offline

Board footer

Powered by FluxBB