You are not logged in.

#1 2010-07-02 08:26:23

Spacenick
Member
From: Germany
Registered: 2010-04-02
Posts: 168

EXT4 Read Corruption

Hello Arch community,
I've got some really weired problem. First I thought it was simple standard bugs that made some applications crash after a reboot (probably since the last kernel + glibc update).
But further investigation shows that I currently have the following situation:

Every reboot my rootfs (ext4)  is normally mounted but when I log in some files are read in a corrupted way.
This makes several applications crash because they can't read their libs and they crash with segemntation faults, illegal hardware instructions or because they can't read the gconf. It also happend that my Wallpaper didn't load anymore.
This seems to be only happening to files on my ext4 partition which is mainly non home stuff.

The really weired thing now is that, with every reboot other files are read corrupted and the files that were corrupted in the last reboot seem to be ok again. I know it sounds crazy but I could actually verify this by md5summing a lib that lead to a program crash, reboot (without chaning anything) and after the reboot it had a different md5 hash. Files written to disk don't seem to get corrupted, wrote a random file with dd and tee'ed the output to md5sum and it's not corrupted after some reboots. i also couldn't make it happen to any of 1000 random data files i had created for this purpose. But it seems to quite randomly hit files on the partition.

I also had zsh crash with a "Bus error" just to work normaly in the next reboot.
One maybe important thing I noticed, is that it seems that every second reboot or so, fsck runs and says there is a filesystem with errors and I also had the following in my everything.log

Jun 30 00:33:34 mercury kernel: EXT4-fs error (device sda6): htree_dirblock_to_tree: bad entry in directory #1057298: directory entry across blocks - block=4203654offset=0(0), inode=1936016475, rec_len=226408, name_len=111
Jun 30 00:33:34 mercury kernel: EXT4-fs error (device sda6): htree_dirblock_to_tree: bad entry in directory #1057298: directory entry across blocks - block=4203654offset=0(0), inode=1936016475, rec_len=226408, name_len=111

I also checked booting to kernel-lts several times but the problem happens there as well.
However Mac OS X on the same machine runs without problems and using a LiveCD I couldn't find something weired with the partition either fsck says it's ok and even chrooting to it and running several oif the applications that crash sometimes worked there. However it also happens that with some boots everything seems to be working.

This is really the craziest bug I've ever encountered and I'm really left with no idea what could cause it.
Maybe someone here has a fresh idea, else I'll probably report to LKML.
Greetings Niklas

EDIT: This could be related, though I'm not using LVM http://lkml.org/lkml/2010/5/20/221

Last edited by Spacenick (2010-07-02 08:27:15)

Offline

Board footer

Powered by FluxBB