You are not logged in.

#26 2012-10-24 19:00:33

graysky
Member
From: The worse toilet in Scotland
Registered: 2008-12-01
Posts: 8,818
Website

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

Ted released a patch in this post which he believes may fix the issue.  He goes on to write, "...we know that my patch definitely restores the behaviour previous to commit eeecef0af5, so it can't hurt, but we do want to make 100% sure that it really fixes the problem. "

I have patched this into 3.6.3 just fine.

Last edited by graysky (2012-10-24 19:06:38)


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#27 2012-10-24 19:10:49

headkase
Member
From: Canada
Registered: 2011-12-06
Posts: 1,583
Website

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

graysky wrote:

Ted released a patch in this post which he believes may fix the issue.  He goes on to write, "...we know that my patch definitely restores the behaviour previous to commit eeecef0af5, so it can't hurt, but we do want to make 100% sure that it really fixes the problem. "

I have patched this into 3.6.3 just fine.

Do you mean you patched it into your Linux-CK kernel repo you maintain?


We all make choices, but in the end, our choices make us.

Offline

#28 2012-10-24 19:15:40

graysky
Member
From: The worse toilet in Scotland
Registered: 2008-12-01
Posts: 8,818
Website

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

@headkase - No.  I haven't updated the AUR or the repo with Ted's restore commit eeecef0af5 patch.

It should be fine since, in Ted's own words,  "...we know that my patch definitely restores the behaviour previous to commit eeecef0af5, so it can't hurt, but we do want to make 100% sure that it really fixes the problem. "  But I am still reluctant to push it since this seems to be an evolving situation... perhaps I'll change my mind.  I dunno.

Last edited by graysky (2012-10-24 19:26:34)


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#29 2012-10-24 19:16:26

headkase
Member
From: Canada
Registered: 2011-12-06
Posts: 1,583
Website

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

Ok, thank you for the clarification graysky. smile


We all make choices, but in the end, our choices make us.

Offline

#30 2012-10-24 19:46:58

graysky
Member
From: The worse toilet in Scotland
Registered: 2008-12-01
Posts: 8,818
Website

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

Consulting higher powers smile

https://bugs.archlinux.org/task/32204


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#31 2012-10-24 19:48:51

Janarto
Member
From: Paris
Registered: 2008-09-23
Posts: 80

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

https://lkml.org/lkml/2012/10/23/690

Ted Tso is pushing a new patch for upgrade but beware with the lastest kernel and changes that were backported

Offline

#32 2012-10-24 19:50:17

karol
Archivist
Registered: 2009-05-06
Posts: 25,433

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

Offline

#33 2012-10-24 20:11:16

graysky
Member
From: The worse toilet in Scotland
Registered: 2008-12-01
Posts: 8,818
Website

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

--Deleted due to thread merge--

Last edited by graysky (2012-10-24 23:18:46)


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#34 2012-10-24 20:22:35

Morn
Member
Registered: 2012-09-02
Posts: 365

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

Is 3.5.6 also affected by this bug or did the kernel developers backport it to 3.5.7 only? If 3.5.6 is also bad, I guess I'd need to downgrade even further.

Offline

#35 2012-10-24 20:25:29

Inxsible
Forum Fellow
From: Chicago
Registered: 2008-06-09
Posts: 9,079

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

Janarto's thread merged.


Forum Rules

There's no such thing as a stupid question, but there sure are a lot of inquisitive idiots !

Offline

#36 2012-10-24 20:45:30

graysky
Member
From: The worse toilet in Scotland
Registered: 2008-12-01
Posts: 8,818
Website

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

Inxsible wrote:

Janarto's thread merged.

Ah, now it looks like I double posted ;p


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#37 2012-10-24 20:47:15

jrussell
Member
From: Cape Town, South Africa
Registered: 2012-08-16
Posts: 510

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

Sorry for a maybe obvious question but - what would the symptoms be? Would you just not be able to boot? Or would the checking fail? Or would you not even realise what has happened untill later?


bitcoin: 1G62YGRFkMDwhGr5T5YGovfsxLx44eZo7U

Offline

#38 2012-10-24 20:49:13

headkase
Member
From: Canada
Registered: 2011-12-06
Posts: 1,583
Website

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

jrussell wrote:

Sorry for a maybe obvious question but - what would the symptoms be? Would you just not be able to boot? Or would the checking fail? Or would you not even realise what has happened untill later?

Your filesystem would be corrupted with unpredictable results.  It might be recovered and you might have lost your high-resolution scan of the Mona Lisa for good.


We all make choices, but in the end, our choices make us.

Offline

#39 2012-10-24 20:54:56

Janarto
Member
From: Paris
Registered: 2008-09-23
Posts: 80

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

Inxsible wrote:

Janarto's thread merged.

Thanks, sorry for the inconvenience

Offline

#40 2012-10-24 21:06:56

mcover
Member
From: Germany
Registered: 2007-01-25
Posts: 133

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

Morn wrote:

Is 3.5.6 also affected by this bug or did the kernel developers backport it to 3.5.7 only? If 3.5.6 is also bad, I guess I'd need to downgrade even further.

3.5.6 does not have said commit "jbd2: don't write superblock when if its empty", therefore I would assume it is is fine. If you look at the git log [1], the tag for 3.5.6 is before that commit.

[1] http://git.kernel.org/?p=linux/kernel/g … 3.5.y;pg=1

Offline

#41 2012-10-24 21:40:18

Morn
Member
Registered: 2012-09-02
Posts: 365

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

Thanks, mcover! I suppose I'll stick with 3.5.6 in that case until 3.7 comes out.  I had originally downgraded from 3.6 because of network issues, which may have been entirely unrelated to the 3.6 kernel of course. But I'm still glad I reverted early. Better safe than sorry where the filesystem is concerned. I've seen quite a few hard drives that were totaled by ReiserFS 3, so I was really hoping EXT4 would prove rock-solid by comparison. smile

Offline

#42 2012-10-25 02:49:16

ontobelli
Member
From: Mexico City
Registered: 2011-02-06
Posts: 127

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

It looks like my original analysis may not have been correct. At least, Eric and I haven't been able to figure out a way to trigger the problem based on my hypothesis of what had been going wrong. Still, the commit in question *does* change things, and so it's still the most likely culprit. (There were no ext4-related changes between v3.6..v3.6.1 and v3.6.2..v3.6.3, and I've looked at all of the changes between v3.6.2 and v3.6.3; all of the other changes look innocuous.) I have a patch (sent around 1:23 am Eastern on Wed., Oct. 24th to the ext4 list on the relevant mail thread) which should revert the problematic change in behavior, as well as put it a check which looks for the original conditions which might have triggered the patch, and prints a warning plus a stack trace so we can really understand what is going on. I don't want to consider this fixed until we have a reproduction case, so we can state with 100% certainty that we understand how it was triggered, and so we know that the proposed patch really does fix things.

That being said, please note that Fedora 17 is apparently on 3.6.2, and so far we only have two users who have reported the problem (or more specifically, both have reproduced file system corruptions with very similar symptoms, one running v3.6.2 and one running v3.6.3). The fact that they have reported the problem on very different hardware (one using a USB stick, the other using a Software RAID-5 setup), means it's not likely a hardware induced problem. However, this could potentially just be bad luck, since the fs corruption that was reported could have been explained by a random hardware glitch. With two users reporting it, though we have to treat it as potentially a real bug, and so I've gone back and re-audited all of the ext4 related commits that went into the v3.6.x stable kernel series.

If you think you have a related, similar bug, please check which kernel version you are using, and get the EXT4 error messages from the syslogs, and report it to me and the ext4 list. And if you can reproduce it reliably, I definitely want to hear from you. :-)

Thanks!!

-- Ted

http://phoronix.com/forums/showthread.p … post293374

FYI.

Offline

#43 2012-10-25 03:05:17

JLloyd13
Member
Registered: 2012-06-24
Posts: 107

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

will this (possable) bug affect all ext4 filesystems mounted or just root? I say this because my / is btrfs but my /home is ext4.. others might be in a likewise place. time to backup.

Offline

#44 2012-10-25 05:22:31

dolby
Member
From: 1992
Registered: 2006-08-08
Posts: 1,581

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

When i woke up my system was unresponsive for the most part and when i tried to log in in a vt i got some I/O error for /dev/sda.
A hard rebooot and the recovery of the journal at startup made the above go away, but i think it must be cause of this.
As there is nothing in logs, how do you check for corrupted data?


There shouldn't be any reason to learn more editor types than emacs or vi -- mg (1)
[You learn that sarcasm does not often work well in international forums.  That is why we avoid it. -- ewaller (arch linux forum moderator)

Offline

#45 2012-10-25 07:38:02

felixonmars
Developer/TU
From: Wuhan, China
Registered: 2011-04-15
Posts: 62
Website

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

I'm lucky sticking at 3.5.6 because of the compcache(zram) issue tongue
Just waiting for the two bugs being solved.


PGP key: 30D7CB92
Key fingerprint: B597 1F2C 5C10 A9A0 8C60  030F 786C 63F3 30D7 CB92

Offline

#46 2012-10-25 08:50:34

blackout23
Member
Registered: 2011-11-16
Posts: 780

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

I don't power down my laptop anyway. Tried btrfs a month ago but the rollback feature did not work quite right following the wiki.

Offline

#47 2012-10-25 12:28:56

punkrockguy318
Member
From: New Jersey
Registered: 2004-02-15
Posts: 707
Website

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

switching to linux-lts was a simple workaround for the time being for me


If I have the gift of prophecy and can fathom all mysteries and all knowledge, and if I have a faith that can move mountains, but have not love, I am nothing.   1 Corinthians 13:2

Offline

#48 2012-10-25 12:44:19

jrussell
Member
From: Cape Town, South Africa
Registered: 2012-08-16
Posts: 510

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

punkrockguy318 wrote:

switching to linux-lts was a simple workaround for the time being for me

My shutdowns with systemd and linux-lts are much much slower than with 3.6.2, any chance you are on systemd and have slower shutdowns with the lts?


bitcoin: 1G62YGRFkMDwhGr5T5YGovfsxLx44eZo7U

Offline

#49 2012-10-25 13:32:05

89c51
Member
Registered: 2012-06-05
Posts: 677

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

As it seems its not as scary as we first thought.

https://plus.google.com/117091380454742 … cc5tMiCgq7

Offline

#50 2012-10-25 15:13:10

Grinch
Member
Registered: 2010-11-07
Posts: 265

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

89c51 wrote:

As it seems its not as scary as we first thought.

https://plus.google.com/117091380454742 … cc5tMiCgq7

Yes, filesystem corruption is a really nasty prospect and it's a good thing this problem has been identified, thankfully though it seems very hard to get bitten by this.

Offline

Board footer

Powered by FluxBB