You are not logged in.

#51 2012-10-25 15:29:05

dash
Member
Registered: 2012-02-12
Posts: 8

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

What if I mount my ext4 filesystem using "ext3" as the filesystem type?

Offline

#52 2012-10-25 16:08:52

oldtimeyjunk
Member
From: /world/europe/uk/england
Registered: 2011-04-30
Posts: 202
Website

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

demaio wrote:
oldtimeyjunk wrote:

My laptop is running 3.6.2, and I've done countless reboots, including having to go through several accidental power outages when I've not had a battery in. Nothing has happened...

Just because nothing happened to you doesn't mean Ted Ts'o is wrong ;-)

I didn't say he was wrong.


"... being a Linux user is sort of like living in a house inhabited by a large family of carpenters and architects. Every morning when you wake up, the house is a little different. Maybe there is a new turret, or some walls have moved. Or perhaps someone has temporarily removed the floor under your bed." - Unix for Dummies, 2nd Edition

Offline

#53 2012-10-25 18:38:58

Roken
Member
From: South Wales, UK
Registered: 2012-01-16
Posts: 1,281

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

So it looks like teh device has to be (amongst other things) mounted with -nobarrier.

I have no idea what -nobarrier is or does, and I certainly don't use it as a mount option. Is this something that is likely to be set by default, or would the user have to specify it in mount options?


Ryzen 5900X 12 core/24 thread - RTX 3090 FE 24 Gb, Asus Prime B450 Plus, 32Gb Corsair DDR4, Cooler Master N300 chassis, 5 HD (1 NvME PCI, 4SSD) + 1 x optical.
Linux user #545703

/ is the root of all problems.

Offline

#54 2012-10-25 19:07:26

teateawhy
Member
From: GER
Registered: 2012-03-05
Posts: 1,138
Website

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

phoronix forums wrote:

... Actually using the nobarrier mount options (which can only be used if you are using hardware raid or an enterprise storage array) is actually pretty rare outside ...

It seems you will have to specify the '-nobarrier' mount option to set it.

phoronix forums wrote:

Right now new features get added under experimental feature flags or mount options. One of the users who ran into problems were using experimental new features that are not enabled by default.

Edit:
Also you need to do the following to reproduce the bug.
umount -l
Now do not wait until umount is finished. (common sense?)
poweroff

This turns out not to be as bad as it was perceived in the beginning.

http://cdn.memegenerator.net/instances/ … 936247.jpg

Last edited by teateawhy (2012-10-25 19:16:03)

Offline

#55 2012-10-25 22:02:03

punkrockguy318
Member
From: New Jersey
Registered: 2004-02-15
Posts: 711
Website

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

jrussell wrote:
punkrockguy318 wrote:

switching to linux-lts was a simple workaround for the time being for me

My shutdowns with systemd and linux-lts are much much slower than with 3.6.2, any chance you are on systemd and have slower shutdowns with the lts?

for whatever reason my shutdowns and boots and overall performance seems a little sluggish compared to 3.6.2.  i don't remember it being this slow when 3.0.x was out and running on this machine, but I can live with it until this gets patched

i'm glad to hear i'm not the only one with performance issues with this kernel and current userspace

edit: after rebooting into 3.6.3, my performance is a lot better -- i'm just sticking with that for now since I don't use nobarriers

Last edited by punkrockguy318 (2012-10-25 22:09:00)


If I have the gift of prophecy and can fathom all mysteries and all knowledge, and if I have a faith that can move mountains, but have not love, I am nothing.   1 Corinthians 13:2

Offline

#56 2012-10-25 23:11:49

Roken
Member
From: South Wales, UK
Registered: 2012-01-16
Posts: 1,281

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

Well, then. Bottom line:

There's a set of circumstances that have to occur before the bug kicks in:

1. you need to be using an enterprise storage or raid configuration
2. You need to mount the FS with the "nobarrier" switch
3. You then need to reboot twice in succession quickly

The combination of all three is rare (I've been a Linux user a long time and didn't even know about the nobarrier switch). The average user is NOT going to be affected by this.


Ryzen 5900X 12 core/24 thread - RTX 3090 FE 24 Gb, Asus Prime B450 Plus, 32Gb Corsair DDR4, Cooler Master N300 chassis, 5 HD (1 NvME PCI, 4SSD) + 1 x optical.
Linux user #545703

/ is the root of all problems.

Offline

#57 2012-10-26 13:57:36

Xi0N
Member
From: Bilbao - Spain
Registered: 2007-11-29
Posts: 832
Website

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

Roken wrote:

Well, then. Bottom line:

There's a set of circumstances that have to occur before the bug kicks in:

1. you need to be using an enterprise storage or raid configuration
2. You need to mount the FS with the "nobarrier" switch
3. You then need to reboot twice in succession quickly

The combination of all three is rare (I've been a Linux user a long time and didn't even know about the nobarrier switch). The average user is NOT going to be affected by this.

raid configuration... as in software raid?
I don't use the nobarrier thingy so I guess I'm safe.....

Offline

#58 2012-10-27 23:23:15

NullNix
Member
Registered: 2012-10-27
Posts: 1

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

The original reporter here (an archlinux user for years, if not on the failing system, which is a homebrew LFS for the sake of maintaining insane degrees of control over precisely what is running on it).

nobarrier turns off barriers, which are indications in the stream of data in transit to the disk that everything before the barrier must be written before anything after the barrier is written. They're critical for maintaining data integrity: among other things, ext4 journal commits drop a barrier into the stream. Obviously, they have quite a performance hit, but it's one you have to eat unless you like massive data loss.

There is one single situation in which you can safely avoid using barriers, and get the nice big write speed improvement: if you know that once you have sent your data to the disk controller it *will* hit the disk, no matter what. It's up to you to define 'no matter what', but generally the definition used is 'even if the power fails before the disk write is complete'. I can only think of two situations where that is true: if you have a SAN where the disk power supplies are independent of the supply of the local machine, and if you have a battery-backed hardware RAID array (with a working battery). Software RAID is definitely not safe -- it's actually even less safe than a single normal disk. A PSU is also not safe, because PSUs can run out while the disk controller cache is non-empty. (The battery on hardware RAID arrays is sized so that it will never run out even if the entire cache needs flushing. Since on my array that takes around a second this is not a terribly onerous requirement for a battery.)

You can generate scenarios in which you can lose data with nobarrier even in the presence of battery-backed hardware RAID (a battery that fails at *just* the second the power fails; an earthquake; an EMP from a high-altitude nuke; invasion by a flotilla of semiconductor-eating aliens). But they're contrived enough that they can generally be ignored.

Offline

#59 2012-10-28 23:31:07

yaffare
Member
Registered: 2011-12-29
Posts: 71

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

As 3.6.4 being released, anybody knows if it contains the fix?


systemd is like pacman. enjoys eating up stuff.

Offline

#60 2012-10-29 02:03:35

ontobelli
Member
From: Mexico City
Registered: 2011-02-06
Posts: 127

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

yaffare wrote:

As 3.6.4 being released, anybody knows if it contains the fix?

No, there is no fix because there is no "problem" for 99.99% of users.

https://bbs.archlinux.org/viewtopic.php … 4#p1181344

Offline

#61 2012-11-01 04:57:58

ontobelli
Member
From: Mexico City
Registered: 2011-02-06
Posts: 127

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

Offline

#62 2012-11-02 01:05:30

oldtimeyjunk
Member
From: /world/europe/uk/england
Registered: 2011-04-30
Posts: 202
Website

Re: [SOLVED] EXT4 Data Corruption Bug Linux 3.6.2 & 3.6.3

Wooohoooo!


"... being a Linux user is sort of like living in a house inhabited by a large family of carpenters and architects. Every morning when you wake up, the house is a little different. Maybe there is a new turret, or some walls have moved. Or perhaps someone has temporarily removed the floor under your bed." - Unix for Dummies, 2nd Edition

Offline

Board footer

Powered by FluxBB