You are not logged in.

#1 2013-07-28 16:43:18

g3n3r1c
Member
Registered: 2013-07-16
Posts: 17

Filesystem corruption --- how to prevent?

Sometimes one has to do a hard reset.

When reading about btrfs, its 'copy-on-write'-feature got my attention. 1. Does btrfs completely prevent filesystem corruption on a hard reset?

I'm asking because recently my / (ext4) filesystem on my laptop got corrupted after a hard reset. I blame powertop's suggestion of setting /proc/sys/vm/dirty_writeback_centisecs to 1500, which I'll never do again, I guess. Lost no data, so it's not important. But I have to setup this laptop again --- ext4 or btrfs or something other? (in any case backups will be made --- so btrfs's experimental status doesn't bother me).

2. Are there any filesystem-agnostic system settings which prevent filesystem corruption on a hard reset? If so --- do they have performance penalities or other drawbacks?

Also: 3. What about the 'sync' mount option?

Also: 4. What about other layers like lvm, dm-crypt, dm-raid etc?

Last edited by g3n3r1c (2013-07-28 16:51:23)

Offline

#2 2013-07-28 17:03:00

Scimmia
Fellow
Registered: 2012-09-01
Posts: 11,952

Re: Filesystem corruption --- how to prevent?

I'm using btrfs on a laptop with a bad battery and a loose power port. Sudden losses of power are too common. The only issue I've had is the space cache getting corrupted, in which case I just remount with the clear_cache option.

Offline

#3 2013-07-28 17:21:01

g3n3r1c
Member
Registered: 2013-07-16
Posts: 17

Re: Filesystem corruption --- how to prevent?

@Scimmia: Thanks for replying! A quick question about btrfs: Did you encounter any other issues than the space cache thing?

Offline

#4 2013-07-28 17:33:21

Scimmia
Fellow
Registered: 2012-09-01
Posts: 11,952

Re: Filesystem corruption --- how to prevent?

Nope. Been running that way on that laptop for about a year now (the battery worked before that), the space cache issue has happened to me three times and no other problems have popped up at all. I've had sudden power losses hundreds of times. I specifically chose btrfs for it's data integrity features.

Offline

#5 2013-07-28 17:42:20

g3n3r1c
Member
Registered: 2013-07-16
Posts: 17

Re: Filesystem corruption --- how to prevent?

@Scimmia: OK I' m sold --- I'll use btrfs too for the laptop smile

I'm not marking this thread as 'solved' for now because I'd like to read more about these things.

Offline

#6 2013-07-28 20:10:45

firekage
Member
From: Eastern Europe, Poland
Registered: 2013-06-30
Posts: 623

Re: Filesystem corruption --- how to prevent?

Scimmia wrote:

I'm using btrfs on a laptop with a bad battery and a loose power port. Sudden losses of power are too common. The only issue I've had is the space cache getting corrupted, in which case I just remount with the clear_cache option.

Could you write what to do in order to start Arch with clear_cache option? I'm learning Arch wink

Offline

#7 2013-07-28 20:18:00

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,636
Website

Re: Filesystem corruption --- how to prevent?

Is it true that there is not fsck util that can repair for btrfs at this point in time?  If so, why in the world would you [scimmia] use that fs on your laptop with the problems you described present?


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#8 2013-07-28 20:19:34

WonderWoofy
Member
From: Los Gatos, CA
Registered: 2012-05-19
Posts: 8,414

Re: Filesystem corruption --- how to prevent?

The clear_cache option is a mount option.  You should do some research into how to change mount options in general. If you are going to be using btrfs, you really really need to know about these kinds of things.  It would probably do you some good to read through the btrfs wiki as well (not the Arch wiki's btrfs page, but the actual btrfs wiki).  There is a lot of info that that is essential to understanding how to use btrfs and what it can offer.  I think that it is a good idea for any distribution, but it is particularly a good idea if you use something like Arch Linux which requires that you configure all the thigns yourself.

Offline

#9 2013-07-28 20:22:40

WonderWoofy
Member
From: Los Gatos, CA
Registered: 2012-05-19
Posts: 8,414

Re: Filesystem corruption --- how to prevent?

graysky wrote:

Is it true that there is not fsck util that can repair for btrfs at this point in time?  If so, why in the world would you [scimmia] use that fs on your laptop with the problems you described present?

No btrfsck (and "btrfs check" which is the same thing) feature the ability to repair the filesystem these days.  It used to be that it did not, but there have been drastic improvments to all aspects of btrfs.

That said, the btrfsck tool is not what you would likely resort to first in the event of filesystem damage.  It is recommended to first try mounting with the recovery mount option.  There is an explanation of the steps to take in the btrfs wiki.

Offline

#10 2013-07-28 20:31:30

WorMzy
Administrator
From: Scotland
Registered: 2010-06-16
Posts: 12,318
Website

Re: Filesystem corruption --- how to prevent?

btrfs is a copy-on-write filesystem, so if something goes wrong (i.e. power cut) mid-write, you still have a usable copy of the data. It's not infallible, and it's still under heavy development, but it's pretty useful in situations like the one Scimmia has encountered. If ext4 was used in this situation, a filesystem corruption would be a regular occurrence.


Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD

Making lemonade from lemons since 2015.

Offline

#11 2013-07-28 20:45:37

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,636
Website

Re: Filesystem corruption --- how to prevent?

Ah, I stand corrected.


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#12 2013-07-28 20:50:07

roentgen
Member
Registered: 2011-03-15
Posts: 91

Re: Filesystem corruption --- how to prevent?

> Sometimes one has to do a hard reset.

You should first try sysrq.
Check the wiki how to enable this and try to reboot using Alt+Print Screen+REISUB.

Offline

#13 2013-07-28 20:57:53

g3n3r1c
Member
Registered: 2013-07-16
Posts: 17

Re: Filesystem corruption --- how to prevent?

@WonderWoofy: I know about the btrfs wiki ...

@graysky: Scimmia already stated he uses btrfs on his laptop because of an unreliable battery/power supply.

@roentgen: I know about the magic sysrequest ... but that doesn't work in every situation and enabling it leads to security concerns --- Sometimes one has to do a hard reset.

Edit: @all: Thank you for replying! Discussion on this topic is something I'm interested in for a long time.

Last edited by g3n3r1c (2013-07-28 21:06:22)

Offline

#14 2013-07-28 21:27:38

cfr
Member
From: Cymru
Registered: 2011-11-27
Posts: 7,144

Re: Filesystem corruption --- how to prevent?

With the default settings for ext4 etc., I had very good experiences with that, to be honest. I had frequent and repeated shutdowns of the dirtiest kind and I never lost data except whatever of my work was new since my last save or autosave. I am talking about shutdowns of the sort you'd expect if you had a sudden power cut on a desktop with no backup. Even though this is a laptop, the shutdowns were always like that. Instant death. I never got filesystem corruption on ext4, though. Usually the system needed to replay part of the filesystem journal but it always recovered it automatically - I never even had to run fsck manually to recover. Given the way my laptop was behaving at the time - and this went on for months thanks to Lenovo's crappy customer "service" - I was fairly impressed. I had laptop mode tools set to let no more than 360s elapse with unsaved data on AC (when most of the crashes happened) or 600s on battery (only 1 or maybe 2 of the incidents). Those are just the defaults, I think.

Were you using non-default options with ext4? (Not journalling or changes to the use of barriers etc.)

Last edited by cfr (2013-07-28 21:30:03)


CLI Paste | How To Ask Questions

Arch Linux | x86_64 | GPT | EFI boot | refind | stub loader | systemd | LVM2 on LUKS
Lenovo x270 | Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz | Intel Wireless 8265/8275 | US keyboard w/ Euro | 512G NVMe INTEL SSDPEKKF512G7L

Offline

#15 2013-07-28 21:47:46

g3n3r1c
Member
Registered: 2013-07-16
Posts: 17

Re: Filesystem corruption --- how to prevent?

cfr wrote:

Were you using non-default options with ext4? (Not journalling or changes to the use of barriers etc.)

I was using (wrote it all down smile):

Creation:

mke2fs -vt ext4 -E lazy_itable_init=1 -O dir_index,uninit_bg /dev/disk/by-uuid/XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX

(for every volume)

/etc/fstab

tmpfs /tmp tmpfs auto,async,noatime,nodev,nodiratime,noexec,nosuid,nouser,rw 0 0
UUID=037c4679-a4ef-42a1-bc9e-17a4c41b6283 swap swap pri=1 0 0
UUID=fd8bf3d6-9693-41c2-95ea-c39b3b0d9443 / ext4 auto,async,dev,exec,nouser,noatime,nodiratime,suid,rw 0 1
UUID=4a8985e4-be31-4757-88cd-5f958e4ba6b9 /boot ext4 auto,async,noatime,nodev,nodiratime,noexec,nosuid,nouser,rw 0 1
UUID=b5826354-772c-462f-86f5-c55b625b11cc /var ext4 auto,async,noatime,nodev,nodiratime,noexec,nosuid,nouser,rw 0 1
UUID=6f47fe27-3cbe-4f72-bd3b-d0d10fc4fff3 /home ext4 noauto,async,noatime,nodev,nodiratime,nouser,rw,acl,x-systemd.automount 0 2

I think this does nothing to journaling.


The system was using a cryptdevice directly on disk without partitions with lvm on top and ext4 on top of the lvms, boot was a partition on an usb key.


When / got corrupted i was still able to access /var and /home with a live medium.

Last edited by g3n3r1c (2013-07-28 22:07:30)

Offline

#16 2013-07-28 22:11:24

Scimmia
Fellow
Registered: 2012-09-01
Posts: 11,952

Re: Filesystem corruption --- how to prevent?

graysky wrote:

Is it true that there is not fsck util that can repair for btrfs at this point in time?  If so, why in the world would you [scimmia] use that fs on your laptop with the problems you described present?

btrfsck exists, but even now, it is pretty basic and doesn't repair things like the fsck tools for other filesystems.

That said, one of the major goals of btrfs is to never have filesystem corruption that needs fsck in the first place. Everything is checked in real time as it's written and read, COW ensures that nothing is overwritten until after an updated copy is fully written somewhere else, metadata is duplicated and checked each time it's accessed, plus other features. It is designed from the ground up for data integrity.

ZFS doesn't have a fsck tool at all, yet it's considered one of the most stable filesystems out there for data integrity.

Offline

#17 2013-07-28 22:11:31

cfr
Member
From: Cymru
Registered: 2011-11-27
Posts: 7,144

Re: Filesystem corruption --- how to prevent?

Mine were created with default options but I can't see that making a difference. My effective mount options were rw,noatime,data=ordered (with errors=remount-ro for the root partition). But data=ordered is default, I believe (I don't specify it in fstab - it just appears in mount's output) and noatime is a superset of nodiratime. async is the only one I'm not sure about.

By the way, the final digit in the fstab lines for /boot and /var should be a "2". Only the root partition should be "1".  And you can remove the line for tmpfs if you use systemd as it is no longer necessary.

EDIT: Oh, and I was also using LVM-on-LUKS but with an unencrypted /boot and ESP on hard disk outside the LUKS container.

Last edited by cfr (2013-07-28 22:13:11)


CLI Paste | How To Ask Questions

Arch Linux | x86_64 | GPT | EFI boot | refind | stub loader | systemd | LVM2 on LUKS
Lenovo x270 | Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz | Intel Wireless 8265/8275 | US keyboard w/ Euro | 512G NVMe INTEL SSDPEKKF512G7L

Offline

#18 2013-07-28 22:17:56

g3n3r1c
Member
Registered: 2013-07-16
Posts: 17

Re: Filesystem corruption --- how to prevent?

cfr wrote:

async is the only one I'm not sure about

'async' is a subset of 'defaults'.

cfr wrote:

By the way, the final digit in the fstab lines for /boot and /var should be a "2". Only the root partition should be "1".  And you can remove the line for tmpfs if you use systemd as it is no longer necessary.

Thank you for mentioning that --- I'll do that the next time i setup ext4.

Edit: Funny fact: I had to hard reset because the system locked up completely when running Windows in qemu kvm ...

Last edited by g3n3r1c (2013-07-28 22:28:33)

Offline

#19 2013-07-29 16:39:52

g3n3r1c
Member
Registered: 2013-07-16
Posts: 17

Re: Filesystem corruption --- how to prevent?

@Scimmia: May I ask for the mount options you use for your btrfs-(sub-)-volumes? I just setup the laptop with btrfs. Do you think "inode_cache" (btrfs mount options) is safe? I also wonder if i should use "recovery","check_int" and "check_int_data".

Last edited by g3n3r1c (2013-07-29 16:46:00)

Offline

#20 2013-07-29 16:46:31

WonderWoofy
Member
From: Los Gatos, CA
Registered: 2012-05-19
Posts: 8,414

Re: Filesystem corruption --- how to prevent?

inode_cache *should* be safe.  But I have read on the mailing list of people having issues with it when trying to use some of the more advanced and less mature features.

I know you asked Scimmia, but I use noatime,compress=lzo,autodefrag.  With the most recent kernels, space_cache is on by default, and the ssd option is automatically detected and applied if necessary.  If you use an old or cheap ssd, it may be better to use ssd_spread, but this is not usually the case.  Also, the btrfs people recommend that you do not use the discard (TRIM) mount option, but instead run fstrim occasionally as necessary.  I just have an anacron job that runs fstrim on my btrfs filesystem on a weekly basis.

Offline

#21 2013-07-29 16:52:19

g3n3r1c
Member
Registered: 2013-07-16
Posts: 17

Re: Filesystem corruption --- how to prevent?

@WonderWoofy: Thanks for sharing your knowledge smile. I guess I'll disable discard then! In terms of "ssd"/"ssd_spread" I'm unsure --- the laptop has an SSD with Indilinx Barefoot controller (it supports TRIM though --- was one of the first to support it if i remember correctly).

Last edited by g3n3r1c (2013-07-29 16:52:48)

Offline

#22 2013-07-29 16:58:40

WonderWoofy
Member
From: Los Gatos, CA
Registered: 2012-05-19
Posts: 8,414

Re: Filesystem corruption --- how to prevent?

I'd say let it use the default 'ssd'.  It will do it on its own... though it doesn't hurt anything to specify the mount option explicitly.

Offline

#23 2013-07-29 17:38:00

g3n3r1c
Member
Registered: 2013-07-16
Posts: 17

Re: Filesystem corruption --- how to prevent?

I know this is a little off-topic, but: Would 'ssd_spread' make sense for a partition on usb key? Also: Is there a way to restrict subvolume size e. g. for /var or a /net subvolume where (samba-)users can write to (I could find no info)?

No that i have btrfs and the laptop is all setup again, I begin to wonder if it would be safe to follow powertop's suggestion of setting /proc/sys/vm/dirty_writeback_centisecs to 1500 again *g* ...

Last edited by g3n3r1c (2013-07-29 20:36:28)

Offline

#24 2013-07-29 22:24:23

WonderWoofy
Member
From: Los Gatos, CA
Registered: 2012-05-19
Posts: 8,414

Re: Filesystem corruption --- how to prevent?

You can restrict subvolume size by using quotas.  It is a relatively new feature, and is not very well documented.  But it is in the btrfs wiki and usage can be found in the commit itself.  Also, it is included in "btrfs quota --help" and "btrfs qgroup --help".

I have no idea about using 'ssd_spread' for a flash drive.  You'll have to ask the btrfs people. 

I do use vm.dirty_writeback_centisecs=1500, and it is fine.

Offline

#25 2013-07-29 23:09:48

g3n3r1c
Member
Registered: 2013-07-16
Posts: 17

Re: Filesystem corruption --- how to prevent?

Thanks a bunch for looking up this quota thing! After reading quota in btrfs I'm unsure if it is possible to alter an existing btrfs with subvolumes to use quota?! "Once a BTRFS has been created, quota must be enabled before any subvolume is added [...] If quotas weren't enabled, you must first enable them, then create a qgroup (quota group) for each of those subvolume" --- does that mean that it can be done with a live cd when btrfs is not mounted?

Well, I'll give vm.dirty_writeback_centisecs another try ...

With btrfs another thing comes to mind: Is it possible to boot into an existing snapshot directly? Something like 'Last working state'. I mean without having to boot into a live system and having to change fstab?

Last edited by g3n3r1c (2013-07-29 23:11:12)

Offline

Board footer

Powered by FluxBB