You are not logged in.

#1 2019-03-01 18:12:13

edh
Wiki Maintainer
Registered: 2012-05-14
Posts: 23

[solved]Root partition not mounting on boot, btrfs issue or something

My system has ended up failing on boot and dumping me to the emergency shell.  I should explain that this system has been working fine for some years, there haven't been any software updated for a few days before hand and it was working just fine.  Rather weirdly, I'd just plugged my keyboard and mouse back in to different USB ports and booted up fine but noticed weird slow mouse behaviour.  I rebooted and got a kernel panic which I couldn't escape from so had to hold down the power button for 5 seconds.  Don't know if the different USB ports (they are USB 3 not USB 2) have caused this but I did swap them back afterwardsthinking this could be the cause of the panic as well as the mouse issues.

Having then gone to boot up I've been met with:

mount: /new_root: wrong fs type, bad option, bad superblock, on /dev/sda2, missing codepage or helper program, or other error

Trying to mount from the emergency shell fails on the same issue.  Obviously something on the btrfs root partition is screwed up somehow.

Booting up with an Arch Linux USB image allows me to mount the drive which seems odd.  Most search results on this error suggest that fsck be run.  As I'm using btrfs this differs from the norm somewhat.  I've not had issues before so btrfs and it's tools I'm not familiar with but I have done:

btrfs rescue zero-log /dev/sda2

and

btrfs rescue super-recover /dev/sda2

Neither reports any issues.

btrfs check does seem quite verbose with it's output and one particular line stands out:

checksum verify failed on 78348289 found AFBED6DF wanted 00AA637E

The fact this is a previously working system with no issues mounting root before suggests it isn't a bad fstab configuration, but could it be?

The output of btrfs rescue suggests the superblock is not bad, but could it be?

What else can cause such an error to appear?  Is there anything NON filesystem wise that might cause it?  As I can mount the partition from a USB live environment I'm a bit confused.

Any ideas much appreciated.

Last edited by edh (2019-03-03 09:41:31)

Offline

#2 2019-03-01 18:32:22

loqs
Member
Registered: 2014-03-06
Posts: 17,195

Re: [solved]Root partition not mounting on boot, btrfs issue or something

At the emergency shell please check if the btrfs module is loaded.

Offline

#3 2019-03-01 19:02:17

mxfm
Member
Registered: 2015-10-23
Posts: 163

Re: [solved]Root partition not mounting on boot, btrfs issue or something

The message 'checksum verify failed' means that the filesystem is damaged and you should 1) Stop executing commands which write on the filesystem 2) if you can mount drive read-only - do backups 3) test RAM and 4) look at linux btrfs mailing list for similar threads.

Does "btrfs insp dump-s -f /dev/sdXXX" mentions any errors?

Last edited by mxfm (2019-03-01 19:43:45)

Offline

#4 2019-03-01 21:10:00

edh
Wiki Maintainer
Registered: 2012-05-14
Posts: 23

Re: [solved]Root partition not mounting on boot, btrfs issue or something

Thanks for the pointers.  Lsmod from the emergency shell shows btrfs to be loaded.

The drive is a root partition so if it gets hosed I can just reinstall the system but I would rather get to the source of the problem and fix it, hence doing a backup doesn't particularly bother me.  I have done a Memtest run which picked nothing up.  I did have  Google for the checksum verify failed issue before but didn't find much to go from.

I did look at 'btrfs insp dump' and it does't seem to be a valid command.  Did you mean btrfs inspect-internal?  Or btrfs image?  A lot of the pages on btrfs carry many warnings about data loss so I just want to make sure I am not sending the wrong command and losing any hope of recovery.

Offline

#5 2019-03-02 05:34:06

mxfm
Member
Registered: 2015-10-23
Posts: 163

Re: [solved]Root partition not mounting on boot, btrfs issue or something

edh wrote:

Thanks for the pointers.  Lsmod from the emergency shell shows btrfs to be loaded.

The drive is a root partition so if it gets hosed I can just reinstall the system but I would rather get to the source of the problem and fix it, hence doing a backup doesn't particularly bother me.  I have done a Memtest run which picked nothing up.  I did have  Google for the checksum verify failed issue before but didn't find much to go from.

I did look at 'btrfs insp dump' and it does't seem to be a valid command.  Did you mean btrfs inspect-internal?  Or btrfs image?  A lot of the pages on btrfs carry many warnings about data loss so I just want to make sure I am not sending the wrong command and losing any hope of recovery.

I just typed 'btrfs insp dump-s -f /dev/mydrive' and it worked. Note, it is dump-s, not dump. It stands for 'btrfs inspect-internal dump-super -f'. The '-f' switch does not force to write anything, it just forces to parse superblock, even if it is a garbage. Which version of btrfs-progs do you use? It is unlikely to show clue because the superblock seems to be fine ...

What does btrfs scrub say on mounted system? Can you rebuild initramfs and try to boot it?

Last edited by mxfm (2019-03-02 05:41:20)

Offline

#6 2019-03-02 09:09:51

edh
Wiki Maintainer
Registered: 2012-05-14
Posts: 23

Re: [solved]Root partition not mounting on boot, btrfs issue or something

I have made some progress.  By running:

btrfs check --repair --init-csum-tree --init-extent-tree /dev/sda2

from a live USB stick I can now reboot and get beyond the emergency shell.  Some services fail to start in particular anything network based like ntpd and dhcpd, X doesn't start and from a command prompt authentication fails on any account so it's still unusable but at least it's further than an emergency shell!

Trouble is I can't get to the point where I can access systemctl to see what errors are shown.

Btrfs scrub reports one uncorrectable error.

I have done ldconfig and mkinitcpio -p linux which completed OK.

I am guessing the failure to boot may be from this uncorrectable error.  Would trying to reinstall all packages from the cache be a worthwhile thing to try?

Offline

#7 2019-03-02 16:09:50

mxfm
Member
Registered: 2015-10-23
Posts: 163

Re: [solved]Root partition not mounting on boot, btrfs issue or something

edh wrote:

I have made some progress.  By running:

btrfs check --repair --init-csum-tree --init-extent-tree /dev/sda2

from a live USB stick I can now reboot and get beyond the emergency shell.  Some services fail to start in particular anything network based like ntpd and dhcpd,

The fact that networking is affected is likely to be a random btrfs volume corruption.

edh wrote:

X doesn't start and from a command prompt authentication fails on any account so it's still unusable but at least it's further than an emergency shell!

I don't think it is a big improvement. Some important system files are corrupted and you cannot boot, there is nothing to be glad.
The fact that after running some commands you are able to mount partition does not mean you can execute some other commands which will fix data corruption.

edh wrote:

I am guessing the failure to boot may be from this uncorrectable error.  Would trying to reinstall all packages from the cache be a worthwhile thing to try?

Perhaps it can fix, if data corruption affected only packages.

Offline

#8 2019-03-02 16:26:54

d_fajardo
Member
Registered: 2017-07-28
Posts: 1,563

Re: [solved]Root partition not mounting on boot, btrfs issue or something

How old is the drive? Hdd or ssd? Any error from smartctl? Just wondering if it's hardware.

Offline

#9 2019-03-02 18:49:31

edh
Wiki Maintainer
Registered: 2012-05-14
Posts: 23

Re: [solved]Root partition not mounting on boot, btrfs issue or something

Hi, the drive is a 3 year old SSD and I think the kernel panic is more the cause of the issue than the drive itself.

I have reinstalled all packages with:

pacman -Qqn | pacman -S -

This didn't fix it.

Booting up in the rescue console and logging in as root journalctl -xb indicates what may be the cause of the problem, everything that fails fails on 'No space left on device'.

The disk actually has plenty of space on it but df shows up a mismatch in freespace.  On a 31GB partition I have 20GB used, yet 0 bytes free!  Clearly a mismatch on free space.  I hadn't seen this before as all of my previous work was done from a USB stick and it showed the free space correctly.  As a result of this most of the btrfs tools don't work under the rescue terminal but do when booting from a USB stick.

This is confusing: how can a filestystem list free space incorrectly when booted one way but not another?

Last edited by edh (2019-03-02 18:49:45)

Offline

#10 2019-03-02 19:55:05

edh
Wiki Maintainer
Registered: 2012-05-14
Posts: 23

Re: [solved]Root partition not mounting on boot, btrfs issue or something

...and I'm back.

Googling on the btrfs free space issue reported shows that someone previously fixed this with btrfs balance.  For example:

btrfs balance -v -dusage=50 /

However, as the rescue console showed the disk as being full I've had to boot a live USB stick, mount the drive, then run the command, then reboot and hope it worked.

Thanks to all suggestions.

Edit: spoke to soon, dmesg shows:

[  354.428071] BTRFS warning (device sda2): sda2 checksum verify failed on 67108864 wanted 6B16EE45 found 1B11F74F level 0
[  354.428106] BTRFS: error (device sda2) in btrfs_finish_ordered_io:3065: errno=-5 IO failure
[  354.428112] BTRFS info (device sda2): forced readonly

At least I have a system that boots so it's easier to do something about it now.

Edit 2: after MANY different uses of btrfs check and also manually deleting a few corrupted files by their inode references it all seems to be back together.  As this was done from a live USB stick I'm afraid I've not got the commands all to hand that I ran but it does show btrfs check can fix a lot.  Whilst I've fixed similar things before with fsck, this doesn't work on btrfs which is a completely different challenge.  Luckily it's not something that should be needed very often.

Last edited by edh (2019-03-03 09:40:55)

Offline

#11 2019-03-03 13:23:08

mxfm
Member
Registered: 2015-10-23
Posts: 163

Re: [solved]Root partition not mounting on boot, btrfs issue or something

'errno=-5 IO failure' is sometimes mentioned in btrfs mailing list, this typically indicates broken disk

Last edited by mxfm (2019-03-03 13:23:25)

Offline

#12 2019-03-05 20:52:35

edh
Wiki Maintainer
Registered: 2012-05-14
Posts: 23

Re: [solved]Root partition not mounting on boot, btrfs issue or something

Things have been running fine for the last couple of days, I've checked dmesg a few times and no errors show so I believe it is now fully solved.

Offline

Board footer

Powered by FluxBB