You are not logged in.

#1 2016-03-24 19:32:41

Stevearch
Member
From: North Wales
Registered: 2014-04-21
Posts: 80

[BUG] Possible BTRFS Raid bug

Hey guys,

I have a BTRFS raid 10 setup in Virtualbox (I'm getting to grips with the Filesystem)
I have the raid mounted to /mnt like so -

 
[root@Xen ~]# btrfs filesystem show /mnt/
Label: none  uuid: ad1d95ee-5cdc-420f-ad30-bd16158ad8cb
        Total devices 4 FS bytes used 1.00GiB
        devid    1 size 2.00GiB used 927.00MiB path /dev/sdb
        devid    2 size 2.00GiB used 927.00MiB path /dev/sdc
        devid    3 size 2.00GiB used 927.00MiB path /dev/sdd
        devid    4 size 2.00GiB used 927.00MiB path /dev/sde

And -

[root@Xen ~]# btrfs filesystem usage /mnt/
Overall:
    Device size:                   8.00GiB
    Device allocated:              3.62GiB
    Device unallocated:            4.38GiB
    Device missing:                  0.00B
    Used:                          2.00GiB
    Free (estimated):              2.69GiB      (min: 2.69GiB)
    Data ratio:                       2.00
    Metadata ratio:                   2.00
    Global reserve:               16.00MiB      (used: 0.00B)

Data,RAID10: Size:1.50GiB, Used:1.00GiB
   /dev/sdb      383.50MiB
   /dev/sdc      383.50MiB
   /dev/sdd      383.50MiB
   /dev/sde      383.50MiB

Metadata,RAID10: Size:256.00MiB, Used:1.16MiB
   /dev/sdb       64.00MiB
   /dev/sdc       64.00MiB
   /dev/sdd       64.00MiB
   /dev/sde       64.00MiB

System,RAID10: Size:64.00MiB, Used:16.00KiB
   /dev/sdb       16.00MiB
   /dev/sdc       16.00MiB
   /dev/sdd       16.00MiB
   /dev/sde       16.00MiB

Unallocated:
   /dev/sdb        1.55GiB
   /dev/sdc        1.55GiB
   /dev/sdd        1.55GiB
   /dev/sde        1.55GiB

Right so everything looks good and I stuck some dummy files in there too -

[root@Xen ~]# ls -lh /mnt/
total 1.1G
-rw-r--r-- 1 root root 1.0G May 30  2008 1GB.zip
-rw-r--r-- 1 root root   28 Mar 24 15:16 hello
-rw-r--r-- 1 root root    6 Mar 24 16:12 niglu
-rw-r--r-- 1 root root    4 Mar 24 15:32 test

The bug appears to happen when you try and test out it's ability to handle a dead drive.
If you follow the instructions here: https://btrfs.wiki.kernel.org/index.php … ed_devices

It tells you do mount the drive with the 'degraded' option,, however this just does not work, allow me to show -

1) I power off the VM and remove one of the drives (Simulating a drive being pulled from a machine)
2) Power on the VM
3) Check DMESG - Everything looks good
4) Check how BTRFS is feeling -

Label: none  uuid: ad1d95ee-5cdc-420f-ad30-bd16158ad8cb
        Total devices 4 FS bytes used 1.00GiB
        devid    1 size 2.00GiB used 1.31GiB path /dev/sdb
        devid    2 size 2.00GiB used 1.31GiB path /dev/sdc
        devid    3 size 2.00GiB used 1.31GiB path /dev/sdd
        *** Some devices missing

So far so good, /dev/sde is missing and BTRFS has detected this.
5) Try and mount it as per the wiki so I can remove the bad drive and replace it with a good one -

[root@Xen ~]# mount -o degraded /dev/sdb /mnt/
mount: wrong fs type, bad option, bad superblock on /dev/sdb,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.

Ok, this is not good, I check DMESG -

[root@Xen ~]# dmesg | tail
[    4.416445] e1000: enp0s3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[    4.416672] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s3: link becomes ready
[    4.631812] snd_intel8x0 0000:00:05.0: white list rate for 1028:0177 is 48000
[    7.091047] floppy0: no floppy controllers found
[   27.488345] BTRFS info (device sdb): allowing degraded mounts
[   27.488348] BTRFS info (device sdb): disk space caching is enabled
[   27.488349] BTRFS: has skinny extents
[   27.489794] BTRFS warning (device sdb): devid 4 uuid ebcd53d9-5956-41d9-b0ef-c59d08e5830f is missing
[   27.491465] BTRFS: missing devices(1) exceeds the limit(0), writeable mount is not allowed
[   27.520231] BTRFS: open_ctree failed

So here lies the problem - BTRFS needs you to have all the devices present in order to mount is as writeable, however if a drive dies spectacularly (as they can do) You can't have that luxury. And as a result you cannot mount any of the remaining drives and fix the problem.
Now you ARE able to mount it read only but you can't issue the fix that they recommend on the wiki, see here -

[root@Xen ~]# mount -o ro,degraded /dev/sdb /mnt/
[root@Xen ~]# btrfs device delete missing /mnt/
ERROR: error removing device 'missing': Read-only file system

So catch 22, you need all the drives otherwise it won't let you mount.

Could this problem be caused by them being Virtual Drives? Apologies for the wall of text but I wanted you guys to be able to see exactly what I'm doing.

Last edited by Stevearch (2016-03-26 12:46:24)

Offline

#2 2016-03-24 22:51:21

Tutti
Member
Registered: 2015-02-26
Posts: 117

Re: [BUG] Possible BTRFS Raid bug

From my understanding the minimum required amount of devices for RAID 10 is 4 devices?

BTRFS wiki wrote:

In case of raidXX layout, you cannot go below the minimum number of the device required. So before removing a device (even the missing one) you may need to add a new one. For example if you have a raid1 layout with two device, and a device fails, you must:

    mount in degraded mode
    add a new device
    remove the missing device

Offline

#3 2016-03-24 23:16:33

WorMzy
Forum Moderator
From: Scotland
Registered: 2010-06-16
Posts: 11,846
Website

Re: [BUG] Possible BTRFS Raid bug

Leading on from what Tutti posted -- can you add a device while it is mounted read only? If no, then I would agree that it is a bug and you should point this out on the btrfs mailing list.


Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD

Making lemonade from lemons since 2015.

Offline

#4 2016-03-25 10:59:13

Stevearch
Member
From: North Wales
Registered: 2014-04-21
Posts: 80

Re: [BUG] Possible BTRFS Raid bug

Thanks for getting back to me.

Tried this -

[root@Xen ~]# mount -o ro,degraded /dev/sdb /mnt/
[root@Xen ~]# btrfs device add /dev/sde /mnt/
ERROR: error adding device '/dev/sde': Read-only file system

This is really not good as such a basic feature like this should be working by now.
I guess I will have to raise this with the BTRFS guys - can you point me in the right direction WorMzy?

Moral of the story here - BTRFS Raid 10 is good unless the drive dies spectacularly, then you are screwed.

Offline

#5 2016-03-25 11:26:09

WorMzy
Forum Moderator
From: Scotland
Registered: 2010-06-16
Posts: 11,846
Website

Re: [BUG] Possible BTRFS Raid bug


Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD

Making lemonade from lemons since 2015.

Offline

#6 2016-03-25 11:55:50

Stevearch
Member
From: North Wales
Registered: 2014-04-21
Posts: 80

Re: [BUG] Possible BTRFS Raid bug

Cheers WorMzy,

Email sent smile

Offline

#7 2016-03-25 17:29:56

Stevearch
Member
From: North Wales
Registered: 2014-04-21
Posts: 80

Re: [BUG] Possible BTRFS Raid bug

**UPDATE**

I've had a reply from a member of the BTRFS team -

btrfs replace is also the recommended way to replace a failed device
nowadays. The wiki is outdated.

On Debian Stretch with Linux 4.4.6, btrfs-progs 4.4 in VirtualBox
5.0.16 with 4*2GB VDIs

# mkfs.btrfs -m raid10 -d raid10 /dev/sdb /dev/sdc /dev/sdd /dev/sdbe

# mount /dev/sdb /mnt
# touch /mnt/test
# umount /mnt

Everything fine so far.

# wipefs -a /dev/sde

*reboot*

# mount /dev/sdb /mnt
mount: wrong fs type, bad option, bad superblock on /dev/sdb,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.

# dmesg | tail
[   85.979655] BTRFS info (device sdb): disk space caching is enabled
[   85.979660] BTRFS: has skinny extents
[   85.982377] BTRFS: failed to read the system array on sdb
[   85.996793] BTRFS: open_ctree failed

Not very informative! An information regression?

# mount -o degraded /dev/sdb /mnt

# dmesg | tail
[  919.899071] BTRFS info (device sdb): allowing degraded mounts
[  919.899075] BTRFS info (device sdb): disk space caching is enabled
[  919.899077] BTRFS: has skinny extents
[  919.903216] BTRFS warning (device sdb): devid 4 uuid
8549a275-f663-4741-b410-79b49a1d465f is missing

# touch /mnt/test2
# ls -l /mnt/
total 0
-rw-r--r-- 1 root root 0 mar 25 15:17 test
-rw-r--r-- 1 root root 0 mar 25 15:42 test2

# btrfs device remove missing /mnt
ERROR: error removing device 'missing': unable to go below four
devices on raid10

As expected.

# btrfs replace start -B missing /dev/sde /mnt
ERROR: source device must be a block device or a devid

Would have been nice if missing worked here too. Maybe it does in
btrfs-progs 4.5?

# btrfs replace start -B 4 /dev/sde /mnt

# dmesg | tail
[ 1618.170619] BTRFS info (device sdb): dev_replace from <missing
disk> (devid 4) to /dev/sde started
[ 1618.184979] BTRFS info (device sdb): dev_replace from <missing
disk> (devid 4) to /dev/sde finished

Repaired!

# umount /mnt
# mount /dev/sdb /mnt
# dmesg | tail
[ 1729.917661] BTRFS info (device sde): disk space caching is enabled
[ 1729.917665] BTRFS: has skinny extents

I've tested out their method which does work however this still isn't an answer if your drive craps out on you completely.

It's also worth noting what he says about the Wiki being out of date - Best to consult the up to date man pages rather than the wiki at this point it seems. hmm

Last edited by Stevearch (2016-03-25 18:39:08)

Offline

#8 2016-03-26 12:37:11

Stevearch
Member
From: North Wales
Registered: 2014-04-21
Posts: 80

Re: [BUG] Possible BTRFS Raid bug

** Update 2 **

It's a bug.

We need this issue be fixed for the real production usage.

  Patch set of hot spare contains the fix for this. Currently I am
  fixing an issue (#5) which Yauhen reported and thats related to the
  auto replace. Refreshed v2 will be out soon.

Thanks, Anand

Offline

#9 2016-08-28 16:10:52

bartki
Member
Registered: 2015-12-01
Posts: 5

Re: [BUG] Possible BTRFS Raid bug

Hi,

is there anything new on this topic? I have the same issue with a raid1 setup with 2 drives, of which one failed while balancing. Suggestions welcome.

@Stevearch: Could you please point me to the thread on the btrfs-mailinglist regarding your issue?!

Cheers, Bart

Offline

#10 2019-01-05 13:11:49

gsauthof
Member
Registered: 2019-01-05
Posts: 1
Website

Re: [BUG] Possible BTRFS Raid bug

bartki wrote:

Hi,

is there anything new on this topic? I have the same issue with a raid1 setup with 2 drives, of which one failed while balancing. Suggestions welcome.

@Stevearch: Could you please point me to the thread on the btrfs-mailinglist regarding your issue?!

Cheers, Bart


The links to @Stevearch's mailinglist messages he quoted in this thread:

Subject: Possible Raid Bug
From: Stephen Williams
Date: Fri, 25 Mar 2016 11:49:20 +0000
https://www.spinics.net/lists/linux-btrfs/msg53392.html

Subject: Re: Possible Raid Bug
From: Anand Jain
Date: Sat, 26 Mar 2016 11:08:50 +0800
https://www.spinics.net/lists/linux-btrfs/msg53437.html

As far as I can see, the situation hasn't improved much.

For example, it still isn't possible to add a new device to a read-only mounted Btrfs filesystem:

ERROR: error adding device '/dev/sdX': Read-only file system

Also, the btrfs replace command stioll doesn't work on a read-only mounted Btrfs filesystem. When executed in the background (the default) the command exits successfully but nothing is replaced. When run in the foreground (with -B) you get an error message like:

ERROR: ioctl(DEV_REPLACE_START) failed on "/mnt/foo": Read-only file system

Offline

#11 2019-01-05 13:57:40

WorMzy
Forum Moderator
From: Scotland
Registered: 2010-06-16
Posts: 11,846
Website

Re: [BUG] Possible BTRFS Raid bug

Please do not necrobump. If you are having problems with your btrfs filesystem, then please open a new topic and include relevant information about it.

https://wiki.archlinux.org/index.php/Co … bumping%22

Closing.


Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD

Making lemonade from lemons since 2015.

Offline

Board footer

Powered by FluxBB