You are not logged in.

#1 2020-07-08 04:29:59

Registered: 2012-02-11
Posts: 22

[SOLVED] bcache stopped registering

Sometime in the last 6mo I rebooted my machine and found only 2 out of 5 bcache backing devices registered.  /dev/sdb1 and /dev/sde1 always register correctly, but /dev/[sda1,sdc1,sdd1] do not.

I've been dealing with it by using "break" on my kernel commandline to drop into the initrd and then I manually register the remaining devices.

The symptoms seem really similar to this issue:

But unlike that bug, it's very consistent. sdb1 and sde1 always register.

Anyone know how I should go about troubleshooting this? Here's my mkinitcpio.conf

# The following modules are loaded before any boot hooks are
# run.  Advanced users may wish to specify all system modules
# in this array.  For instance:
#     MODULES=(piix ide_disk reiserfs)

# This setting includes any additional binaries a given user may
# wish into the CPIO image.  This is run last, so it may be used to
# override the actual binaries included by a given hook
# BINARIES are dependency parsed, so you may safely ignore libraries

# This setting is similar to BINARIES above, however, files are added
# as-is and are not parsed in any way.  This is useful for config files.

# This is the most important setting in this file.  The HOOKS control the
# modules and scripts added to the image, and what happens at boot time.
# Order is important, and it is recommended that you do not change the
# order in which HOOKS are added.  Run 'mkinitcpio -H <hook name>' for
# help on a given hook.
# 'base' is _required_ unless you know precisely what you are doing.
# 'udev' is _required_ in order to automatically load modules
# 'filesystems' is _required_ unless you specify your fs modules in MODULES
# Examples:
##   This setup specifies all modules in the MODULES setting above.
##   No raid, lvm2, or encrypted root is needed.
#    HOOKS=(base)
##   This setup will autodetect all modules for your system and should
##   work as a sane default
#    HOOKS=(base udev autodetect block filesystems)
##   This setup will generate a 'full' image which supports most systems.
##   No autodetection is done.
#    HOOKS=(base udev block filesystems)
##   This setup assembles a pata mdadm array with an encrypted root FS.
##   Note: See 'mkinitcpio -H mdadm' for more information on raid devices.
#    HOOKS=(base udev block mdadm encrypt filesystems)
##   This setup loads an lvm2 volume group on a usb device.
#    HOOKS=(base udev block lvm2 filesystems)
##   NOTE: If you have /usr on a separate partition, you MUST include the
#    usr, fsck and shutdown hooks.
HOOKS=(base udev autodetect modconf block bcache filesystems keyboard btrfs fsck)

# Use this to compress the initramfs image. By default, gzip compression
# is used. Use 'cat' to create an uncompressed image.

# Additional options for the compressor

Last edited by bobpaul (2020-09-28 17:06:24)


#2 2020-09-16 17:35:14

Registered: 2012-02-11
Posts: 22

Re: [SOLVED] bcache stopped registering

I don't reboot often, so this has been pretty low priority. But I think I've found it. The bcache-register script depends on blkid to identify devices that have bcache on them. But something's messed up and blkid doesn't identify the bcache volumes. I think it's the result of swapping in drives that were previously formatted with ext4 without first running wipefs on them:

Here's a drive that is registering on bootup:

$ sudo blkid -p /dev/sdb1
/dev/sdb1: UUID="4d671cc6-2cdd-4140-aff9-6faa4d62ec90" TYPE="bcache" USAGE="other" PART_ENTRY_SCHEME="gpt" PART_ENTRY_NAME="toshiba-4tb-Y637" PART_ENTRY_UUID="1e712c54-af04-464d-8163-a0fb9367b058" PART_ENTRY_TYPE="a19d880f-05fc-4d3b-a006-743f0f84911e" PART_ENTRY_NUMBER="1" PART_ENTRY_OFFSET="2048" PART_ENTRY_SIZE="7814035087" PART_ENTRY_DISK="8:16"
sudo wipefs /dev/sdb1
DEVICE OFFSET TYPE   UUID                                 LABEL
sdb1   0x1018 bcache 4d671cc6-2cdd-4140-aff9-6faa4d62ec90

and here's one of the drives that doesn't register on bootup:

$ sudo blkid -p /dev/sdd1
blkid: /dev/sdd1: ambivalent result (probably more filesystems on the device, use wipefs(8) to see more details)
$ sudo wipefs /dev/sdd1
DEVICE OFFSET TYPE   UUID                                 LABEL
sdd1   0x1018 bcache 0609a139-8870-4073-a8a5-70d97853707d 
sdd1   0x438  ext4   a5f8f689-4106-4ee4-bc8b-cc4d3759521a 

I'm not going to attempt to fix this non-destructively. These are part of a btrfs array, so I'll just remove each drive from the array, use wipefs to remove the magic strings, re-create the bcache backing device, and then re-add the drive to the array.

But... I think it might be possible to fix it in place with

$ wipefs -o 0x438

to remove the erroneous ext4 string. But I'm not familiar enough with these structures to be certain it won't cause data corruption.

Last edited by bobpaul (2020-09-16 17:36:21)


Board footer

Powered by FluxBB