You are not logged in.
One of my users in research environment invoked out-of-memory on a server which mounts a 52TB btrfs partition. I had to power cycle the server.
After the reboot my btrfs partition cannot be mounted in read-write mode.
Mar 19 15:10:52 mamut kernel: BTRFS error (device dm-5): open_ctree failed
Mar 19 15:10:52 mamut kernel: BTRFS info (device dm-5): use lzo compression, level 0
Mar 19 15:10:52 mamut kernel: BTRFS info (device dm-5): disk space caching is enabled
Mar 19 15:10:52 mamut kernel: BTRFS info (device dm-5): has skinny extents
Mar 19 15:10:52 mamut systemd[1]: mnt-storage.mount: Mount process exited, code=killed, status=15/TERM
Mar 19 15:10:52 mamut systemd[1]: mnt-storage.mount: Failed with result 'timeout'.
Mar 19 15:10:52 mamut systemd[1]: Failed to mount /mnt/storage.
Mar 19 15:10:52 mamut kernel: BTRFS error (device dm-5): super_total_bytes 52798547820544 mismatch with fs_devices total_rw_bytes 105597095641088
Mar 19 15:10:52 mamut kernel: BTRFS error (device dm-5): failed to read chunk tree: -22
Mar 19 15:10:52 mamut kernel: BTRFS error (device dm-5): open_ctree failed
[...]
Mar 19 15:15:52 mamut systemd-helper[9798]: IO Error (subvolume is not a btrfs subvolume).
Mar 19 15:15:52 mamut systemd-helper[9798]: number cleanup for 'storage' failed.
Mar 19 15:15:52 mamut systemd-helper[9798]: running timeline cleanup for 'storage'.
Mar 19 15:15:52 mamut systemd-helper[9798]: IO Error (subvolume is not a btrfs subvolume).
Mar 19 15:15:52 mamut systemd-helper[9798]: timeline cleanup for 'storage' failed.
Mar 19 15:15:52 mamut systemd-helper[9798]: running empty-pre-post cleanup for 'storage'.
Mar 19 15:15:52 mamut systemd-helper[9798]: IO Error (subvolume is not a btrfs subvolume).
Mar 19 15:15:52 mamut systemd-helper[9798]: empty-pre-post cleanup for storage failed.
Mar 19 15:15:52 mamut systemd[1]: snapper-cleanup.service: Main process exited, code=exited, status=1/FAILURE
Mar 19 15:15:52 mamut systemd[1]: snapper-cleanup.service: Failed with result 'exit-code'.The super_total_bytes=52798547820544 is the correct size of the partition in bytes reported by fdisk.
fs_devices total_rw_bytes=105597095641088 is exactly twice of that.
I tried running btrfs check but got this error:
# btrfs check /dev/mapper/fc_trunk-part3
Opening filesystem to check...
Checking filesystem on /dev/mapper/fc_trunk-part3
UUID: 40a2e65b-f34a-4d33-946d-055d93fe7ffa
[1/7] checking root items
ERROR: failed to repair root items: Input/output errorNow, I know about `btrfs rescue fix-device-size`, but I have never ran it before. The man page says:
fix-device-size <device>
fix device size and super block total bytes values that are do
not match
Kernel 4.11 starts to check the device size more strictly and
this might mismatch the stored value of total bytes. See the
exact error message below. Newer kernel will refuse to mount the
filesystem where the values do not match. This error is not fatal
and can be fixed. This command will fix the device size values if
possible.
BTRFS error (device sdb): super_total_bytes 92017859088384 mismatch with fs_devices total_rw_bytes 92017859094528
The mismatch may also exhibit as a kernel warning:
WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559 btrfs_update_device+0x1c5/0x1d0 [btrfs]Kernel version did change after reboot, but both versions are > 4.11 and previously I had no problems mounting this partition.
The partition:
- is big and will take a lot of time, and space I don't have, to back up
- has critical data for my research
- has snapshots
- it is possible to mount it with -o rescue,ro
Is it safe to call `btrfs rescue fix-device-size`? Can I fix it in some other safe way?
Last edited by merilius (2019-03-19 16:41:41)
Offline
Perhaps these threads in linux-btrfs can help:
https://lore.kernel.org/linux-btrfs/9dc … 2@gmx.com/
https://lore.kernel.org/linux-btrfs/c70 … edalo.com/
https://lore.kernel.org/linux-btrfs/bde … a@gmx.com/
If you believe to that information, the problem is not that bad and running rescue command is relatively safe. There is some probability of data loss, but since you do not want to backup your critical data, it is not really critical.
Offline
Thanks @mxfm,
I saw the first post. That user had a tiny difference between the two sizes. Same in the second post.
However the third one is same as mine! Twice as large.
After a long btrfs check --repair (this time no IO errors) it finally worked. And exactly the same problems with long mount time.
Offline
Am I right that after running repair command you fixed 'super_total_bytes' error, but now mounting partition is very slow?
This explains the slow mount.
Unless using the new BG_TREE feature I purposed, the slow mount can't
really be solved.Thanks,
Qu
It is unlikely you will find any help beyond btrfs developers.
Offline