You are not logged in.

#1 2017-10-11 19:43:14

Convergence
Member
Registered: 2005-07-02
Posts: 377

BTRFS RAID1 Recovery

I have a BTRFS raid1 array that won't mount rw, it gives some kind of corrupt leaf, bad block order .... error sda1.  The Raid is sda and sdb1.  I should also mention that both drives have run SMART long tests without error.

Here is the output of btrfs check /dev/sda:

Checking filesystem on /dev/sda
UUID: fd80eff9-fda6-47ca-bf3c-0aed4d01ae84
checking extents
corrupt extent record: key 1237686263808 168 163840
bad key ordering 123 124
bad block 2181710577664
ERROR: errors found in extent allocation tree or chunk allocation
checking free space cache
There is no free space entry for 1206796767232-1206796701696
There is no free space entry for 1206796767232-1207452041216
cache appears valid but isn't 1206378299392
ERROR: errors found in free space cache
found 881052917760 bytes used, error(s) found
total csum bytes: 0
total tree bytes: 156893184
total fs tree bytes: 0
total extent tree bytes: 156024832
btree space waste bytes: 30955309
file data blocks allocated: 355467264
 referenced 355467264

I searched, but I can't find any specific instructions on raid1 recovery.  I thought because it mirrored the data completely, I'd have more flexibility as far as data recovery if I ever needed it.  My options as far as I see it are:

- run "btrfs check --repair sdda" Which I hope will intelligently identify which drive is bonkers, (assuming it's only one of them), and fix the data using checksums or whatever.  Risky because I've heard bad things about btrfs check --repair

- Try to figure out which drive is bonkers in some other way, it, with xfs or ext4 and just copy the data from the undamaged one to the newly formatted drive. Kind of risky because there would be only one copy of the data for a while.  This has the benefit of abandoning BTRFS altogether, which is pretty appealing at this point.  I think there is a command that aids with this, something like "btrfs rescue"

-buying yet another drive, using btrfs's rescue feature to copy data to another disk.  This is probably the safest approach, but I'd rather not buy another drive.

Thanks in advance.


It's a very deadly weapon to know what you're doing
---  William Murderface

Offline

#2 2017-10-12 01:13:26

Convergence
Member
Registered: 2005-07-02
Posts: 377

Re: BTRFS RAID1 Recovery

I just went ahead and ran btrfs check --repair on sdda.  this is what I got:

btrfs check --repair /dev/sda 
enabling repair mode
Checking filesystem on /dev/sda
UUID: fd80eff9-fda6-47ca-bf3c-0aed4d01ae84
checking extents
corrupt extent record: key 1237686263808 168 163840
bad key ordering 123 124
corrupt extent record: key 1237686263808 168 163840
incorrect offsets 9817 9711
Shifting item nr 122 by 106 bytes in block 2181335908352
Deleting bogus item [1206796718080,168,16384] at slot 123 on block 2181335908352
Shifting item nr 124 by 53 bytes in block 2181335908352
Shifting item nr 125 by 53 bytes in block 2181335908352
Shifting item nr 126 by 53 bytes in block 2181335908352
Shifting item nr 127 by 53 bytes in block 2181335908352
Shifting item nr 128 by 53 bytes in block 2181335908352
Shifting item nr 129 by 53 bytes in block 2181335908352
Shifting item nr 130 by 53 bytes in block 2181335908352
Shifting item nr 131 by 53 bytes in block 2181335908352
Shifting item nr 132 by 53 bytes in block 2181335908352
Shifting item nr 133 by 53 bytes in block 2181335908352
Shifting item nr 134 by 53 bytes in block 2181335908352
Shifting item nr 135 by 53 bytes in block 2181335908352
Shifting item nr 136 by 53 bytes in block 2181335908352
Shifting item nr 137 by 53 bytes in block 2181335908352
Shifting item nr 138 by 53 bytes in block 2181335908352
Shifting item nr 139 by 53 bytes in block 2181335908352
Shifting item nr 140 by 53 bytes in block 2181335908352
Shifting item nr 141 by 53 bytes in block 2181335908352
Shifting item nr 142 by 53 bytes in block 2181335908352
Shifting item nr 143 by 53 bytes in block 2181335908352
Shifting item nr 144 by 53 bytes in block 2181335908352
Shifting item nr 145 by 53 bytes in block 2181335908352
Shifting item nr 146 by 53 bytes in block 2181335908352
Shifting item nr 147 by 53 bytes in block 2181335908352
Shifting item nr 148 by 53 bytes in block 2181335908352
Shifting item nr 149 by 53 bytes in block 2181335908352
Shifting item nr 150 by 53 bytes in block 2181335908352
Shifting item nr 151 by 53 bytes in block 2181335908352
Shifting item nr 152 by 53 bytes in block 2181335908352
Shifting item nr 153 by 53 bytes in block 2181335908352
Shifting item nr 154 by 53 bytes in block 2181335908352
Shifting item nr 155 by 53 bytes in block 2181335908352
Shifting item nr 156 by 53 bytes in block 2181335908352
Shifting item nr 157 by 53 bytes in block 2181335908352
Shifting item nr 158 by 53 bytes in block 2181335908352
Shifting item nr 159 by 53 bytes in block 2181335908352
Shifting item nr 160 by 53 bytes in block 2181335908352
Shifting item nr 161 by 53 bytes in block 2181335908352
Shifting item nr 162 by 53 bytes in block 2181335908352
Shifting item nr 163 by 53 bytes in block 2181335908352
Shifting item nr 164 by 53 bytes in block 2181335908352
Shifting item nr 165 by 53 bytes in block 2181335908352
Shifting item nr 166 by 53 bytes in block 2181335908352
Shifting item nr 167 by 53 bytes in block 2181335908352
Shifting item nr 168 by 53 bytes in block 2181335908352
Shifting item nr 169 by 53 bytes in block 2181335908352
Shifting item nr 170 by 53 bytes in block 2181335908352
Shifting item nr 171 by 53 bytes in block 2181335908352
Shifting item nr 172 by 53 bytes in block 2181335908352
Shifting item nr 173 by 53 bytes in block 2181335908352
Shifting item nr 174 by 53 bytes in block 2181335908352
Shifting item nr 175 by 53 bytes in block 2181335908352
Shifting item nr 176 by 53 bytes in block 2181335908352
Shifting item nr 177 by 53 bytes in block 2181335908352
Shifting item nr 178 by 53 bytes in block 2181335908352
Shifting item nr 179 by 53 bytes in block 2181335908352
Shifting item nr 180 by 53 bytes in block 2181335908352
Shifting item nr 181 by 53 bytes in block 2181335908352
Shifting item nr 182 by 53 bytes in block 2181335908352
Shifting item nr 183 by 53 bytes in block 2181335908352
Shifting item nr 184 by 53 bytes in block 2181335908352
Shifting item nr 185 by 53 bytes in block 2181335908352
Shifting item nr 186 by 53 bytes in block 2181335908352
Shifting item nr 187 by 53 bytes in block 2181335908352
Shifting item nr 188 by 53 bytes in block 2181335908352
Shifting item nr 189 by 53 bytes in block 2181335908352
Shifting item nr 190 by 53 bytes in block 2181335908352
Shifting item nr 191 by 53 bytes in block 2181335908352
Shifting item nr 192 by 53 bytes in block 2181335908352
Shifting item nr 193 by 53 bytes in block 2181335908352
corrupt extent record: key 1237686263808 168 163840
corrupt extent record: key 1237686263808 168 163840
corrupt extent record: key 1237686263808 168 163840
Incorrect local backref count on 1199849078784 root 5 owner 2723970 offset 24117248 found 0 wanted 1 back 0x55e027f8a640
Backref disk bytenr does not match extent record, bytenr=1199849078784, ref bytenr=17592197054464
Backref 1199849078784 root 5 owner 2723970 offset 7340032 num_refs 0 not found in extent tree
Incorrect local backref count on 1199849078784 root 5 owner 2723970 offset 7340032 found 1 wanted 0 back 0x55e02ead9e20
backpointer mismatch on [1199849078784 524288]
repair deleting extent record: key 1199849078784 168 524288
adding new data backref on 1199849078784 root 5 owner 2723970 offset 7340032 found 1
Repaired extent references for 1199849078784
ref mismatch on [1206796701696 36864] extent item 1, found 3
Incorrect local backref count on 1206796701696 root 5 owner 6195 offset 0 found 0 wanted 1 back 0x55e03a01f7d0
Backref disk bytenr does not match extent record, bytenr=1206796701696, ref bytenr=0
Backref 1206796701696 root 5 owner 6189 offset 0 num_refs 0 not found in extent tree
Incorrect local backref count on 1206796701696 root 5 owner 6189 offset 0 found 1 wanted 0 back 0x55e030ac2a00
Backref bytes do not match extent backref, bytenr=1206796701696, ref bytes=36864, backref bytes=16384
Backref 1206796701696 root 5 owner 6190 offset 0 num_refs 0 not found in extent tree
Incorrect local backref count on 1206796701696 root 5 owner 6190 offset 0 found 1 wanted 0 back 0x55e03a01fb60
Backref disk bytenr does not match extent record, bytenr=1206796701696, ref bytenr=1206796718080
Backref bytes do not match extent backref, bytenr=1206796701696, ref bytes=36864, backref bytes=16384
Backref 1206796701696 root 5 owner 6192 offset 0 num_refs 0 not found in extent tree
Incorrect local backref count on 1206796701696 root 5 owner 6192 offset 0 found 1 wanted 0 back 0x55e037728d30
Backref disk bytenr does not match extent record, bytenr=1206796701696, ref bytenr=1206796734464
Backref bytes do not match extent backref, bytenr=1206796701696, ref bytes=36864, backref bytes=32768
backpointer mismatch on [1206796701696 36864]
attempting to repair backref discrepency for bytenr 1206796701696
Backrefs don't agree with each other and extent record doesn't agree with anybody, so we can't fix bytenr 1206796701696 bytes 36864
failed to repair damaged filesystem, aborting

So I guess that cuts down on my options.  Fortunately it didn't run very long, so the damage is unlikely to be that extensive. 

My question now is this: 
How dangerous would it be to just format one of the drives with XFS, and run 'btrfs restore' on the remaining drive, copying the data to the XFS drive?  I really wish I had a spare TiB or two lying around, but I don't.


It's a very deadly weapon to know what you're doing
---  William Murderface

Offline

#3 2017-10-12 08:15:31

mich41
Member
Registered: 2012-06-22
Posts: 796

Re: BTRFS RAID1 Recovery

I'm no btrfs expert, but your fsck doesn't seem to report any I/O errors, bad checksums or mismatch between the two disks. You also say that long selftest passed - that's a rather strong argument against I/O error theory. If there are no I/O error messages in dmesg then you can be sure it's not an issue. Therefore I think the disks are both fine, the problem is that some software bug caused bogus data to be written to the filesystem, in two identical, correctly checksummed, fully readuntant and still perfectly readable copies.

Is it possible to mount the filesystem read-only using just one disk? And then mount separately using the other disk? If so, I'd do that and check which disk can be used to recover important data (probably either both of them or none) and if it's both or at least one then umount and mkfs the other one and rescue the data.

Or go ask somewhere else where people are experts on recovering broken btrfs.

In the future, use Linux software RAID and XFS tongue

Offline

#4 2017-10-12 23:50:48

Convergence
Member
Registered: 2005-07-02
Posts: 377

Re: BTRFS RAID1 Recovery

Mich41:  You have some good points.  I will try to mount it RO wone disk at a time over the weekend when I get some time.  If one of the drives is good, I'll format the other with XFS, and look into software raid.  I might just skip raid altogether, and just backup instead.

Thanks!


It's a very deadly weapon to know what you're doing
---  William Murderface

Offline

Board footer

Powered by FluxBB