You are not logged in.

#1 2010-05-19 06:00:22

XtrmGmr99
Member
Registered: 2009-04-14
Posts: 128

RAID0 error... oh boy...

Hello,

I've been working on my server adding a few disks to the LVM. This required some of the files that I usually back up to the server to be moved to my main computer -- a BIOS (fake) RAID0 array that uses two 160GB Hitachi disks. The disk manipulation on the server, oddly enough, went fine. After I was done working on it, I turned back to my computer to find an incredibly slow desktop (was booted into Windows 7) -- it eventually crashed and I had to perform a hard reboot.

When booting, it shows an error on the RAID disk. it won't boot into either Arch or Windows 7. On Windows 7, it sometimes gets to teh logo. On Arch, it seems to read at least a little from the drive because it tries to mount the raid array via dmraid. It says, among other things, "ERROR: isw: wrong number of devices in RAID set "isw_bicdeefigd_Volume0" [1/2] on /dev/sda. It then goes on to say that device mapper failed, blah blah blah, and drops me to a recovery shell.

If I'm not mistaken, for it to get that far would require it to read from /boot, correct? If I remember correctly, everything was on the RAID0 array (ie: I did NOT have a separate partition for /boot) so it successfully reads at least part of the disk... Correct me if I'm wrong.

Considering that I have my backups on this RAID array due to unfortunate timing, it's somewhat crucial I'm at least able to access some files if not be completely bootable.

I have a Ubuntu installation that I can boot off my flash drive to perform recovery if need be. The drives are SMART enabled if that would help at all.

Can anyone help me? I know that RAID0 is not the smartest thing to use, but it's usually safe since I make backups to the server. It's just that, due to me working on the server, the backups are on the array. I've never done any sort of data recovery, so I'm not really sure where to start... I've read that I could try to disable the RAID on the BIOS and recreate it on Linux via dmraid. I've also read that performing a few simple diagnostics would clear errors, allowing it to boot. But I'm going blind here and would really like some assistance.

Thank you so much to whoever can help!

Offline

#2 2010-05-19 08:41:10

fukawi2
Ex-Administratorino
From: .vic.au
Registered: 2007-09-28
Posts: 6,222
Website

Re: RAID0 error... oh boy...

RAID-0 is named because that's how many files you get back if it fails wink

I can't imagine disabling the mobo raid and trying to assemble it in Linux will work unless you know *exactly* how the mobo assembles it (stripe size etc). You would have to force it, and if you get anything wrong then I imagine it will only make it worse.

Here's what I would do:
1. TURN THAT COMPUTER OFF. Don't make it worse than it already is.
2. Boot a LiveCD of some sort (Arch installer, Ubuntu whatever).
3. Take an image of both drives using dd to another different disk.
4. Work on the disk images -- try assembling them with mdadm, run data recovery software over it etc

Offline

#3 2010-05-19 13:21:47

XtrmGmr99
Member
Registered: 2009-04-14
Posts: 128

Re: RAID0 error... oh boy...

How would I take disc images of them? The individual drives don't have any partitions, unless you can take an image of the entire disc...

Offline

#4 2010-05-19 14:14:48

.:B:.
Forum Fellow
Registered: 2006-11-26
Posts: 5,819
Website

Re: RAID0 error... oh boy...

Dd?


Got Leenucks? :: Arch: Power in simplicity :: Get Counted! Registered Linux User #392717 :: Blog thingy

Offline

#5 2010-05-19 15:16:34

XtrmGmr99
Member
Registered: 2009-04-14
Posts: 128

Re: RAID0 error... oh boy...

But wouldn't I need two discs to dd the hard drives to? Or will dd'ing the images to two files on a single 500GB hard drive work?

I'd try it, but I'm not currently at the house, so... >_>

Offline

#6 2010-05-19 16:17:45

.:B:.
Forum Fellow
Registered: 2006-11-26
Posts: 5,819
Website

Re: RAID0 error... oh boy...

You can dd to an image, so as long as the destination HD is big enough you can put as many images on it as you want. You can also compress the images, which will make them roughly end up the size of the used space of the original partition (instead of just the size of the partition).


Got Leenucks? :: Arch: Power in simplicity :: Get Counted! Registered Linux User #392717 :: Blog thingy

Offline

#7 2010-05-19 23:16:16

fukawi2
Ex-Administratorino
From: .vic.au
Registered: 2007-09-28
Posts: 6,222
Website

Re: RAID0 error... oh boy...

Something like this:

dd if=/dev/sdX of=/mnt/big_disk/disk1.img bs=8M
dd if=/dev/sdY of=/mnt/big_disk/disk2.img bs=8M

Replace sdX and sdY with the actual disks. This will take an image of the whole disk, block for block, so you will need at least 320gb of free space.

You can't compress them since you wouldn't be able to work on them and treat them like disks then wink

Offline

#8 2010-05-20 03:34:38

XtrmGmr99
Member
Registered: 2009-04-14
Posts: 128

Re: RAID0 error... oh boy...

Thanks! smile

Okay, I've got the backups. I didn't add the bs= parameter to it though, just a basic dd if=? of=?:

$ ls -lh
total 307G
drwx------ 2 root root  16K 2010-05-19 19:36 lost+found
-rw-r--r-- 1 root root 154G 2010-05-19 20:57 sda_backup.img
-rw-r--r-- 1 root root 154G 2010-05-19 22:28 sdb_backup.img

Interestingly, it's only 307GB.

I found someone with a very similar problem. So far what he has said matches my situation with little difference. However, after he makes full backups, he then goes into the BIOS and re-initializes the RAID array which wipes the partition table. He fixed it by comparing hexdump data and seeing where there was a difference in the newly-initialized array and the backups: "Apparently 0x2000 and 0x200 bytes were zeroed out from disks, respectively. Used dd to restore them from backup"

He then restored those bytes from the backup using some terminal magic.

I'm not really sure where to go from here. I can try his method of going to the BIOS and setting a new RAID with the same parameters, which should fix the disk sync, but then I have to face doing something that's way out of my league (I barely understand the command he used to fix the partition table after initializing the disks). Or I can try to use mdadm, but again, I've never used it before and really don't know where to start.

Thanks for the help, everyone!

EDIT: and, also, if I do screw up the physicals disks, I can just dd the backups back onto them, right (assuming I don't touch them...)

Last edited by XtrmGmr99 (2010-05-20 13:58:05)

Offline

#9 2010-05-20 08:20:21

fukawi2
Ex-Administratorino
From: .vic.au
Registered: 2007-09-28
Posts: 6,222
Website

Re: RAID0 error... oh boy...

XtrmGmr99 wrote:

Interestingly, it's only 307GB.

Rounding and manufacturer marketing confusion wink

XtrmGmr99 wrote:

I found someone with a very similar problem.

Best waterblock?? Wrong link I think wink

XtrmGmr99 wrote:

EDIT: and, also, if I do screw up the physicals disks, I can just dd the backups back onto them, right (assuming I don't touch them...)

In theory, yes.... I make no guarantees tho wink

Offline

#10 2010-05-20 13:57:47

XtrmGmr99
Member
Registered: 2009-04-14
Posts: 128

Re: RAID0 error... oh boy...

fukawi2 wrote:
XtrmGmr99 wrote:

I found someone with a very similar problem.

Best waterblock?? Wrong link I think wink

/facepalm, my fault. Link has been fixed... =P

Offline

#11 2010-05-20 22:29:56

fukawi2
Ex-Administratorino
From: .vic.au
Registered: 2007-09-28
Posts: 6,222
Website

Re: RAID0 error... oh boy...

That could work, but I've never done it so I don't have the experience to comment on the validity. Seems to make sense though.

Offline

#12 2010-05-20 22:54:23

Maximalminimalist
Member
Registered: 2009-09-20
Posts: 112

Re: RAID0 error... oh boy...

I already could recover 900 GB of a corrupt (non-RAID) partition. The only thing I can tell you:

Use ddrescue for making an image of a hard disk. wink
(In ubuntu it's named gddrescue.)

(Maybe take a look at my thread about that. But I don't thing it would help. http://bbs.archlinux.org/viewtopic.php?id=91896)

Last edited by Maximalminimalist (2010-05-20 22:57:13)

Offline

#13 2010-05-21 03:09:03

XtrmGmr99
Member
Registered: 2009-04-14
Posts: 128

Re: RAID0 error... oh boy...

I fixed it!

I went ahead and decided to just re-create the array from the BIOS and see what happens.

Going to Google, I found out that the MBR resides in the first 446 bytes of the disk, while the partition table resides in the next 66 bytes, which brings the total out to 512 bytes (1 hard disk sector). When re-initializing the broken array via the BIOS and keeping the exact same settings (stripe size, volume name), it should only wipe the first sector of the hard disk and keep the rest of the data. So after I re-initialized the array, all my data was put back together, however the the map containing partition information was gone.

Looking at the first 512 bytes of data from the first disk confirmed that re-initializing the disks wiped the vital partition information, replacing it with 0's:

$ sudo hexdump -C -n 512 /dev/sda
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200

So I went back to the backup, which has a working MBR and partition table, and copied the first 512 bytes to /dev/sda

$ sudo dd if='/media/New Volume/sda_backup.img' of='/dev/sda' bs=1 count=512

Using hexdump, I verified that the copy did indeed work, and the MBR info from the backup wrote to the disk just fine:

$ sudo hexdump -C -n 512 /dev/sda
00000000  eb 48 90 d0 bc 00 7c 8e  c0 8e d8 be 00 7c bf 00  |.H....|......|..|
00000010  06 b9 00 02 fc f3 a4 50  68 1c 06 cb fb b9 04 00  |.......Ph.......|
00000020  bd be 07 80 7e 00 00 7c  0b 0f 85 0e 01 83 c5 10  |....~..|........|
00000030  e2 f1 cd 18 88 56 00 55  c6 46 11 05 c6 46 03 02  |.....V.U.F...F..|
00000040  ff 00 00 20 01 00 00 00  00 02 fa 90 90 f6 c2 80  |... ............|
00000050  75 02 b2 80 ea 59 7c 00  00 31 c0 8e d8 8e d0 bc  |u....Y|..1......|
00000060  00 20 fb a0 40 7c 3c ff  74 02 88 c2 52 be 7f 7d  |. ..@|<.t...R..}|
00000070  e8 34 01 f6 c2 80 74 54  b4 41 bb aa 55 cd 13 5a  |.4....tT.A..U..Z|
00000080  52 72 49 81 fb 55 aa 75  43 a0 41 7c 84 c0 75 05  |RrI..U.uC.A|..u.|
00000090  83 e1 01 74 37 66 8b 4c  10 be 05 7c c6 44 ff 01  |...t7f.L...|.D..|
000000a0  66 8b 1e 44 7c c7 04 10  00 c7 44 02 01 00 66 89  |f..D|.....D...f.|
000000b0  5c 08 c7 44 06 00 70 66  31 c0 89 44 04 66 89 44  |\..D..pf1..D.f.D|
000000c0  0c b4 42 cd 13 72 05 bb  00 70 eb 7d b4 08 cd 13  |..B..r...p.}....|
000000d0  73 0a f6 c2 80 0f 84 ea  00 e9 8d 00 be 05 7c c6  |s.............|.|
000000e0  44 ff 00 66 31 c0 88 f0  40 66 89 44 04 31 d2 88  |D..f1...@f.D.1..|
000000f0  ca c1 e2 02 88 e8 88 f4  40 89 44 08 31 c0 88 d0  |........@.D.1...|
00000100  c0 e8 02 66 89 04 66 a1  44 7c 66 31 d2 66 f7 34  |...f..f.D|f1.f.4|
00000110  88 54 0a 66 31 d2 66 f7  74 04 88 54 0b 89 44 0c  |.T.f1.f.t..T..D.|
00000120  3b 44 08 7d 3c 8a 54 0d  c0 e2 06 8a 4c 0a fe c1  |;D.}<.T.....L...|
00000130  08 d1 8a 6c 0c 5a 8a 74  0b bb 00 70 8e c3 31 db  |...l.Z.t...p..1.|
00000140  b8 01 02 cd 13 72 2a 8c  c3 8e 06 48 7c 60 1e b9  |.....r*....H|`..|
00000150  00 01 8e db 31 f6 31 ff  fc f3 a5 1f 61 ff 26 42  |....1.1.....a.&B|
00000160  7c be 85 7d e8 40 00 eb  0e be 8a 7d e8 38 00 eb  ||..}.@.....}.8..|
00000170  06 be 94 7d e8 30 00 be  99 7d e8 2a 00 eb fe 47  |...}.0...}.*...G|
00000180  52 55 42 20 00 47 65 6f  6d 00 48 61 72 64 20 44  |RUB .Geom.Hard D|
00000190  69 73 6b 00 52 65 61 64  00 20 45 72 72 6f 72 00  |isk.Read. Error.|
000001a0  bb 01 00 b4 0e cd 10 ac  3c 00 75 f4 c3 00 00 00  |........<.u.....|
000001b0  00 00 00 00 00 00 00 00  46 82 88 ac 00 00 80 20  |........F...... |
000001c0  21 00 07 df 13 0c 00 08  00 00 00 20 03 00 00 df  |!.......... ....|
000001d0  14 0c 07 fe ff ff 00 28  03 00 00 e0 3a 20 80 fe  |.......(....: ..|
000001e0  ff ff 83 fe ff ff 00 08  3e 20 fd f3 0e 00 00 fe  |........> ......|
000001f0  ff ff 05 fe ff ff fd fb  4c 20 30 58 0b 06 55 aa  |........L 0X..U.|
00000200

I restarted the computer and booted back into my recovery Ubuntu distribution. dmraid successfully identified all the partitions in the array and I was able to mount them! So far, I haven't come across any corrupted data and I'm performing backups now. I doubt I'll be able to boot back into the partitions themselves; I know Arch is set up to look for the 'isw_bicdeefigd_Volume0' RAID set; re-initializing the array changed the ID to 'isw_egdbiidgh_Volume0', so it's doubtful it'll boot unless I change it up a bit. I'm not sure what Windows 7 would do, and I don't plan to find out until the backups are complete. =P

So this has been a very successful, -- and educational -- venture for me. smile

Thanks to all that helped. smile

Offline

Board footer

Powered by FluxBB