You are not logged in.

#1 2018-02-18 04:30:50

Bugattikid2012
Member
Registered: 2014-09-24
Posts: 58

Software RAID1 User error with bootloader woes

Been running a few servers on one system for several years, but my drives are at the end of their lifespan.  Can't just clone my OS as 1. The fakeRAID hardware controller's maximum drive support is smaller than my new drives, and 2. Old drives are in MBR, and the new drives have to be GPT, which complicates things a bit when cloning, and messes up the bootloader anyways. 

Because of these issues, I am going to try a software RAID reinstall (though I'm not against the idea of cloning my old drives and setting up a new software RAID from there)

I gave it my best effort, but I can't for the life of me get my bootloader setup. 

Followed the instructions on the following pages:

https://wiki.archlinux.org/index.php/RAID#Installation
https://wiki.archlinux.org/index.php/So … ID_and_LVM
https://wiki.archlinux.org/index.php/GRUB#RAID

To summarize what I did, here is basically a step by step guide of my futile attempts.  There will be changes noted towards the end, and unless specifically mention, nothing else was changed in my reinstallation process:

gdisk to create a raid "partition" (gdisk code fd00) to both drives.  Also to both drives, I added a /boot, /, /SWAP, and a bios_grub partition (gdisk code ef02) for my GPT on BIOS issues.  Mind that for this first attempt, this was all inside of the raid "partition" I mentioned earlier.  From here, I ran

# mdadm --create --verbose --level=1 --metadata=1.2 --raid-devices=2 /dev/md0 /dev/sda1 /dev/sdb1

Which of course created a loop of the entire raid "partition" into /dev/md0, and from here we were left with /dev/md0p1-4. 

While it's irrelevant for RAID1 writes, it is useful to properly configure strides and stripes, or so I've read.  So I ran this for my partitions inside of the raid "partition"

# mkfs.ext4 -v -L myarray -m 0.5 -b 4096 -E stride=128,stripe-width=256 /dev/md0p3

This was done for /boot and /

# echo 'DEVICE partitions' > /etc/mdadm.conf
# mdadm --detail --scan >> /etc/mdadm.conf
# mdadm --assemble --scan

Ran the installation through pacstrap, and then

# mdadm --detail --scan >> /mnt/etc/mdadm.conf

All according to the wiki, from the best of my understanding.

Added mdadm_udev to mkinitcpio.conf on the chrooted environment, as well as adding dm_mod to the modules list, as the second wiki page I linked mentions. 

From here, I've tried installing grub to both /dev/sda and /dev/sdb (in one installation as mentioned in the wiki page for grub) and in a second attempt from the top, I've also tried it to /dev/md0, which to my understand this last one SHOULDN'T work anyways. 

From another attempt, from the top, I've tried moving my swap and boot into separate raid "partitions".  It should be noted that I could not get /boot to automatically show up in my fstab, but I was able to manually add it.I can get GRUB to "successfully" install to both /dev/sda and /dev/sdb JUST LIKE BEFORE, but again, when I try to boot to grub, I get a rescue environment. 

From yet another attempt, from the top, I've tried moving my /boot out of a RAID and just manually mirroring it on one of my drives.  It should be noted again that I could not get /boot to automatically show up in my fstab, but I was able to manually add it.  Obviously, I can't have both partitions mounted as /boot, but it only needs one /boot partition anyways, so I figured it'd be worth a shot.  Of course, the grub installation is seemingly okay(?) as I can again reach a rescue environment like before, and am unable to find my desired partitions. 

And yet another attempt from the top, I even screwed around with syslinux (with mdadm version 0.90), but I had nothing but trouble with it.  Didn't even try to boot as I'm positive it wasn't setup properly. 

In each of these attempts (minus the syslinux one), I attempted to boot into my installation with my supergrubdisk .iso on a multiboot USB.  I would select the RAID option, but each time it was unable to find anything other than /boot, and I don't believe it was able to find that in the first attempt (where /boot was inside one large raid "partition").  I also want to mention that in each attempt I made sure that the previous RAID settings were not in effect by following the first wiki link's instructions to remove previous attempts.

It also should be noted that I did NOT try to set special kernel options for GRUB nor syslinux as the wiki mentions it shouldn't be necessary anymore, and I was unable to figure out how to set it up with GRUB in the way it was asking for.  I'm willing to try if someone could give me a hand with what to do, as I am unfamiliar with configuring GRUB.

Clearly I'm screwing something up, despite no other errors showing up (that I'm aware of).  The RAID itself seems to be working properly on all partitions, I'm just not installing my bootloader properly I guess.  I don't care what bootloader I use; I just want it to work, and I don't have access to UEFI on this computer unfortunately.  I have experience using GRUB, but hardly any configuring it (I'm used to rEFInd personally).  I have no experience with syslinux, but if given a helping hand I'll have no objection to using it here.  I don't particularly care if my /boot or /swap is inside of the RAID1 setup, but I guess I'd prefer it if it's not any extra trouble. 

I've spent a solid 12 hours on this today, and I'm not finding any resources that are of help.  I'd really appreciate it if I could get this working this weekend, but I'll be sure to have patience as well.  Thanks a ton for your time and assistance!

Edit: I just got to thinking about it, and I'm going to try to clone my old drive over, and then without touching RAID at all, I'm going to get the bootloader working.  From there I should be able to easily enable software RAID and sync the drives, right?  I guess I'd still have to update the bootloader at that point though.  Thoughts?

Second Edit: The old partition has quite a few bad sectors, so I'm not sure if this will work or not.  Cloning it takes forever, and this still won't solve my issues with the bootloader, and I still don't know if I'll have to do anything to disable any RAID settings that may be present on the old partition, as it's been several years since I set it up.

Last edited by Bugattikid2012 (2018-02-18 06:38:20)

Offline

#2 2018-02-18 10:06:00

frostschutz
Member
Registered: 2013-11-15
Posts: 1,417

Re: Software RAID1 User error with bootloader woes

Bugattikid2012 wrote:

From here, I ran

# mdadm --create --verbose --level=1 --metadata=1.2 --raid-devices=2 /dev/md0 /dev/sda1 /dev/sdb1

Which of course created a loop of the entire raid "partition" into /dev/md0, and from here we were left with /dev/md0p1-4.

While it's normal (unavoidable even) for fakeraid to be partitioned, with software raid it is preferable to partition the disks themselves, and then have /dev/md1-4 instead of /dev/md0p1-4.

For a /boot partition you can also use --level=1 --metadata=1.0 - that simplifies things for the bootloader because it needs no longer be RAID-aware. --metadata=1.0 is metadata at the end of the partition so to a non-RAID-aware bootloader it looks like a regular filesystem.

In your case not only needs your bootloader be raid aware, it also has to handle partitions on the raid itself. it's not entirely impossible, but why make things difficult for yourself...

Bugattikid2012 wrote:

when I try to boot to grub, I get a rescue environment.

Which is probably why you ran into this problem.


Bugattikid2012 wrote:

I would select the RAID option, but each time it was unable to find anything other than /boot, and I don't believe it was able to find that in the first attempt (where /boot was inside one large raid "partition").

Well, the bootloader only needs to find /boot (your kernel. your initramfs). Finding all the other stuff is then the kernel's (or initramfs') headache.



Also with such questions you should show a few things about your setup, like lsblk, cat /proc/mdstat, parted -l, grub config, mdadm.conf, ...exact commands you used to install grub, and their outputs.

Offline

#3 2018-02-18 21:00:42

Bugattikid2012
Member
Registered: 2014-09-24
Posts: 58

Re: Software RAID1 User error with bootloader woes

frostschutz wrote:

While it's normal (unavoidable even) for fakeraid to be partitioned, with software raid it is preferable to partition the disks themselves, and then have /dev/md1-4 instead of /dev/md0p1-4.

I'm not sure I follow what you're asking me to do here.  Are you saying that each partition should be in their own raid "partition"?  If so, I mentioned that I have tried this in my post.

Or are you saying that I should NOT have the raid "partition" that contains the other partitions, and instead just have the normal partitions, then put the actual partitions themselves into the md0 loops?

frostschutz wrote:

For a /boot partition you can also use --level=1 --metadata=1.0 - that simplifies things for the bootloader because it needs no longer be RAID-aware. --metadata=1.0 is metadata at the end of the partition so to a non-RAID-aware bootloader it looks like a regular filesystem.

I tried this with metatdata version 0.9, but with no different result as mentioned above.  Also, I think GRUB is supposed to work with higher versions of metadata anyways, but it doesn't really matter to me.  It's /boot, not my /.  I'm not picky about it.  I'll give it a try soon.

frostschutz wrote:

In your case not only needs your bootloader be raid aware, it also has to handle partitions on the raid itself. it's not entirely impossible, but why make things difficult for yourself...

So to clarify, you are saying that I can possibly avoid issues by putting my /boot into a metadata 1.0 RAID setup as compared to 1.2?  Or are you saying that my /boot should not be in a software RAID at all?  I think you meant the first, yes?

frostschutz wrote:

Well, the bootloader only needs to find /boot (your kernel. your initramfs). Finding all the other stuff is then the kernel's (or initramfs') headache.

SGD is a little bit different than that, it's specifically designed to find the OS when the installed bootloader can't.  I was trying to use it to get into my OS to confirm the OS was working, and that the bootloader was the issue.  Despite not being able to confirm this, I think it's still pretty obvious that my bootloader is the issue, but please correct me if I'm wrong.

frostschutz wrote:

Also with such questions you should show a few things about your setup, like lsblk, cat /proc/mdstat, parted -l, grub config, mdadm.conf, ...exact commands you used to install grub, and their outputs.

You're right.  In my mind I was thinking these were pretty useless as I have detailed my steps anyways, but I guess they can't hurt.  /proc/mdstat showed that everything was still trying hard to check that everything was synced, and looked normal.  Grub config was default, and I mentioned earlier that I ran the default grub installation commands, as detailed in the wiki pages I linked. 

Thanks for your help.  I should be able to continue working on this in about an hour or so, and I'll try doing this without the raid "partition".  I don't quite understand why the wiki would recommend me to do that if it isn't commonplace, and it really makes no mention on what I should do with the bootloader, or how to set it up properly.

Offline

#4 2018-02-19 00:44:23

frostschutz
Member
Registered: 2013-11-15
Posts: 1,417

Re: Software RAID1 User error with bootloader woes

Bugattikid2012 wrote:

I'm not sure I follow what you're asking me to do here.  Are you saying that each partition should be in their own raid "partition"?

several raid made up of partitions

md3 : active raid6 sdb3[8] sdf3[7] sdh3[5] sdg3[9] sde3[3] sdd3[2] sdc3[1]
      1220692480 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/7] [UUUUUUU]
      
md2 : active raid6 sdb2[8] sdf2[7] sdh2[5] sdg2[9] sde2[3] sdd2[2] sdc2[1]
      1220692480 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/7] [UUUUUUU]
      
md1 : active raid6 sdb1[8] sdf1[7] sdh1[5] sdg1[9] sde1[3] sdd1[2] sdc1[1]
      1220692480 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/7] [UUUUUUU]

md1, md2, md3 is built on top of partitions. There is no partitions on top of md1, md2, md3. You fdisk/parted with /dev/sdX, but not with /dev/mdX.

On top of md you either have filesystem (this would be the case for /boot, and / /home if you only need those two), or maybe LUKS and/or LVM (if you need these), depending on your preferences.

Bugattikid2012 wrote:

So to clarify, you are saying that I can possibly avoid issues by putting my /boot into a metadata 1.0 RAID setup as compared to 1.2?

Yes, 0.90 is fine too however, 0.90 is really archaic, 1.0 is the new 0.90 (both have metadata at the end so it can be ignored by bootloader). 1.2 has metadata 4K from the start, 1.1 directly at the start (basically, you never use 1.1 for anything.)

Bugattikid2012 wrote:

SGD is a little bit different than that, it's specifically designed to find the OS when the installed bootloader can't.

Yes but even then, SGD should only need /boot, and in addition to kernel initramfs, also the grub.cfg within your /boot.

Bugattikid2012 wrote:

I don't quite understand why the wiki would recommend me to do that if it isn't commonplace, and it really makes no mention on what I should do with the bootloader, or how to set it up properly.

It's difficult to write good documentation with several authors and different setups involved. And even if it's somewhat good, it might still be hard to follow. You could do an entire wiki about RAID alone, much less cover everything in one article in a general linux wiki...

https://wiki.archlinux.org/index.php/RA … _.28GPT.29 says "It is highly recommended to pre-partition the disks to be used in the array.", and notes "Note: It is also possible to create a RAID directly on the raw disks (without partitions), but not recommended because it can cause problems ...".

It does mention mdXpY later on once. Well, the setup exists (particularly with fakeraid, which could also be handled by  /dev/mdX), so it should be documented how to handle it somewhere... that belongs to the fakeraid article really https://wiki.archlinux.org/index.php/In … _Fake_RAID where this is used because unavoidable, not because it's any good... https://wiki.archlinux.org/index.php/So … ID_and_LVM does not have partitions on md either, instead it uses LVM which is fine (although LVM is also capable to do RAID by itself, but more difficult to handle in case of failure).

Offline

#5 2018-02-19 01:27:00

Bugattikid2012
Member
Registered: 2014-09-24
Posts: 58

Re: Software RAID1 User error with bootloader woes

I am still very confused, and I think we may be miscommunicating, or maybe I'm not understanding/following you.

frostschutz wrote:

several raid made up of partitions

md1, md2, md3 is built on top of partitions. There is no partitions on top of md1, md2, md3. You fdisk/parted with /dev/sdX, but not with /dev/mdX.

On top of md you either have filesystem (this would be the case for /boot, and / /home if you only need those two), or maybe LUKS and/or LVM (if you need these), depending on your preferences.

Okay, I am not following this part very well.  I can interpret this two different ways.  Which of the following are you more closely referring to here?

Option A:

create multiple RAID partitions, and INSIDE of these partitions, create an ext4 filesystem for /, /home, etc
Then, choose the RAID partitions to be part of the md0-4 devices.  

Example partitioning process assuming empty drives:
Individual RAID partition for /boot, /SWAP, /, /home, etc with hex code FD00 on each individual /dev/sda and /dev/sdb, etc.
Build array with
# mdadm --create --verbose --level=1 --metadata=1.2 --raid-devices=2 /dev/md0 /dev/sda1 /dev/sdb1
where sda1 and sdb1 would be the partitions created above for /boot.  Would then repeat process for md1 with /SWAP, md2 for /, etc

# echo 'DEVICE partitions' > /etc/mdadm.conf
# mdadm --detail --scan >> /etc/mdadm.conf
#mdadm --assemble --scan

Then, I would partition NEW PARTITIONS inside of the previously created FD00 partitions, and format them as such:
# mkfs.ext4 -v -L myarray -m 0.5 -b 4096 -E stride=128,stripe-width=256 /dev/md0

From here I would install with pacstrap and continue as normal.

Option B

Do NOT create a RAID partition, and instead choose the real partitions to make up the md0-4 devices
Individual standard partitions that you would use in a normal installation, but manually applied to both /dev/sda and /dev/sdb, etc.

Build array with
# mdadm --create --verbose --level=1 --metadata=1.2 --raid-devices=2 /dev/md0 /dev/sda1 /dev/sdb1
Where sda1 and sdb1 would be the partitions created above for /boot.  Would then repeat process for md1 with /SWAP, md2 for /, etc

# echo 'DEVICE partitions' > /etc/mdadm.conf
# mdadm --detail --scan >> /etc/mdadm.conf
#mdadm --assemble --scan

From here I would install with pacstrap and continue as normal.

Because the wiki very clearly mentions to create a RAID partition, I have not attempted option B, as explained in my first post.  Which one are you instructing me to follow?  I am confused between these two options after reading your responses.  Sorry about that.


frostschutz wrote:

md1, md2, md3 is built on top of partitions. There is no partitions on top of md1, md2, md3. You fdisk/parted with /dev/sdX, but not with /dev/mdX.

No, I stated in my first post that I did create my filesytems inside of /dev/mdX after creating the RAID partitions, and setting the RAID partitions to mirror each other.

frostschutz wrote:

says "It is highly recommended to pre-partition the disks to be used in the array.", and notes "Note: It is also possible to create a RAID directly on the raw disks (without partitions), but not recommended because it can cause problems ...".

I linked to this page, and I am asking for clarification as to what this means, because obviously it doesn't mean what I think it means, otherwise I wouldn't be in this mess. 

Thanks again for your time, I know I keep asking for you to explain it more, but I am just not following you otherwise.  My bad. 

I also wanted to be clear, I am not trying to be sarcastic, passive aggressive, etc, nor saying you are at fault for not making this clear enough.  I am merely asking for you to clarify a bit to help me understand.  Thank you very much.

Edit: I'm going to LITERALLY copy EVERY command I run into here now, so hopefully it becomes dead clear as to where I am wrong:

# lsblk
sda and sdb are both unpartitioned GPT tabled drives

# gdisk /dev/sda
    n > 1 > 2048 > +150M > fd00 
    n > 2 > *FIRST AVAILABLE SECTOR* > +5M > fd00
    n > 3 > *same as above* > +8G > fd00
    n > 4 > * same as above* > -500M > fd00
    p > w > y
# gdisk /dev/sdb
    n > 1 > 2048 > +150M > fd00 
    n > 2 > *FIRST AVAILABLE SECTOR* > +5M > fd00
    n > 3 > *same as above* > +8G > fd00
    n > 4 > * same as above* > -500M > fd00
    p > w > y
# lsblk
    Confirmed it looks good
# mdadm --create --verbose --level=1 --metadata=1.2 --raid-devices=2 /dev/md0 /dev/sda4 /dev/sdb4
    This creates the loop for my / 
# mdadm --create --verbose --level=1 --metadata=1.0 --raid-devices=2 /dev/md1 /dev/sda1 /dev/sdb1
    This creates the loop for my /boot
# mdadm --create --verbose --level=1 --metadata=1.0 --raid-devices=2 /dev/md2 /dev/sda2 /dev/sdb2
    This creates the loop for my grub_bios 1MiB partition
# mdadm --create --verbose --level=1 --metadata=1.2 --raid-devices=2 /dev/md3 /dev/sda3 /dev/sdb3
    This creates the loop for my /swap

# cat /proc/mdstat
    Looks good, attempting sync, wiki says you can work while this is happening

# echo 'DEVICE partitions' > /etc/mdadm.conf
# mdadm --detail --scan >> /etc/mdadm.conf

# nano /etc/mdadm.conf
    Looks appropriate according to wiki's example

# mdadm --assemble --scan

# mkfs.ext4 -v -L *nameofpartition* -m 0.5 -b 4096 -E stride=128,stripe-width=256 /dev/md0
# mkfs.ext4 -v -L *nameofpartition*  -m 0.5 -b 4096 -E stride=128,stripe-width=256 /dev/md1
    Did NOT run a command for md2, as it is my understanding that it should NOT be partitioned, and this worked earlier today on a non-raid install.
# mkswap /dev/md3 

#gdisk /dev/md2
    n > 1 > *default* > *default* > ef02
    p > w > y

From this point onwards, it basically becomes a normal installation.

# swapon /dev/md3
# mount /dev/md0 /mnt
# mkdir /mnt/boot
# mount /dev/md1 /mnt/boot

# cp /etc/pacman.d/mirrorlist /etc/pacman.d/mirrorlist.backup
# sed -i 's/^#Server/Server/' /etc/pacman.d/mirrorlist.backup
# rankmirrors -n 12 /etc/pacman.d/mirrorlist.backup > /etc/pacman.d/mirrorlist

# pacstrap /mnt base base-devel

# genfstab -U /mnt >> /mnt/etc/fstab
# nano /mnt/etc/fstab
    Always have to manually add /boot when using RAID attempt, but it showed up this time on its own.

# arch-chroot /mnt
# ln -sf /usr/share/zoneinfo/MyRegion/MyCity /etc/localtime
# nano /etc/locale.gen
    Uncomment UTF-8
# locale-gen
# nano /etc/locale.conf
   add LANG=en_US.UTF-8
# nano /etc/hostname
    Add hostname
/etc/hosts
    Add hostname and local static IP address

Add mdadm_udev to "HOOKS" in /etc/mkinitcpio.conf
# mkinitcpio -p linux

# passwd

# pacman -S grub
# grub-install --target=i386-pc --debug /dev/sda
# grub-install --target=i386-pc --debug /dev/sdb
    AS PER WIKI - note that trying /dev/md1 would result in an error as mentioned in a previous post, and I don't think this would be correct anyways, even without that error

And out pops grub as follows:

grub-install: waring: this GPT partition label contains no BIOS Boot Partition; embedding won't be possible.
grub-install: error: embedding is not possible, but this is required for RAID and LVM install.

So where's the screw up?

Edit:

By suggestion, I took /boot and grub_BIOS out of their individual raids, and let them be outside of RAID with appropriate partitions.  
I then doubled them up manually on /dev/sda1, sda2, as well as sdb1 and sdb2.
In order to do this, I used gparted to really make sure my RAID settings were dead, as they like to ghost their way back in sometimes.  
I'm able to install grub this way, but when I boot to it I'm left with a rescue environment type situation and I don't get to the menu at all.

Last edited by Bugattikid2012 (2018-02-19 06:30:50)

Offline

#6 2018-02-19 04:15:45

NoSuck
Member
Registered: 2015-03-04
Posts: 157
Website

Re: Software RAID1 User error with bootloader woes

Given that you are open to the idea of setting up a new software RAID from scratch, would you be against setting up a ZFS dataset instead?  I ask because ZFS was designed to mitigate the very hassles you now describe.  Just put your RAID controller in JBoD mode (when available) and follow the guide (ignoring all the DKMS/dependency warnings, if you use the repo).

In either case, you seem to want to boot from your data tank.  Is there a specific reason you don't want to keep the system modular by booting from a dedicated OS drive?  You have new drives and everything.  Now is the time to make a change for the better.

Good luck either way.

Last edited by NoSuck (2018-02-19 04:17:23)

Offline

#7 2018-02-19 05:10:00

Bugattikid2012
Member
Registered: 2014-09-24
Posts: 58

Re: Software RAID1 User error with bootloader woes

NoSuck wrote:

Given that you are open to the idea of setting up a new software RAID from scratch, would you be against setting up a ZFS dataset instead?  I ask because ZFS was designed to mitigate the very hassles you now describe.  Just put your RAID controller in JBoD mode (when available) and follow the guide (ignoring all the DKMS/dependency warnings, if you use the repo).

I'm not against it, I'm just unfamiliar with it.  I've heard a lot of good things about it though.  You mention RAID controller - Unfortunately I can't use my hardware's fakeRAID controller as it only supports drives of roughly ~700GiB.  Are you talking about the software raid setup? 

Would ZFS be a replacement to mdadm, or would it run alongside of it?  It *sounds* like ZFS has data mirroring/scrubbing without the need for an actual mirrored drive, which could allow me to setup my drives in RAID0 with mdadm, and effectively have a RAID0+1, right?  Surely this is too good to be true.

NoSuck wrote:

In either case, you seem to want to boot from your data tank.  Is there a specific reason you don't want to keep the system modular by booting from a dedicated OS drive?  You have new drives and everything.  Now is the time to make a change for the better.

Do you mean have a separate drive for the actual bulk of my data, and have my root on a different drive?  Really my root is what I want to protect, as that's where all of my configuration files are.  Sure, I could keep the larger data separate, but I don't have any other drives laying around, and I'm not sure of what the real benefit would be.  I'm open to the idea though.

Edit: Also please check the last post before this, as I made a major edit to it with a literal step by step guide for my installation process.  Something in that is wrong apparently.

Last edited by Bugattikid2012 (2018-02-19 05:23:20)

Offline

#8 2018-02-19 07:12:55

NoSuck
Member
Registered: 2015-03-04
Posts: 157
Website

Re: Software RAID1 User error with bootloader woes

Bugattikid2012 wrote:

Would ZFS be a replacement to mdadm, or would it run alongside of it?

A replacement.

Bugattikid2012 wrote:

Do you mean have a separate drive for the actual bulk of my data, and have my root on a different drive?

Yes, but if you only have two physical drives (with no plans of adding more--something possible with ZFS), there would not be much point.  One of the benefits, however, would be the ability to house the OS and tmpfs on a fast SSD (or even a flash drive) while keeping heavy data on HDD.

Bugattikid2012 wrote:

It *sounds* like ZFS has data mirroring/scrubbing without the need for an actual mirrored drive...

I am not sure what you mean, but RAID1 with ZFS is called "mirror" and would yield the smaller size of your two drives.  I confess that I do not run ZFS on / (on account of the aforementioned modularity), but many Arch users do.

Last edited by NoSuck (2018-02-19 07:16:58)

Offline

#9 2018-02-19 08:11:34

Bugattikid2012
Member
Registered: 2014-09-24
Posts: 58

Re: Software RAID1 User error with bootloader woes

I gotta say it does look like a promising alternative to mdadm, and, assuming I could get ZFS to work, it would certainly solve my issues, as ZFS takes over after the system has already booted up.  Unfortunately it looks like I won't be able to test it out for a few days, but I'll try to read up on it before I come back to it.  Thanks for the suggestion!  I always had thought ZFS was a file system, which is only partially true/half of its job.  Thanks a ton!

I need to ask though, which package or repository should I choose when installing this?  The ones independent of the kernel sound useful to me, but I'd like your input seeing as you have some experience with it.

In the meantime, if anyone has suggestions on fixing my mdadm configuration issue, I'm still interested in learning about what I did wrong, even if I choose to use ZFS(I think I will).

Last edited by Bugattikid2012 (2018-02-19 08:14:37)

Offline

#10 2018-02-19 11:39:46

frostschutz
Member
Registered: 2013-11-15
Posts: 1,417

Re: Software RAID1 User error with bootloader woes

Everything looks fine to me except this one thing

# mdadm --create --verbose --level=1 --metadata=1.0 --raid-devices=2 /dev/md2 /dev/sda2 /dev/sdb2
    This creates the loop for my grub_bios 1MiB partition

Creating a RAID for bios_grub is wrong because bios_grub is an exception. It has to be a plain partition, no raid, no filesystem, no nothing. Just a partition, with the correct partition type for bios_grub. Only thing ever writing here is grub-install, nothing else is allowed to even touch it.

grub-install: waring: this GPT partition label contains no BIOS Boot Partition; embedding won't be possible.
grub-install: error: embedding is not possible, but this is required for RAID and LVM install.

Seems like the partition type is not set to bios_grub.

Try:

# remove mdadm from bios_grub
mdadm --stop /dev/md2
mdadm --zero-superblock /dev/sda2
mdadm --zero-superblock /dev/sdb2

# prepare for legacy bios boot (blindly assumes that partition no 2 is your 1MB part)
parted /dev/sda disk_set pmbr_boot on
parted /dev/sda set 2 bios_grub on
parted /dev/sdb disk_set pmbr_boot on
parted /dev/sdb set 2 bios_grub on

# re-install grub for each drive
grub-install /dev/sda
grub-install /dev/sdb

You will also have to edit your mdadm.conf.

Remove the /dev/md2 reference which no longer exists, then redo initramfs / mkinitcpio.

mdadm --detail --scan tends to be too verbose, you may simplify each md entry to UUID only, example:

/dev/md3 UUID=43741ea2:cc1aa3ea:59aa239c:91fd9e61

Additional conditions such as metadata= name= devices= etc. can cause problems in some cases. Basically you should just have simple entries in mdadm.conf as above, plus a MAILADDR so mdadm monitor can notify you of problems. This also requires a sendmail system to be set up, look into it once your system is booting properly. Also configure smartmontools monitoring at that time.

Offline

#11 2018-02-23 06:36:54

Bugattikid2012
Member
Registered: 2014-09-24
Posts: 58

Re: Software RAID1 User error with bootloader woes

frostschutz, I'll give that a try soon and get back with you on that.

NoSuck, how on Earth is ZFS supposed to be easier?  The wiki pages are even more confusing than before!  They're all over the place, and the wording is pretty poor in my opinion. 

/ZFS mentions an installation process of the ZFS package(s), but /Installing_Arch_Linux_on_ZFS mentions putting ZFS into an archiso.  Are these alternatives to each other, or are they both required?  Is the main difference that the /ZFS page references an already installed system, whereas the /Installing_Arch_Linux_on_ZFS page is from a fresh installation? 

I figure since I'll once again have limited time to work on this, I'll go ahead and go through the installation process that I plan to follow with you, like I did above for mdadm.  I'm sure I'll make a mistake somewhere by misinterpreting these installation guides, so please don't hesitate to tell me where I'm wrong.  The following process will be what I plan to do, start to finish, with a

Partition drive with bf00 partition on /, and setup /boot and bios_grub or whatever as normal
 
Setup swap as wiki suggests:
# zfs create -V 8G -b $(getconf PAGESIZE) \
              -o logbias=throughput -o sync=always\
              -o primarycache=metadata \
              -o com.sun:auto-snapshot=false <pool>/swap
# mkswap -f /dev/zvol/<pool>/swap
# swapon /dev/zvol/<pool>/swap

Add the following to /etc/fstab:
/dev/zvol/<pool>/swap none swap discard 0 0

# mkfs.ext4 /dev/*path to /boot*

# modprobe zfs
and look for zfs modules to confirm they are working as desired - Could you elaborate on what I am looking for specifically?  The wiki does not.

# zpool create -f zroot /dev/disk/by-id/id-to-partition-partx
???  Again, wiki doesn't explain much here, or at least I must be missing it, too tired, etc.  UUID?  so like /dev/sda/*UUID GOES HERE*/*What goes here?*
Could someone provide an example maybe?

So a dataset is like a partition but is special to ZFS.  The wiki mentions separating the majority of your files into your /home, but I wasn't planning on having a /home.  Is this only for snapshots, or what?  I guess the snapshot is of /, and therefore large data shouldn't be a part of it?  Is that the logic going on?  If so, I guess I'd go with these commands:

# zfs create -o encryption=on -o keyformat=passphrase -o mountpoint=none zroot/encr
# zfs create -o mountpoint=none zroot/encr/ROOT
# zfs create -o mountpoint=none zroot/encr/data

Mentions acl must be enabled on /, as it contains journalctl logs?  So I'd just add acl to my fstab options for my root directory?  

# zfs umount -a
Unmounting to edit properties of the datasets I guess?

# zfs set mountpoint=/ zroot/ROOT/default
# zfs set mountpoint=legacy zroot/data/home
Why not do this when we created the datasets just above?

Says to add to /etc/fstab, but I don't see acl option on the example:
# <file system>        <dir>         <type>    <options>              <dump> <pass>
zroot/ROOT/default / zfs defaults,noatime 0 0
zroot/data/home /home zfs defaults,noatime 0 0

Would this command be necessary if I have a /boot partition?
# zpool set bootfs=zroot/ROOT/default zroot

# zpool export zroot
# zpool import -d /dev/disk/by-id -R /mnt/zroot -l zroot
Again, what is /dev/disk by id?  /dev/sd*value*/*UUID?*

# cp /etc/zfs/zpool.cache /mnt/etc/zfs/zpool.cache
OR # zpool set cachefile=/etc/zfs/zpool.cache zroot
if I don't have that file apparently

mount above disksets, install system as normal, then 

# genfstab -U -p /mnt >> /mnt/etc/fstab
"Comment out all non-legacy datasets apart from the root dataset, the swap file and the boot/EFI partition. It is a convention to replace the swap's uuid with /dev/zvol/zroot/swap."
Could I get an example on this?  

add repository to pacman, sign key, and install zfs-dkmz?  Guide mentions two other packages specifically, but wouldn't I want the one that doesn't depend on my kernel version?  Would there be any obvious issue with this?

Edit mkinitcpio.conf hooks to be like:
HOOKS="base udev autodetect modconf block keyboard zfs filesystems"

"When using systemd in the initrd, you need to install mkinitcpio-sd-zfsAUR and add the sd-zfs hook after the systemd hook instead of the zfs hook."
So would that apply to me or not?  I know I use systemd, but I don't understand what it means by, "in the initrd"

# ZPOOL_VDEV_NAME_PATH=1 grub-mkconfig -o /boot/grub/grub.cfg
# grub-install --target=i386-pc /dev/sdx

# exit
# umount /mnt/boot 
# zfs umount -a
# zpool export zroot

# zpool set cachefile=/etc/zfs/zpool.cache <pool>
So I need to do this for each pool.  How do I list my pools?  Would this just be the datasets or whatever that I created earlier?  I guess it'd be the name of the dataset?

# systemctl enable zfs.target

# zgenhostid $(hostid)
Then add the id to /etc/hostid?

Then regenerate mkinitcpio.  If I just ran pacman -S linux, that'd give the same effect right?

I know that's a lot of questions, but I'd like to learn about ZFS, as it looks like it is really packed with features.  I just wish the wiki was a bit easier to follow, but maybe the issue lies with me instead.  All of that above and I still haven't setup mirroring, or any advanced features other than encryption.  If I can get the above working, it shouldn't be that hard to get mirroring going as well by the looks of it. 

I want to learn more about what I'm doing, and not just follow whatever you, the wiki, and other resources tell me if possible.

Thanks for your time and help, I genuinely do appreciate it.

Offline

#12 2018-02-23 08:43:05

NoSuck
Member
Registered: 2015-03-04
Posts: 157
Website

Re: Software RAID1 User error with bootloader woes

I apologize for throwing out a resource that I didn't actually use.  I retract my recommendation.  Ignore the wiki for now.

Actual commands from $HISTFILE:

# Add repo.
■ sudoedit /etc/pacman.conf
■ tail -n2 /etc/pacman.conf
[archzfs]
Server = http://archzfs.com/$repo/$arch
■ sudo pacman-key -r 5E1ABF240EE7A126
■ sudo pacman-key --lsign-key 5E1ABF240EE7A126
■ sudo pacman -Syy
■ sudo pacman -S zfs-linux zfs-utils-linux spl-linux spl-utils-linux
# Make a pool.  (Cache file is automatically created and need not exist.)
■ sudo zpool set cachefile=/etc/zfs/zpool.cache tank
■ sudo zpool create -fm /mnt/tank -o ashift=12 tank raidz2 /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg
# Make the pool automatically mount.
■ sudo systemctl enable zfs-import.target zfs-mount.service zfs-import-cache.service zfs-share.service zfs-zed.service

Everything after that is maintenance (i.e. a cron job for "zpool scrub tank").

Both #zfs and #zfsonlinux on freenode are helpful places.  They should be able to provide more insight with using ZFS on root, for example.

Bugattikid2012 wrote:

UUID?  so like /dev/sda/*UUID GOES HERE*/*What goes here?*

sudo blkid

Offline

#13 2018-02-25 20:44:24

Bugattikid2012
Member
Registered: 2014-09-24
Posts: 58

Re: Software RAID1 User error with bootloader woes

No dice with the mdadm suggestions unfortunately.  I get the same bashlike rescue environment from GRUB as before.

NoSuck, I have a few questions in regards to the commands you used. 

For starters, do I need to put ZFS into an archiso?  I'm assuming not since you told me to ignore the wiki and made no specific mention of it. 

What about the fact that these are listed as being on the AUR?  I'm assuming that adding the ZFS repository is an alternative to using the AUR packages?  And the package names are the same?

At what point in your commands do I install the base OS?  And what specifically would I mount to / before installing?  What would that look like?

Thanks again all for the persistent help, I do appreciate it.

Edit: Looks like installing ZFS to the archiso is needed.  Quite a pain but I guess there's no other way for them to do it due to licensing.

Last edited by Bugattikid2012 (2018-02-25 21:35:48)

Offline

#14 2018-02-26 00:04:18

Bugattikid2012
Member
Registered: 2014-09-24
Posts: 58

Re: Software RAID1 User error with bootloader woes

I tried to create a archiso, but got an error claiming that /dev/loop1 was unable to be mounted, and spit me out to a rescue menu.  I'm sure I didn't do it properly, but I followed the instructions to a T, at least I think I did.

NoSuck wrote:

Both #zfs and #zfsonlinux on freenode are helpful places.  They should be able to provide more insight with using ZFS on root, for example.

I wish I could concur.  They were pretty rude to me and didn't help much.  After being insulted with passive aggressive and talking down to you style remarks from 2 guys for about 20 minutes, one of the channel mods stepped in and suggested I install Arch to a standard ext4 partition, then use a zfs-iso from zfsonlinux.org to setup the ZFS partitions, and then move my root into the newly created ZFS pool.  What do you think?

I'm really getting fed up with how hard it is to get this working right.  I didn't expect either ZFS or mdadm to have such horrible documentation.  Is there something that I'm missing here?

Offline

#15 2018-02-26 00:38:11

frostschutz
Member
Registered: 2013-11-15
Posts: 1,417

Re: Software RAID1 User error with bootloader woes

Bugattikid2012 wrote:

No dice with the mdadm suggestions unfortunately.  I get the same bashlike rescue environment from GRUB as before.

It would be nice if you provided updated outputs for everything again. (parted -l, grub-install, etc. etc.)

Bugattikid2012 wrote:

I get the same bashlike rescue environment from GRUB as before.

Does that refer to the GRUB prompt itself (which is not quite bash-like) or an actual shell (where commands such as dmesg, cat /proc/cmdline, etc. work) then you'd be in an initramfs shell so the bootloader at least would be in working order then...

Bugattikid2012 wrote:

They were pretty rude to me and didn't help much.

Usually happens when it doesn't seem like you're cooperating or otherwise deliberately wasting people's time.

I can't help with ZFS, as I've never used it, sorry.

Offline

#16 2018-02-26 01:41:32

Bugattikid2012
Member
Registered: 2014-09-24
Posts: 58

Re: Software RAID1 User error with bootloader woes

frostschutz wrote:
Bugattikid2012 wrote:

No dice with the mdadm suggestions unfortunately.  I get the same bashlike rescue environment from GRUB as before.

It would be nice if you provided updated outputs for everything again. (parted -l, grub-install, etc. etc.)

Bugattikid2012 wrote:

I get the same bashlike rescue environment from GRUB as before.

Does that refer to the GRUB prompt itself (which is not quite bash-like) or an actual shell (where commands such as dmesg, cat /proc/cmdline, etc. work) then you'd be in an initramfs shell so the bootloader at least would be in working order then...

Bugattikid2012 wrote:

They were pretty rude to me and didn't help much.

Usually happens when it doesn't seem like you're cooperating or otherwise deliberately wasting people's time.

I can't help with ZFS, as I've never used it, sorry.


I was referring to the GRUB's actual prompt, only by name.  It's not even close to bash like in reality. 

I was certainly not being rude, disingenuous, etc.  They were jerks through and through.  This certainly isn't my first time using IRC, I am aware how things work over there.

Thanks for your help anyways, I appreciate it all the same.  I'm considering getting a RAID controller at this point, but I really don't want to blow money like that if I could get one of these methods to work.  Unfortunately I don't have any output as I've already moved past that point.  My bad, it just seems useless in my head when I am providing literally each step I took, so the thought doesn't occur to me as often.

Offline

Board footer

Powered by FluxBB