You are not logged in.
Hello,
I'm partial to running RAID arrays on my personal PC's for obvious reasons, and lately have been running into peculiar issues and can not put my finger on what to blame. First off, my setup is generally the latest generation consumer grade PC's with Intel chips, and NVMe m.2 disks running in a RAID5 configuration.
As recently as a year ago running a BIOS level RAID array (Intel VMD, RST, etc) was an exercise in patience, tinkering, and overall staying true to Linux's do it yourself spirit. But, within the past year there have been (from what I can tell) advancements in mdadm compatibility with BIOS level RAID controllers. Now, with VMD at the very least, if you create an array at the BIOs level mdadm will be compatible with the array. Starting it as md127, and md126 respectively.
With that being said I primarily had used mdadm and "software RAID" with NVMe disks without issue. Recently, I have had issues with these arrays, and have yet to be able to discern if the issue is at a hardware level, software level, or just bad luck. Multiple times over the course of the past 12 months I would have an md array destroy itself somehow without the specter or recovery. In general I would run a command, and then the entire OS would cease functioning with one or more of the disks aligned with the array being removed with mdadm telling me the disks I'm trying to re-add are already part of an array, but not being added at boot by mdadm. I would have to stop the array, wipe the disk, and then re-add said disk. This issue is not isolated to a specific disk, but all disks are the same type and brand, being Samsung 990 Pro 1TB m.2's. I'm aware in the wiki that running RAID5 is fraught with the risk up having to rebuild the array, but I feel like this failure rate is excessive. This issues also persist into RAID1 arrays recently as well.
Now, I'm having issues getting a RAID array, either software or hardware, to last even into disk encryption. So, my question to the community before I go out and replace all three disks is whether or not software RAID or hardware (BIOS) RAID is even practical with NVMe disks in general. I feel like it should be, but probably not the use case the maintainers have as a priority considering this is a consumer grade PC. I'm mainly wondering if anyone has any insight, tips or tricks, or otherwise a workaround to me throwing money at the issue with new hardware.
The disks I'm using are relatively new, about a year old. The md arrays I'm running are LUKS encrypted, and then being put into a lvm2 volume group, and lv's being created and partitioned based on my /home, /var, /opt, and root partitions. The disks again are Samsung 990 Pro 1tb m.2 NVMe disks. I know there was a firmware issue with this specific model, but unless these disks have another, separate issue, all the firmware has been updated as of about eight months ago. I'm running a Gigabyte z790 Auros Master mb with an Intel 12900k cpu and 32gb of RAM.
I'm currently on my laptop so cannot give the output of smartctl or any other info until I get home.
Offline