You are not logged in.
Hello ,
after upgrading to linux 5.8.8.arch1-1 my system starts up in emergency mode, because it is unable to mount partitions that are located on my RAID 0. The respective systemd units fail with a timeout.
Sep 11 20:18:01 GaliusBonus systemd[1]: dev-disk-by\x2duuid-64C646AEC6467FF2.device: Job dev-disk-by\x2duuid-64C646AEC6467FF2.device/start failed with result 'timeout'.
Sep 11 20:18:01 GaliusBonus systemd[1]: mnt-raid.mount: Job mnt-raid.mount/start failed with result 'dependency'.
Sep 11 20:18:01 GaliusBonus systemd[1]: Dependency failed for /mnt/raid.
Sep 11 20:18:01 GaliusBonus systemd[1]: Timed out waiting for device /dev/disk/by-uuid/64C646AEC6467FF2.
Sep 11 20:18:01 GaliusBonus systemd[1]: dev-disk-by\x2duuid-64C646AEC6467FF2.device: Job dev-disk-by\x2duuid-64C646AEC6467FF2.device/start timed out.
Sep 11 20:18:01 GaliusBonus systemd[1]: dev-disk-by\x2duuid-19333c8d\x2dc400\x2d4123\x2d89d5\x2d6590c3a04ea1.device: Job dev-disk-by\x2duuid-19333c8d\x2dc400\x2d4123\x2d89d5\x2d6590c3a04ea1.device/start failed with result 'timeout'.
All physical harddrives are listed when using "fdisk -l". Just the raid "md" device is missing. Downgrading to kernel version 5.8.7 fixed the issue. I assume something changed in the kernel regarding software RAIDs. I am also dual-booting Windows and the RAID works perfectly over there. What am I missing or how can I troubleshoot this further?
Asking mdadm about the RAID platform it reports the same in both kernel versions.
mdadm --detail-platform
Platform : Intel(R) Rapid Storage Technology
Version : 11.1.0.1413
RAID Levels : raid0 raid1 raid10 raid5
Chunk Sizes : 4k 8k 16k 32k 64k 128k
2TB volumes : supported
2TB disks : supported
Max Disks : 6
Max Volumes : 2 per array, 4 per controller
I/O Controller : /sys/devices/pci0000:00/0000:00:1f.2 (SATA)
Port5 : /dev/sdd (S1D5NSCF342734B)
Port3 : /dev/sdb (S3Z2NB0M525985W)
Port4 : /dev/sdc (Z1DB5B6N)
Port2 : /dev/sda (Z1DB4Q7R)
Port0 : - no device attached -
Port1 : - no device attached -
Thank you in advance.
Offline
I had the same issue
But because I experimented too much with different raid settings, I've now completely broken it and (I think) lost a VM image that I didn't have a backup of
Edit: I also opened a followup topic for my problem: https://bbs.archlinux.org/viewtopic.php … 1#p1925781
Last edited by Shinigami92 (2020-09-11 21:55:52)
Offline
There are a few dm related commits in https://cdn.kernel.org/pub/linux/kernel … eLog-5.8.8
If you revert them one by one or by bisecting between 5.8.7 and 5.8.8 can you locate the causal commit?
Offline
I had the same issue. Thanks for pointing me to downgrading the kernel.
I'll keep checking here to see when it is safe to upgrade, as I don't think I'm skilled enough to understand kernel commit jargon.
Last edited by Garzet (2020-09-12 09:06:35)
Offline
Follow Arch_Build_System#Retrieve_PKGBUILD_source to obtain the PKGBUILD and config.
Change the prepare function of the PKGBUILD to the following:
prepare() {
cd $_srcname
echo "Setting version..."
scripts/setlocalversion --save-scmversion
echo "-$pkgrel" > localversion.10-pkgrel
echo "${pkgbase#linux}" > localversion.20-pkgname
git revert -n 4469ea5972ab9c3064af6dcc0d76c1dfa6bb7913
git revert -n b3c76fdbb11988c5775b684980aabc02886e5d41
git revert -n d02a33a248258cc0c2803f7af318ddcd8d83ba16
git revert -n 0a495d145f59939cba68849a721e6cf27babce34
git revert -n 372236a01bc548c3a0fdb02eb362144a3b10a233
local src
for src in "${source[@]}"; do
src="${src%%::*}"
src="${src##*/}"
[[ $src = *.patch ]] || continue
echo "Applying patch $src..."
patch -Np1 < "../$src"
done
echo "Setting config..."
cp ../config .config
make olddefconfig
make -s kernelrelease > version
echo "Prepared $pkgbase version $(<version)"
}
This reverts some dm commits. Enable parallel compilation to reduce build time. Build the package and test if the issue is still present.
Offline
I did build the kernel with the patches provided by loqs reverted, but the issue stayed the same.
I will try to revert the other dm related commits soon.
Offline
The following will add more commits and only rebuild what has changed (should now include all block commits but not NVME)
cd src/archlinux-linux
git revert -n bf8fe7b755c2ccdf8fd739ad71dd0d035588511a
git revert -n 3c761332597d1dc3bc527ba5924f300dc43ae9a2
git revert -n 70d22582c3eb6d50c30574019777d546fbd5cc81
git revert -n dea6f05d372a2117b581e17a3638a72d696ac6aa
git revert -n e37bc36aaff38fdf8fafc52bc88ad98ed1ff7a88
git revert -n 329c9ffc81cfb985c6d131e94e6d220d7c1b19ca
git revert -n a7a42c1e5023cdac2bbc1038689509595d279cd2
git revert -n b7df98a8b7b8abce596e9696d5c3183fc4c0019d
git revert -n 692d0626557451c4b557397f20b7394b612d0289
cd ../..
makepkg -e
Offline
Reverting the additional patches fixed the issue for me.
Offline
So it was one of those nine extra commits. The following reapplies four of them.
cd src/archlinux-linux
git cherry-pick -n 692d0626557451c4b557397f20b7394b612d0289
git cherry-pick -n b7df98a8b7b8abce596e9696d5c3183fc4c0019d
git cherry-pick -n a7a42c1e5023cdac2bbc1038689509595d279cd2
git cherry-pick -n 329c9ffc81cfb985c6d131e94e6d220d7c1b19ca
cd ../..
makepkg -e
Then repeat reverting or cherry-picking as needed until you have a single commit left as the cause.
Edit:
If anyone else wants to test please try the following change to the PKGBUILD which reverts the five commits Nuckal777 would not reapplying:
prepare() {
cd $_srcname
echo "Setting version..."
scripts/setlocalversion --save-scmversion
echo "-$pkgrel" > localversion.10-pkgrel
echo "${pkgbase#linux}" > localversion.20-pkgname
git revert -n bf8fe7b755c2ccdf8fd739ad71dd0d035588511a
git revert -n 3c761332597d1dc3bc527ba5924f300dc43ae9a2
git revert -n 70d22582c3eb6d50c30574019777d546fbd5cc81
git revert -n dea6f05d372a2117b581e17a3638a72d696ac6aa
git revert -n e37bc36aaff38fdf8fafc52bc88ad98ed1ff7a88
local src
for src in "${source[@]}"; do
src="${src%%::*}"
src="${src##*/}"
[[ $src = *.patch ]] || continue
echo "Applying patch $src..."
patch -Np1 < "../$src"
done
echo "Setting config..."
cp ../config .config
make olddefconfig
make -s kernelrelease > version
echo "Prepared $pkgbase version $(<version)"
}
Last edited by loqs (2020-09-13 00:23:33)
Offline
I failed with my other topic and falled back to using linux-lts now
I reformatted my raid and lost my VM, and will install it from scratch ¯\_(ツ)_/¯
Hope this bug will be fixed soon
Offline
Please test cherry-picking https://git.kernel.org/pub/scm/linux/ke … 5957a60e1a
Offline
After cherry-picking the raid is not recognized. So one of the following commits should be the cause.
692d0626557451c4b557397f20b7394b612d0289
b7df98a8b7b8abce596e9696d5c3183fc4c0019d
a7a42c1e5023cdac2bbc1038689509595d279cd2
329c9ffc81cfb985c6d131e94e6d220d7c1b19ca
Please test cherry-picking https://git.kernel.org/pub/scm/linux/ke … 5957a60e1a
Ontop of a clean 5.8.8 build?
Offline
Yes on a clean 5.8.8 that will be the only needed change.
Offline
Commit 692d0626557451c4b557397f20b7394b612d0289 is causing the issue on my machine. I will try to test the 5.8.8 + cherry-pick 88ce2a530cc9865a894454b2e40eba5957a60e1a tomorrow.
Offline
If it is 692d0626557451c4b557397f20b7394b612d0289 88ce2a530cc9865a894454b2e40eba5957a60e1a will not fix it. Try just reverting 692d0626557451c4b557397f20b7394b612d0289 on 5.8.8 to confirm.
Ignore that I got confused between the backport commit IDs and the upstream commit IDs.
https://bugs.archlinux.org/task/67891
Nuckal777 thank you for all the testing.
Edit:
Fixed in linux 5.8.9.arch2-1
Last edited by loqs (2020-09-14 02:09:25)
Offline