You are not logged in.
Hello guys,
after more then two years a new version of mdadm dropped in and my own alarm scripts has some trouble to deal with it. :-/
Unfortunately it looks like that the new version spins around during the boot phase and devices are tested if they are active or not.
Of course a new assemble and renew the mdadm.conf file was my first step und a mkinitcpio runs after that.
At the moment i include a dirty hack with file in my script which leads to other problems if a "real" error happens.
However, do you have an solution for the new version that it's testing devices on boot ?
(I tested a package downgrade of course and everything is fine again)
My script output:
journalctl -b -u mdmonitor.service
Mär 03 09:54:26 daten-box systemd[1]: Started MD array monitor.
Mär 03 09:54:26 daten-box mdadm[719]: mdadm: DeviceDisappeared event detected on md device /dev/md/md127
Mär 03 09:54:26 daten-box mdadm[727]: Assume boot in progress. Waiting...
Mär 03 09:54:26 daten-box mdadm[719]: mdadm: DeviceDisappeared event detected on md device /dev/md/md1
Mär 03 09:54:26 daten-box mdadm[746]: Assume boot in progress. Waiting...
Mär 03 09:54:26 daten-box mdadm[719]: mdadm: NewArray event detected on md device /dev/md127
Mär 03 09:54:26 daten-box mdadm[755]: Assume boot in progress. Waiting...
Mär 03 09:54:26 daten-box mdadm[719]: mdadm: NewArray event detected on md device /dev/md1
Mär 03 09:54:26 daten-box md_monitor_alarm.sh[769]: md_monitor_alarm.sh called with NewArray /dev/md1 arguments and no /var fs is present
Mär 03 09:54:26 daten-box mdadm[765]: Assume boot in progress Waiting...
mdadm.conf:
...
MAILADDR <hidden>
PROGRAM /usr/local/bin/md_monitor_alarm.sh
# old arch package 4.2.2 #ARRAY /dev/md/daten-box:127 metadata=1.2 UUID=da1092e8:99e02141:26ae0fc4:e82b972a
# old arch packahe 4.2.2 #ARRAY /dev/md1 metadata=1.2 UUID=b615704c:787643d8:2dcd3a41:52cca4c6
# Reassemble with my rescue system and 'mkinitcpio -P' done
ARRAY /dev/md1 metadata=1.2 name=debian-rescue:pool UUID=b615704c:787643d8:2dcd3a41:52cca4c6
ARRAY /dev/md127 metadata=1.2 name=debian-rescue:ssd UUID=da1092e8:99e02141:26ae0fc4:e82b972a
My script: (I'm a lazy bitch, so no comments about the script style please)
#!/usr/bin/bash
#
# MD raid alarm script
# Copyright Akusari 2023 (<hidden>)
#
# Called by mdadm
# $1 = event
# $2 = device
#
# Start conditions
#
if [ "$1" == "" ]; then
echo "Missing first argument!"
exit 255
fi
if [ "$2" == "" ]; then
echo "Missing second argument!"
exit 255
fi
#
# Logging
#
call_date="$(/usr/bin/date +%Y-%m-%d-%T)"
message="$(basename $0) called with $@ arguments"
log_file="/var/log/monitor_md_alarm.log"
if [ ! -f $log_file ]; then
echo "$message and no /var fs is present" | /usr/bin/systemd-cat -p warning -t $(basename $0)
echo "Assume boot in progress! Waiting..."
exit 254
fi
echo "${call_date}: $message" >> $log_file
echo "$message" | /usr/bin/systemd-cat -p warning -t $(basename $0)
#
# conditions
#
if [ "$1" == "TestMessage" ]; then
echo "${call_date}: Abort because Test-Mode detected" >> $log_file
echo $1 | mail -Ssendwait -s "$(hostname) $(basename $0) device $2" root@<hidden>
exit 0
fi
if [ "$(cat /sys/block/${2}/md/sync_action)" == "check" ]; then
echo "Abort raid alarm because there is a raid check on $2 running" >> $log_file
echo $1 | mail -Ssendwait -s "$(hostname) $(basename $0) device $2" root@<hidden>
exit 0
fi
if [[ $1 == "Rebuild"* ]]; then
echo "Abort raid alarm because there is a raid rebuilding on $2 running" >> $log_file
echo $1 | mail -Ssendwait -s "$(hostname) $(basename $0) device $2" root@<hidden>
exit 0
fi
if [ "$1" == "NewArray" ]; then
echo "New Array $2 detected - There is nothing todo for us" >> $log_file
echo $1 | mail -Ssendwait -s "$(hostname) $(basename $0) device $2" root@<hidden>
exit 0
fi
if [ "$1" == "RebuildStarted" ]; then
echo "Rebuild Array $2 detected - There is nothing to do for us" >> $log_file
echo $1 | mail -Ssendwait -s "$(hostname) $(basename $0) device $2" root@<hidden>
exit 0
fi
if [ "$1" == "RebuildFinished" ]; then
echo "Abort raid alarm because Raid rebuild $2 finised" >> $log_file
echo $1 | mail -Ssendwait -s "$(hostname) $(basename $0) device $2" root@<hidden>
exit 0
fi
#
# Run endless
#
while true; do
if [ ! -f /.md_silent ]; then
echo -e '\a' > /dev/console
fi
sleep 30
if [ -f /.md_exit ]; then
rm /.md_exit
break
fi
done
exit 0
Regards
Akusari
Last edited by Akusari (2024-03-03 09:29:47)
black listed users: seth WorMzy
Please stay away from this thread - Thanks.
Offline
Seems like a bug, but maybe raise it on the mdadm mailing list. If you're not overly reliant on the device name being /dev/md#, then you could just change it in your mdadm.conf to match what mdadm 'wants' to call it. i.e. /dev/md/md127 and /dev/md/md1.
FWIW I can reproduce locally if I change
ARRAY /dev/md/ssdraid metadata=1.2 name=sakura:ssdraid UUID=b7a499c0:c424e415:011ce8fd:934931ab
to
ARRAY /dev/md127 metadata=1.2 name=sakura:ssdraid UUID=b7a499c0:c424e415:011ce8fd:934931ab
mdadm creates /dev/md/md127 first, then removes that and creates /dev/md127, triggering the DeviceDisappeared and NewArray events.
Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD
Making lemonade from lemons since 2015.
Offline
Seems like a bug, but maybe raise it on the mdadm mailing list. If you're not overly reliant on the device name being /dev/md#, then you could just change it in your mdadm.conf to match what mdadm 'wants' to call it. i.e. /dev/md/md127 and /dev/md/md1.
Yes, i think it's a bug and not a feature too. Thanks anyway :-)
I guess that we could get a problem with your (good) suggestion, because 99% of all Wiki's and documentations around mdadm using the old /dev/mdX standard and this might be a problem for a lot of users.
It should be fixed anyway.
Regards
Akusari
Last edited by Akusari (2024-03-04 20:25:29)
black listed users: seth WorMzy
Please stay away from this thread - Thanks.
Offline
You can set MONITORDELAY in mdadm.conf but I'm not sure if it would help in this case.
You could just ignore calls with unusual array names in your script. Or ask the linux-raid mailing list for advice.
Offline
You can set MONITORDELAY in mdadm.conf but I'm not sure if it would help in this case.
It's a problem from the mdadm script itself (upstream problem), so it doesn't work.
You could just ignore calls with unusual array names in your script.
Yeah, but i moved on to symbolic /dev/md/X struct and it works for me anyway.
BTW: A possible package downgrade no longer works since the ARCH-Team announced a mkinitcpio upgrade: https://archlinux.org/news/mkinitcpio-h … microcode/
But thanks anyway for your help :-)
Regards
Akusari
Last edited by Akusari (2024-03-04 21:19:01)
black listed users: seth WorMzy
Please stay away from this thread - Thanks.
Offline