You are not logged in.

#1 2025-09-18 19:34:27

akira86
Member
Registered: 2009-01-16
Posts: 124

[SOLVED] pacman→hook→udev-reload→watchdog did not stop→reboot→corrupt

Hello,

I think it might be a bug, but before reporting it I though I would discuss it here (in case the "bug" is more on my side/conf than from Archlinux itself).

I run an "XPS 13 9300".

On normal boot I have :

root@mde # journalctl -b | grep watchdog
sept. 18 21:11:35 archlinux kernel: NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
sept. 18 21:11:51 mde boltd[733]: watchdog: enabled [pulse: 90s]

root@mde # lsmod | grep wdt
iTCO_wdt               16384  0
intel_pmc_bxt          16384  1 iTCO_wdt
iTCO_vendor_support    12288  1 iTCO_wdt
intel_oc_wdt           12288  0

root@mde # wdctl
Device:        /dev/watchdog0
Identity:      intel_oc_wdt [version 0]
Timeout:       60 seconds
FLAG           DESCRIPTION               STATUS BOOT-STATUS
KEEPALIVEPING  Keep alive ping reply          0           0
MAGICCLOSE     Supports magic close char      0           0
SETTIMEOUT     Set timeout (in seconds)       0           0

On some pacman upgrade/install, `30-systemd-udev-reload.hook` get triggered

root@mde # cat /usr/share/libalpm/hooks/30-systemd-udev-reload.hook
[Trigger]
Type = Path
Operation = Install
Operation = Upgrade
Operation = Remove
Target = usr/lib/udev/rules.d/*

[Action]
Description = Reloading device manager configuration...
When = PostTransaction
Exec = /usr/share/libalpm/scripts/systemd-hook udev-reload

which translate like this in journald:

sept. 18 00:11:08 mde pacman[5777]: running '30-systemd-udev-reload.hook'...
sept. 18 00:11:09 mde boltd[747]: [2062da76-7aab-XPS 13 9300                ] udev: device changed: authorized -> authorized
sept. 18 00:11:09 mde systemd-logind[754]: Watching system buttons on /dev/input/event1 (Power Button)
sept. 18 00:11:09 mde boltd[747]: [01d9e555-0eca-XPS 13 9300                ] udev: device changed: authorized -> authorized
sept. 18 00:11:09 mde systemd-logind[754]: Watching system buttons on /dev/input/event2 (Sleep Button)
sept. 18 00:11:09 mde systemd-logind[754]: Watching system buttons on /dev/input/event0 (Lid Switch)
sept. 18 00:11:09 mde systemd[1655]: Reached target Bluetooth.
sept. 18 00:11:09 mde systemd[1655]: Reached target Sound Card.
sept. 18 00:11:09 mde systemd-logind[754]: Watching system buttons on /dev/input/event5 (Yubico Yubikey NEO OTP+U2F+CCID)
sept. 18 00:11:09 mde systemd-logind[754]: Watching system buttons on /dev/input/event11 (Intel HID events)
sept. 18 00:11:09 mde systemd-logind[754]: Watching system buttons on /dev/input/event12 (Intel HID 5 button array)
sept. 18 00:11:09 mde kernel: watchdog: watchdog1: watchdog did not stop!
sept. 18 00:11:09 mde kernel: watchdog: watchdog0: watchdog did not stop!
sept. 18 00:11:09 mde systemd-logind[754]: Watching system buttons on /dev/input/event3 (AT Translated Set 2 keyboard)

notice the `watchdog did not stop!` ↑

And 60s later, the system reboot abruptly.

Sometimes the pacman hooks handling isn't finished, and here comes the fun → I already get some corrupted ramdisk and some corrupted EFI /boot partition.
So if it's actually an Archlinux bug, I think it's a big one.
(I think an inexperienced user would have had a hard time fixing a corrupted ramdisk or EFI partition)

I managed to figure out it was a watchdog problem, and blacklisted the two watchdog module that was loaded by default:

root@mde # cat /etc/modprobe.d/blacklist.conf
blacklist intel_oc_wdt
blacklist iTCO_wdt

Now can I update my system safely.

→ So what do you think ?
Is it an Archlinux bug, or did I missed something ?

Last edited by akira86 (2025-09-22 07:59:41)

Offline

#2 2025-09-18 19:49:19

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 69,987

Re: [SOLVED] pacman→hook→udev-reload→watchdog did not stop→reboot→corrupt

Offline

#3 2025-09-18 20:32:34

akira86
Member
Registered: 2009-01-16
Posts: 124

Re: [SOLVED] pacman→hook→udev-reload→watchdog did not stop→reboot→corrupt

It might be related as the "systemd" package own some files in `/usr/lib/udev/rules.d/` (which would trigger udev reload during the update)

But prMoriarty talks about an update in a container, and said the container crashed, not it's whole laptop.
If it was a watchdog problem (inside the kernel) I think his whole system would have reboot.
I don't know much about watchdogs though… maybe the kernel has a way to make watchdog works inside containers.

Offline

#4 2025-09-18 20:38:57

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 69,987

Re: [SOLVED] pacman→hook→udev-reload→watchdog did not stop→reboot→corrupt

Ignore the specific symptoms, does "sudo touch /etc/systemd/do-not-udevadm-trigger-on-update" prevent this?
You're currently sidestepping the issue by not using the watchdog, but the cause is what triggers the WD itfp.

Offline

#5 2025-09-21 12:49:33

akira86
Member
Registered: 2009-01-16
Posts: 124

Re: [SOLVED] pacman→hook→udev-reload→watchdog did not stop→reboot→corrupt

seth wrote:

does "sudo touch /etc/systemd/do-not-udevadm-trigger-on-update" prevent this?


yes it does.

I reverted my module blacklist, so:

root@mde # lsmod | grep wdt
iTCO_wdt               16384  0
intel_pmc_bxt          16384  1 iTCO_wdt
iTCO_vendor_support    12288  1 iTCO_wdt
intel_oc_wdt           12288  0

did a full system reinstall so it trigger all hooks:

root@mde # pacman -Qqn | pacman -S -

→ no reboot

Though, will forbidding `udevadm trigger` on update break anything ? It must be here for a reason… I wonder.

Last edited by akira86 (2025-09-21 12:50:19)

Offline

#6 2025-09-21 13:38:55

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 69,987

Re: [SOLVED] pacman→hook→udev-reload→watchdog did not stop→reboot→corrupt

Downstream from linked thread and bug, https://github.com/systemd/systemd-stab … 4631-L4656
This is only relevant if there're changes to udev rules that shell then be applied to present devices - "rarely ever"
The trigger is, as you can see in that bug, however prone to cause all sorts of issues w/ badly responding hardware.

If it help you anything, I've it off for principle reasons because I consider the one-off chance of some random udev rule not being applied immediately WAY less harmful than then (likewise remote?) risk of a pacman -Syupsreboot

In case and please always remember to mark resolved threads by editing your initial posts subject - so others will know that there's no task left, but maybe a solution to find.
Thanks.

Offline

Board footer

Powered by FluxBB