You are not logged in.

#1 2023-01-25 21:52:37

bachtiar
Member
Registered: 2005-02-08
Posts: 64

Watchdog does not reboot the machine

I'm trying to configure my system for automatic reboot on lockup using a watchdog.

In dmesg I see the watchdog device being detected:

NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
...
iTCO_wdt iTCO_wdt: Found a Intel PCH TCO device (Version=4, TCOBASE=0x0400)
iTCO_wdt iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0)

This makes me believe that the watchdog device is present and working. I also have /dev/watchdog and /dev/watchdog0.

As per https://0pointer.de/blog/projects/watchdog.html, I configured RuntimeWatchdogSec=20 in /etc/systemd/system.conf, and checked it with wdctl:

Device:        /dev/watchdog0
Identity:      iTCO_wdt [version 0]
Timeout:       20 seconds
Timeleft:      19 seconds

So far so good. One minor observation is that the "Timeleft" is always at 19. I would expect it to count down, but then again I'm not sure how it's supposed to work.

But then I tried to simulate the kernel crash as described in https://unix.stackexchange.com/a/66205:

sysctl debug.kdb.panic=1
echo c > /proc/sysrq-trigger

This froze the kernel as intended, but the machine did not reboot as expected after 20s. Why? Is the watchdog not working, or have I not configured it correctly, or I am not testing it correctly? Or is it something else...?

PS: There are no watchdog-related settings in BIOS...

Last edited by bachtiar (2023-01-26 06:39:23)

Offline

Board footer

Powered by FluxBB