emergency.service can't be overridden any more

Salkay · 2024-12-25 10:25:03

My home server has an issue with one hard drive not starting properly after a system reboot. The drive is in my fstab, and hence prevents the system from fully booting. Thus, I previously set my system to automatically reboot after a failure, doing systemctl edit emergency.service then adding:

[Service]
ExecStartPre=/bin/sh -c "read -t 30 || /bin/systemctl reboot"

This has worked fine worked for months. However, in the last few weeks, it's stopped working, seemingly progressing to emergency/rescue mode without performing the reboot. I had a look in journalctl, and I found the following messages

Dec 25 17:05:40 qi (plymouth)[516]: emergency.service: Unable to locate executable 'plymouth': No such file or directory
Dec 25 17:05:40 qi systemd[1]: Startup finished in 44.719s (firmware) + 6.013s (loader) + 6.767s (kernel) + 21.640s (userspace) = 1min 19.1>
Dec 25 17:05:55 qi systemd[1]: emergency.service: Control process exited, code=exited, status=1/FAILURE
Dec 25 17:05:55 qi systemd[1]: emergency.service: Failed with result 'exit-code'.

I can see that /usr/lib/systemd/system/emergency.service (suprisingly?) contains a reference to plymouth, as per the excerpt:

[Service]
Environment=HOME=/root
WorkingDirectory=-/root
ExecStartPre=-plymouth --wait quit
ExecStart=-/usr/lib/systemd/systemd-sulogin-shell emergency

Looking back at my logs and backups, I don't think I've ever had plymouth installed. (Should this be a dependency of systemd, which provides /usr/lib/systemd/system/emergency.service?) However, I think this service has contained the plymouth reference for ages, so perhaps this is not relevant? Nevertheless, I tried to override this line, by modifying systemctl edit emergency.service to

[Service]
ExecStartPre=
ExecStartPre=/bin/sh -c "read -t 15 || /bin/systemctl reboot"

I now no longer get the plymouth error, but it still fails to reboot, with the following in journalctl

Dec 25 17:42:46 qi systemd[1]: emergency.service: Main process exited, code=exited, status=1/FAILURE
Dec 25 17:42:46 qi systemd[1]: emergency.service: Failed with result 'exit-code'.

I did some other troubleshooting, and I think the line seems to execute, but just doesn't cause a reboot any more. How can I get this working again?

EDIT

I did a bit more troubleshooting. If I change the line to the following

ExecStartPre=/bin/sh -c "echo waiting 15 seconds; read -t 15 || echo timed out; ls /bin/systemctl; /bin/systemctl reboot; echo should have rebooted"

I see all the "echo"s when I connect a physical screen to the server. However, when /bin/systemctl reboot should execute, I see

Failed to connect to system scope bus via local transport: No such file or directory

It might be related to this issue. I patched the file and built systemd, but it still seems broken for me.

Last edited by Salkay (2025-01-01 06:23:28)

Salkay · 2025-01-16 10:43:27

This seems to have been fixed with a recent update, perhaps from one of the changes in systemd 257.2-2, which I updated yesterday. I'm not really sure what was changed that was relevant. In any case, the emergency shell now reboots as expected. FWIW I now see the following in journalctl

Jan 16 19:57:40 qi systemd[1]: emergency.service: Control process exited, code=killed, status=15/TERM
Jan 16 19:57:40 qi systemd[1]: emergency.service: Failed with result 'signal'.

However, the reference to plymouth in the service is still confusing. If I don't override it, I still get the following in journalctl.

Jan 16 21:28:52 qi (plymouth)[510]: emergency.service: Unable to locate executable 'plymouth': No such file or directory

I don't really understand why there should be a reference to plymouth in the service, but I guess it's doing no harm.

Arch Linux

#1 2024-12-25 10:25:03

emergency.service can't be overridden any more

#2 2025-01-16 10:43:27

Re: emergency.service can't be overridden any more

Board footer