You are not logged in.
It seems an recurring systemd bug is rearing its head again. When I shut down, I get a message that "a stop job is running for session C1 user XXXX" then it makes me wait a minute and a half before it will complete the shutdown.
Searches on this has revealed that variations of this have happened a number of times in the past, in various distros using systemd, due to a variety of reasons. "Solutions" have always been either to downgrade systemd, or use a workaround to force a quicker shutdown despite systemd. Neither of which is a solution. Then, apparently, systemd is fixed and the small number of users actually affected move on...
So, my great suspicion is that there is nothing really to be done until systemd is fixed and updated, but just in case someone who understands these things wants to look at the relevant portion of my journal from the last shutdown, and finds it to be NOT a systemd bug but something else that I (or someone else) can fix, I'll include it here:
Sep 30 06:06:20 XXXXarch polkitd[555]: Registered Authentication Agent for unix-process:7834:157345 (system bus name :1.24 [/usr/bin/pkttyagent --notify-fd 5 -
Sep 30 06:06:20 XXXXarch systemd-logind[468]: System is rebooting.
Sep 30 06:06:20 XXXXarch systemd[1]: Stopped target Graphical Interface.
Sep 30 06:06:20 XXXXarch systemd[1]: Deactivating swap /dev/sdb5...
Sep 30 06:06:20 XXXXarch systemd[1]: Stopping Manage, Install and Generate Color Profiles...
Sep 30 06:06:20 XXXXarch systemd[1]: Stopped target Sound Card.
Sep 30 06:06:20 XXXXarch systemd[1]: Stopping RealtimeKit Scheduling Policy Service...
Sep 30 06:06:20 XXXXarch systemd[1]: Stopping Session c1 of user XXXX.
Sep 30 06:06:20 XXXXarch slim[509]: XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0.0"
Sep 30 06:06:20 XXXXarch slim[509]: after 3805 requests (3805 known processed) with 0 events remaining.
Sep 30 06:06:20 XXXXarch systemd[1]: Starting Generate shutdown-ramfs...
Sep 30 06:06:20 XXXXarch systemd[1]: Stopping Authorization Manager...
Sep 30 06:06:20 XXXXarch slim[509]: XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0.0"
Sep 30 06:06:20 XXXXarch slim[509]: after 11579 requests (11579 known processed) with 0 events remaining.
Sep 30 06:06:20 XXXXarch NetworkManager[473]: <warn> error requesting auth for org.freedesktop.NetworkManager.settings.modify.hostname: (0) Authorization chec
Sep 30 06:06:20 XXXXarch NetworkManager[473]: <warn> error requesting auth for org.freedesktop.NetworkManager.settings.modify.own: (0) Authorization check fai
Sep 30 06:06:20 XXXXarch NetworkManager[473]: <warn> error requesting auth for org.freedesktop.NetworkManager.settings.modify.system: (0) Authorization check
Sep 30 06:06:20 XXXXarch NetworkManager[473]: <warn> error requesting auth for org.freedesktop.NetworkManager.wifi.share.open: (0) Authorization check failed:
Sep 30 06:06:20 XXXXarch dbus[476]: [system] Activating via systemd: service name='org.freedesktop.PolicyKit1' unit='polkit.service'
Sep 30 06:06:20 XXXXarch systemd[1]: Stopping CUPS Scheduler...
Sep 30 06:06:20 XXXXarch pulseaudio[3086]: XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0.0"
Sep 30 06:06:20 XXXXarch pulseaudio[3086]: after 12 requests (6 known processed) with 0 events remaining.
Sep 30 06:06:20 XXXXarch avahi-daemon[1045]: Got SIGTERM, quitting.
Sep 30 06:06:20 XXXXarch systemd[1]: Starting Store Sound Card State...
Sep 30 06:06:20 XXXXarch avahi-daemon[1045]: Leaving mDNS multicast group on interface eth0.IPv4 with address 192.168.1.97.
Sep 30 06:06:20 XXXXarch systemd[1]: Removed slice system-systemd\x2dfsck.slice.
Sep 30 06:06:20 XXXXarch avahi-daemon[1045]: avahi-daemon 0.6.31 exiting.
Sep 30 06:06:20 XXXXarch systemd[1]: Unmounting /home/XXXX/.Work...
Sep 30 06:06:20 XXXXarch NetworkManager[473]: <info> wpa_supplicant stopped
Sep 30 06:06:20 XXXXarch systemd[1]: Stopping WPA supplicant...
Sep 30 06:06:20 XXXXarch apcupsd[469]: apcupsd exiting, signal 15
Sep 30 06:06:20 XXXXarch systemd[1]: Stopping Avahi mDNS/DNS-SD Stack...
Sep 30 06:06:20 XXXXarch apcupsd[469]: apcupsd shutdown succeeded
Sep 30 06:06:20 XXXXarch systemd[1]: Stopped target Multi-User System.
Sep 30 06:06:20 XXXXarch NetworkManager[473]: <warn> disconnected by the system bus.
Sep 30 06:06:20 XXXXarch systemd[1]: Stopping MariaDB database server...
Sep 30 06:06:20 XXXXarch NetworkManager[473]: <warn> error requesting auth for org.freedesktop.NetworkManager.wifi.share.protected: (0) Authorization check fa
Sep 30 06:06:20 XXXXarch systemd[1]: Stopping Hard drive temperature monitor daemon...
Sep 30 06:06:20 XXXXarch NetworkManager[473]: <warn> error requesting auth for org.freedesktop.NetworkManager.wifi.share.open: (0) Authorization check failed:
Sep 30 06:06:20 XXXXarch systemd[1]: Stopping APC UPS Monitor...
Sep 30 06:06:20 XXXXarch NetworkManager[473]: <info> caught SIGTERM, shutting down normally.
Sep 30 06:06:20 XXXXarch systemd[1]: Stopping D-Bus System Message Bus...
Sep 30 06:06:20 XXXXarch NetworkManager[473]: <info> exiting (success)
Sep 30 06:06:20 XXXXarch systemd[1]: Stopping Entropy Harvesting Daemon...
Sep 30 06:06:20 XXXXarch systemd[2767]: pulseaudio.service: Main process exited, code=exited, status=1/FAILURE
Sep 30 06:06:20 XXXXarch systemd[2767]: Stopped Sound Service.
Sep 30 06:06:20 XXXXarch systemd[2767]: pulseaudio.service: Unit entered failed state.
Sep 30 06:06:20 XXXXarch systemd[2767]: pulseaudio.service: Failed with result 'exit-code'.
Sep 30 06:06:20 XXXXarch mysqld[470]: 150930 6:06:20 [Note] /usr/bin/mysqld: Normal shutdown
Sep 30 06:06:20 XXXXarch mysqld[470]: 150930 6:06:20 [Note] Event Scheduler: Purging the queue. 0 events
Sep 30 06:06:20 XXXXarch haveged[474]: haveged: Stopping due to signal 15
Sep 30 06:06:20 XXXXarch haveged[474]: haveged starting up
Sep 30 06:06:20 XXXXarch NetworkManager[473]: g_dbus_connection_real_closed: Remote peer vanished with error: Underlying GIOStream returned 0 bytes on an async
Sep 30 06:06:20 XXXXarch wpa_supplicant[554]: Successfully initialized wpa_supplicant
Sep 30 06:06:20 XXXXarch slim[509]: real transparency off.... depth: 24
Sep 30 06:06:20 XXXXarch slim[509]: xRandr: Found crtc's: 2
Sep 30 06:06:20 XXXXarch slim[509]: xRandr: Linking output DVI-I-1 with crtc 0
Sep 30 06:06:20 XXXXarch slim[509]: NO XSETTINGS manager, tint2 use config 'launcher_icon_theme'.
Sep 30 06:06:20 XXXXarch systemd[2767]: Stopped target Default.
Sep 30 06:06:20 XXXXarch systemd[2767]: Stopping D-Bus User Message Bus...
Sep 30 06:06:20 XXXXarch systemd[1]: Unmounted /run/user/1000/gvfs.
Sep 30 06:06:20 XXXXarch slim[509]: (II) Server terminated successfully (0). Closing log file.
Sep 30 06:06:20 XXXXarch systemd[2767]: Stopped D-Bus User Message Bus.
Sep 30 06:06:20 XXXXarch systemd[2767]: Stopped target Basic System.
Sep 30 06:06:20 XXXXarch systemd[2767]: Stopped target Sockets.
Sep 30 06:06:20 XXXXarch systemd[2767]: Closed Sound System.
Sep 30 06:06:20 XXXXarch systemd[2767]: Stopped target Timers.
Sep 30 06:06:20 XXXXarch systemd[2767]: Stopped target Paths.
Sep 30 06:06:20 XXXXarch systemd[2767]: Closed D-Bus User Message Bus Socket.
Sep 30 06:06:20 XXXXarch systemd[2767]: Reached target Shutdown.
Sep 30 06:06:20 XXXXarch systemd[2767]: Starting Exit the Session...
Sep 30 06:06:21 XXXXarch systemd[1]: Unmounted /home/XXXX/.Work.
Sep 30 06:06:21 XXXXarch systemd[2767]: Received SIGRTMIN+24 from PID 7875 (kill).
Sep 30 06:06:21 XXXXarch systemd[2768]: pam_unix(systemd-user:session): session closed for user XXXX
Sep 30 06:06:21 XXXXarch systemd[1]: Stopped User Manager for UID 1000.
Sep 30 06:06:21 XXXXarch mkinitcpio[7842]: ==> Starting build: none
Sep 30 06:06:21 XXXXarch mkinitcpio[7842]: -> Running build hook: [sd-shutdown]
Sep 30 06:06:21 XXXXarch systemd-logind[468]: Failed to abandon session scope: Transport endpoint is not connected
Sep 30 06:06:21 XXXXarch systemd[1]: slim.service: Main process exited, code=exited, status=1/FAILURE
Sep 30 06:06:21 XXXXarch systemd[1]: Stopped SLiM Simple Login Manager.
Sep 30 06:06:21 XXXXarch systemd[1]: slim.service: Unit entered failed state.
Sep 30 06:06:21 XXXXarch systemd[1]: slim.service: Failed with result 'exit-code'.
Sep 30 06:06:21 XXXXarch mysqld[470]: 150930 6:06:21 [Note] InnoDB: FTS optimize thread exiting.
Sep 30 06:06:21 XXXXarch mysqld[470]: 150930 6:06:21 [Note] InnoDB: Starting shutdown...
Sep 30 06:06:21 XXXXarch mkinitcpio[7842]: ==> Build complete.
Sep 30 06:06:21 XXXXarch systemd[1]: Started Generate shutdown-ramfs.
Sep 30 06:06:23 XXXXarch mysqld[470]: 150930 6:06:23 [Note] InnoDB: Shutdown completed; log sequence number 1617383
Sep 30 06:06:23 XXXXarch mysqld[470]: 150930 6:06:23 [Note] /usr/bin/mysqld: Shutdown complete
Sep 30 06:06:23 XXXXarch systemd[1]: Stopped MariaDB database server.
Sep 30 06:06:28 XXXXarch systemd-coredump[7843]: Process 5856 (BrowserBlocking) of user 1000 dumped core.
Sep 30 06:07:50 XXXXarch systemd[1]: session-c1.scope: Stopping timed out. Killing.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped Session c1 of user XXXX.
Sep 30 06:07:50 XXXXarch systemd[1]: session-c1.scope: Unit entered failed state.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopping Login Service...
Sep 30 06:07:50 XXXXarch systemd[1]: Stopping Permit User Sessions...
Sep 30 06:07:50 XXXXarch systemd[1]: Removed slice user-1000.slice.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped Permit User Sessions.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped target Remote File Systems.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped Login Service.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped target Basic System.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped target Paths.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped Dispatch Password Requests to Console Directory Watch.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped Forward Password Requests to Wall Directory Watch.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped CUPS Scheduler.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped target Sockets.
Sep 30 06:07:50 XXXXarch systemd[1]: Closed CUPS Scheduler.
Sep 30 06:07:50 XXXXarch systemd[1]: Closed Avahi mDNS/DNS-SD Stack Activation Socket.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped target Slices.
Sep 30 06:07:50 XXXXarch systemd[1]: Removed slice User and Session Slice.
Sep 30 06:07:50 XXXXarch systemd[1]: Closed D-Bus System Message Bus Socket.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped target System Initialization.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped Setup Virtual Console.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped target Encrypted Volumes.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopping Load/Save Random Seed...
Sep 30 06:07:50 XXXXarch systemd[1]: Stopping Update UTMP about System Boot/Shutdown...
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped Apply Kernel Variables.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped Load Kernel Modules.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped target Swap.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped Load/Save Random Seed.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped Update UTMP about System Boot/Shutdown.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped Create Volatile Files and Directories.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped target Local File Systems.
Sep 30 06:07:50 XXXXarch systemd[1]: Unset automount home-XXXX-.Work.automount.
Sep 30 06:07:50 XXXXarch systemd[1]: Unmounting /home...
Sep 30 06:07:50 XXXXarch systemd[1]: Unmounting /run/user/1000...
Sep 30 06:07:50 XXXXarch systemd[1]: Unmounting Temporary Directory...
Sep 30 06:07:50 XXXXarch systemd[1]: Unmounted /run/user/1000.
Sep 30 06:07:50 XXXXarch systemd[1]: Unmounted Temporary Directory.
Sep 30 06:07:50 XXXXarch systemd[1]: Unmounted /home.
Sep 30 06:07:50 XXXXarch systemd[1]: Unset automount home.automount.
Sep 30 06:07:50 XXXXarch systemd[1]: Reached target Unmount All Filesystems.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped target Local File Systems (Pre).
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped Create Static Device Nodes in /dev.
Sep 30 06:07:50 XXXXarch systemd[1]: Stopped Remount Root and Kernel File Systems.
Sep 30 06:07:50 XXXXarch systemd[1]: Reached target Shutdown.
Sep 30 06:07:50 XXXXarch systemd[1]: Reached target Final Step.
Sep 30 06:07:50 XXXXarch systemd[1]: Starting Reboot...
Sep 30 06:07:50 XXXXarch systemd[1]: Shutting down.
Sep 30 06:07:50 XXXXarch systemd-shutdown[1]: Sending SIGTERM to remaining processes...
Sep 30 06:07:50 XXXXarch systemd-journal[179]: Journal stopped
Beyond that, it's just an annoyance to have to sit through a 90s countdown before I can shut down.
Thanks for any input,
Andre
Offline
but just in case someone who understands these things wants to look at the relevant portion of my journal from the last shutdown, and finds it to be NOT a systemd bug but something else that I (or someone else) can fix
You can determine whether or not this is a systemd-related bug by simply downgrading the package.
If that fixes it, be sure to file a detailed bug report upstream.
EDIT: Also: https://wiki.freedesktop.org/www/Softwa … eventually
EDIT2: This bit looks relevant:
Sep 30 06:06:28 XXXXarch systemd-coredump[7843]: Process 5856 (BrowserBlocking) of user 1000 dumped core. Sep 30 06:07:50 XXXXarch systemd[1]: session-c1.scope: Stopping timed out. Killing.
Last edited by Head_on_a_Stick (2015-09-30 11:57:54)
Freedom for Öcalan!
Offline
If I DID downgrade systemd, would I need to downgrade something else with it, or just that? (How would I know?)
But again, my point is that this same behavior has been seen for a long time with systemd. I don't think it's any real issue elsewhere.
Offline
Well this MAY (without my input) have gone away on its own. But since we all know computers don't work that way...?
I found that if I booted into my Openbox desktop, did nothing for a couple of minutes, then rebooted, I didn't have the timeout problem. OK, so then I booted and launched Chrome, did nothing else, rebooted, and I have the timeout problem. I repeated this a few times each way to verify this behavior - leading me to believe that Chrome was causing this (which was hinted at in my code windows above).
So then I tried the same thing with Firefox (which I reasoned was also Internet-related, but NOT from the AUR. Firefox, however, did the same thing. Hmmm.
So then I tried just some random program, thinking that maybe it was caused by the user starting a program, as opposed to doing nothing, then rebooting. Libreoffice Write did not cause the problem.
So at the next reboot I tried Chrome again. It worked perfectly on reboot. Firefox the same. I verified it by a few tries. The problem stopped on its own.
Since I did nothing to solve it, and it DOES seem to mirror a recurring issue with systemd, I'm not marking it solved. If anyone has any ideas as to what happened, or where to look for clues, I'm all ears...
Offline
I'm pretty sure this is a systemd bug. I also have been having this bug since upgrading both my main desktop Arch install as well as my laptop's systemd package. In the case of my laptop it can't seem to stop anything. On my desktop I have the added bonus of it failing to dismount my drives. I've been trying to fix my desktop first since unclean dismounts is a serious issue in my eyes.
[edit] I should specify that I'm referring to the 225 -> 226 upgrade. Not the 226-1 -> 226-3 update.
Last edited by nstgc (2015-10-01 17:00:20)
Offline
Update: The problem really hasn't gone away. It just presents itself according to its own desires and schedule. It doesn't always happen on reboot/shutdown, but it sometimes does. I have no idea why one session should cause this while another doesn't. And based on searches of this problem in the past, I don't think it actually matters much what the logs say since all of it is just a symptom of another problem that is NOT being logged. In my case, I suspect using Chrome is somehow "causing" it, but as that shouldn't matter, and as I am doing nothing unusual with Chrome...
Please do not interpret this as an anti-systemd screed. I am neutral on the subject. This problem has apparently be happening for years now.
Offline
I also have this problem since the 225 -> 226 upgrade.
I have tried all workarounds mentioned in this bugreport:
https://bugs.freedesktop.org/show_bug.cgi?id=70593
None of them have removed the issue.
Of course using the -f option works for an immediate shutdown, but I don't feel like something this drastic is appropriate for a 90s annoyance...
Offline
That bug report is about 2 years old. I've only started getting issues recently. Wouldn't that mean that its a new bug? (That's a question, not an interrogative statement.)
Offline
I am not sure, the symptoms are the same, the actual issue might be different. But as @andrekp mentioned, this thing has been going around for quite a time.
I too have only started getting the issues since the upgrade mentioned, do you think we should file a new bug report as it seems to happen to quite a few systems?
EDIT: what caught my eye in the bug report linked before is that it spans from 2013-2015 and the workarounds vary from system to system and don't work for everyone equally as one would expect.
Last edited by karlch (2015-10-02 17:00:03)
Offline
I am experiencing the same problem after I upgraded from 226-1 to 226-3. But it only happened occasionally so I am not sure if it was introduced long before 226-1 (I rarely rebooted my machine unless there was a major upgrade)
Offline
I added the shutdown hook to mkinitpcio.conf and that seems to have worked for me.
Offline
I don't think that the 'shutdown' hook is the solution, because I had it enabled in mkinitcpio, and I still got the 90s delay when shutting down.
Searching threads with similar issues with shutdown in recent systemd releases, I saw this reply from Poettering:
Support for the unified cgroup hierarchy is now merged, see #1116. If you boot with that turned on the problem will go away now.
The link is https://github.com/systemd/systemd/pull … -137005614
I am trying that, with the 'systemd' hook in mkinitcpio (removing the 'shutdown' hook, because the sd-shutdown hook is automatically used now).
And the 90s delay seems not to occur any longer with this kernel command line option and the systemd hook in mkinitcpio.
I get this warning in journal:
oct. 04 10:07:39 localhost kernel: cgroup: sane_behavior: this is still under development and its behaviors will change, proceed at your own risk
but the line:
sept. 21 23:13:13 arch64 systemd[1]: session-c1.scope: Unit entered failed state.
which shows the 90s delay, is not there any longer.
This needs confirmation from further shutdowns in later days.
Others could test if this improves the shutting down issue, for them also.
Résumé of what I did and try now:
oct. 04 10:07:39 localhost kernel: Kernel command line: BOOT_IMAGE=../vmlinuz-linux root=UUID=xxxxxxxxxxxxxxxx rw vga=773 systemd.unified_cgroup_hierarchy=1 initrd=../intel-ucode.img,../initramfs-linux.img
and in /etc/mkinitcpio.conf:
#HOOKS="base udev autodetect modconf block filesystems keyboard shutdown"
HOOKS="base systemd autodetect modconf block keyboard sd-vconsole fsck filesystems"
Last edited by berbae (2015-10-04 10:12:52)
Offline
Yeah. Shutdown isn't a solution. Last night I had the 90s wait with session-c1.scope failing to shutdown. However, at least everything dismounted.
I'll try the systemd hook, but first I'm going to see if the shutdown hook at least can ensure everything is unmounted.
Offline
Using the 'systemd' hook seems to solve it for me. I had a safe way to trigger the delay and I do not receive it anymore, thank you!
EDIT: Just reappeared, so I guess I shouted out too early, has been a while though so it seems less frequent at least ...
Last edited by karlch (2015-10-09 15:15:37)
Offline
I don't think that the 'shutdown' hook is the solution, because I had it enabled in mkinitcpio, and I still got the 90s delay when shutting down.
Searching threads with similar issues with shutdown in recent systemd releases, I saw this reply from Poettering:
Support for the unified cgroup hierarchy is now merged, see #1116. If you boot with that turned on the problem will go away now.
The link is https://github.com/systemd/systemd/pull … -137005614
I am trying that, with the 'systemd' hook in mkinitcpio (removing the 'shutdown' hook, because the sd-shutdown hook is automatically used now).
And the 90s delay seems not to occur any longer with this kernel command line option and the systemd hook in mkinitcpio.
I get this warning in journal:
oct. 04 10:07:39 localhost kernel: cgroup: sane_behavior: this is still under development and its behaviors will change, proceed at your own risk
but the line:
sept. 21 23:13:13 arch64 systemd[1]: session-c1.scope: Unit entered failed state.
which shows the 90s delay, is not there any longer.
This needs confirmation from further shutdowns in later days.
Others could test if this improves the shutting down issue, for them also.
Résumé of what I did and try now:
oct. 04 10:07:39 localhost kernel: Kernel command line: BOOT_IMAGE=../vmlinuz-linux root=UUID=xxxxxxxxxxxxxxxx rw vga=773 systemd.unified_cgroup_hierarchy=1 initrd=../intel-ucode.img,../initramfs-linux.img
and in /etc/mkinitcpio.conf:
#HOOKS="base udev autodetect modconf block filesystems keyboard shutdown" HOOKS="base systemd autodetect modconf block keyboard sd-vconsole fsck filesystems"
My mkinitcpio.conf has:
HOOKS="base udev autodetect modconf block filesystems keyboard fsck"
This makes me wonder if I should still have base and udev in there at all? Or is there any real harm in having all three "base udev systemd..." in there? (Or is NOT using the systemd hook the real problem here?)
Offline
I do not say that NOT using the 'systemd' hook is the problem, but I wanted to replace the 'shutdown' hook with the default 'sd-shutdown', so I think that using the 'systemd' hook is preferable for that: so everything is controlled by systemd, even in the initramfs, at boot and at shutdown.
But I do not say that the shutdown issue is caused by not doing this. It's just to have things netter and cleaner, in my opinion.
The 'systemd' hook should replace the 'udev' hook, but the 'base' hook can be kept to have an emergency system, but it's not required.
The shutdown issue results from a user session entering a failed state, but the cause is not clear to me; this seems linked to cgroups management, from what I understand reading threads related to similar issues; but I do not fully grasp all the mechanism about that.
I tested the 'systemd.unified_cgroup_hierarchy=1' kernel command line parameter, thinking that it will improve things, especially with the shutdown.
I seem not to have the 90s delay with it enabled, but I get that now:
oct. 04 23:18:20 arch64 systemd[1]: Failed unmounting Temporary Directory.
oct. 04 23:18:20 arch64 systemd[1]: Failed unmounting /home.
without the 90s delay and the 'session-c1.scope: Unit entered failed state'.
So, I am not sure this is the solution either.
I am trying now the 'ctrl-alt-del.target' linked to the 'poweroff.target' with:
# systemctl enable poweroff.target
So I can completely log out before powering off, which I think would clean the user session, and then power off the machine with ctrl-alt-del without being logged in.
I test that now with and without the ''systemd.unified_cgroup_hierarchy=1' parameter.
It takes days to be able to reach a conclusion, because I do not want to shutdown many times in a row...
I am still testing...
Offline
So the systemd hook can completely replace the udev hook?
Offline
@berbae, I have 3 systems with simular setup, xfce4-session with i3. systemd 226 all upgraded.
Sometimes with logout I manage to have clean shutdown (now wait for c2,c3 session 1min30s).
I disabled lightdm, Networkmanager to see if these sessions were causing the problem. It still persisted.
Also shutting down from xfce4 panel gives the waiting sometimes. I checked many times to see the what session is still running, I cannot find nothing until now.
From what I understand goettering (github link) says: find your troubling session and shut it down clean. Is this the right way?
poweroff systemctl poweroff -i shutdown -h
all sometimes shutdown immediately sometimes with the: a stop job is still running for session c2(3) with user bla. 1min30s.
If you find something let us know.
Offline
On my machine, I have a reproducible way to trigger the shutdown/poweroff problem:
Playing some Youtube videos with chromium/chromium-pepper-flash, then quitting chromium.
That leaves chromium-pepper-flash processes in an undefined state, still attached to the user session scope;
these processes are still there, even after logout, and the session scope enters an unclean state in systemd;
the processes cannot be killed with kill, killall or pkill commands; they are in an undefined state (maybe zombies).
The scope doesn't appear with the command 'loginctl list-sessions', but it's there in the output of 'systemd-cgls' with the chromium processes.
So having this way to reproduce the problem, I can compare the results with and without the 'systemd.unified_cgroup_hierarchy=1' parameter.
1) Without the parameter, when I log out the user session and shutdown/poweroff
The journal shows:
oct. 08 23:22:32 arch64 login[319]: pam_unix(login:session): session closed for user berbae
oct. 08 23:22:32 arch64 systemd[1]: getty@tty1.service: Service has no hold-off time, scheduling restart.
oct. 08 23:22:32 arch64 systemd[1]: Stopping Session c1 of user berbae.
...
90s delay
...
oct. 08 23:24:02 arch64 systemd[1]: session-c1.scope: Stopping timed out. Killing.
oct. 08 23:24:02 arch64 systemd[1]: Stopped Session c1 of user berbae.
oct. 08 23:24:02 arch64 systemd[1]: session-c1.scope: Unit entered failed state.
oct. 08 23:24:02 arch64 systemd-logind[317]: Removed session c1.
oct. 08 23:24:02 arch64 systemd[1]: Stopping User Manager for UID 1000...
...
the shutdown/poweroff continues normally to the end
there is no problem to unmount /tmp and /home
...
2) With the parameter enabled, when I log out and shutdown/poweroff in the same conditions as in the preceding case
The journal now shows:
oct. 07 23:24:09 arch64 login[263]: pam_unix(login:session): session closed for user berbae
oct. 07 23:24:09 arch64 systemd[1]: getty@tty1.service: Service has no hold-off time, scheduling restart.
oct. 07 23:24:09 arch64 systemd[1]: Stopped Getty on tty1.
oct. 07 23:24:09 arch64 systemd[1]: Started Getty on tty1.
oct. 07 23:24:09 arch64 systemd[1]: Stopped Session c1 of user berbae. <---------------there is no 90s delay
oct. 07 23:24:09 arch64 systemd-logind[220]: Removed session c1.
oct. 07 23:24:09 arch64 systemd[1]: Stopping User Manager for UID 1000...
...
the shutdown/poweroff continues normally
...
until
oct. 07 23:24:14 arch64 mkinitcpio[4151]: -> Running build hook: [sd-shutdown]
...
oct. 07 23:24:14 arch64 systemd[1]: Unmounting Temporary Directory...
oct. 07 23:24:14 arch64 systemd[1]: Unmounting /home...
oct. 07 23:24:14 arch64 systemd[1]: Unmounting /opt...
oct. 07 23:24:14 arch64 systemd[1]: Unmounting /boot...
oct. 07 23:24:14 arch64 umount[4238]: umount: /home: target is busy
oct. 07 23:24:14 arch64 umount[4238]: (In some cases useful info about processes that
oct. 07 23:24:14 arch64 umount[4238]: use the device is found by lsof(8) or fuser(1).)
oct. 07 23:24:14 arch64 umount[4237]: umount: /tmp: target is busy
oct. 07 23:24:14 arch64 umount[4237]: (In some cases useful info about processes that
oct. 07 23:24:14 arch64 umount[4237]: use the device is found by lsof(8) or fuser(1).)
oct. 07 23:24:14 arch64 systemd[1]: tmp.mount: Mount process exited, code=exited status=32
oct. 07 23:24:14 arch64 systemd[1]: Failed unmounting Temporary Directory.
oct. 07 23:24:14 arch64 systemd[1]: home.mount: Mount process exited, code=exited status=32
oct. 07 23:24:14 arch64 systemd[1]: Failed unmounting /home.
...
then the shutdown/poweroff continues to the end.
So the problem seems to disappear with the kernel parameter, but it is not solved in fact, because now the unmounting of /tmp and /home
is prevented, these file system being still busy.
'fuser' reveals that this is due to the same processes remaining in indeterminate state as in the preceding case.
All these tests were done with the 'systemd' hook in mkinitcpio which I intend to keep for now.
3) My conclusions:
As logging out before shutdown/poweroff doesn't solve the issue, I remove the 'ctrl-alt-del.target' linked to 'poweroff.target';
(I always regarded this as a security risk and I mask the 'ctrl-alt-del.target' unit).
As the 'systemd.unified_cgroup_hierarchy=1' kernel parameter only seems to solve the issue, and it is not quite ready to use
presently, I removed it also.
The shutdown/poweroff 90s delay problem is due to a user session scope going into an unclean state;
when this happens systemd waits during this delay for the state to become clean again, then it continues the shutdown/poweroff
sequence until the end.
So this may not be a systemd bug introduced recently, but could reveal problems with applis not quitting cleanly, leaving
processes behind in a zombie or undefined state (like for example chromium/chromium-pepper-flash).
This is the feeling I have come to, after all these tests and the reading of threads related to problems similar to this one.
The solutions should come from the applis not quitting cleanly; I don't think the systemd developers will do something about
that.
So to me, there is no systemd solution to this issue presently.
Last edited by berbae (2015-10-09 14:24:10)
Offline
Interesting. Thanks for your work on this. I think I saw a similar thing with pepper-flash in at least some of my journals after rebooting after this problem. Maybe the Chrome dev needs to be nudged...
While it may not be an actual systemd problem, I do think the system should be able to poweroff reasonably even if some applications aren't closing properly. Maybe with an appropriate warning that is slightly more obvious than it clearly is (considering the time this has been going on). Maybe that's a systemd job, or maybe not, but did this happen before systemd? (did the implementation of systemd FIND the extant problem, or cause it to be one?)
In any case, is there a safe way to reduce the 90s wait to some more reasonable period (like 30s), that maintains system integrity, while not forcing every shutdown to take a couple of minutes. (I've seen it in a couple of config files, but I'm not sure which is being invoked)
-Andre
On my machine, I have a reproducible way to trigger the shutdown/poweroff problem:
Playing some Youtube videos with chromium/chromium-pepper-flash, then quitting chromium.
That leaves chromium-pepper-flash processes in an undefined state, still attached to the user session scope;
these processes are still there, even after logout, and the session scope enters an unclean state in systemd;
the processes cannot be killed with kill, killall or pkill commands; they are in an undefined state (maybe zombies).The scope doesn't appear with the command 'loginctl list-sessions', but it's there in the output of 'systemd-cgls' with the chromium processes.
So having this way to reproduce the problem, I can compare the results with and without the 'systemd.unified_cgroup_hierarchy=1' parameter.
1) Without the parameter, when I log out the user session and shutdown/poweroff
The journal shows:
oct. 08 23:22:32 arch64 login[319]: pam_unix(login:session): session closed for user berbae oct. 08 23:22:32 arch64 systemd[1]: getty@tty1.service: Service has no hold-off time, scheduling restart. oct. 08 23:22:32 arch64 systemd[1]: Stopping Session c1 of user berbae. ... 90s delay ... oct. 08 23:24:02 arch64 systemd[1]: session-c1.scope: Stopping timed out. Killing. oct. 08 23:24:02 arch64 systemd[1]: Stopped Session c1 of user berbae. oct. 08 23:24:02 arch64 systemd[1]: session-c1.scope: Unit entered failed state. oct. 08 23:24:02 arch64 systemd-logind[317]: Removed session c1. oct. 08 23:24:02 arch64 systemd[1]: Stopping User Manager for UID 1000... ... the shutdown/poweroff continues normally to the end there is no problem to unmount /tmp and /home ...
2) With the parameter enabled, when I log out and shutdown/poweroff in the same conditions as in the preceding case
The journal now shows:
oct. 07 23:24:09 arch64 login[263]: pam_unix(login:session): session closed for user berbae oct. 07 23:24:09 arch64 systemd[1]: getty@tty1.service: Service has no hold-off time, scheduling restart. oct. 07 23:24:09 arch64 systemd[1]: Stopped Getty on tty1. oct. 07 23:24:09 arch64 systemd[1]: Started Getty on tty1. oct. 07 23:24:09 arch64 systemd[1]: Stopped Session c1 of user berbae. <---------------there is no 90s delay oct. 07 23:24:09 arch64 systemd-logind[220]: Removed session c1. oct. 07 23:24:09 arch64 systemd[1]: Stopping User Manager for UID 1000... ... the shutdown/poweroff continues normally ... until oct. 07 23:24:14 arch64 mkinitcpio[4151]: -> Running build hook: [sd-shutdown] ... oct. 07 23:24:14 arch64 systemd[1]: Unmounting Temporary Directory... oct. 07 23:24:14 arch64 systemd[1]: Unmounting /home... oct. 07 23:24:14 arch64 systemd[1]: Unmounting /opt... oct. 07 23:24:14 arch64 systemd[1]: Unmounting /boot... oct. 07 23:24:14 arch64 umount[4238]: umount: /home: target is busy oct. 07 23:24:14 arch64 umount[4238]: (In some cases useful info about processes that oct. 07 23:24:14 arch64 umount[4238]: use the device is found by lsof(8) or fuser(1).) oct. 07 23:24:14 arch64 umount[4237]: umount: /tmp: target is busy oct. 07 23:24:14 arch64 umount[4237]: (In some cases useful info about processes that oct. 07 23:24:14 arch64 umount[4237]: use the device is found by lsof(8) or fuser(1).) oct. 07 23:24:14 arch64 systemd[1]: tmp.mount: Mount process exited, code=exited status=32 oct. 07 23:24:14 arch64 systemd[1]: Failed unmounting Temporary Directory. oct. 07 23:24:14 arch64 systemd[1]: home.mount: Mount process exited, code=exited status=32 oct. 07 23:24:14 arch64 systemd[1]: Failed unmounting /home. ... then the shutdown/poweroff continues to the end.
So the problem seems to disappear with the kernel parameter, but it is not solved in fact, because now the unmounting of /tmp and /home
is prevented, these file system being still busy.
'fuser' reveals that this is due to the same processes remaining in indeterminate state as in the preceding case.All these tests were done with the 'systemd' hook in mkinitcpio which I intend to keep for now.
3) My conclusions:
As logging out before shutdown/poweroff doesn't solve the issue, I remove the 'ctrl-alt-del.target' linked to 'poweroff.target';
(I always regarded this as a security risk and I mask the 'ctrl-alt-del.target' unit).As the 'systemd.unified_cgroup_hierarchy=1' kernel parameter only seems to solve the issue, and it is not quite ready to use
presently, I removed it also.The shutdown/poweroff 90s delay problem is due to a user session scope going into an unclean state;
when this happens systemd waits during this delay for the state to become clean again, then it continues the shutdown/poweroff
sequence until the end.So this may not be a systemd bug introduced recently, but could reveal problems with applis not quitting cleanly, leaving
processes behind in a zombie or undefined state (like for example chromium/chromium-pepper-flash).This is the feeling I have come to, after all these tests and the reading of threads related to problems similar to this one.
The solutions should come from the applis not quitting cleanly; I don't think the systemd developers will do something about
that.So to me, there is no systemd solution to this issue presently.
Offline
@berbae, Thanks for the work. It is indeed a smart way you reproduced. I just don't understand why such a long delay the shutdown must have.
In my opinion 90s is not needed, as when the user wants to shutdown it is a definite, absolute process. It must not go on for even 2s.
When I start up my system: i3 with xfce4 panels, it starts my session: urxvt (2x), thunar,geany,googlechrome. When I shutdown not always I manage/want to run down all the workspaces to stop all the running apps.
This is just my opinion.
From the github discussion, I thought to understand the developers also have kind of standoff.
Again thank you @berbae
Last edited by boban_dj (2015-10-10 05:23:50)
Offline
To reduce the delay from the default value 90s, it is in the /etc/systemd/system.conf file:
from
#DefaultTimeoutStartSec=90s
#DefaultTimeoutStopSec=90s
to
DefaultTimeoutStartSec=10s
DefaultTimeoutStopSec=10s
followed by:
# systemctl daemon-reload
I think that 10s for modern computers should be a long enough duration for just current usage.
While it may not be an actual systemd problem, I do think the system should be able to poweroff reasonably even if some applications aren't closing properly
That's already the case, without any changes, the system is shutting down cleanly, after the waiting delay in the case of an user session scope being in a unclean state.
This is seen in the fact that the /tmp and /home file systems, and all the others, are unmounting without errors, which means all unclean things still linked to them have disappeared.
When I shutdown not always I manage/want to run down all the workspaces to stop all the running apps.
The problem is not with applis continuing running at shutdown, if they quit cleanly when receiving the SIGTERM signal, leaving nothing behind them.
The problem comes when applis, like chrome or chromium/chromium-pepper-flash and others, do not act in this way to finish running.
Will their devs be opened to changed that? I don't know.
Last edited by berbae (2015-10-10 21:16:00)
Offline
My system not unmounting my home directory seems to be unrelated to c1-session 's failure to close. It happened again today, but c1 closed. Ran mkinicpio with the systemd hook. I was only using the shutdown hook. Hopefully I'll have some luck with that.
I'm not using any special kernel parameters at boot . I also am not having any issues with /tmp unmounting.
Everytime I shutdown my computer, for about a week, I have exited Gnome, and logged out of all getties. I then log back into at TTY1, and run
killall --user nstgc
sync
systemctl poweroff
Offline
How are you all starting your sessions? Via a login manager (which one?), or TTY?
Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD
Making lemonade from lemons since 2015.
Offline
@nstgc
Your problem is definitely different, because "zombie" processes cannot be killed with the killall command.
All messages 'Failed unmounting /home' have not the same origins.
If c1 session scope closes, that's something outside the session which keeps /home busy.
Did you try 'fuser -m /home' to list what processes still use the file system?
BTW you can remove the 'shutdown' hook, because there is the default 'sd-shutdown' hook which is then automatically run.
Offline