You are not logged in.

#1 2018-09-08 02:47:49

jdoggsc
Member
Registered: 2011-08-15
Posts: 98

computer takes a long time to suspend

Hey all,

I have been dealing with this problem for a while now, and I want to finally figure it out.  I need some help, though.  After I tell my computer to suspend, it consistently takes nearly exactly 2 minutes before it enters sleep.  The fact that it's a predicable 120 seconds made me think it was like that systemd 1m 30sec countdown for services that won't shut down cleanly that I've seen in previous situations when trying to shut down or restart.   

I've checked journalctl output and dmesg output, but I've found nothing that I can run with. After reading around about debugging this, I found that it can be useful to append "initcall_debug ignore_loglevel" parameters to the kernel boot line in grub, so the following output includes this information, but I don't gather any new value in it.


I initiated a suspend command using systemctl suspend.

journalctl output (dmesg is identical, but with cryptic timestamps):

Sep 07 21:29:00 jdoggsc-lappy sudo[821]:     jdoggsc : TTY=pts/2 ; PWD=/home/jdoggsc ; USER=root ; COMMAND=/usr/bin/systemctl suspend
Sep 07 21:29:00 jdoggsc-lappy sudo[821]: pam_unix(sudo:session): session opened for user root by (uid=0)
Sep 07 21:29:00 jdoggsc-lappy sudo[821]: pam_unix(sudo:session): session closed for user root
Sep 07 21:29:01 jdoggsc-lappy systemd[1]: Reached target Sleep.
Sep 07 21:29:01 jdoggsc-lappy systemd[1]: Starting Suspend...
Sep 07 21:29:01 jdoggsc-lappy systemd-sleep[834]: Suspending system...
Sep 07 21:29:01 jdoggsc-lappy kernel: PM: suspend entry (deep)
Sep 07 21:29:01 jdoggsc-lappy kernel: PM: Syncing filesystems ... done.
Sep 07 21:31:03 jdoggsc-lappy kernel: Freezing user space processes ... (elapsed 0.001 seconds) done.
Sep 07 21:31:03 jdoggsc-lappy kernel: OOM killer disabled.
Sep 07 21:31:03 jdoggsc-lappy kernel: Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
Sep 07 21:31:03 jdoggsc-lappy kernel: Suspending console(s) (use no_console_suspend to debug)
Sep 07 21:31:03 jdoggsc-lappy kernel: input input19: calling input_dev_suspend+0x0/0x50 @ 834, parent: card0
Sep 07 21:31:03 jdoggsc-lappy kernel: input input19: input_dev_suspend+0x0/0x50 returned 0 after 0 usecs
Sep 07 21:31:03 jdoggsc-lappy kernel: input input18: calling input_dev_suspend+0x0/0x50 @ 834, parent: card0
Sep 07 21:31:03 jdoggsc-lappy kernel: input input18: input_dev_suspend+0x0/0x50 returned 0 after 0 usecs
Sep 07 21:31:03 jdoggsc-lappy kernel: input input17: calling input_dev_suspend+0x0/0x50 @ 834, parent: card0
Sep 07 21:31:03 jdoggsc-lappy kernel: input input17: input_dev_suspend+0x0/0x50 returned 0 after 0 usecs
Sep 07 21:31:03 jdoggsc-lappy kernel: input input16: calling input_dev_suspend+0x0/0x50 @ 834, parent: card0
Sep 07 21:31:03 jdoggsc-lappy kernel: input input16: input_dev_suspend+0x0/0x50 returned 0 after 0 usecs
Sep 07 21:31:03 jdoggsc-lappy kernel: input input13: calling input_dev_suspend+0x0/0x50 @ 834, parent: serio1
Sep 07 21:31:04 jdoggsc-lappy kernel: input input13: input_dev_suspend+0x0/0x50 returned 0 after 0 usecs
Sep 07 21:31:04 jdoggsc-lappy kernel: input input15: calling input_dev_suspend+0x0/0x50 @ 834, parent: 1-1.5:1.0
Sep 07 21:31:04 jdoggsc-lappy kernel: input input15: input_dev_suspend+0x0/0x50 returned 0 after 0 usecs
Sep 07 21:31:04 jdoggsc-lappy kernel: coretemp coretemp.0: calling platform_pm_suspend+0x0/0x50 @ 834, parent: platform

obviously it keeps going, and suspends successfully, but the 2-minute pause is clearly visible.

What is happening between 21:29:01 and 21:31:03?  Syncing filesystems seems to have completed by 21:29:01, and freezing user space processes only took 0.001 seconds, but happened at 21:31:03, so it's not like it took 2 minutes to freeze user space processes, right?

Please let me know if you have any ideas of causes or things to check.

Thanks,

Offline

#2 2018-09-09 23:42:58

jdoggsc
Member
Registered: 2011-08-15
Posts: 98

Re: computer takes a long time to suspend

I just found this on systemd's github page.  It's the same issue I'm having, but with a longer delay (5-6 minutes rather than 2).  As the OP in the thread reported, I also have the same behavior whether I initiate the suspend with systemd or by typing

echo mem > /sys/power/state

The systemd dev closed it as not being an issue with systemd, but rather with the kernel.  Yay! I posted this in the right place!

The question now is where to start (and generally, how) to debug the kernel?  I've had this computer since the beginning part of this year, so I've been through several kernel versions.  It's been this way since I first put arch on it, and nothing has ever changed in this regard.

Offline

#3 2018-09-10 00:03:39

loqs
Member
Registered: 2014-03-06
Posts: 17,196

Re: computer takes a long time to suspend

The systemd bug report had a linked kernel bug that had a debugging patch attached to it.  https://bugzilla.kernel.org/show_bug.cgi?id=101101
You could try seeing if that patch still applies and post updated journal output from the patched kernel if it does.

Offline

#4 2018-09-10 00:29:13

jdoggsc
Member
Registered: 2011-08-15
Posts: 98

Re: computer takes a long time to suspend

yes. I just finished reading through that kernel.org thread.  I'll give it a try, I think.  It's frustrating to me that the OP dismissed the issue since he only wanted to unplug his external hdd during suspend, and found that he could do that after a few seconds.  It's a bigger frustration for me.  Even if I install the patch (which, I think, would only give additional debug messages), I will probably need to start a thread on kernel.org like snarkatt did.  I'd like to probe the minds of the arch community first, though, since the OP of those threads was using Manjaro.  It might be coincidence, but we'll see.

Offline

#5 2018-09-11 04:32:42

jdoggsc
Member
Registered: 2011-08-15
Posts: 98

Re: computer takes a long time to suspend

So i spent a couple evenings rebuilding a kernel using the Arch Build System walk-through.  I changed the PKGBUILD to include the for_101101.patch file from the bugzilla.kernel.org page.  From what I could see, the patch process worked, but it happened so quickly it was hard to see that step explicitly.  Anyway, I just rebooted after installing that custom kernel, but the output looks the same.

Jun 22 11:05:03 jdoggsc-lappy polkitd[623]: Registered Authentication Agent for unix-process:773:3044 (system bus name :1.64 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/free>
Jun 22 11:05:03 jdoggsc-lappy polkitd[623]: Unregistered Authentication Agent for unix-process:773:3044 (system bus name :1.64, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, local>
Jun 22 11:05:04 jdoggsc-lappy systemd[1]: Reached target Sleep.
Jun 22 11:05:04 jdoggsc-lappy systemd[1]: Starting Suspend...
Jun 22 11:05:04 jdoggsc-lappy systemd-sleep[789]: Suspending system...
Jun 22 11:05:04 jdoggsc-lappy kernel: PM: suspend entry (deep)
Jun 22 11:09:38 jdoggsc-lappy kernel: PM: Syncing filesystems ... done.
Jun 22 11:09:38 jdoggsc-lappy kernel: Freezing user space processes ... (elapsed 0.001 seconds) done.
Jun 22 11:09:38 jdoggsc-lappy kernel: OOM killer disabled.
Jun 22 11:09:38 jdoggsc-lappy kernel: Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
Jun 22 11:09:38 jdoggsc-lappy kernel: Suspending console(s) (use no_console_suspend to debug)

except here, systemd shows a 4-minute delay from initiating suspend to when the computer actually sleeps.  I'll give it another try tomorrow, I think, to make sure the patch applied correctly.  I'll check the patched source as well, to ensure the change made was consistent with patch contents--maybe even try to make a judgment as to whether the patch is still applicable.  it patched without problems on the 4.18.7 kernel source that I downloaded first from kernel.org before opting to use the ABS.

Offline

#6 2018-09-13 01:45:27

jdoggsc
Member
Registered: 2011-08-15
Posts: 98

Re: computer takes a long time to suspend

update:  I added messages and pauses to the stdout of the PKGBUILD script and verified that the patch applied correctly.  I checked the source files to confirm.  However, using that kernel, I don't see any of those messages in my journal when I suspend.  That's a dead end, unfortunately.

Offline

#7 2018-09-24 01:00:19

jdoggsc
Member
Registered: 2011-08-15
Posts: 98

Re: computer takes a long time to suspend

update: I guess I didn't realize when I compiled custom kernel through the ABS system that it creates a separate entry in GRUB. By default, it does not boot the custom kernel.  That is why I never saw those messages.  The messages come as programmed when I use the custom kernel, but little is gleaned from them.

The difficulty in debugging this is that I need to know at which point in the suspend process the delay happens.  To do that, I have 2 options: dmesg and journalctl

dmesg shows kernel messages as a timestep in S.nanosec format.  Those sources show drastically different things:

I primarily edited suspend.c in the kernel source under (kernel/power/suspend.c) and added printouts so I can see the timestep when those points in the process are hit. 

This is what journalctl shows:

Sep 23 20:34:39 jdoggsc-lappy kernel: PM: suspend entry (deep)
Sep 23 20:34:39 jdoggsc-lappy kernel: PM: enter_state: entered
Sep 23 20:34:39 jdoggsc-lappy kernel: PM: enter_state: before pm_mutex lock
Sep 23 20:34:39 jdoggsc-lappy kernel: PM: enter_state: pm_mutex lock acquired
Sep 23 20:34:39 jdoggsc-lappy kernel: PM: Syncing filesystems ... 
Sep 23 20:34:39 jdoggsc-lappy kernel: PM: Preparing system for sleep (deep)
Sep 23 20:36:43 jdoggsc-lappy kernel: Freezing user space processes ... (elapsed 0.001 seconds) done.
Sep 23 20:36:43 jdoggsc-lappy kernel: OOM killer disabled.
Sep 23 20:36:43 jdoggsc-lappy kernel: Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
Sep 23 20:36:43 jdoggsc-lappy kernel: PM: Suspending system (deep)

the delay previously was somewhere around the "Syncing filesystems ..." message.  So, looking at the c-source:

static int enter_state(suspend_state_t state)
{
	int error;
    pr_info("%s: entered\n", __func__); //added by me
	trace_suspend_resume(TPS("suspend_enter"), state, true);
	if (state == PM_SUSPEND_TO_IDLE) {
#ifdef CONFIG_PM_DEBUG
		if (pm_test_level != TEST_NONE && pm_test_level <= TEST_CPUS) {
			pr_warn("Unsupported test mode for suspend to idle, please choose none/freezer/devices/platform.\n");
			return -EAGAIN;
		}
#endif
	} else if (!valid_state(state)) {
		return -EINVAL;
	}
	pr_info("%s: before pm_mutex lock\n", __func__);//added by me
	if (!mutex_trylock(&pm_mutex))
		return -EBUSY;
    pr_info("%s: pm_mutex lock acquired\n", __func__);//added by me
	if (state == PM_SUSPEND_TO_IDLE)
		s2idle_begin();

#ifndef CONFIG_SUSPEND_SKIP_SYNC
	trace_suspend_resume(TPS("sync_filesystems"), 0, true);
	pr_info("Syncing filesystems ... ");
	ksys_sync();
	//pr_cont("done.\n");//edited by me
	trace_suspend_resume(TPS("sync_filesystems"), 0, false);
#endif

	pr_info("Preparing system for sleep (%s)\n", mem_sleep_labels[state]);//edited by me

I wondered if the delay to print "Syncing filesystems ... done." was that the timestamp for that line was for the "done." part, and the delay was because ksys_sync() was taking a long time to execute.  That's why in the code above, the "done." part is commented-out.  The journalctl printout this time (versus my previous post) seems to indicate this completed before the delay.

dmesg output:

[   83.358713] PM: suspend entry (deep)
[   83.358717] PM: enter_state: entered
[   83.358720] PM: enter_state: before pm_mutex lock
[   83.358723] PM: enter_state: pm_mutex lock acquired
[   83.358724] PM: Syncing filesystems ... 
[   83.369092] PM: Preparing system for sleep (deep)
[   83.369698] Freezing user space processes ... (elapsed 0.001 seconds) done.
[   83.371163] OOM killer disabled.
[   83.371165] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[   83.372414] PM: Suspending system (deep)
[   83.372542] Suspending console(s) (use no_console_suspend to debug)
[   83.373011] wlp2s0: deauthenticating from xx:xx:xx:xx:xx:xx by local choice (Reason: 3=DEAUTH_LEAVING)
[   83.393004] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[   83.396715] sd 0:0:0:0: [sda] Stopping disk
[   83.770548] ACPI: EC: interrupt blocked
[   83.786863] ACPI: Preparing to enter system sleep state S3
[   83.789004] ACPI: EC: event blocked
[   83.789006] ACPI: EC: EC stopped
[   83.789008] PM: Saving platform NVS memory
[   83.790412] Disabling non-boot CPUs ...
[   83.847815] smpboot: CPU 1 is now offline
[   83.887931] smpboot: CPU 2 is now offline
[   83.921127] smpboot: CPU 3 is now offline

this feels useless to me because it indicates the entire process happening in 0.6 seconds, whereas the process actually took 2 minutes.

I read on a forum that I cannot easily find anymore, that someone had similar issues caused by their SSD.  I transferred my SSD from my previous computer (which did not have these issues), but decided to check for improvement anyway by removing the ssd and booting from a live disk vis usb.  No change.

I was relying on timestamps to help me find the process hang-up, but the delay printed in journactl seems to be as long as the time from when I issue the suspend command to when i push the power button to resume.  Journalctl cannot chronicle the needed information for me because it's already been stopped as a process before the hang-up occurs.

No one has responded yet on kernel.org's bugzilla.

I'm still open to suggestions.

Offline

#8 2018-09-24 06:38:09

seth
Member
Registered: 2012-09-03
Posts: 49,992

Re: computer takes a long time to suspend

sync # FS is now sync'd - how long did that take?
echo mem > /sys/power/state # sleeping faster now?

Offline

#9 2018-09-25 22:06:13

jdoggsc
Member
Registered: 2011-08-15
Posts: 98

Re: computer takes a long time to suspend

Thanks for your response. I tried what you suggested.

date;sync;date

result:

Tue Sep 25 17:45:18 EDT 2018
Tue Sep 25 17:45:19 EDT 2018

although this isn't the most precise, it tells the story well enough--it takes less than 1 second for the sync command to complete. Also, no change to suspend time.

I didn't do a good job of summarizing my findings in my last post. My hypothesis is that the delay in my journalctl log is misplaced, and not representative of where in the process the actual delay happens. so I think I need to go down the rabbit hole of kernel c source...

Offline

#10 2018-09-27 05:01:57

jdoggsc
Member
Registered: 2011-08-15
Posts: 98

Re: computer takes a long time to suspend

I found a fascinating tool for kernel developers.  it's comes with the kernel source and is found in the tools/power/pm-graph folder.  it's a python script called sleepgraph.py (examples of its use and how it's aided in kernel development)
it requires some kernel config parameters (which are all enabled in arch's standard kernel) as well as an addition to the kernel boot parameters at the bootloader.  The script analyzes the detailed script and processes the data into a html file that shows macroscopically how long both suspend and resume take and why--whether because of a particular driver or a particular function call, or something else.

The short version: the kernel legitimately seems to be taking ~600ms to suspend--yet 2 minutes to physically sleep. I honestly don't know what to make of this.  Running the script without parameters, the tool issues a rtcwake command 15 seconds after suspending.  Suspend takes 0.6sec and then the computer takes an additional 2 minutes to power off.Then the rtcwake of 15 seconds is effective.  my computer stays off for 15 seconds (2:15 after suspend) and then wakes.

I'm not understanding how the kernel is comprehending this 2:00 wait time...

Offline

#11 2018-09-27 06:39:46

seth
Member
Registered: 2012-09-03
Posts: 49,992

Re: computer takes a long time to suspend

I'm not understanding how the kernel is comprehending this 2:00 wait time...

It asks the rtc ;-)

Try to get rid of all external devices you can get rid off, in doubt, program a "sleep 20; systemctl suspend" and even unplug the keyboard.

Offline

Board footer

Powered by FluxBB