You are not logged in.

#1 2016-04-04 19:55:45

nvteighen
Member
Registered: 2014-03-13
Posts: 31

Hibernation not resuming, sometimes

Hi,
In the past I've already written about this issue, which has driven me absolutely crazy for months, but I hadn't enough knowledge back then to be able to recollect the data that I'm now able to give here for helping me debug this (and eventually, file a proper bug report). I'll try to be as organized as possible so that this thread is really useful to other people.

1. The symptoms

System does not resume after hibernation, but in an apparently random fashion. I cannot identify any conditions in which this happens.

Logs for a succesful hibernation/resume cycle:

-- Logs begin at sáb 2014-07-05 12:49:32 CEST, end at lun 2016-04-04 18:43:22 CEST. --
abr 04 13:54:12 UGI systemd[1]: Reached target Sleep.
abr 04 13:54:12 UGI systemd[1]: Starting Hibernate...
abr 04 13:54:12 UGI systemd-sleep[10096]: Suspending system...
abr 04 13:54:12 UGI kernel: PM: Hibernation mode set to 'platform'
abr 04 15:55:50 UGI kernel: PM: Syncing filesystems ... done.
abr 04 15:55:51 UGI kernel: Freezing user space processes ... (elapsed 0.003 seconds) done.
abr 04 15:55:51 UGI kernel: PM: Marking nosave pages: [mem 0x00000000-0x00000fff]
...
abr 04 15:55:51 UGI kernel: PM: Basic memory bitmaps created
abr 04 15:55:51 UGI kernel: PM: Preallocating image memory... done (allocated 315315 pages)
abr 04 15:55:51 UGI kernel: PM: Allocated 1261260 kbytes in 2.46 seconds (512.70 MB/s)
abr 04 15:55:51 UGI kernel: Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
abr 04 15:55:51 UGI kernel: Suspending console(s) (use no_console_suspend to debug)
abr 04 15:55:51 UGI kernel: PM: freeze of devices complete after 285.061 msecs
abr 04 15:55:51 UGI kernel: PM: late freeze of devices complete after 13.788 msecs
abr 04 15:55:51 UGI kernel: PM: noirq freeze of devices complete after 2.317 msecs
abr 04 15:55:51 UGI kernel: ACPI: Preparing to enter system sleep state S4
...
abr 04 15:55:51 UGI kernel: PM: Creating hibernation image:
abr 04 15:55:51 UGI kernel: PM: Need to copy 313623 pages
abr 04 15:55:51 UGI kernel: PM: Normal pages needed: 313623 + 1024, available pages: 705083
[...]
abr 04 15:55:52 UGI kernel: PM: Image restored successfully.
abr 04 15:55:52 UGI kernel: PM: Basic memory bitmaps freed
abr 04 15:55:52 UGI kernel: Restarting tasks ... done.

Logs for an unsuccesful hibernation/resume cycle on the very same day:

-- Logs begin at sáb 2014-07-05 12:49:32 CEST, end at lun 2016-04-04 18:36:01 CEST. --
abr 04 16:54:52 UGI systemd[1]: Reached target Sleep.
abr 04 16:54:52 UGI systemd[1]: Starting Hibernate...
abr 04 16:54:52 UGI systemd-sleep[12958]: Suspending system...
abr 04 16:54:52 UGI kernel: PM: Hibernation mode set to 'platform'
-- Reboot --
abr 04 18:18:44 localhost systemd-journald[70]: Runtime journal (/run/log/journal/) is 8.0M, max 192.0M, 184.0M free.
abr 04 18:18:44 localhost kernel: microcode: CPU0 microcode updated early to revision 0x29, date = 2013-06-12
abr 04 18:18:44 localhost kernel: Initializing cgroup subsys cpuset
abr 04 18:18:44 localhost kernel: Initializing cgroup subsys cpu
abr 04 18:18:44 localhost kernel: Initializing cgroup subsys cpuacct
abr 04 18:18:44 localhost kernel: Linux version 4.4.5-1-ARCH (builduser@tobias) (gcc version 5.3.0 (GCC) ) #1 SMP PREEMPT Thu Mar 10 07:38:19 CET 2016
abr 04 18:18:44 localhost kernel: Command line: initrd=\intel-ucode.img initrd=\initramfs-linux.img root=UUID=6252a794-fd05-4f8c-9c65-984b106dd193 rw resume=UUID=d5f153dd-712a-4cc7-a712-1e109d9861da
...
abr 04 18:18:45 localhost systemd[1]: Found device TOSHIBA_MQ01ABD050 3.
abr 04 18:18:45 localhost systemd[1]: Starting Resume from hibernation using device /dev/disk/by-uuid/d5f153dd-712a-4cc7-a712-1e109d9861da...
abr 04 18:18:45 localhost kernel: PM: Starting manual resume from disk
abr 04 18:18:45 localhost kernel: PM: Hibernation image partition 8:3 present
abr 04 18:18:45 localhost kernel: PM: Looking for hibernation image.
abr 04 18:18:45 localhost systemd-hibernate-resume[137]: Could not resume from '/dev/disk/by-uuid/d5f153dd-712a-4cc7-a712-1e109d9861da' (8:3).
abr 04 18:18:45 localhost systemd[1]: Started Resume from hibernation using device /dev/disk/by-uuid/d5f153dd-712a-4cc7-a712-1e109d9861da.
abr 04 18:18:45 localhost systemd[1]: Reached target Local File Systems (Pre).
abr 04 18:18:45 localhost systemd[1]: Reached target Local File Systems.
abr 04 18:18:45 localhost systemd[1]: Reached target System Initialization.
abr 04 18:18:45 localhost systemd[1]: Reached target Basic System.
abr 04 18:18:45 localhost kernel: PM: Image not found (code -22)
abr 04 18:18:45 localhost kernel: PM: Hibernation image not present or could not be loaded.
abr 04 18:18:45 localhost systemd[1]: Found device TOSHIBA_MQ01ABD050 2.
abr 04 18:18:45 localhost systemd[1]: Starting File System Check on /dev/disk/by-uuid/6252a794-fd05-4f8c-9c65-984b106dd193...
abr 04 18:18:45 localhost systemd-fsck[141]: /dev/sda2: recovering journal

As you may notice:

  • The system configuration allows hibernating (more on the config below)

  • The logs for the whole hibernation process are written into disk after the system is resumed. I assume that the entries get written into memory while the system is freezing processes. That's why in the succesful case the timestamp for the freezing of processes and sync'ing of disks is 2 hours later than the "Suspending system..." message. This means that in the unsuccesful case chances are high that the issue hits the system while resuming, so that the logs about the whole hibernation process get lost.

  • The "Could not resume from", "Hibernation image not present or could not be loaded", etc. messages are also found in normal boot sequences after a regular shutdown.

2. System configuration

The current configuration can be summarized as follows:

2.1. lspci

$ lspci
00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04)
00:16.0 Communication controller: Intel Corporation 7 Series/C210 Series Chipset Family MEI Controller #1 (rev 04)
00:1a.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #2 (rev 04)
00:1b.0 Audio device: Intel Corporation 7 Series/C210 Series Chipset Family High Definition Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 1 (rev c4)
00:1c.1 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 2 (rev c4)
00:1c.3 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 4 (rev c4)
00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation HM76 Express Chipset LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 7 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04)
00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family SMBus Controller (rev 04)
02:00.0 Network controller: Qualcomm Atheros AR9485 Wireless Network Adapter (rev 01)
03:00.0 Ethernet controller: Qualcomm Atheros AR8161 Gigabit Ethernet (rev 10)

2.2 Boot and hibernation sequence controlled exclusively by systemd
I wanted to rule out possible suspects that could be interfering with the resume process, so I've been changing everything to systemd-components so to avoid unknown incompatibilities between different pieces of code. In theory one would expect that systemd-coded tools have been tested with each other.

For testing, I replaced the base, udev and resume triad of hooks for the systemd hook in the initramfs configuration. asus_nb_wmi is the only module integrated into the initramfs.

/etc/mkinitcpio.conf
-------------------------
MODULES="asus_nb_wmi"
HOOKS="systemd autodetect modconf block filesystems keyboard fsck"

Interestingly, systemd in Arch lacks a conf file that man systemd-sleep refers to, which controls the parameters that are passed onto /sys/power/disk and /sys/power/state. The contents of this file make it explicit to use the 'platform' hibernation mode (as opposed to non-ACPI 'shutdown') and 'disk' as the suspension state (which is what you want for hibernation; 'mem' is for suspending to RAM).

/etc/systemd/sleep.conf
-------------------------------
[Sleep]
HibernateMode=platform
HibernateState=disk

Finally, I also changed GRUB for systemd-boot, with the following bootloader entry for Arch:

/boot/loader/entries/arch.conf
--------------------------------------
title Arch Linux
linux /vmlinuz-linux    
initrd /intel-ucode.img
initrd /initramfs-linux.img
options root=UUID=6252a794-fd05-4f8c-9c65-984b106dd193 rw resume=UUID=d5f153dd-712a-4cc7-a712-1e109d9861da

However, none of these changes have seem to have any positive effect or have revealed anything that might be interesting for solving the problem.

3. Other tests and measures

  • BIOS updated (AsusTEK X55CR)

  • The sequence of pm_tests described by the kernel documentation doesn't fail at any point, but I've noticed that the 'devices' test makes the GPU driver crash (i915).

  • Using the 'shutdown' hibernation mode seems to be more reliable, but the problem with that mode is that it shuts the system down as if it was shut down using the power button (the kernel cannot tell ACPI to "gently transition" to S0 if there's a hibernation image). This means that disks are stopped all in a sudden and I'm not quite comfortable with that. Things are not supposed to work like that.

  • Be aware that hibernation worked on Arch on this very same system, as well as on Debian previously.

4. Conclusion
I just can't think of anything else that I could test for to understand the nature of this problem. I haven't tested the lts kernel yet (I'll do it as soon as I can), but I'd rather like to find a solution that is helpful for the users of the current kernel if someone else is also having these issues (there are threads here and there talking about issues that seem to be similar, but nobody has conducted pm_tests or posted log files). I'd really appreciate if anyone of you knows of any path I haven't tried or any subtle configuration option I may have skipped.

Sorry for the huge post sad

Last edited by nvteighen (2016-04-04 19:56:47)

Offline

#2 2016-04-05 02:43:34

firekage
Member
From: Eastern Europe, Poland
Registered: 2013-06-30
Posts: 617

Re: Hibernation not resuming, sometimes

In fact, i have similar problem with 6-th Intel generation processor - Skylake. I had also disable HT in order to have working hibernation, but it must be something more thant that because it could write image of hibernation, but often when resume id being done, after image is restored - hard freez. I event can't debug it because there are no logs at all when computer freezes. They are only writed/stored when it works...but not with hibernation. What is strange:

-suspend to RAM always works
-disable HT helps, but sometimes suspend to RAM can fail. I have HT enabled on Atom n2600 and there is no problem at all
-all kernel debug test was done, and they work

Also, you mentioned about "platform" mode - in my opinion this option is one of many that gave problems to Intel users. I previously had Intel 4670K - the same problem for almost 1.5 year with suspend to RAM. Earlier than that, i had X4-955 from AMD where suspend to RAM and hibernation workes like a charm. I thing that "platform" should be somehow changed to "shutdown" but even if i did it with "echo" option, it is still being reverted to "platform" after one cycle.

In fact, i'm barely able to use S3 with Skylake and Arch. Yesterday, even with HT disabled, resume from suspend to RAM freezed machine.


RAM modules are ok - being tested again and again for few days with Memtest86 bootable iso with all test, even manually done.

Offline

#3 2016-04-05 18:17:00

nvteighen
Member
Registered: 2014-03-13
Posts: 31

Re: Hibernation not resuming, sometimes

firekage wrote:

In fact, i have similar problem with 6-th Intel generation processor - Skylake. I had also disable HT in order to have working hibernation, but it must be something more thant that because it could write image of hibernation, but often when resume id being done, after image is restored - hard freez. I event can't debug it because there are no logs at all when computer freezes. They are only writed/stored when it works...but not with hibernation. What is strange:

-suspend to RAM always works
-disable HT helps, but sometimes suspend to RAM can fail. I have HT enabled on Atom n2600 and there is no problem at all
-all kernel debug test was done, and they work

Also, you mentioned about "platform" mode - in my opinion this option is one of many that gave problems to Intel users. I previously had Intel 4670K - the same problem for almost 1.5 year with suspend to RAM. Earlier than that, i had X4-955 from AMD where suspend to RAM and hibernation workes like a charm. I thing that "platform" should be somehow changed to "shutdown" but even if i did it with "echo" option, it is still being reverted to "platform" after one cycle.

In fact, i'm barely able to use S3 with Skylake and Arch. Yesterday, even with HT disabled, resume from suspend to RAM freezed machine.


RAM modules are ok - being tested again and again for few days with Memtest86 bootable iso with all test, even manually done.

The behavior of "platform" being reverted is expected: it is the default mode as set by the kernel. If you want to set it persistently to "shutdown", the route within systemd is to edit /etc/systemd/sleep.conf (see 'man systemd-sleep').

Suspend to RAM works, except for a weird bug that affects GNOME (sometimes glyphs are corrupted after waking the system up link to wiki). By the way, these days on the Arch reddit some people have stated that xf86-server-intel should be dropped in favor of the native modesetting driver. Remember that I have observed that there is a GPU crash in one of the pm_tests; it is the only device that fails the test, so it is worth checking out what happens when the driver is uninstalled. Here's the reddit thread.

Offline

#4 2016-04-06 02:27:43

firekage
Member
From: Eastern Europe, Poland
Registered: 2013-06-30
Posts: 617

Re: Hibernation not resuming, sometimes

I don't have installed xf86-video-intel and don't have xf86-server-intel, i own nvidia and the same is with nouveau and nvidia drivers.

Offline

#5 2016-07-02 13:32:57

gladston3
Member
Registered: 2016-07-02
Posts: 1

Re: Hibernation not resuming, sometimes

Hi,

I have the exact same problem and I am also unable to get any useful logs. I even bought another laptop since I am so dependent on a functional hibernation. Unfortunately the problem occurs on the new laptop too )):

Did you get any further or did you even find a solution?

Some guys suggested to me to use a serial console for debugging. Have you tried that?

cheers
-gladston3

Offline

#6 2016-07-03 23:12:33

nvteighen
Member
Registered: 2014-03-13
Posts: 31

Re: Hibernation not resuming, sometimes

gladston3 wrote:

Hi,

I have the exact same problem and I am also unable to get any useful logs. I even bought another laptop since I am so dependent on a functional hibernation. Unfortunately the problem occurs on the new laptop too )):

Did you get any further or did you even find a solution?

Some guys suggested to me to use a serial console for debugging. Have you tried that?

cheers
-gladston3

No, I haven't been able to get any further. There's the hack using "shutdown" as the hibernation mode, but I don't like it at all: it stops the hard disk all of a sudden (it forces a power-off to ACPI S0 and leaves hibernation to be detected by software) and even though you know things are being stored in your swap partition, I don't want that sudden power-off to happen.

I don't have a clue about using serial consoles for debugging. I hope someone here does.

Offline

#7 2016-07-04 07:09:11

Bradan
Member
Registered: 2015-10-13
Posts: 21

Re: Hibernation not resuming, sometimes

Hello,

I have similar problems since a few years. Also using an intel cpu and nvidia gpu:

> lscpu
Architektur:           x86_64
CPU Operationsmodus:   32-bit, 64-bit
Byte-Reihenfolge:      Little Endian
CPU(s):                8
Liste der Online-CPU(s):0-7
Thread(s) pro Kern:    2
Kern(e) pro Socket:    4
Sockel:                1
NUMA-Knoten:           1
Anbieterkennung:       GenuineIntel
Prozessorfamilie:      6
Modell:                58
Modellname:            Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
Stepping:              9
CPU MHz:               2179.851
Maximale Taktfrequenz der CPU:3900,0000
Minimale Taktfrequenz der CPU:1600,0000
BogoMIPS:              6803.92
Virtualisierung:       VT-x
L1d Cache:             32K
L1i Cache:             32K
L2 Cache:              256K
L3 Cache:              8192K
NUMA-Knoten0 CPU(s):   0-7
Markierungen:          fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts
lspci
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor DRAM Controller (rev 09)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port (rev 09)
00:02.0 Display controller: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller (rev 09)
00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04)
00:16.0 Communication controller: Intel Corporation 7 Series/C210 Series Chipset Family MEI Controller #1 (rev 04)
00:1a.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #2 (rev 04)
00:1b.0 Audio device: Intel Corporation 7 Series/C210 Series Chipset Family High Definition Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 1 (rev c4)
00:1c.3 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 4 (rev c4)
00:1c.4 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 5 (rev c4)
00:1c.5 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 6 (rev c4)
00:1c.6 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 7 (rev c4)
00:1c.7 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 8 (rev c4)
00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation Z77 Express Chipset LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 7 Series/C210 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04)
00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family SMBus Controller (rev 04)
01:00.0 VGA compatible controller: NVIDIA Corporation GK106 [GeForce GTX 650 Ti] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GK106 HDMI Audio Controller (rev a1)
02:00.0 Serial controller: MosChip Semiconductor Technology Ltd. PCIe 9912 Multi-I/O Controller
02:00.1 Serial controller: MosChip Semiconductor Technology Ltd. PCIe 9912 Multi-I/O Controller
02:00.2 Parallel controller: MosChip Semiconductor Technology Ltd. PCIe 9912 Multi-I/O Controller
03:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA Controller (rev 01)
04:00.0 Ethernet controller: Broadcom Corporation NetLink BCM57781 Gigabit Ethernet PCIe (rev 10)
05:00.0 PCI bridge: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge (rev 03)
07:00.0 Network controller: Qualcomm Atheros AR9485 Wireless Network Adapter (rev 01)
08:00.0 USB controller: ASMedia Technology Inc. ASM1042 SuperSpeed USB Host Controller

I've created a thread once here, where I mentioned it as a second problem.

nvteighen: you could create a systemd-before-script that halts your hard disks properly. Wouldn't that be the solution?

Offline

#8 2016-07-07 17:34:38

nvteighen
Member
Registered: 2014-03-13
Posts: 31

Re: Hibernation not resuming, sometimes

Well, "shutdown" mode halts the filesystem correctly, but the sound it makes when doing it is the same as in a hard reset... This is not necessarily harmful, but I just don't like it.

@firekage: In order to have "shutdown" permanently set, you must create /etc/systemd/sleep.conf with this content (the file is not created by default on Arch Linux):

[Sleep]
HibernateMode=shutdown
HibernateState=disk

Offline

#9 2016-12-15 21:53:32

matiasm
Member
From: Misiones-Arg
Registered: 2010-08-01
Posts: 21
Website

Re: Hibernation not resuming, sometimes

Hi there, I'm having the same issue here. A few months ago (not really sure when it started) hibernation and resuming was working flawlessly, and suddenly resuming started to fail randomly as stated by the OP.
I use GDM as display manager and have a nvidia card. I've also tried with the LDM and mate desktop.
Normally I use the regular kernel but also I've been trying with the LTS kernel and nvidia-lts driver and I even tried to install ICE but couldn't make the ice kernel boot (didn't try much after it failed to boot but I followed the wiki and made through the ice kernel compilation and installation).

I've noticed that in some failure situations it seems that the general system resuming had worked but the GDM (or gnome itself) fails to recover and shows the regular login window (with all users listed) and not the "locked login window" that usually shows after a successful recover or after a locked screen with just my pass being asked (sorry if Im not using the correct names for this login windows, and let me know if If not being clear). If I check the "uptime" it shows the system has correctly recovered from hibernation, but my desktop session hasn't.
I've even noticed that after failing and showing the regular login window, selecting my user and entering the password, the screen blanks and it doesn't enter the desktop, it shows up the "locked login window" like it should have done in the first instance tongue

I pasted some logs here in case they help:
everything.log
kernel.log
Xorg.0.log

I copy here some logs that called my attention today after failing a second resuming from a "hybrid-sleep" (that is, hybrid-sleeping worked once resuming correctly and after sleeping a 2nd time it failed to recover):

Dec 15 15:57:12 psyd2 kernel: pcieport 0000:00:15.1: System wakeup disabled by ACPI
Dec 15 15:57:12 psyd2 kernel: usb usb5: root hub lost power or was reset
Dec 15 15:57:12 psyd2 kernel: sd 0:0:0:0: [sda] Starting disk
Dec 15 15:57:12 psyd2 kernel: r8169 0000:09:00.0 eth0: link down
Dec 15 15:57:12 psyd2 kernel: PM: resume of devices complete after 324.616 msecs
Dec 15 15:57:12 psyd2 kernel: ata8: SATA link down (SStatus 0 SControl 300)
Dec 15 15:57:12 psyd2 kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Dec 15 15:57:12 psyd2 kernel: ata1.00: configured for UDMA/133
Dec 15 15:57:12 psyd2 kernel: PM: Basic memory bitmaps freed
Dec 15 15:57:12 psyd2 kernel: Restarting tasks ... done.
Dec 15 15:57:12 psyd2 rtkit-daemon[1622]: The canary thread is apparently starving. Taking action.
Dec 15 15:57:12 psyd2 systemd-sleep[22802]: System resumed. 
Dec 15 15:57:12 psyd2 systemd[1]: Time has been changed
Dec 15 15:57:12 psyd2 laptop-mode[23165]: enabled, not active [unchanged]
Dec 15 15:57:12 psyd2 systemd[1]: Started Hybrid Suspend+Hibernate.
Dec 15 15:57:12 psyd2 laptop-mode[23184]: Laptop mode
Dec 15 15:57:12 psyd2 systemd[1]: sleep.target: Unit not needed anymore. Stopping.
Dec 15 15:57:12 psyd2 laptop-mode[23187]: enabled, not active [unchanged]
Dec 15 15:57:12 psyd2 systemd[1]: Stopped target Sleep.
Dec 15 15:57:12 psyd2 systemd[1]: Reached target Hybrid Suspend+Hibernate.
Dec 15 15:57:12 psyd2 systemd[1]: hybrid-sleep.target: Unit is bound to inactive unit systemd-hybrid-sleep.service. Stopping, too.
Dec 15 15:57:12 psyd2 systemd[1]: Stopped target Hybrid Suspend+Hibernate.
Dec 15 15:57:12 psyd2 laptop-mode[23206]: Laptop mode
Dec 15 15:57:12 psyd2 laptop-mode[23213]: enabled, not active [unchanged]
...
Dec 15 15:57:13 psyd2 /usr/lib/gdm/gdm-x-session[1713]: (EE) NVIDIA(0):  *** Aborting ***
Dec 15 15:57:13 psyd2 /usr/lib/gdm/gdm-x-session[1713]: (EE)
Dec 15 15:57:13 psyd2 /usr/lib/gdm/gdm-x-session[1713]: Fatal server error:
Dec 15 15:57:13 psyd2 /usr/lib/gdm/gdm-x-session[1713]: (EE) EnterVT failed for screen 0
Dec 15 15:57:13 psyd2 /usr/lib/gdm/gdm-x-session[1713]: (EE) Server terminated with error (1). Closing log file.
Dec 15 15:57:13 psyd2 chromium.desktop[2411]: [2411:2411:1215/155713:ERROR:chrome_browser_main_extra_parts_x11.cc(62)] X IO error received (X server probably went away)
Dec 15 15:57:13 psyd2 chromium.desktop[2411]: [2475:2475:1215/155713:ERROR:x11_util.cc(87)] X IO error received (X server probably went away)
Dec 15 15:57:13 psyd2 pulseaudio[1780]: XIO:  fatal IO error 11 (Resource temporarily unavailable) on X server ":1"
Dec 15 15:57:13 psyd2 pulseaudio[1780]:       after 14 requests (6 known processed) with 0 events remaining.
Dec 15 15:57:13 psyd2 audacious.desktop[2197]: audacious: Fatal IO error 104 (Connection reset by peer) on X server :1.
Dec 15 15:57:13 psyd2 patchage.desktop[3304]: patchage: Fatal IO error 11 (Resource temporarily unavailable) on X server :1.
Dec 15 15:57:13 psyd2 gnome-terminal-[1915]: gnome-terminal-server: Fatal IO error 104 (Connection reset by peer) on X server :1.
Dec 15 15:57:13 psyd2 firefox.desktop[2206]: /usr/lib/firefox/plugin-container: Fatal IO error 11 (Resource temporarily unavailable) on X server :1.
Dec 15 15:57:14 psyd2 gnome-shell[1773]: gnome-shell: Fatal IO error 0 (Success) on X server :1.
Dec 15 15:57:14 psyd2 gdm-x-session[1713]: X server was killed with status 6
Dec 15 15:57:14 psyd2 gnome-session[1720]: gnome-session-binary[1720]: GLib-GObject-CRITICAL: g_object_unref: assertion 'G_IS_OBJECT (object)' failed
Dec 15 15:57:14 psyd2 gnome-session-binary[1720]: GLib-GObject-CRITICAL: g_object_unref: assertion 'G_IS_OBJECT (object)' failed
Dec 15 15:57:14 psyd2 gnome-settings-[1874]: gnome-settings-daemon: Fatal IO error 11 (Resource temporarily unavailable) on X server :1.
Dec 15 15:57:14 psyd2 gnome-shell[23588]: Unable to initialize Clutter: Unable to open display ':1'
Dec 15 15:57:14 psyd2 gnome-session[1720]: gnome-session-binary[1720]: WARNING: App 'org.gnome.Shell.desktop' exited with code 1
Dec 15 15:57:14 psyd2 gnome-session-binary[1720]: WARNING: App 'org.gnome.Shell.desktop' exited with code 1
Dec 15 15:57:14 psyd2 gnome-session-binary[1720]: Unrecoverable failure in required component org.gnome.Shell.desktop
Dec 15 15:57:14 psyd2 gnome-session[1720]: gnome-session-binary[1720]: WARNING: App 'org.gnome.Shell.desktop' respawning too quickly
Dec 15 15:57:14 psyd2 gnome-session-binary[1720]: WARNING: App 'org.gnome.Shell.desktop' respawning too quickly
Dec 15 15:57:14 psyd2 gnome-session[1720]: Unable to init server: Could not connect: Connection refused
Dec 15 15:57:14 psyd2 gnome-shell[1576]: driver/gl/cogl-framebuffer-gl.c:1022: GL error (1286): Invalid framebuffer operation
Dec 15 15:57:14 psyd2 kernel: gnome-session-f[23595]: segfault at 0 ip 00007fd300ec0c59 sp 00007fff5ef3f0e0 error 4 in libgtk-3.so.0.2200.5[7fd300bde000+6fa000]
Dec 15 15:57:14 psyd2 gnome-shell[1576]: _st_paint_shadow_with_opacity: assertion 'shadow_pipeline != NULL' failed
Dec 15 15:57:14 psyd2 gnome-shell[1576]: driver/gl/cogl-framebuffer-gl.c:1186: GL error (1286): Invalid framebuffer operation
Dec 15 15:57:14 psyd2 systemd[1]: Started Process Core Dump (PID 23596/UID 0).
Dec 15 15:57:14 psyd2 systemd-coredump[23597]: Process 23595 (gnome-session-f) of user 1000 dumped core.
Stack trace of thread 23595:
#0  0x00007fd300ec0c59 n/a (libgtk-3.so.0)
#1  0x00007fd300d5c968 n/a (libgtk-3.so.0)
#2  0x00007fd300d71a54 n/a (libgtk-3.so.0)
#3  0x00007fd300d5dc9c n/a (libgtk-3.so.0)
#4  0x00007fd300d7197c n/a (libgtk-3.so.0)
#5  0x00007fd300d719d5 n/a (libgtk-3.so.0)
#6  0x00007fd300d5e602 n/a (libgtk-3.so.0)
#7  0x00007fd3006c130f g_type_create_instance (libgobject-2.0.so.0)
#8  0x00007fd3006a31fb n/a (libgobject-2.0.so.0)
#9  0x00007fd3006a4c0d g_object_newv (libgobject-2.0.so.0)
#10 0x00007fd3006a53c4 g_object_new (libgobject-2.0.so.0)
#11 0x00007fd300d79eba n/a (libgtk-3.so.0)
#12 0x00007fd300f57c47 n/a (libgtk-3.so.0)
#13 0x00007fd3006c130f g_type_create_instance (libgobject-2.0.so.0)
#14 0x00007fd3006a31fb n/a (libgobject-2.0.so.0)
#15 0x00007fd3006a4c0d g_object_newv (libgobject-2.0.so.0)
#16 0x00007fd3006a53c4 g_object_new (libgobject-2.0.so.0)
#17 0x0000000000401db3 n/a (gnome-session-failed)
#18 0x00007fd2ffff6291 __libc_start_main (libc.so.6)
#19 0x00000000004021ba n/a (gnome-session-failed)

And this seems to be a few seconds later trying to start a new session:

Dec 15 16:01:42 psyd2 gdm[1316]: GdmManager: trying to open reauthentication channel for user mat
Dec 15 16:01:42 psyd2 gdm[1316]: GdmSession: starting conversation gdm-password
Dec 15 16:01:42 psyd2 gdm[1316]: GdmSessionWorkerJob: Starting worker...
Dec 15 16:01:42 psyd2 gdm[1316]: GdmSessionWorkerJob: Running session_worker_job process: gdm-session-worker [pam/gdm-password] /usr/lib/gdm/gdm-session-worker
Dec 15 16:01:42 psyd2 gdm[1316]: GdmSessionWorkerJob: : SessionWorkerJob on pid 23671
Dec 15 16:01:42 psyd2 gdm-password][23671]: Enabling debugging
Dec 15 16:01:42 psyd2 gdm-password][23671]: GdmSessionWorker: connecting to address: unix:abstract=/tmp/dbus-ja8OP763
Dec 15 16:01:42 psyd2 gdm[1316]: GdmDBusServer: new connection 0x1f17c30
Dec 15 16:01:42 psyd2 gdm[1316]: GdmSession: Handling new connection from worker
Dec 15 16:01:42 psyd2 gdm[1316]: GdmSession: Authenticating new connection
Dec 15 16:01:42 psyd2 gdm-password][23671]: AccountsService: ActUserManager: calling 'ListCachedUsers'
Dec 15 16:01:42 psyd2 gdm-password][23671]: AccountsService: Failed to identify the current session: No such device or address
Dec 15 16:01:42 psyd2 gdm-password][23671]: AccountsService: ActUserManager: seat unloaded, so trying to set loaded property
Dec 15 16:01:42 psyd2 gdm-password][23671]: AccountsService: ActUserManager: Listing cached users, so not setting loaded property
Dec 15 16:01:42 psyd2 gdm-password][23671]: AccountsService: ActUserManager: Listing cached users, so not setting loaded property
Dec 15 16:01:42 psyd2 gdm[1316]: GdmSession: worker connection is 0x1f17c30
Dec 15 16:01:42 psyd2 gdm[1316]: GdmSession: Emitting conversation-started signal
Dec 15 16:01:42 psyd2 gdm[1316]: GdmManager: session conversation started for service gdm-password
Dec 15 16:01:42 psyd2 gdm[1316]: GdmSession: Setting user: 'mat'
Dec 15 16:01:42 psyd2 gdm[1316]: GdmSession: Beginning setup for user mat
Dec 15 16:01:42 psyd2 gdm[1316]: GdmSession: getting session command for file 'gnome.desktop'
Dec 15 16:01:42 psyd2 gdm-password][23671]: AccountsService: ActUserManager: ListCachedUsers finished, will set loaded property after list is fully loaded
Dec 15 16:01:42 psyd2 gdm-password][23671]: AccountsService: ActUserManager: tracking new user with object path /org/freedesktop/Accounts/User1001
Dec 15 16:01:42 psyd2 gdm-password][23671]: AccountsService: ActUserManager: tracking new user with object path /org/freedesktop/Accounts/User1002
Dec 15 16:01:42 psyd2 gdm[1316]: GdmSession: checking if file 'gnome.desktop' is wayland session: yes
Dec 15 16:01:42 psyd2 gdm[1316]: GdmSession: setting session to type 'wayland'
Dec 15 16:01:42 psyd2 gdm[1316]: GdmSession: getting session command for file 'gnome.desktop'
Dec 15 16:01:42 psyd2 gdm[1316]: GdmSession: Conversation started

The full logs are in the stated links above. Please excuse me if this is too much info pasted, and also let me know any other info that can help.
Thanks in advance

Offline

#10 2016-12-21 19:10:42

matiasm
Member
From: Misiones-Arg
Registered: 2010-08-01
Posts: 21
Website

Re: Hibernation not resuming, sometimes

So summing-up, my problem seems not to be with hibernation and resuming but with the desktops recovering the sessions.
nvteighen is something similar happening to you??

Offline

#11 2017-01-16 09:09:02

nartes
Member
From: Minsk, Belarus
Registered: 2014-07-05
Posts: 62
Website

Re: Hibernation not resuming, sometimes

I am experiencing the same problem as well. That's why I've tried to use pure VTY interface, or to use weston instead of GNOME.
For some reasons, it is hard to repeat the bad behaviour instantly after boot up.
Also the problem seems to correlate with the amount of loaded RAM,
at least in my case.
There is an additional problem,i.e. my GNOME on Wayland DM begins flickering
after resuming from hibernation.
I had the issue with weston.
Here is my bug report to weston, with the same problem.
https://bugs.freedesktop.org/show_bug.cgi?id=97815

Do you have some idea how to make use of all those logs?
I think we need to debug something deeper, because
the 'Reboot' entry in the systemd log sounds like
a message from black box.

I don't even know, which packages might be relevant,
in order to try to rollback them.

May somebody else provide the output for the command

journalctl -r | grep -E 'Image mismatch|Reboot|Hibernate|Image not found'

<nartes> grawity, pardon me. These lines are the problem actually.
<nartes> abr 04 18:18:45 localhost kernel: PM: Image not found (code -22)
<nartes> abr 04 18:18:45 localhost kernel: PM: Hibernation image not present or could not be loaded.
<grawity> that's usually normal – your initramfs initiates the resume later

Is there somebody who has "PM: Image mismatch..." in journalctl?
In my case, it is usually "PM: Image mismatch: memory size",
that causes broken resuming after hibernation sleep.

Here is a more detailed log
broken-hibernation.log
I've posted an issue on kernel.org
hibernation image corruption bug report

Last edited by nartes (2017-01-16 15:28:48)

Offline

#12 2017-01-17 13:46:46

matiasm
Member
From: Misiones-Arg
Registered: 2010-08-01
Posts: 21
Website

Re: Hibernation not resuming, sometimes

I've been trying the systemctl hybrid-sleep option to see if there are any different results but today I've got the same problem again.

Maybe it is related to the RAM amount used as nartes mentioned since it happens more often when I'm using gimp with many opened files, but it's just a raw assumption hmm

Today after getting the failed resuming I went to a tty (which showed the correct uptime, a couple of days since the last restart and few correct resuming) and tried to systemctl restart gdm but it would work, it seemed to be trying over and over to start gdm but without success. Then I just rebooted.

Here are the recent results of

journalctl -r | grep -E 'Image mismatch|Reboot|Hibernate|Image not found'

Jan 17 09:10:04 psyd2 systemd[1]: Starting Reboot...
Jan 17 08:57:35 psyd2 systemd[1]: Stopped target Hybrid Suspend+Hibernate.
Jan 17 08:57:35 psyd2 systemd[1]: Reached target Hybrid Suspend+Hibernate.
Jan 17 08:57:35 psyd2 systemd[1]: Started Hybrid Suspend+Hibernate.
Jan 16 18:06:49 psyd2 systemd[1]: Starting Hybrid Suspend+Hibernate...

And here I pasted some logs from this morning. I copied the journalctl -r since yesterday's hibernation until today's gdm retries...

Some lines caught my attention. Yesterday when started the hibernation it ended in:

Jan 16 18:06:49 psyd2 systemd-sleep[487]: Suspending system...
Jan 16 18:06:49 psyd2 kernel: PM: Hibernation mode set to 'suspend'
Jan 16 18:06:49 psyd2 systemd[1]: Starting Hybrid Suspend+Hibernate...
Jan 16 18:06:49 psyd2 systemd[1]: Reached target Sleep.

and today when trying to resume it went like this:

Jan 17 08:57:35 psyd2 kernel: PM: Hibernation image created (1630561 pages copied)
Jan 17 08:57:35 psyd2 kernel: PM: Normal pages needed: 1630561 + 1024, available pages: 2521214
Jan 17 08:57:35 psyd2 kernel: PM: Need to copy 1630561 pages
Jan 17 08:57:35 psyd2 kernel: PM: Creating hibernation image:
...
Jan 17 08:57:35 psyd2 kernel: Suspending console(s) (use no_console_suspend to debug)
Jan 17 08:57:35 psyd2 kernel: Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
Jan 17 08:57:35 psyd2 kernel: PM: Allocated 6527084 kbytes in 9.23 seconds (707.15 MB/s)
Jan 17 08:57:35 psyd2 kernel: PM: Preallocating image memory... done (allocated 1631771 pages)
Jan 17 08:57:35 psyd2 kernel: PM: Basic memory bitmaps created
Jan 17 08:57:35 psyd2 kernel: PM: Marking nosave pages: [mem 0xd0000000-0xd3ffffff]
Jan 17 08:57:35 psyd2 kernel: PM: Marking nosave pages: [mem 0xdd800000-0x100000fff]
Jan 17 08:57:35 psyd2 kernel: PM: Marking nosave pages: [mem 0xdd083000-0xdd7f3fff]
Jan 17 08:57:35 psyd2 kernel: PM: Marking nosave pages: [mem 0xdca36000-0xdcc3bfff]
Jan 17 08:57:35 psyd2 kernel: PM: Marking nosave pages: [mem 0xda70c000-0xdca34fff]
Jan 17 08:57:35 psyd2 kernel: PM: Marking nosave pages: [mem 0x0009e000-0x000fffff]
Jan 17 08:57:35 psyd2 kernel: PM: Marking nosave pages: [mem 0x00000000-0x00000fff]
Jan 17 08:57:35 psyd2 kernel: Double checking all user space processes after OOM killer disable... (elapsed 0.000 seconds) 
Jan 17 08:57:35 psyd2 kernel: Freezing user space processes ... (elapsed 0.002 seconds) done.
Jan 17 08:57:34 psyd2 kernel: PM: Syncing filesystems ... done.

I don't know about PM's ways, but it seemed strange to me that it tried to create the hibernation image today, when trying to resume, maybe it's related to the hybrid-sleep option? Again, the full logs are here.
Cheers!

Offline

#13 2017-01-17 21:37:34

nartes
Member
From: Minsk, Belarus
Registered: 2014-07-05
Posts: 62
Website

Re: Hibernation not resuming, sometimes

matiasm, since uptime didn'tdoesn't change, perhaps, your hibernation is not corrupted.
The problem may be with gdm indeed. I am not aware of correct way to restart
gdm. I usually try to do

sudo su -c (systemctl stop gdm && killall gdm killall gnome && systemctl start gdm)

Can you try to install lightdm and to reproduce the bug again?

pacman -S lightdm
systemctl disable gdm && systemtl enable lightdm && systemctl restart

In the case of nvteighen the issue is that

abr 04 18:18:45 localhost kernel: PM: Image not found (code -22)

I had experienced this behaviour. Usually system
boots up without resuming from hibernation, and begins
cheking evindently correputed filesystem.

But in my case the reason for the broken resuming is

Jan 16 15:59:10 siarhei_asus kernel: PM: Image mismatch: memory size

Kernel writes that when number of pages don't match.
See the functionI've copied few relevant functions
from the source code of linux kernel.

/* "kernel/power/snapshot.c" */
...

2164 static int check_header(struct swsusp_info *info)                                    
2165 {                                                                                    
2166         char *reason;                                                                
2167                                                                                      
2168         reason = check_image_kernel(info);                                           
2169         if (!reason && info->num_physpages != get_num_physpages())                   
2170                 reason = "memory size";                                              
2171         if (reason) {                                                                
2172                 printk(KERN_ERR "PM: Image mismatch: %s\n", reason);                 
2173                 return -EPERM;                                                       
2174         }                                                                            
2175         return 0;                                                                    
2176 }

/* "include/linux/mm.h" */
...

1824 static inline unsigned long get_num_physpages(void)                                  
1825 {                                                                                    
1826         int nid;                                                                     
1827         unsigned long phys_pages = 0;                                                
1828                                                                                      
1829         for_each_online_node(nid)                                                    
1830                 phys_pages += node_present_pages(nid);                               
1831                                                                                      
1832         return phys_pages;                                                           
1833 }    

Last edited by nartes (2017-01-17 21:39:52)

Offline

Board footer

Powered by FluxBB