You are not logged in.

#1 2022-04-13 09:55:28

erenon
Member
Registered: 2022-04-13
Posts: 5

[SOLVED] Resume from Suspend fails

On 2022-04-08, my Arch Linux desktop failed to resume from `systemctl suspend`:
the screen was blank, the keyboard numlock led was off.
According to etckeeper, I had a `sudo pacman -Syu` two days prior, that
upgraded a large number of packages, including:

    -linux 5.16.15.arch1-1
    +linux 5.17.1.arch1-1
    -vulkan-radeon 21.3.7-2
    +vulkan-radeon 22.0.1-3
    -wayland 1.20.0-1
    +wayland 1.20.0-2

The incident repeatedly occurred randomly, roughly 1/3 of the resumes fail.
After a suspend/failed-resume/reset-by-button sequence, journalctl says:

    systemd-journald[274]: File /var/log/journal/<hash>/system.journal corrupted or uncleanly shut down, renaming and replacing.

`journalctl -b -1` shows the boot before the boot that failed to suspend.
Apparently, when resume fails, the journal is not flushed properly.

Following https://01.org/node/3721, I did the following:

- booted with initcall_debug ignore_loglevel no_console_suspend
- echo 0 > /sys/power/pm_async
- echo 1 > /proc/sys/kernel/sysrq
- systemctl suspend

Disabling pm_async makes it consistently fail, with different failure modes:

- On suspend (when I press a key on the keyboard), the PC speaker beeps once, it didn't use to do that
- Sometimes there's no video output, sometimes there is, and correct
- Usually, the numlock led is off, keyboard and mouse is unresponsive
- The machine responds to ping, but sshd is unavailable (connect succeeds, but no further response)

Once I managed to capture the console output after such a failed resume,
see: https://ibb.co/sspnb55 . I'd like to highlight:

    ata1: found unknown device (class 0)

After a normal boot, I have no such message, a non-system hdd is attached to ata1.

Short system info: man:Gigabyte Technology Co., Ltd. | plat:970A-DS3P | cpu:AMD FX(tm)-8320 Eight-Core Processor | bios:F2j | biosdate:12/29/2014 | numcpu:8 | memsz:16349744 | memfr:13925272 | os:Arch Linux

Any idea how can I debug this further?

Last edited by erenon (2022-04-20 06:36:03)

Offline

#2 2022-04-13 15:38:52

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,390

Re: [SOLVED] Resume from Suspend fails

So, what /is/ (on) ata1 and do you experience the same problem w/ the lts kernel (referencing your incomplete list of updated packages)
What if you pass "pcie_aspm=off" to the kernel?

Since you mentioned "wayland" and "vulkan-radeon" - do you suspend from a GUI session and does the problem exist from a multi-user.target (console, no GUI) login?

Offline

#3 2022-04-13 18:37:57

erenon
Member
Registered: 2022-04-13
Posts: 5

Re: [SOLVED] Resume from Suspend fails

> So, what /is/ (on) ata1

A HDD with data. It is mounted under /mnt/..., but the issue can be triggered even if it is not mounted at all.

> do you experience the same problem w/ the lts kernel
I'll need to try that.

> pcie_aspm=off
I'll try that as well. Note: suspend/resume was working for years, it broke recently. No hardware change was made.

> Since you mentioned "wayland" and "vulkan-radeon" - do you suspend from a GUI session and does the problem exist from a multi-user.target (console, no GUI) login?
The issue can be triggered both from a GUI session (i3 + systemctl suspend from a terminal emualtor), and from remotely over ssh (no login on the tty console)

Offline

#4 2022-04-13 18:59:25

erenon
Member
Registered: 2022-04-13
Posts: 5

Re: [SOLVED] Resume from Suspend fails

It is reproducible with pcie_aspm=off.
It is reproducible with linux-lts.

Offline

#5 2022-04-13 19:29:10

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,390

Re: [SOLVED] Resume from Suspend fails

So probably not the kernel update and by the previous assertion rather also none of the other listed packages.
Post the pacman log for other suspicious packages and … how painful would it be to remove the drive on ata1?

Offline

#6 2022-04-13 19:39:40

erenon
Member
Registered: 2022-04-13
Posts: 5

Re: [SOLVED] Resume from Suspend fails

Full list of suspected packages:

   Package changes:
    -alsa-card-profiles 1:0.3.48-1
    +alsa-card-profiles 1:0.3.49-1
    -at-spi2-core 2.44.0-1
    -atk 2.36.0-1
    +at-spi2-core 2.44.0-2
    +atk 2.38.0-1
    -ca-certificates-mozilla 3.76-1
    +ca-certificates-mozilla 3.77-1
    -cairo 1.17.6-1
    +cairo 1.17.6-2
    -chromium 99.0.4844.82-1
    +chromium 100.0.4896.75-1
    -cmake 3.22.3-1
    +cmake 3.23.0-1
    -cups-filters 1.28.12-1
    -curl 7.82.0-1
    +cups-filters 1.28.14-1
    +curl 7.82.0-2
    -expat 2.4.7-1
    +expat 2.4.8-1
    -ffmpeg 2:5.0-5
    +ffmpeg 2:5.0-6
    -firefox 98.0.1-1
    +firefox 99.0-1
    -fontconfig 2:2.13.96-1
    +fontconfig 2:2.14.0-1
    -freetype2 2.11.1-1
    +freetype2 2.12.0-1
    -ghostscript 9.55.0-4
    +ghostscript 9.56.1-1
    -github-cli 2.6.0-1
    -glib-networking 1:2.70.1-1
    +github-cli 2.7.0-1
    +glib-networking 1:2.72.0-1
    -groff 1.22.4-6
    -grub 2:2.06-4
    +groff 1.22.4-7
    +grub 2:2.06-5
    -gvim 8.2.4464-1
    +gvim 8.2.4651-1
    -harfbuzz 4.0.1-1
    +harfbuzz 4.2.0-1
    -hwloc 2.7.0-1
    +hwloc 2.7.1-1
    -imagemagick 7.1.0.28-1
    +imagemagick 7.1.0.29-1
    -iproute2 5.16.0-1
    +iproute2 5.17.0-1
    -json-glib 1.6.6-1
    +json-glib 1.6.6-2
    -kmod 29-2
    +kmod 29-3
    -libarchive 3.6.0-1
    +libarchive 3.6.0-2
    -libcanberra 0.30+2+gc0620e4-5
    +libcanberra 1:0.30+r2+gc0620e4-1
    -libevdev 1.12.0-1
    +libevdev 1.12.1-1
    -libgcrypt 1.9.4-1
    +libgcrypt 1.10.1-1
    -libinput 1.20.0-1
    +libinput 1.20.0-2
    -libnetfilter_conntrack 1.0.8-1
    +libnetfilter_conntrack 1.0.9-1
    -librsvg 2:2.54.0-1
    +librsvg 2:2.54.0-2
    -libsecret 0.20.5-1
    +libsecret 0.20.5-2
    -libsndfile 1.0.31-1
    +libsndfile 1.1.0-2
    -libsoup3 3.0.5-1
    +libsoup3 3.0.6-1
    -libstemmer 2.2.0-1
    -libsysprof-capture 3.42.1-3
    +libstemmer 2.2.0-2
    +libsysprof-capture 3.44.0-1
    -libtiff 4.3.0-1
    +libtiff 4.3.0-2
    -libtool 2.4.6+59+gb55b1cc8-2
    +libtool 2.4.7-1
    -libva-mesa-driver 21.3.7-2
    +libva-mesa-driver 22.0.1-3
    -libwacom 2.1.0-1
    +libwacom 2.2.0-1
    -libx11 1.7.3.1-1
    +libx11 1.7.5-1
    -libxcursor 1.2.0-2
    +libxcursor 1.2.1-1
    -linux 5.16.15.arch1-1
    +linux 5.17.1.arch1-1
    -luajit 2.1.0.beta3.r391.g8b8304f1-1
    +luajit 2.1.0.beta3.r397.g20aea939-1
    -mdadm 4.2-1
    +mdadm 4.2-2
    -mesa 21.3.7-2
    +linux 5.17.1.arch1-1
    -luajit 2.1.0.beta3.r391.g8b8304f1-1
    +luajit 2.1.0.beta3.r397.g20aea939-1
    -mdadm 4.2-1
    +mdadm 4.2-2
    -mesa 21.3.7-2
    -miniupnpc 2.2.2-2
    -minizip 1:1.2.11-5
    +mesa 22.0.1-3
    +miniupnpc 2.2.3-1
    +minizip 1:1.2.12-1
    -nodejs 17.7.2-1
    +nodejs 17.8.0-1
    -nspr 4.33-1
    -nss 3.76-1
    +nspr 4.33-2
    +nss 3.77-1
    -perf 5.16-1
    -perl 5.34.0-3
    +perf 5.17-1
    +perl 5.34.1-1
    -pipewire 1:0.3.48-1
    +pipewire 1:0.3.49-1
    -pipewire-pulse 1:0.3.48-1
    +pipewire-pulse 1:0.3.49-1
    -prometheus 2.33.4-1
    +prometheus 2.34.0-1
    -python 3.10.2-1
    +python 3.10.4-1
    -python-click 8.0.4-1
    +python-click 8.1.2-1
    -python-cryptography 36.0.1-1
    +python-cryptography 36.0.2-1
    -python-pillow 9.0.1-1
    +python-pillow 9.1.0-1
    -python-pyparsing 3.0.3-1
    +python-pyparsing 3.0.7-1
    -re2 1:20220201-1
    +re2 1:20220401-1
    -shared-mime-info 2.0+115+gd74a913-1
    +shared-mime-info 2.0+144+g13695c7-1
    -sqlite 3.38.1-1
    +sqlite 3.38.2-1
    -strace 5.16-1
    +strace 5.17-1
    -util-linux 2.37.4-1
    -util-linux-libs 2.37.4-1
    +util-linux 2.38-1
    +util-linux-libs 2.38-1
    -vim-runtime 8.2.4464-1
    +vim-runtime 8.2.4651-1
    -virtualbox-host-modules-arch 6.1.32-16
    +virtualbox-host-modules-arch 6.1.32-19
    -vulkan-radeon 21.3.7-2
    +vulkan-radeon 22.0.1-3
    -wayland 1.20.0-1
    +wayland 1.20.0-2
    -wget 1.21.2-1
    +wget 1.21.3-1
    -xorg-setxkbmap 1.3.2-2
    +xorg-setxkbmap 1.3.3-1
    -zip 3.0-9
    +zip 3.0-10
    -zlib 1:1.2.11-5
    +zlib 1:1.2.12-1

The previous update was 15 days earlier. During that window, no failure was observed.

I'll try w/o the drive on ata1 tomorrow.

Offline

#7 2022-04-20 06:35:39

erenon
Member
Registered: 2022-04-13
Posts: 5

Re: [SOLVED] Resume from Suspend fails

Since the most recent lts kernel upgrade, I'm no longer able to repro. I'm marking this as solved, for now.

Offline

#8 2022-04-20 16:00:09

abbaswasim
Member
Registered: 2022-04-14
Posts: 1

Re: [SOLVED] Resume from Suspend fails

Sorry for posting in [Solved] post.

I have had the same issue for a week or so now. It also happened after update. Happy to provide all the details(full Dmesg etc) but I have been experimenting and have narrowed it down to my Magic Mouse. If I boot and don't enable btusb I can suspend and resume fine. But if I move and click my mouse (which installs btusb etc) I can't resume from suspend.

Difference between successful resume vs failed resume are the following dmesg lines:

$ diff suspend_success.log suspend_fail.log                                                    
--- suspend_success.log	2022-04-20 16:34:00.459569417 +0100
+++ suspend_fail.log	2022-04-20 16:35:02.663803357 +0100
@@ -59,3 +59,39 @@
 [   67.649079] ata7.00: supports DRM functions and may not be fully accessible
 [   67.654679] ata7.00: supports DRM functions and may not be fully accessible
 [   68.589971] xhci_hcd 0000:86:00.2: xHC error in resume, USBSTS 0x401, Reinit
+[   93.668390] hid-generic 0005:004C:0269.0003: unknown main item tag 0x0
+[   93.672357] hid_magicmouse: unknown parameter 'scroll_delay_pos_x' ignored
+[   93.672365] hid_magicmouse: unknown parameter 'scroll_delay_pos_y' ignored
+[   93.743597] magicmouse 0005:004C:0269.0003: unknown main item tag 0x0
+[   94.212286] hid-generic 0005:004C:0269.0003: unknown main item tag 0x0
+[   96.263431] hid_magicmouse: unknown parameter 'scroll_delay_pos_x' ignored
+[   96.263439] hid_magicmouse: unknown parameter 'scroll_delay_pos_y' ignored
+[   96.363470] magicmouse 0005:004C:0269.0003: unknown main item tag 0x0
+[  114.974837] Call Trace:
+[  114.974842]  <TASK>
+[  114.974843]  restore_processor_state+0x273/0x2e0
+[  114.974855]  x86_acpi_suspend_lowlevel+0x11a/0x160
+[  114.974862]  acpi_suspend_enter+0x53/0x1f0
+[  114.974867]  suspend_devices_and_enter+0x6ee/0x7d0
+[  114.974873]  pm_suspend.cold+0x2fb/0x342
+[  114.974879]  state_store+0x71/0xd0
+[  114.974884]  kernfs_fop_write_iter+0x11c/0x1b0
+[  114.974891]  new_sync_write+0x15c/0x1f0
+[  114.974897]  vfs_write+0x1eb/0x280
+[  114.974900]  ksys_write+0x67/0xe0
+[  114.974903]  do_syscall_64+0x5c/0x80
+[  114.974910]  entry_SYSCALL_64_after_hwframe+0x44/0xae
+[  114.974915] RIP: 0033:0x7f628fa9a257
+[  114.974920] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
+[  114.974922] RSP: 002b:00007ffec5d30ac8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
+[  114.974925] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f628fa9a257
+[  114.974927] RDX: 0000000000000004 RSI: 00007ffec5d30bb0 RDI: 0000000000000004
+[  114.974928] RBP: 00007ffec5d30bb0 R08: 00005638c6201200 R09: 0000000000000000
+[  114.974929] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
+[  114.974931] R13: 00005638c61fd3c0 R14: 0000000000000004 R15: 00007f628fb937a0
+[  114.974933]  </TASK>
+[  115.174776] powercap intel-rapl:1: PM: parent intel-rapl should not be sleeping
+[  116.270453] xhci_hcd 0000:00:14.0: xHC error in resume, USBSTS 0x411, Reinit
+[  116.645506] ata7.00: supports DRM functions and may not be fully accessible
+[  116.651128] ata7.00: supports DRM functions and may not be fully accessible
+[  117.592982] xhci_hcd 0000:86:00.2: xHC error in resume, USBSTS 0x401, Reinit

My current workaround is to ssh to the machine and call the following, which brings the display back.

sudo systemctl restart display-manager

Also I think I don't have any graphics driver issues but I am on NVIDIA.

[Edit] Probably relevant patches https://lore.kernel.org/lkml/413ce7e5-1 … tel.com/t/

Last edited by abbaswasim (2022-04-20 16:01:09)

Offline

#9 2023-12-21 16:43:21

0d201fa73
Member
Registered: 2023-12-21
Posts: 1

Re: [SOLVED] Resume from Suspend fails

The only workaround I found so far is to setup a keyboard shortcut for hibernate. Hibernate seems to allow for something in hardware/driver to properly reset.

Last edited by 0d201fa73 (2023-12-27 22:45:10)

Offline

Board footer

Powered by FluxBB