You are not logged in.
Hi,
Since recently, sometimes (but not always) the system does not deactivate swap on shutdown. That means there is a 90 second timeout which needs to be waited (usually shutdown is done in less than 10 seconds - haven't timed it) - but more importantly, the computer does not power off. This is especially annoying since this is a laptop (I need to be able to shutdown, close the lid, and put it in the bag and have it not waste battery and/or thermally overheat and shutdown over time).
I'm not 100% sure on the non-poweroff being an issue, though. Usually, I'm using plymouth and need to press ESC to see what is going on during shutdown. Perhaps it's by design (of Plymouth / Kernel) - if the user has requested seeing the Kernel messages, it will hold it's horses on power off by design.
This is on btrfs, and the whole disk is encrypted. Normally, I power off via KDE Plasma menus.
I'm not sure how to troubleshoot further. Here are the entries on the previous shutdown attempts log:
loka 07 16:27:35 catonthemove systemd[1]: Stopped Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling.
loka 07 16:27:35 catonthemove systemd[1]: Reached target System Shutdown.
loka 07 16:29:05 catonthemove systemd[1]: dev-mapper-cat\x2d\x2dvg\x2dswap.swap: Deactivation timed out. Stopping.
loka 07 16:29:05 catonthemove systemd[1]: dev-mapper-cat\x2d\x2dvg\x2dswap.swap: Swap process exited, code=killed, status=15/TERM
loka 07 16:29:05 catonthemove systemd[1]: Failed deactivating swap /dev/mapper/cat--vg-swap.
loka 07 16:29:05 catonthemove systemd[1]: Failed deactivating swap /dev/dm-2.
loka 07 16:29:05 catonthemove systemd[1]: Failed deactivating swap /dev/cat-vg/swap.
loka 07 16:29:05 catonthemove systemd[1]: Failed deactivating swap /dev/disk/by-id/dm-name-cat--vg-swap.
loka 07 16:29:05 catonthemove systemd[1]: Failed deactivating swap /dev/disk/by-id/dm-uuid-LVM-IQk69rKPbyfqqyVWyJsmUllKb8qoKAOEoPVzkTGu1xChBIZ3XSK23w9dpKWzLfZU.
loka 07 16:29:05 catonthemove systemd[1]: Failed deactivating swap /dev/disk/by-uuid/e12c5a05-a085-4ccc-bbf1-8cd5f6511d99.
loka 07 16:29:05 catonthemove systemd[1]: Failed deactivating swap /dev/disk/by-diskseq/4.
loka 07 16:29:05 catonthemove systemd[1]: Reached target Unmount All Filesystems.
loka 07 16:29:05 catonthemove systemd[1]: Reached target Late Shutdown Services.
loka 07 16:29:05 catonthemove systemd[1]: systemd-poweroff.service: Deactivated successfully.
loka 07 16:29:05 catonthemove systemd[1]: Finished System Power Off.
loka 07 16:29:05 catonthemove systemd[1]: Reached target System Power Off.
loka 07 16:29:05 catonthemove systemd[1]: Shutting down.
loka 07 16:29:05 catonthemove systemd-shutdown[1]: Syncing filesystems and block devices.
loka 07 16:29:05 catonthemove systemd-shutdown[1]: Sending SIGTERM to remaining processes...
loka 07 16:29:05 catonthemove systemd-journald[585]: Received SIGTERM from PID 1 (systemd-shutdow).
loka 07 16:29:05 catonthemove systemd-udevd[638]: Failed to remove file descriptor "config-serialization" from the store, ignoring: Connection refused
loka 07 16:29:05 catonthemove systemd-udevd[638]: Failed to push serialization fd to service manager: Connection refused
loka 07 16:29:05 catonthemove systemd-journald[585]: Journal stopped
Cheers!
Offline
Is the power-off failure strictly tied to "sometimes (but not always) the system does not deactivate swap" or does it also not power off when the swap can be deactivated?
Do you also use zswap or a zram swap device?
Deactivating the swap implies to read the swapped out memory back into RAM, any chance you're (still) low on that (FS cache) when this happens?
For reference, seems more common:
https://github.com/systemd/systemd/issues/38167
https://discussion.fedoraproject.org/t/ … wap/161585
Offline
No, I have not noticed any shutdown failures when swap has been correctly deactivated. I've only noticed the computer is slow to shut down and when I press ESC, it's waiting for the timeout (for being unable to deactivate swap). This is 100% of the failed cases.
EDIT: With sometimes (but not always) I mean sometimes the laptop shuts down correctly.
I have not noticed any abnormal RAM usage (I could check this fairly easy with some monitoring, but I'd presume I would have noticed other symptoms). I can not see why I would be low on FS cache on shutdown, since all processes should be terminated (the X.Org session is, which should include practically anything running on the laptop using any significant amount of RAM).
EDIT: Possibly of interest is the fact I also use hybrid suspend to disk/RAM quite often. I.e. just close the lid. It will hibernate correctly and power off (save for RAM). I believe that will rule out any RAM usage issues being the cause almost certainly (though not 100% sure - if a stray process is eating RAM and swap, it could have (by chance?) happened only on shutdown, and could explain why Kernel can not let go of SWAP, but I'd presume there would be something else about this process specifically in the log if that were the case).
I do not use zswap nor zram.
Last edited by Wild Penguin (2025-10-07 15:07:55)
Offline
suspending/hibernating won't swapoff, or does it in your setup?
The systemd bug suggests this to happen w/ 6.15 but stops w/ 6.16, what kernel do you currently boot?
How frequent is this?
Can you acquire a habit to log out of your session and shut down from the DM or even take that down (isolate the multi-user.target) before the shutdown and see whether you can still somehow trigger this?
Offline
suspending/hibernating won't swapoff, or does it in your setup?
The systemd bug suggests this to happen w/ 6.15 but stops w/ 6.16, what kernel do you currently boot?How frequent is this?
It does in this case. The swap is in it's own LVM VG. Hibernate goes into the swap.
I'm having this on 6.16 (zen branch).
I'm not sure on frequency, and IIRC sometimes the swap deactivation failures are not written into syslog! So it's a bit difficult to check for sure the frequency after the fact. But it happens often enough to become a nuisance / I've stopped shutting down the computer but instead always hibernate (despite not actually having a need to do so).
$ sudo lsblk && sudo blkid
[sudo] ville-käyttäjän salasana:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
nvme0n1 259:0 0 476,9G 0 disk
├─nvme0n1p1 259:1 0 1,5G 0 part /efi
└─nvme0n1p3 259:2 0 475,4G 0 part
└─cryptlvm 253:0 0 475,4G 0 crypt
├─cat--vg-ArchRoot 253:1 0 459,4G 0 lvm /
└─cat--vg-swap 253:2 0 16G 0 lvm [SWAP]
/dev/mapper/cat--vg-ArchRoot: UUID="144f9968-3614-4e60-a832-b3b5b7da8bc0" UUID_SUB="eef9b1b6-a5bc-4d2c-ba1d-fd62c97f945d" BLOCK_SIZE="4096" TYPE="btrfs"
/dev/nvme0n1p3: UUID="57c23c8c-e374-423c-b71f-34fb4c679950" TYPE="crypto_LUKS" PARTUUID="36ae2f33-66b0-4abf-976d-e03bc1b25df2"
/dev/nvme0n1p1: UUID="C8AA-6F33" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="8db8856f-1dd3-404f-b822-945b2cbac19f"
/dev/mapper/cat--vg-swap: UUID="e12c5a05-a085-4ccc-bbf1-8cd5f6511d99" TYPE="swap"
/dev/mapper/cryptlvm: UUID="ZHiZfH-Z4Se-ed4T-TMxZ-gWff-EkBr-VR0Ihk" TYPE="LVM2_member"
/etc/fstab:
# Static information about the filesystems.
# See fstab(5) for details.
# <file system> <dir> <type> <options> <dump> <pass>
# /dev/mapper/ubuntu--vg-ArchRoot
/dev/mapper/cat--vg-ArchRoot / btrfs rw,relatime,ssd,space_cache=v2,subvolid=5,subvol=/ 0 0
# /dev/nvme0n1p1
/dev/nvme0n1p1 /efi vfat rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro 0 2
# /dev/mapper/ubuntu--vg-ubuntu--lv
# /dev/mapper/cat--vg-ubuntu--lv /ubunturoot ext4 rw,relatime,noauto 0 2
# /dev/mapper/ubuntu--vg-swap
/dev/mapper/cat--vg-swap none swap defaults 0 0
(EDIT: Never mind the reference to Ubuntu in fstab. I had Ubuntu previouly, and it's installer made the LVM setup, which I just decided to repurpose for Arch being a bit lazy).
/proc/cmdline:
cryptdevice=UUID=57c23c8c-e374-423c-b71f-34fb4c679950:cryptlvm splash resume=UUID=e12c5a05-a085-4ccc-bbf1-8cd5f6511d99 root=UUID=144f9968-3614-4e60-a832-b3b5b7da8bc0 rw
Last edited by Wild Penguin (2025-10-07 15:47:02)
Offline
You're aware that you could hibernate into a swapfile (that's not actually being used as swap)?
Out of curiosity: there've previously being complaints about swap in LVM being dead-slow, do you experience something like that?
Other than that: keep trying to shutdown from the DM or the mulit-user.target and let's see whether this has any relevance, I guess…
Offline
I'm experiencing exactly the same problems for a couple of weeks now. I don't use LVM though, just regular partitions
/dev/nvme0n1p1 4096 618495 614400 300M Sistema EFI
/dev/nvme0n1p2 618496 3766343679 3765725184 1,8T Sistema de ficheros de Linux
/dev/nvme0n1p3 3766343680 3907028991 140685312 67,1G Linux swap
The problem is that I don't know where to begin looking for clues on how/why this is happenning.
In my case, I'd say 8 out of 10 times I'm getting those errors, and from those 8, 5 or 6 times the system won't poweoff/restart so I have to press the off button to switch it off
Last edited by superlinuxero (2025-10-10 13:25:09)
Offline
After doing some tests, I think there's a workaround that at least it's working for me. I just created a systemd service that turns off swap before shutdown
Create a file /etc/systemd/system/swapoff-before-shutdown.service and put inside:
[Unit]
Description=swapoff partition before shutdown
[Service]
Type=oneshot
RemainAfterExit=true
ExecStop=/usr/bin/swapoff /dev/nvme0n1p3
TimeoutSec=infinity
[Install]
WantedBy=multi-user.target
Replace nvme0n1p3 accordingly
then
sudo systemctl daemon-reload
sudo systemctl enable swapoff-before-shutdown.service
Hope it works for you
Offline
Hi,
Sorry for not coming back to this - I've been busy with some actual work. I'm still having the issue.
I haven't tried superlinuxhero's workaround yet, however I've made some progress and I believe this is a Kernel memory leak triggered by ... something, some corner case (I suppose, if it was common, we'd have more replies???).
FWIW IIRC I **did** try to deactivate swap manually, a few times, when I noticed there are 10GiB+ (!!!) of swap used. This is not normal on this laptop (it's used for typical coding / browsing / office work on-the-go), and the swap usage does not go away even if I close (all) applications.
I've noticed something peculiar on closer inspection in the kernel logs, starting around the same time this problem started (end of september). I believe I could pinpoint the exact kernel version, if this is a Kernel bug.
From September 23th and onwards, I'm sometimes getting this in the logs (it seems like it's all kinds of random processes):
Oct 01 16:51:13 catonthemove kernel: kworker/u64:22: page allocation failure: order:0, mode:0x100c02(GFP_NOIO|__GFP_HIGHMEM|__GFP_HARDWALL), nodemask=(null),cpuset=/,mems_allowed=0
Oct 01 16:48:36 catonthemove bluetoothd[804]: Battery Provider Manager destroyed
Oct 01 16:51:13 catonthemove kernel: CPU: 14 UID: 0 PID: 31035 Comm: kworker/u64:22 Tainted: G W 6.16.8-zen3-1-zen #1 PREEMPT(full) 3d19e534e753f1537075da4e4d2c167804c539e3
Oct 01 16:51:13 catonthemove bluetoothd[804]: Battery Provider Manager created
Oct 01 16:51:13 catonthemove kernel: Tainted: [W]=WARN
Oct 01 16:51:13 catonthemove kernel: Hardware name: LENOVO 21A0CTO1WW/21A0CTO1WW, BIOS R1MET61W (1.31 ) 03/31/2025
Oct 01 16:51:13 catonthemove kernel: Workqueue: async async_run_entry_fn
Oct 01 16:51:13 catonthemove kernel: Call Trace:
Oct 01 16:51:13 catonthemove kernel: <TASK>
Oct 01 16:51:13 catonthemove kernel: dump_stack_lvl+0x5d/0x80
Oct 01 16:51:13 catonthemove kernel: warn_alloc+0x163/0x190
Oct 01 16:51:13 catonthemove kernel: __alloc_frozen_pages_noprof+0xa8d/0x1130
Oct 01 16:51:13 catonthemove kernel: __alloc_pages_noprof+0xe/0x20
Oct 01 16:51:13 catonthemove systemd-sleep[31017]: System returned from sleep operation 'hybrid-sleep'.
Oct 01 16:51:13 catonthemove kernel: __ttm_pool_alloc+0x57a/0xb00 [ttm 991a864e9b4961f0e04d509e24511718f6e2ca2a]
Oct 01 16:51:13 catonthemove kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 01 16:51:13 catonthemove kernel: ttm_pool_alloc+0x7d/0xb0 [ttm 991a864e9b4961f0e04d509e24511718f6e2ca2a]
Oct 01 16:51:13 catonthemove kernel: amdgpu_ttm_tt_populate+0x86/0xd0 [amdgpu 523c3db77b50464b5e9186dd5086a52254b5c2eb]
Oct 01 16:51:13 catonthemove kernel: ttm_tt_populate+0xa1/0x190 [ttm 991a864e9b4961f0e04d509e24511718f6e2ca2a]
Oct 01 16:51:13 catonthemove kernel: ttm_bo_populate+0x32/0xb0 [ttm 991a864e9b4961f0e04d509e24511718f6e2ca2a]
Oct 01 16:51:13 catonthemove kernel: ttm_bo_handle_move_mem+0x192/0x1a0 [ttm 991a864e9b4961f0e04d509e24511718f6e2ca2a]
Oct 01 16:51:13 catonthemove kernel: ttm_bo_evict+0xc0/0x210 [ttm 991a864e9b4961f0e04d509e24511718f6e2ca2a]
Oct 01 16:51:13 catonthemove kernel: ttm_bo_evict_first+0x26d/0x2c0 [ttm 991a864e9b4961f0e04d509e24511718f6e2ca2a]
Oct 01 16:51:13 catonthemove kernel: ttm_resource_manager_evict_all+0x4c/0x160 [ttm 991a864e9b4961f0e04d509e24511718f6e2ca2a]
Oct 01 16:51:13 catonthemove kernel: amdgpu_device_suspend+0xf5/0x180 [amdgpu 523c3db77b50464b5e9186dd5086a52254b5c2eb]
Oct 01 16:51:13 catonthemove kernel: amdgpu_pmops_freeze+0x1f/0x70 [amdgpu 523c3db77b50464b5e9186dd5086a52254b5c2eb]
Oct 01 16:51:13 catonthemove kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 01 16:51:13 catonthemove kernel: pci_pm_freeze+0xc5/0x130
Oct 01 16:51:13 catonthemove kernel: ? __pfx_pci_pm_freeze+0x10/0x10
Oct 01 16:51:13 catonthemove kernel: dpm_run_callback+0x41/0x1e0
Oct 01 16:51:13 catonthemove kernel: ? pm_runtime_barrier+0x55/0x90
Oct 01 16:51:13 catonthemove kernel: device_suspend+0x455/0x9e0
Oct 01 16:51:13 catonthemove kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 01 16:51:13 catonthemove kernel: async_suspend+0x21/0x30
Oct 01 16:51:13 catonthemove kernel: async_run_entry_fn+0x36/0x140
Oct 01 16:51:13 catonthemove kernel: process_one_work+0x193/0x350
Oct 01 16:51:13 catonthemove kernel: worker_thread+0x254/0x3a0
Oct 01 16:51:13 catonthemove kernel: ? __pfx_worker_thread+0x10/0x10
Oct 01 16:51:13 catonthemove kernel: kthread+0xfc/0x240
Oct 01 16:51:13 catonthemove kernel: ? __pfx_kthread+0x10/0x10
Oct 01 16:51:13 catonthemove kernel: ? __pfx_kthread+0x10/0x10
Oct 01 16:51:13 catonthemove kernel: ret_from_fork+0x1c4/0x1f0
Oct 01 16:51:13 catonthemove kernel: ? __pfx_kthread+0x10/0x10
Oct 01 16:51:13 catonthemove kernel: ret_from_fork_asm+0x1a/0x30
Oct 01 16:51:13 catonthemove kernel: </TASK>
Oct 01 16:51:13 catonthemove kernel: Mem-Info:
Oct 01 16:51:13 catonthemove kernel: active_anon:90490 inactive_anon:574195 isolated_anon:0
active_file:46761 inactive_file:275790 isolated_file:0
unevictable:14 dirty:0 writeback:0
slab_reclaimable:46705 slab_unreclaimable:59522
mapped:134627 shmem:16113 pagetables:14020
sec_pagetables:953 bounce:0
kernel_misc_reclaimable:0
free:28657 free_pcp:215 free_cma:0
Also, I get a lot of this:
Oct 10 15:39:28 catonthemove kernel: pagefault_out_of_memory: 334 callbacks suppressed
Oct 10 15:39:28 catonthemove kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
Oct 10 15:39:28 catonthemove kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
Oct 10 15:39:28 catonthemove kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
Oct 10 15:39:28 catonthemove kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
Oct 10 15:39:28 catonthemove kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
Oct 10 15:39:28 catonthemove kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
Oct 10 15:39:28 catonthemove kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
Oct 10 15:39:28 catonthemove kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
Oct 10 15:39:28 catonthemove kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
Oct 10 15:39:28 catonthemove kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
superlinuxero, have you noticed any weird memory usage patterns (or errors in the logs)?
I believe these log entries are only symptoms, they don't pinpoint what is leaking memory nor why - it's just telling I'm running out of memory (and I didn't spot anything pointing the culprit in the logs - I will upload logs later, maybe someone else spots something...).
Curiously, I have a desktop computer, also with Arch, and it does not have this kind of problems (using the same -zen branch Kernel...). EDIT: The desktop uses a swapfile on a btrfs (subvolume), no LVM, no swap partition(s), and is setup to be able to hibernate to the swapfile.
And as for why hibernation works - I suspect it reverts to suspends to RAM only, if it can not free the swap. I've just not ran out of battery so I haven't noticed anything / no data loss (hmmm.... I suppose this should be reflected in the logs somehow?).
As another workaround: Since I now know this behavior pattern / issue is there, I only shut down when needed (perhaps a bit paradoxically, hibernation is more safe / certain to work currently!) - just sync first, close everything, try to disable swap (which will fail in my case most probably) and REISUO (power off) with SysRq at / after the systemd timeout manually (I suppose just pressing the power button would be exactly the same, systemd should have demounted / RO'd everything at this point in any case...).
Last edited by Wild Penguin (2025-10-15 13:18:19)
Offline
You're aware that you could hibernate into a swapfile (that's not actually being used as swap)?
I could, however this setup worked just fine (previously). It would be a workaround but there certainly is some regression in play here.
Out of curiosity: there've previously being complaints about swap in LVM being dead-slow, do you experience something like that?
None whatsoever, albeit normally the system does not use that much swap. It's mainly there for hibernation (and "just in case" I need it) ... the fact having no slowdowns is, perhaps, even more evident now that I've actually run into quite heavy swap usage ... (see my previous reply). System seems responsive.
Last edited by Wild Penguin (2025-10-15 13:21:14)
Offline
AMD is OTR for "leaking" RAM into GART/GTT though should™ surrender that on demand.
FWIW IIRC I **did** try to deactivate swap manually, a few times, when I noticed there are 10GiB+ (!!!) of swap used.
When you witness that nex time:
cat /proc/meminfo
There ~1GB in use but only 27MB "free" in the meminfo from the allocation failure…
If this is a regression, do you experience this w/ an earlier or the LTS kernel?
Offline
Thanks Seths,
I could indeed try on LTS Kernel if it goes away. Or an earlier regular / lts...
Further digging the logs, it seems last known good is:
Sep 12 12:12:31 catonthemove kernel: Linux version 6.16.5-zen1-1-zen (linux-zen@archlinux) (gcc (GCC) 15.2.1 20250813, GNU ld (GNU Binutils) 2.45.0) #1 ZEN SMP PREEMPT_DYNAMIC Thu, 04 Sep 2025 23:17:59 +0000
And the first bad one:
Sep 25 17:03:56 catonthemove kernel: Linux version 6.16.8-zen3-1-zen (linux-zen@archlinux) (gcc (GCC) 15.2.1 20250813, GNU ld (GNU Binutils) 2.45.0) #1 ZEN SMP PREEMPT_DYNAMIC Mon, 22 Sep 2025 22:08:18 +0000
Also, this started after that Kernel version (up to and including 6.16.10):
Oct 15 16:27:24 catonthemove kernel: ------------[] cut here ]------------
Oct 15 16:27:24 catonthemove kernel: WARNING: CPU: 3 PID: 3803 at kernel/power/main.c:47 pm_restrict_gfp_mask+0x48/0x50
Oct 15 16:27:24 catonthemove kernel: Modules linked in: ccm rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device uhid cmac algif_hash algif_skcipher af_alg joydev mousedev bnep nls_iso8859_1 snd_acp_legacy_mach vfat snd_acp_mach fat snd_soc_nau8821 snd_soc_dmic snd_acp3x_rn snd_acp3x_pdm_dma snd_sof_amd_acp70 snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match snd_amd_sdw_acpi soundwire_amd soundwire_generic_allocation soundwire_bus snd_ctl_led snd_soc_sdca amd_atl intel_rapl_msr snd_hda_codec_realtek intel_rapl_common snd_soc_core snd_hda_codec_generic mt7921e snd_hda_scodec_component snd_compress mt7921_common snd_hda_codec_hdmi ac97_bus uvcvideo mt792x_lib btusb snd_pcm_dmaengine videobuf2_vmalloc snd_rpl_pci_acp6x btrtl mt76_connac_lib uvc snd_hda_intel snd_acp_pci btintel videobuf2_memops snd_amd_acpi_mach snd_intel_dspcfg mt76 snd_intel_sdw_acpi videobuf2_v4l2 btbcm snd_acp_legacy_common kvm_amd mac80211
Oct 15 16:27:24 catonthemove kernel: r8169 snd_hda_codec ucsi_acpi videobuf2_common btmtk snd_pci_acp6x kvm snd_pci_acp5x typec_ucsi libarc4 realtek videodev snd_hda_core irqbypass mdio_devres typec bluetooth snd_rn_pci_acp3x snd_hwdep mc rapl roles snd_acp_config pcspkr think_lmi snd_pcm sp5100_tco libphy snd_soc_acpi cfg80211 firmware_attributes_class wmi_bmof thunderbolt psmouse nxp_nci_i2c snd_timer snd_pci_acp3x i2c_piix4 mdio_bus nxp_nci k10temp i2c_smbus nci nfc i2c_scmi mac_hid uinput i2c_dev crypto_user ntsync loop nfnetlink ip_tables x_tables dm_crypt encrypted_keys trusted asn1_encoder tee dm_mod amdgpu amdxcp i2c_algo_bit thinkpad_acpi drm_ttm_helper sparse_keymap polyval_clmulni ttm platform_profile ghash_clmulni_intel drm_exec snd sha512_ssse3 gpu_sched rtsx_pci_sdmmc sha1_ssse3 soundcore drm_suballoc_helper mmc_core aesni_intel serio_raw rfkill drm_panel_backlight_quirks nvme drm_buddy video nvme_core drm_display_helper nvme_keyring ccp rtsx_pci cec nvme_auth xhci_pci_renesas wmi
Oct 15 16:27:24 catonthemove kernel: CPU: 3 UID: 0 PID: 3803 Comm: systemd-sleep Not tainted 6.16.10-zen1-1-zen #1 PREEMPT(full) d5775a3d5e17649e0d275aca4c82b129fab7507b
Oct 15 16:27:24 catonthemove kernel: Hardware name: LENOVO 21A0CTO1WW/21A0CTO1WW, BIOS R1MET61W (1.31 ) 03/31/2025
Oct 15 16:27:24 catonthemove kernel: RIP: 0010:pm_restrict_gfp_mask+0x48/0x50
Oct 15 16:27:24 catonthemove kernel: Code: 03 85 c0 75 25 8b 05 37 21 8a 02 89 05 31 71 39 03 24 3f 89 05 29 21 8a 02 e9 3f ea cc ff 0f 0b 8b 05 1c 71 39 03 85 c0 74 db <0f> 0b eb d7 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90
Oct 15 16:27:24 catonthemove kernel: RSP: 0018:ffffccfe8aefbb10 EFLAGS: 00010206
Oct 15 16:27:24 catonthemove kernel: RAX: 0000000001ffffff RBX: 0000020434a099bb RCX: 0000000000000009
Oct 15 16:27:24 catonthemove kernel: RDX: 0000000000000000 RSI: ffff8aaca8415ce8 RDI: ffffffffaf6551a0
Oct 15 16:27:24 catonthemove kernel: RBP: 0000000000000003 R08: ffff8aac0a75b028 R09: ffff8aac1168fb78
Oct 15 16:27:24 catonthemove kernel: R10: ffff8aac1168fb18 R11: 0000000000000000 R12: 0000000000000002
Oct 15 16:27:24 catonthemove kernel: R13: fffffffffffffff2 R14: 0000000000000000 R15: 0000000000000000
Oct 15 16:27:24 catonthemove kernel: FS: 00007f54c2aca880(0000) GS:ffff8aae71b99000(0000) knlGS:0000000000000000
Oct 15 16:27:24 catonthemove kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 15 16:27:24 catonthemove kernel: CR2: 0000000000000000 CR3: 0000000117a7a000 CR4: 0000000000f50ef0
Oct 15 16:27:24 catonthemove kernel: PKRU: 55555554
Oct 15 16:27:24 catonthemove kernel: Call Trace:
Oct 15 16:27:24 catonthemove kernel: <TASK>
Oct 15 16:27:24 catonthemove kernel: dpm_suspend_start+0x7b/0x120
Oct 15 16:27:24 catonthemove kernel: suspend_devices_and_enter+0x15a/0x890
Oct 15 16:27:24 catonthemove kernel: ? memory_bm_next_pfn+0x3a/0xd0
Oct 15 16:27:24 catonthemove kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 15 16:27:24 catonthemove kernel: hibernate.cold+0x3c2/0x477
Oct 15 16:27:24 catonthemove kernel: state_store+0x102/0x150
Oct 15 16:27:24 catonthemove kernel: kernfs_fop_write_iter+0x149/0x200
Oct 15 16:27:24 catonthemove kernel: vfs_write+0x32c/0x4f0
Oct 15 16:27:24 catonthemove kernel: __x64_sys_write+0x70/0xe0
Oct 15 16:27:24 catonthemove kernel: do_syscall_64+0x81/0x970
Oct 15 16:27:24 catonthemove kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 15 16:27:24 catonthemove kernel: ? do_syscall_64+0x81/0x970
Oct 15 16:27:24 catonthemove kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 15 16:27:24 catonthemove kernel: ? do_syscall_64+0x81/0x970
Oct 15 16:27:24 catonthemove kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 15 16:27:24 catonthemove kernel: ? exc_page_fault+0x7e/0x180
Oct 15 16:27:24 catonthemove kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e
Oct 15 16:27:24 catonthemove kernel: RIP: 0033:0x7f54c22931ce
Oct 15 16:27:24 catonthemove kernel: Code: 4d 89 d8 e8 64 be 00 00 4c 8b 5d f8 41 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 11 c9 c3 0f 1f 80 00 00 00 00 48 8b 45 10 0f 05 <c9> c3 83 e2 39 83 fa 08 75 e7 e8 13 ff ff ff 0f 1f 00 f3 0f 1e fa
Oct 15 16:27:24 catonthemove kernel: RSP: 002b:00007ffe47e1e320 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
Oct 15 16:27:24 catonthemove kernel: RAX: ffffffffffffffda RBX: 000055a8f4199310 RCX: 00007f54c22931ce
Oct 15 16:27:24 catonthemove kernel: RDX: 0000000000000005 RSI: 000055a8f41a4ae0 RDI: 0000000000000007
Oct 15 16:27:24 catonthemove kernel: RBP: 00007ffe47e1e330 R08: 0000000000000000 R09: 0000000000000000
Oct 15 16:27:24 catonthemove kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000005
Oct 15 16:27:24 catonthemove kernel: R13: 0000000000000005 R14: 000055a8f41a4ae0 R15: 00007ffe47e1e4c0
Oct 15 16:27:24 catonthemove kernel: </TASK>
Oct 15 16:27:24 catonthemove kernel: ---[] end trace 0000000000000000 ]---
EDIT: Possibly related, or maybe not, just found via search and I'm still to digest this: https://gitlab.freedesktop.org/drm/amd/-/issues/4573
Last edited by Wild Penguin (2025-10-15 14:11:04)
Offline
I just tried swapoff with this situation:
$ free -h
total used free shared buff/cache available
Mem: 11Gi 2,8Gi 229Mi 59Mi 8,7Gi 8,7Gi
Swap: 15Gi 3,4Gi 12Gi
Swapoff command does not finish in a timely manner (after a few minutes) and free now shows:
$ free -h
total used free shared buff/cache available
Mem: 11Gi 3,3Gi 1,3Gi 86Mi 7,3Gi 8,2Gi
Swap: 1,5Gi 0B -2Gi
I'll try a newer Kernel (and perhaps wait for 6.18, maybe the issue goes away...).
Also, shutdown does not finish, as per the first post...
Last edited by Wild Penguin (2025-10-15 14:37:02)
Offline
Swap: 1,5Gi 0B -2Gi
Wtf?
Are you sure about the zswap situation? Journal?
Offline
Wtf?
My words exactly!
Are you sure about the zswap situation? Journal?
Well, seems like I was mistaken! Zswap is enabled per default these days (it seems). I had missed this entirely.
(on my both computers:)
$ grep -r . /sys/module/zswap/parameters/
/sys/module/zswap/parameters/enabled:Y
/sys/module/zswap/parameters/shrinker_enabled:Y
/sys/module/zswap/parameters/max_pool_percent:20
/sys/module/zswap/parameters/compressor:zstd
/sys/module/zswap/parameters/zpool:zsmalloc
/sys/module/zswap/parameters/accept_threshold_percent:90
I'm still scratching my head why does it show negative free swap (how does that make any sense???).
Zswap is probably not the culprit here, however. Probably, as in I did not even know I was using it, and have not touched it('s configuration).
Offline
Disable zswap, re-trigger the problem.
I'm betting against your probability - zswap can completely explode into your face if the swapped memory is sufficiently compressible.
Offline
Hello!
Jumping in here as i am facing a similar if not the same issue.
I also have negative values in my free:
$ free -h
total used free shared buff/cache available
Mem: 31Gi 12Gi 8,6Gi 1,8Gi 11Gi 18Gi
Swap: 31Gi 0B -13Gi
And my zswap parameters are also at the default:
$ grep -r . /sys/module/zswap/parameters/
/sys/module/zswap/parameters/enabled:Y
/sys/module/zswap/parameters/shrinker_enabled:Y
/sys/module/zswap/parameters/max_pool_percent:20
/sys/module/zswap/parameters/compressor:zstd
/sys/module/zswap/parameters/zpool:zsmalloc
/sys/module/zswap/parameters/accept_threshold_percent:90
My zswap debug seems fine as all values are on the same order of magnitude as the example on the wiki (except reject_reclaim_fail).
$ sudo grep -r . /sys/kernel/debug/zswap/
/sys/kernel/debug/zswap/stored_pages:177306
/sys/kernel/debug/zswap/pool_total_size:99532800
/sys/kernel/debug/zswap/written_back_pages:387508
/sys/kernel/debug/zswap/decompress_fail:0
/sys/kernel/debug/zswap/reject_compress_poor:0
/sys/kernel/debug/zswap/reject_compress_fail:112926
/sys/kernel/debug/zswap/reject_kmemcache_fail:0
/sys/kernel/debug/zswap/reject_alloc_fail:0
/sys/kernel/debug/zswap/reject_reclaim_fail:58855
/sys/kernel/debug/zswap/pool_limit_hit:0
BUT i also checked meminfo as suggested above and this is where i am at a loss for words...
$ cat /proc/meminfo
MemTotal: 32758384 kB
MemFree: 9161376 kB
MemAvailable: 20139120 kB
Buffers: 5796 kB
Cached: 10828016 kB
SwapCached: 4326320 kB
Active: 7523612 kB
Inactive: 14217352 kB
Active(anon): 5474320 kB
Inactive(anon): 5143464 kB
Active(file): 2049292 kB
Inactive(file): 9073888 kB
Unevictable: 1684 kB
Mlocked: 1684 kB
SwapTotal: 33554428 kB
SwapFree: 18446744073695824876 kB
Zswap: 97200 kB
Zswapped: 709216 kB
Dirty: 4508 kB
Writeback: 0 kB
AnonPages: 8190900 kB
Mapped: 1401304 kB
Shmem: 1452672 kB
KReclaimable: 318876 kB
Slab: 754936 kB
SReclaimable: 318876 kB
SUnreclaim: 436060 kB
KernelStack: 46448 kB
PageTables: 121132 kB
SecPageTables: 4528 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 49933620 kB
Committed_AS: 41024284 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 135456 kB
VmallocChunk: 0 kB
Percpu: 27744 kB
HardwareCorrupted: 0 kB
AnonHugePages: 589824 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
FileHugePages: 0 kB
FilePmdMapped: 0 kB
CmaTotal: 0 kB
CmaFree: 0 kB
Unaccepted: 0 kB
Balloon: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 0 kB
DirectMap4k: 11656568 kB
DirectMap2M: 21803008 kB
DirectMap1G: 0 kB
Why the hell does it think i have 18 ZETTABYTES of free swap!? (zstd can't be THAT good...)
This system somehow suspends (or rather hybrid sleeps) just fine. But when trying to shut down fully it hangs and needs to be hard stopped.
The journal of the last boot then shows it failed to unmount /var/cache (a separate btrfs volume on my system) and got stuck
09:36:56.601 UTC init.scope user.slice: Consumed 5h 38min 34.913s CPU time, 26G memory peak, 7.4G memory swap peak.
09:36:56.668 UTC init.scope var-cache.mount: Mount process exited, code=exited, status=32/n/a
09:36:56.669 UTC init.scope Failed unmounting /var/cache.
Does zswap keep pages here that fail to write back? (the "buff/cache" column in $ free -h seems to indicate so)
I wanted to try disabling zswap as suggested, but now I'd like some help in finding the root cause of this first.
Zswap must have been running fine for over a year now, with these issues only manifesting around the time of kernel 6.16.8 or so. (Just a time-frame, not putting blame on the kernel version)
I'm going to keep this system running for now if anyone has any suggestions on investigating where this went wrong.
Offline
zstd can't be THAT good...
https://en.wikipedia.org/wiki/Zip_bomb but I assume something's off w/ the accounting and the "overfreed" (z)swap gets accumulated (so you're swapping 1GB, but that's just counted as 100MB in zswap but then discounted by 1GB - what'd be bad enough)
09:36:56.669 UTC init.scope Failed unmounting /var/cache.
it failed to unmount /var/cache (a separate btrfs volume on my system)
This isn't at all related to swapoff but
it hangs and needs to be hard stopped
since you probably(?) restarted the failing boot w/ the power button(?) some parts of the journal might have been lost.
Avoid that, https://wiki.archlinux.org/title/Genera … l_messages and monitor the shutdown to see where it really hangs.
Also wait - the systemd default timeout is 90s
Offline
This isn't at all related to swapoff but
it hangs and needs to be hard stopped
since you probably(?) restarted the failing boot w/ the power button(?) some parts of the journal might have been lost.
I feel so stupid for not having realized that myself...
Also wait - the systemd default timeout is 90s
Yeah, i gave it 15mins before just to be safe, but i use Plymouth and didn't hit Esc until after i waited and then it didn't do anything.
Good news is simply watching the shutdown without Plymouth in the way revealed the hangup.
It was, to no ones surprise, swapoff.
However after the timeout the system finishes the shutdown process, declares it has reached Power Off, and then doesn't actually power off...
It just sits there on the last framebuffer and does nothing.
So now I'm just confused, because it seems like a separate issue, but they only ever occur together.
The system shuts down and reboots just fine with little or no swap used.
Edit: By the way you were completely right, /var/cache is not related and also fails on otherwise successful shutdowns.
Last edited by namco-4 (Yesterday 11:22:22)
Offline
Disable zswap, re-trigger the problem.
[…] zswap can completely explode into your face if the swapped memory is sufficiently compressible.
https://wiki.archlinux.org/title/Zswap#Toggling_zswap
Offline