You are not logged in.
Hello!
I own a notebook named hp 250G8 ( a terrible PoS!)
This system has an i5-1135G7 CPU. Transcend TS250 4TB SSD.
And on this notebook there's multiboot setup with Windows 11 (but I boot into Windows only about once a year, so the Windows drivers or whatever are not the culprit here!).
My Linux installation is archlinux setup by following the manual instalation instructions. I had to install maually, because the 'archinstall' script did not work with brfs subvolumes as I am using.
Now near everything is working fine, but there's one very annoying and concerning issue:
Everytime I try to hibernate the notebook (or everytime it does this automatically as set in XFCE power management GUI), it will make my system unbootable.
It goes so far, that the nvme drive I am using (Transcend TS250) even completely disappears from BIOS/UEFI - it just is not there anymore! The computer can't detect it in any way, like it wasn't in there.
I then have to power on/off enter UEFI-setup and toggle some arbitrary option there off and on agan, save the setup and repeat all this between 3 and 20 times, until the nvme drive is even detected again.
After the drive has finally been detected again, grub installation is broken and so the notebook boots into Windows instead.
So I have to put in an USB drive with the arch ISO on it and manually repair grub using arch-chroot and grub-install.
I have not encountered any data loss, yet. So I think it's not caused by the SSD itself. Maybe it's a firmware issue of this notebook that's not mitigated by the kernel?
Or some issue with the intel storage drivers?
Please tell me which info you need for finding the reasons.
I hope you can help me here, fix this problem!
Thank you!
Last edited by Elmario (2023-09-01 14:32:20)
Offline
but I boot into Windows only about once a year
Which is more or less irrelevant, 3rd link below. Mandatory.
Disable it (it's NOT the BIOS setting!) and reboot windows and linux twice for voodo reasons.
The symptoms basically yell that this is the cause.
Offline
Hello!
What do ou mean by 'Disable it'?
I tried to remove Windows from UEFI-boot and only have it available in Grub, but could not find how to do so ..
Offline
Did you follow the link? Windows does't really shut down, but defaults to hibernation to fake a fast boot process. That's gonna confuse the shit out of your BIOS because you're now running (and even hibernating) to OS at the same time.
Offline
Ah, you mean FastStartup. I already have this disabled. It's a thing I always do after installing Windows somewhere. I just need to have hibernation work for my archlinux setup.
And: I need to report, that things have otten even worse
I did a BIOS update from F.52 to F.62 and now the issue occured two times in row just by a simple shutdown from my arch system!
(I wil adapt the tile).
Last edited by Elmario (2023-08-31 17:26:41)
Offline
The usual suspects: https://wiki.archlinux.org/title/Solid_ … leshooting
Add
nvme_core.default_ps_max_latency_us=0 iommu=soft
Offline
Does this work for Intel hardware (Sorry, I didn't write the specs above at first, just edited it in!).
Well, I will try it anyway.
Last edited by Elmario (2023-09-01 14:09:10)
Offline
That works for everything. You won't have an IOMMU what will break VFIO (passing devices through to a VM) but we first need to see whether we can control the problem at hand by manipulating the NVME at all.
Offline
You where right!
I actually had 'iommu=intel' in my kernel line. Instead of making it 'soft' I just tried removing the iommu parameter completely, and it's fine. I tried suspend and shutdown, both are working now!
Of course I didn't have this parameter set arbitrarily, as I frequently use VMs and sometimes need to passthrough a thing... and having this removed will probably create some issues there, because otherwise I wouldn't have set it before .. (I think I had to pass through one whole USB root hub before, for making some low level USB-controller firmware flashing tools work from a Windows VM).
Thank you very much!
I'm gonna read some more about the iommu parameters now. I guess 'soft' is a compatibility option that should give the best results in case nothing else is working, right?
Last edited by Elmario (2023-09-01 14:30:31)
Offline
iommu=intel isn't legal to begin with, https://raw.githubusercontent.com/torva … meters.txt
Last edited by seth (2023-09-01 14:32:36)
Offline
Yes, I just noticed I couldn't find it in this document, too. So I had a look at the line I commented out, again: I wrote bogus.
The parameter I just removed was 'intel_iommu=on' instead of what I wrote.
So I will now try replacing it with 'iommu=soft' and add 'nvme_core.default_ps_max_latency_us=0' if needed. I hope that this will be the compromise for having both things working (boot and passthrough).
Last edited by Elmario (2023-09-01 14:50:12)
Offline
OK, it's working fine using 'iommu=soft'.
So that'S probably the best option right now. Seems like I can leave the SSD power management enabled.
Now I just got to solve that other problem that occured since I updated the notebook's BIOS: There's no sound device anymore. I hope this is not connected to the iommu parameter too ..
Thank you!
Last edited by Elmario (2023-09-01 15:09:39)
Offline
iommu=intel isn't legal to begin with, https://raw.githubusercontent.com/torva … meters.txt
Hello!
I sadly report, that the issue is back.
It was working much better since I changed the kernel line.
But three days ago, the issue reappeared. I then added ''nvme_core.default_ps_max_latency_us=0' additionally to 'iommu=soft'.
But today it happened again
So it seems that these parameters made it less frequent, but not disappear.
What else could I do about this?
Thank you!
Offline
Do you have a system journal of a boot that ended in a hibernation you could not resume?
Eg. for the previous one
sudo journalctl -b -1 | curl -F 'file=@-' 0x0.st
Offline
Here's this log from right now (I think I did not reboot since the first boot after the last crash, but I am not 100% sure):
As far as I remember the notebook was not hibernated but shutdown before this crash.
I don't know if the command you posted would be sufficient in case of a former shutdown (opposed to hibernation), too ..
I will wait for the issue to happen again and immediately post a log in case this one isn't suitable.
Last edited by Elmario (2023-09-17 22:01:49)
Offline
Sep 17 18:26:34 250g8 kernel: PM: Image not found (code -22)
Sep 17 16:30:05 250g8 kernel: Command line: BOOT_IMAGE=/@/boot/vmlinuz-linux root=UUID=4843fb47-bb68-4a90-9e76-8b1a8719aa8e rw rootflags=subvol=@ loglevel=3 quiet nvme_core.default_ps_max_latency_us=0 iommu=soft
There's no "resume" parameter??
https://wiki.archlinux.org/title/Power_ … parameters
Are you defining the image location at runtime?
How exactly? And what location?
Offline
Just for making sure (as I don't know if this is relevant to your question): Did you read that this time it probably wasn't from Hibernation but from a Shutdown?
The resume location... hm, I have a swap location set in fstab. Isn't that enough? (Hibernation generally is working, so it has to be set sufficiently somewhere.. I don't remember where if this fstab entry isn't enough.)
My '/etc/default/grub' doesn't contain any 'resume' string.
I installed arch using the manual installation (because the auto script couldn't handle btrfs subvolumes) instructions and made additional configurations as needed afterwards.
[ladmin@250g8 ~]$ cat /etc/fstab
# Static information about the filesystems.
# See fstab(5) for details.
# <file system> <dir> <type> <options> <dump> <pass>
# /dev/nvme0n1p5
UUID=4843fb47-bb68-4a90-9e76-8b1a8719aa8e / btrfs rw,relatime,ssd,discard=async,space_cache=v2,subvol=/@ 0 0
# /dev/nvme0n1p5
UUID=4843fb47-bb68-4a90-9e76-8b1a8719aa8e /home btrfs rw,relatime,ssd,discard=async,space_cache=v2,subvol=/@home 0 0
# /dev/nvme0n1p5
UUID=4843fb47-bb68-4a90-9e76-8b1a8719aa8e /var btrfs rw,relatime,ssd,discard=async,space_cache=v2,subvol=/@var 0 0
# /dev/nvme0n1p5
UUID=4843fb47-bb68-4a90-9e76-8b1a8719aa8e /home/ladmin/Schreibtisch btrfs rw,relatime,ssd,discard=async,space_cache=v2,subvol=/@desktop 0 0
# /dev/nvme0n1p5
UUID=4843fb47-bb68-4a90-9e76-8b1a8719aa8e /home/ladmin/Downloads btrfs rw,relatime,ssd,discard=async,space_cache=v2,subvol=/@downloads 0 0
# /dev/nvme0n1p1
UUID=EECE-A231 /boot/efi vfat rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,utf8,errors=remount-ro 0 2
# /dev/nvme0n1p6
UUID=b44e24c0-3400-4df8-a163-cc9291ef5da6 none swap defaults 0 0
Well, I just read 'https://wiki.archlinux.org/title/Power_ … _hibernate' and the paragraph '4.1 About swap partition/file size' made me remember, that I upgraded my notebook's RAM from 16 GByte to 64 GByte in between.
While I think there shouldn't be an issue with the swap partition being too small, because it always has been 64 GByte, may there possibly be another connection between bigger RAM size and this issue?
(Anyway: I just enhanced thee swap partition from 64 GByte to 128 GByte, just to be sure!)
I also added a resume paramater to the kernel line. The full kernel line in /etc/default/grub now is:
'GRUB_CMDLINE_LINUX_DEFAULT="loglevel=3 quiet nvme_core.default_ps_max_latency_us=0 iommu=soft resume=UUID=b44e24c0-3400-4df8-a163-cc9291ef5da6'
Last edited by Elmario (2023-09-18 07:59:03)
Offline
Please use [code][/code] tags. Edit your post in this regard.
Did you read that this time it probably wasn't from Hibernation but from a Shutdown?
Sep 17 18:26:34 250g8 kernel: PM: Image not found (code -22)
Sep 17 18:26:34 250g8 systemd-sleep[18740]: Entering sleep state 'hibernate'...
Sep 17 18:26:34 250g8 kernel: PM: hibernation: hibernation entry
Isn't that enough? (Hibernation generally is working, so it has to be set sufficiently somewhere.. I don't remember where if this fstab entry isn't enough.)
How does the kernel know where to put the hibernation image and where to get it from after the reboot?
The userspace could set /sys/power/resume* before the hibernation, but the resuming kernel then doesn't know where to get it.
Please post a journal from a boot you think successfully hibernated and resumed.
Edit: make sure you also ran grub-mkconfig
Edit #2: unrelated sidebar
You've NetworkManager and dhcpcd enabled concurrently. Pick one, disble the other.
Last edited by seth (2023-09-18 08:01:43)
Offline
Sorry, I edited this much after making the initial post. My thoughts often actually proceed by writing them down
OK, so this last broken boot was resulting from a hibernation; good to know.
(As I said, I wasn't sure about this anymore).
I will now do a hibernation and post again.
Offline
First reboot and also make sure that the resume parameter shows up in /proc/cmdline
Offline
The results after a successful wake-up from hibernation:
http://0x0.st/HO17.txt
Offline
First reboot and also make sure that the resume parameter shows up in /proc/cmdline
Oh, this came a bit too late. I already uploaded another log. But the parameters is there:
[ladmin@250g8 ~]$ cat /proc/cmdline
BOOT_IMAGE=/@/boot/vmlinuz-linux root=UUID=4843fb47-bb68-4a90-9e76-8b1a8719aa8e rw rootflags=subvol=@ loglevel=3 nvme_core.default_ps_max_latency_us=0 iommu=soft resume=UUID=b44e24c0-3400-4df8-a163-cc9291ef5da6
Btw:
I usually commit these changes by using my own 'update-grub' script:
[ladmin@250g8 ~]$ cat /usr/bin/update-grub
#!/bin/sh
set -e
grub-mkconfig -o /boot/grub/grub.cfg "$@"
grub-install
I hope that's sufficient? (Or maybe it's the reason for all this trouble )
Last edited by Elmario (2023-09-20 10:46:21)
Offline
http://0x0.st/HO17.txt is from a regular boot w/ "resume=UUID=b44e24c0-3400-4df8-a163-cc9291ef5da6" that after ~4 minutes ends in
Sep 18 10:00:47 250g8 kernel: Command line: BOOT_IMAGE=/@/boot/vmlinuz-linux root=UUID=4843fb47-bb68-4a90-9e76-8b1a8719aa8e rw rootflags=subvol=@ loglevel=3 nvme_core.default_ps_max_latency_us=0 iommu=soft resume=UUID=b44e24c0-3400-4df8-a163-cc9291ef5da6
…
Sep 18 10:04:29 250g8 systemd[1]: Starting Hibernate...
Sep 18 10:04:29 250g8 systemd-sleep[6130]: Entering sleep state 'hibernate'...
Sep 18 10:04:29 250g8 kernel: PM: Image not found (code -22)
Sep 18 10:04:29 250g8 kernel: PM: hibernation: hibernation entry
the hibernation, if entered, wasn't resumed.
Offline
Ah, OK. Then it's the broken XFCE session management that's confusing me all the time. I was assuming hibernation was working because I entered the desktop right where I left it, with the same windows open at same positions and such, despite I have the sessions saving disabled.
This explains the somewhat strange behaviour when 'hibernating' or better trying to:
I click on hibernate, and the systems seesms to shutdown (black screen), then the full desktop becomes visible again, and then the noteook powers off.
I guessed this might not be normal, but the overall behaviour distracted me from this thought.
So actually the notebook probably never hibernates - but sometimes breaks. How could we continue from here?
Edit:
OK, I made another observation:
'systemctl suspend' does actually work. The 'shutdown to hibernation is MUCH faster, without the desktop reappearing in between, and the 'reboot from hibernation' is near instant.
It's just the 'hibernation button' in XFCE's GUI that's not working.
Here's the log from after a successfull 'systemctl suspend':
(Well it generated same filname and URL as before.. so I don't know if this worked or just is showing up the old data?)
Just checked the log: ItS still the old data.
So do I need to first reboot and then hibernate for 'sudo journalctl -b -1 | curl -F 'file=@-' 0x0.st' to work?
Edit2:
I rebooted, then suspended using 'systemctl suspend' and then issued 'sudo journalctl -b -1 | curl -F 'file=@-' 0x0.st' again, but it again did not create a new file...?
Last edited by Elmario (2023-09-18 08:31:05)
Offline
Open an xterm, type "snafu" (don't press enter, that's not a command) hibernate and resume.
If the same terminal shows up and says "snafu", just as you left it, you resumed a hibernating system.
nb. that "systemctl suspend" is NOT hibernation (S4), but "suspend to ram" (S3) and that is expected to be way faster that hibernation which writes the RAM to disk, powers down, powers up, reads the previous RAM content from disk to restore the status quo ante.
The boot of http://0x0.st/HO17.txt never entered nor woke from an S3
nb. that the exact command from #14 because of the "-1" reflects the previous boot, not the ongoing (a system resumed from hibernation is a single boot, despite the interim power-down)
Offline