You are not logged in.
Pages: 1
Ever since upgrading my installation late last week (kernel 4.16.13-1), my computer fails to get past this point around 40% of the time, with the cursor not blinking anymore once it hits this state (photo rather than a log dump due to the issue brought up below the image):
https://i.imgur.com/xsn2sbH.jpg
journalctl -b-1 just shows the log for the last successful boot rather than the freeze at boot, so it doesn't provide any insight. It's almost as though the drives fail to mount, but I usually see it taking a few seconds recovering the journal on / after the required hard reset.
I thought it could be this since I have an Intel CPU (Haswell), but I haven't noticed a difference since making the changes. Plus, it didn't happen until I updated my installation, and I've been running pacman -Syu every couple weeks for a while now.
Since my / is an SSD (sda below; sdb is an HDD), I thought it might have something to do with fstrim, so I ran it once and it cleared out 6 GB, then set it up to run daily. Back when I set up this system several years ago, common wisdom was to use -o discard, which seems to no longer be the case. Here's my fstab:
/dev/disk/by-uuid/e1484ea3-4eb5-45b9-88e2-ee3bc74ce77a /boot ext2 noatime 0 1 # sda2
/dev/disk/by-uuid/33b8254c-a43f-4e90-83d3-7bd8204bd0cc / ext4 noatime,data=ordered,discard 0 1 # sda3
/dev/disk/by-uuid/9736d6d8-0a7a-4236-b770-006ee942c69c swap swap defaults 0 0 # sdb1
/dev/disk/by-uuid/b5b37dbf-e89d-4e6d-a40b-790b896e3546 /var reiserfs defaults 0 1 # sdb2
/dev/disk/by-uuid/b8b60567-f612-4f0d-be49-94512af86b82 /home xfs defaults 0 1 # sdb8
I'm not opposed to removing discard, but I don't think it's related to the issue here.
I also just got a couple of new 8bitdo USB gamepads. I've had problems where my machine fails to boot due to USB devices, but it does still seem to happen without the gamepads plugged in.
If it means anything, I have a reasonably recent nVidia GPU and have been using nvidia-dkms for it. I haven't had any problems until now.
Does anybody have any suggestions on what to try or where to look for more information? I just removed quiet from my kernel parameters, but won't get back to this computer until tomorrow. I'll update this thread with any new findings.
UPDATE: I just saw this and will see if it helps when I get a chance. The machine is a desktop running on AC rather than a laptop, though.
Mod note: Please post urls or thumbnails to images -- V1del
Last edited by V1del (2018-06-16 18:51:32)
Offline
The infrequency (i.e. seems to work some of the time) make me more inclined to think it's the issue with the random seed, see this thread for more information and a potential fix.
Last edited by V1del (2018-06-16 18:54:40)
Offline
Here's where it stops without quiet in the kernel parameters:
https://i.imgur.com/jsTeQcZ.jpg
I have a feeling that this latest OS update might've done my SSD in since it's many years old at this point. What would be the best way to test it for errors? Is there a good live image tool for such a purpose?
Last edited by takenji1989 (2018-06-17 21:05:07)
Offline
You were told in your first post to only post thumbnails: read the Code of Conduct and do not do it again: http://wiki.archlinux.org/index.php/Cod … s_and_code
Offline
Samsung SSD's are notorious as well, disable SATA power saving, as mentioned in the thread you linked. If you want to check your SSD for errors anyway you can basically use any live image and run a long smart check after the estimated time has elapsed, look at/post the output of
smartctl -a $device
Offline
Here's the long SMART self-test output. While my drives already were set to max_performance:
/sys/class/scsi_host/host0/link_power_management_policy:max_performance
/sys/class/scsi_host/host1/link_power_management_policy:max_performance
/sys/class/scsi_host/host2/link_power_management_policy:max_performance
/sys/class/scsi_host/host3/link_power_management_policy:max_performance
/sys/class/scsi_host/host4/link_power_management_policy:max_performance
/sys/class/scsi_host/host5/link_power_management_policy:max_performance
I did add a udev rule for it.
Offline
No that drive couldn't be healthier (at least from what I'm seeing, SMART outputs can be weird to interpret sometimes), did you do what I linked to wrt the random seed?
Offline
So I did try replacing the SSD with a new WD one to eliminate that, but the old one was definitely healthy during the ddrescue copy. The boot issue still persists.
I guess I'll give patching the kernel a try. haveged didn't change anything.
It seems like the symptom of the entropy issue tends to be boots taking maybe a minute or two, but my machine can be left stuck waiting for hours in that state. I haven't seen any other cases like that.
Offline
Have you tried linux-lts and please post the output of dmesg from a successful boot so others can see if they spot anything unusual in it.
Offline
Here's dmesg from boot. I had to boot three times in a row to get it to successfully go just now, and it wasn't always at the SSD this time. Sometimes it was at one of the other filesystems, and when I boot without quiet set, the random seed load/save happens right after the filesystems mount. V1del was right all along.
I'll give linux-lts a shot.
Offline
I didn't get the freezes at boot the few times I booted 4.17 after updating but another issue forced me to switch to linux-lts after installing 4.17. No problems with 4.14 so far, but I'll report back later and update the title of this thread if the problem is fully gone.
Offline
Pages: 1