You are not logged in.

#1 2020-10-30 02:49:05

TheDcoder
Member
Registered: 2020-06-06
Posts: 51
Website

Boot process occasionally freezes during GUI initialization

Hi everyone,

I recently switched to an SSD and put my root partition in there, this has resulted in lightning fast boot times, but unfortunately there has been a new side-effect, sometimes the boot gets stuck just before or during (I am not sure which) GUI initialization. This never happened when I was still using my mechanical HDD. The issue is easily solved by force shutdown followed by a reboot.

I suspect it might something to do with the Nvidia graphic drivers that I am using, I recently removed the nvidia kernel modules from my mkinitcpio.conf since I really wasn't using DRM kernel mode setting, I use Intel integrated graphics as my primary graphics so I don't need Nvidia to load as fast as possible.

My systemd journal doesn't show any abnormal warnings or errors related to GUI or display drivers, the last two entries are about flushing the journal to persistent storage. I don't want my journal to be publicly available forever so I am pasting it here (it only contains the entries from my previous boot, where this issue happened).

Happy to provide more info/logs.

Thanks for the assistance in advance!

Offline

#2 2020-10-30 08:08:59

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 11,272

Re: Boot process occasionally freezes during GUI initialization

Readd the modules to the mkinitcpio conf this is the main thing that this fixes, you don't have to use and there is no inherent link to the modesetting parameter, but adding them to the initramfs will ensure they load earlier and are ready in time. (... which is btw not exclusive to nvidia, all graphics driver run into this given a sufficiently fast SSD)

If you are currently using linux 5.9 you might want to omit nvidia_uvm from the "usual" list as that's the module that currently having legalese loading issues.

Edit: I'm a slight idiot and didn't read completely, but as mentioned, this is not exclusive to nvidia, so you'd have to add i915 to the modules for the intel case

... What's weird in that journal excerpt that there is no graphics module loaded in the entire section, neither intel nor nvidia, so it might also be possible to be a lower level firmware issue in which case maybe look out for UEFI/firmware updates.

Last edited by V1del (2020-10-30 08:40:56)

Offline

#3 2020-10-30 12:14:03

TheDcoder
Member
Registered: 2020-06-06
Posts: 51
Website

Re: Boot process occasionally freezes during GUI initialization

Thank you for your reply V1del! You are certainly not an idiot in the slightest because you did read completely eventually big_smile

Thanks for the pointer about early kernel start, unlike you I had somewhat misread this text:

Intel, Nouveau, ATI and AMDGPU drivers already enable KMS automatically for all chipsets, so you need not install it manually.

...and began falsely believing that the Intel module/driver is automatically included in the initramfs... should've know better and read the heading of the section which says "Late KMS start". I followed the steps to include i915 and hopefully the issue will not happen again smile

Oh, and I checked for a BIOS/firmware update like you suggested and found one which was released an year ago, so thanks for that as well, maybe it will fix some of the other strange things I've seen.

Offline

#4 2020-10-31 02:22:36

TheDcoder
Member
Registered: 2020-06-06
Posts: 51
Website

Re: Boot process occasionally freezes during GUI initialization

I am back to sadly report that the issue is not resolved, it happened again this morning sad

First of all here is a screen-cap of the exact moment where it got stuck:

https://i.imgur.com/43mD3rV.jpeg

The journal yet again shows no clear indications... No idea what is going on at the moment. My system booted fine after the force shutdown, what gives?


moderator edit -- replaced oversized image with link.
Pasting pictures and code

Last edited by 2ManyDogs (2020-10-31 12:18:58)

Offline

#5 2020-10-31 10:41:04

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 11,272

Re: Boot process occasionally freezes during GUI initialization

That isn't an issue and not necessarily related to graphics card based hangups assuming it's literally just waiting for the man-db regeneration.  What command are you using to look at the journal? All those services that are mentioned here do not appear at all, which I don't think is conclusive enough if you are using the -k switch post/check a complete

sudo journalctl -b-1

and maybe also check a xorg log, even if xorg doesn't pop up if an attempt at starting it is made it should write a log: https://wiki.archlinux.org/index.php/Xorg#General

Last edited by V1del (2020-10-31 10:42:25)

Offline

#6 2020-10-31 12:08:00

TheDcoder
Member
Registered: 2020-06-06
Posts: 51
Website

Re: Boot process occasionally freezes during GUI initialization

V1del wrote:

That isn't an issue and not necessarily related to graphics card based hangups assuming it's literally just waiting for the man-db regeneration.

Not, it is definitely not waiting for man-db, the entire screen was frozen for a good minute before I captured it. And most of the time it gets stuck on the "Starting graphical interface" message or some other nearby message.

V1del wrote:

What command are you using to look at the journal? All those services that are mentioned here do not appear at all, which I don't think is conclusive enough if you are using the -k switch post/check a complete

sudo journalctl -b-1

I am using the same command that you posted (except I use a space between -b and -1, and also used -p 7 to include all messages), I did wonder why all that wasn't appearing in the journal... by the way, what is the -k switch?

V1del wrote:

maybe also check a xorg log, even if xorg doesn't pop up if an attempt at starting it is made it should write a log: https://wiki.archlinux.org/index.php/Xorg#General

Okay, I just checked and the logs from this morning's boot are gone, so I will have to wait until it happens again. By the way, should I be able to switch to an alternative pseudo-terminal by pressing Ctrl+Alt+2 at this stage? If only X is stuck then it should be possible for me to do that. I recall that it was not possible but I will try it again to confirm.

Offline

#7 2020-11-03 05:00:40

TheDcoder
Member
Registered: 2020-06-06
Posts: 51
Website

Re: Boot process occasionally freezes during GUI initialization

This happened again today, here is the stuck boot screen, I tried switching to an alternative pseudo-terminal using Ctrl+Alt+Fn (n = number) but it did not work.

X is not producing logs for the failed boot, the .old log is from yesterday and the new one is from my current boot. The systemd journal is showing nothing special, it is similar to the previous 2 I posted.

Why is this bug so elusive? sad

Offline

#8 2020-11-03 08:04:49

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 11,272

Re: Boot process occasionally freezes during GUI initialization

That's the problem with race conditions they can look fine for many tries and fail catastrophically once and then look fine again, what is somewhat more concerning to me is that your journal is that incomplete, granted it still has similar contents even if you use the command I suggested, as that means the kernel froze so hard that it couldn't even write a journal. Which might have the implication that your SSD has issues. what's the SSD vendor here? Try running a SMART test

Offline

#9 2020-11-03 08:31:23

TheDcoder
Member
Registered: 2020-06-06
Posts: 51
Website

Re: Boot process occasionally freezes during GUI initialization

I am starting to hate these race conditions :-/

It is indeed concerning that the kernel is freezing badly enough that it couldn't write a journal.

V1del wrote:

Which might have the implication that your SSD has issues. what's the SSD vendor here? Try running a SMART test

I hope it is not. My SSD is WD Blue SN550 if I recall correctly, smartctl doesn't really show self-tests for it:

root@arch ~# smartctl -c /dev/nvme0
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.9.2-xanmod1-1] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Firmware Updates (0x14):            2 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Maximum Data Transfer Size:         128 Pages
Warning  Comp. Temp. Threshold:     80 Celsius
Critical Comp. Temp. Threshold:     85 Celsius
Namespace 1 Features (0x02):        NA_Fields

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     3.50W    2.10W       -    0  0  0  0        0       0
 1 +     2.40W    1.60W       -    0  0  0  0        0       0
 2 +     1.90W    1.50W       -    0  0  0  0        0       0
 3 -   0.0200W       -        -    3  3  3  3     3900   11000
 4 -   0.0050W       -        -    4  4  4  4     5000   39000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         2
 1 -    4096       0         1

but the `nvme` tool does offer a `device-self-test` option so I ran a short self-test and here is the result:

Device Self Test Log for NVME device:nvme0
Current operation  : 0
Current Completion : 0%
Self Test Result[0]:
  Operation Result             : 0
  Self Test Code               : 1
  Valid Diagnostic Information : 0
  Power on hours (POH)         : 0x14
  Vendor Specific              : 0 0

I don't really understand the result, and the "Valid Diagnostic Information" field is set to 0 which usually means false, does this mean that there is no valid diagnostic information??? For extra information, here is the SMART log:

root@arch ~# nvme smart-log /dev/nvme0
Smart Log for NVME device:nvme0 namespace-id:ffffffff
critical_warning			: 0
temperature				: 43 C
available_spare				: 100%
available_spare_threshold		: 10%
percentage_used				: 0%
endurance group critical warning summary: 0
data_units_read				: 2,375,061
data_units_written			: 201,477
host_read_commands			: 8,420,548
host_write_commands			: 917,568
controller_busy_time			: 19
power_cycles				: 56
power_on_hours				: 20
unsafe_shutdowns			: 13
media_errors				: 0
num_err_log_entries			: 1
Warning Temperature Time		: 0
Critical Composite Temperature Time	: 0
Thermal Management T1 Trans Count	: 0
Thermal Management T2 Trans Count	: 0
Thermal Management T1 Total Time	: 0
Thermal Management T2 Total Time	: 0

There are no critical warnings, which is good. `num_err_log_entries` is 1, but I looked at the error log and there are no errors, maybe the error was overwritten by all the other 64 "success" logs I saw in the error logs? I am assuming `unsafe_shutdowns` are 13 because I force shutdown the laptop when the boot freezes.

Not sure if SSD is at fault here, it is working well so far and I have rebuilt the kernel many times, so it does not make sense if this issue is related to stored memory in SSD. It always gets stuck before the "Starting (or started) Graphical Interface" message which makes me believe that it might have something to do with graphics... probably the Nvidia graphics.

Offline

#10 2020-11-04 14:00:20

TheDcoder
Member
Registered: 2020-06-06
Posts: 51
Website

Re: Boot process occasionally freezes during GUI initialization

It has happened again today, but in the evening (it usually happens in the morning when I first start my computer on a new day). Does anyone have any ideas that I can try?

Offline

#11 2020-11-04 14:29:49

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 11,272

Re: Boot process occasionally freezes during GUI initialization

Values look alrightish, something you can try is upping the nvme power saving delay time, assuming WD drives have similar issues to samsung, try something like

nvme_core.default_ps_max_latency_us=5500

on the kernel params to disable nvme power saving.

Also in hopes of getting better logs, enable sysrq and in case of crash try to reboot with the REISUB method and hope more journal could be written: https://wiki.archlinux.org/index.php/Ke … el_(SysRq)

Is the system stable once you are in? Does it only happen during the bootup phase?

Last edited by V1del (2020-11-04 14:30:51)

Offline

#12 2020-11-04 15:26:03

TheDcoder
Member
Registered: 2020-06-06
Posts: 51
Website

Re: Boot process occasionally freezes during GUI initialization

V1del wrote:

Values look alrightish, something you can try is upping the nvme power saving delay time, assuming WD drives have similar issues to samsung, try something like on the kernel params to disable nvme power saving.

Okay, will try.

V1del wrote:

Also in hopes of getting better logs, enable sysrq and in case of crash try to reboot with the REISUB method and hope more journal could be written: https://wiki.archlinux.org/index.php/Ke … el_(SysRq)

Wow! I never knew something like this existed, will definitely try this one too smile

V1del wrote:

Is the system stable once you are in? Does it only happen during the bootup phase?

Yes, this issue only happens during boot.

But I may have identified a potentially related issue where sometimes OpenGL would fail with the discrete Nvidia graphics, I am experiencing that issue right now and it goes away after a reboot:

TheDcoder@arch ~> prime-run glxgears
X Error of failed request:  BadValue (integer parameter out of range for operation)
  Major opcode of failed request:  152 (GLX)
  Minor opcode of failed request:  3 (X_GLXCreateContext)
  Value in failed request:  0x0
  Serial number of failed request:  25
  Current serial number in output stream:  26
TheDcoder@arch ~ [1]> glxgears
Running synchronized to the vertical refresh.  The framerate should be
approximately the same as the monitor refresh rate.
309 frames in 5.0 seconds = 61.643 FPS
X connection to :0.0 broken (explicit kill or server shutdown).

The first command fails because the GLX context cannot be created, every OpenGL-related application would not work in this case with the nvidia graphics, I have tried it with WebGL (3D in-browser graphics) and Minecraft. The second command demonstrates that GLX works fine with integrated Intel graphics.

Offline

#13 2020-11-04 15:55:35

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 11,272

Re: Boot process occasionally freezes during GUI initialization

That does sound like the race condition issues though, which modules did you add to your mkinitcpio now? In doubt add both the intel and nvidia ones, don't forget to rebuild the images when making adjustments.

FWIW if they aren''t in there this does start to make more sense to me, as you try to use prime run the nvidia drivers need to be present when xorg attempts to start anyway, since they hook into the running xorg session to provide prime.

Last edited by V1del (2020-11-04 16:20:46)

Offline

#14 2020-11-05 03:29:43

TheDcoder
Member
Registered: 2020-06-06
Posts: 51
Website

Re: Boot process occasionally freezes during GUI initialization

The issue happened again this morning and I tried the SysRq shortcuts, but none of them had no observable effect, I tried multiple combinations to make sure that the kernel is actually getting the SysRq key input. I am guessing they weren't responsive because I either couldn't figure out the right key combinations or because I was using a custom kernel build with minimal configuration which could have excluded the SysRq diagnostic keys altogether. So I am back to the standard linux-zen kernel now and hopefully it will work the next time.

About the modules in mkinitcpio, I recently removed the nvidia modules from it because they were causing issues during shutdown (systemd apparently doesn't like them)... and more recently as I have mentioned in one of the earlier posts in the thread, I have i915 intel graphic module loaded in the init image. Aside from that I have nothing loaded.

Let's ignore the Nvidia OpenGL issue for now, as it seems to be unrelated after reading your explanation, it's not a big deal.

Offline

#15 2020-11-05 08:07:14

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 11,272

Re: Boot process occasionally freezes during GUI initialization

At this point I'd say test the LTS kernel, maybe there's a regression in 5.9 on your system, as even if the graphics driver goes down if the kernel was still there it should still be able to write logs.

Offline

#16 2020-11-11 05:14:14

TheDcoder
Member
Registered: 2020-06-06
Posts: 51
Website

Re: Boot process occasionally freezes during GUI initialization

A minor update, I have tried both the LTS kernel and the zen kernel, they both have the same issue and I couldn't get SysRq to work in them... It is unlikely but maybe something is causing a hard kernel freeze.

Offline

#17 2020-11-11 06:11:59

TheDcoder
Member
Registered: 2020-06-06
Posts: 51
Website

Re: Boot process occasionally freezes during GUI initialization

I just tried the SysRq shutdown command after booting to the `multi-user` systemd target, which basically boots into a console, and it definitely worked. So it leads me to believe that the kernel is indeed freezing hard hmm

Not sure what to do at this point...

Offline

#18 2020-11-14 04:34:30

TheDcoder
Member
Registered: 2020-06-06
Posts: 51
Website

Re: Boot process occasionally freezes during GUI initialization

Fixing this issue (X is unloading the Nvidia module) also fixed the freeze, I no longer experience it. Definitely some kind of race condition which is hard locking the kernel somehow, I'll leave the investigation to the experts, I am happy that the issue was fixed big_smile

Offline

Board footer

Powered by FluxBB