You are not logged in.

#1 2019-05-25 17:52:51

Han Vinke
Member
Registered: 2017-02-18
Posts: 17

nvidia 390xx with kernel 5.1.x

Since kernel 5.1 I have problems with the stability of my system. It stops booting most of the time giving an error like

"kernel: [drm:nv_drm_master_drop [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] nv_drm_atomic_helper_disable_all failed with error code -22 !" and

"BUG: unable to handle kernel paging request at ffffe8feb2b9cd88
May 25 19:23:36 arch.localhost kernel: #PF error: [normal kernel read fault]"

My system currently:
nvidia-390xx-dkms  390.116-23
5.1.4-arch1-1-ARCH

Other users experiencing similar problems? I am on an LVM with LUKS and Samsung SSD. However no problems with LUKS-header etc.

Offline

#2 2019-05-26 20:59:29

Han Vinke
Member
Registered: 2017-02-18
Posts: 17

Re: nvidia 390xx with kernel 5.1.x

Since installing kernel 5.1.5 the problems have gone.
It most probably had to do with this recent kernel bug - Linux 5.1.5 Kernel Fixes The Latest Data Corruption Bug

Offline

#3 2019-05-28 07:33:08

Han Vinke
Member
Registered: 2017-02-18
Posts: 17

Re: nvidia 390xx with kernel 5.1.x

Well, as it turns out I was too quick drawing conclusions. After installing 5.1.5-arch1-2-ARCH the problem returned. I think I'll timeshift back to 5.1.5-arch1-1-ARCH, that seemed to have less problems.

Offline

#4 2019-05-29 22:01:44

Han Vinke
Member
Registered: 2017-02-18
Posts: 17

Re: nvidia 390xx with kernel 5.1.x

I figured out the cause of the problem, quite unexpectedly. As it turns out it was caused by the presence of a double user account folder in [var/lib].
One folder was correctly named [AccountsService], but there was a second map called [accountsservice]. It was somehow created in the past. I have no idea if it was created by a program or by myself.

For some reason this caused all the previous kernels not to fail during boot, but 5.1 must be picky.

Offline

#5 2019-05-31 20:54:17

Dextrey1
Member
Registered: 2019-05-31
Posts: 2

Re: nvidia 390xx with kernel 5.1.x

I'm having the exact same problem with same software versions.

Han Vinke wrote:

I figured out the cause of the problem, quite unexpectedly. As it turns out it was caused by the presence of a double user account folder in [var/lib].
One folder was correctly named [AccountsService], but there was a second map called [accountsservice]. It was somehow created in the past. I have no idea if it was created by a program or by myself.

For some reason this caused all the previous kernels not to fail during boot, but 5.1 must be picky.

How did you determine this was the cause of the problem in your case? I don't have AccountsService installed and thus don't have those folders in /var/lib but the kernel is always segfaulting whenever I try to start Xorg.

Offline

#6 2019-06-03 14:13:35

Han Vinke
Member
Registered: 2017-02-18
Posts: 17

Re: nvidia 390xx with kernel 5.1.x

@Dextrey1

I thought I was the only one experiencing errors.
I admit I am still having problems, but way less after that change. The usual boot error now has changed to: [cinnamon-session: GLib-GIO-CRITICAL: t+9525.01187s: g_dbus_connection_call_sync_internal: assertion 'G_IS_DBUS_CONNECTION (connection)' failed]. Also  seeing: [kernel: BUG: unable to handle kernel paging request at ffffff0e7d000008]. But those error messages might have changed by upgrading to 5.1.6-arch1-1-ARCH and Cinnamon 4.0.10.

Starting a few times in fallback mode usually helps booting normally. Maybe changing some hooks or modules in mkninitcpio.conf help. Still testing.....

Offline

#7 2019-06-03 16:08:35

Dextrey1
Member
Registered: 2019-05-31
Posts: 2

Re: nvidia 390xx with kernel 5.1.x

Han Vinke wrote:

@Dextrey1

I thought I was the only one experiencing errors.
I admit I am still having problems, but way less after that change. The usual boot error now has changed to: [cinnamon-session: GLib-GIO-CRITICAL: t+9525.01187s: g_dbus_connection_call_sync_internal: assertion 'G_IS_DBUS_CONNECTION (connection)' failed]. Also  seeing: [kernel: BUG: unable to handle kernel paging request at ffffff0e7d000008]. But those error messages might have changed by upgrading to 5.1.6-arch1-1-ARCH and Cinnamon 4.0.10.

Starting a few times in fallback mode usually helps booting normally. Maybe changing some hooks or modules in mkninitcpio.conf help. Still testing.....

It seems to be a problem with either the new kernel versions or the proprietary Nvidia driver.  I downgraded to the 5.0.3-arch1-1-ARCH kernel (with the matching Nvidia driver version) and the system works fine.  Open source Nouveau driver also mitigates the problem even with the latest kernel.

Offline

#8 2019-06-04 06:13:31

Han Vinke
Member
Registered: 2017-02-18
Posts: 17

Re: nvidia 390xx with kernel 5.1.x

I am currently testing different settings. The first one with a disabled DRM KMS for nvidia. That seems to work fine. According to NVIDIA its DRM KMS support is still considered experimental.
When on the other hand trying with early KMS enabled, it seems the option drm.edid_firmware=[/PATH-TO_EDID.BIN]/edid.bin in grub is also important. I am still holding my breath but so far, so good.
To create a valid edid.bin for your monitor you can use nvidia-settings.

Offline

#9 2019-06-04 07:11:50

Han Vinke
Member
Registered: 2017-02-18
Posts: 17

Re: nvidia 390xx with kernel 5.1.x

Sorry, did not read your last reply carefully. If even nouveau mitigates the problem then it is obviously not an nvidia problem.

Offline

#10 2019-06-04 07:19:29

Han Vinke
Member
Registered: 2017-02-18
Posts: 17

Re: nvidia 390xx with kernel 5.1.x

@Dextry1

Did you also test Nouveau with modeset=0 option, by any chance, to disable KMS?

Offline

#11 2019-06-08 21:36:02

Han Vinke
Member
Registered: 2017-02-18
Posts: 17

Re: nvidia 390xx with kernel 5.1.x

I have tested again with another (newer gtx1060) video card. With 390xx drivers - same problem. Installed 430.14 driver - not a single boot failed, no errors at all.

So my guess: this Data Loss Bug Due To Overly Aggressive FSTRIM appearing in kernel 5.1 is only solved for the 430.14 driver, but still present in the legacy 390xx series driver.

Offline

Board footer

Powered by FluxBB