You are not logged in.

#51 2021-05-03 15:00:59

evil.genius
Member
Registered: 2016-02-13
Posts: 9

Re: [SOLVED] nvidia driver causes kernel panic

ammonium wrote:

Could it be the resolution or the signal? When I look at the nvidia-settings on the DP-X specific page it says that the signal is TMDS, is this different from normal?

Both of my monitors are running at 1440p over DP. One works and one doesn't, so I don't think resolution is the issue. The only difference between the two is that one is using four lanes and the other (non-working) one uses two.

I think TMDS is only for HDMI and not DP, so likely not relevant. I'm not sure though.

This thread on the nvidia forum seems to be the same problem and has received a response from an nvidia dev:
https://forums.developer.nvidia.com/t/4 … t/175782/8

Offline

#52 2021-05-03 16:22:49

ShayBox
Member
Registered: 2021-05-03
Posts: 5

Re: [SOLVED] nvidia driver causes kernel panic

Same problem
Previously: 1080p DP, 720p DP to VGA, 2nd 720p HDMI to VGA, Valve Index DP
Now: 1440p DP, 1080p DP, 720p HDMI to VGA, Valve Index DP
Seems to be when the total size of the X screen exceeds a size not the connector type, but I could be wrong.

Offline

#53 2021-05-03 22:16:58

ammonium
Member
Registered: 2021-04-21
Posts: 5

Re: [SOLVED] nvidia driver causes kernel panic

ammonium wrote:

So I tested again now using a "DP to HDMI converter" (using the DP-out from the GPU and converting it to HDMI) and again it worked fine. Unfortunately this adapter/converter is limited to 1080p and the monitor is 1440p

Just an update: Testing with another 'DP to HDMI adapter' it also works with the monitor's native resolution. I'm using 4k over HDMI + 1440p over DP converted to HDMI and everything is working correctly with kernel 5.11.16-arch1-1 and Nvidia driver 465.27. Weird workaround but that's what we got for now

Offline

#54 2021-05-04 08:20:03

O_o
Member
Registered: 2021-05-04
Posts: 1

Re: [SOLVED] nvidia driver causes kernel panic

Not a kernel panic but related: after an upgrade on April 29, my machine started freezing randomly, with a pattern of most freezes happening when the display went to sleep. This happened 2-3 times an hour, so it meant either a hardware issue or an upgrade issue.

Found this thread and downgraded packages as per MetalMatze's comment #24 and this fixed things for me.

AMD Ryzen 7 2700X with GeForce RTX 3070
Problematic packages: nvidia-465.27-2 (the probable culprit) and linux-5.11.16

Offline

#55 2021-05-04 14:49:50

cs_95cc64dd
Member
Registered: 2020-10-29
Posts: 3

Re: [SOLVED] nvidia driver causes kernel panic

I had the same problem, copy pasta to fix was:

#!/bin/bash

for x in \
  nvidia-460.67-5-x86_64.pkg.tar.zst \
  nvidia-dkms-460.67-1-x86_64.pkg.tar.zst \
  nvidia-settings-460.67-1-x86_64.pkg.tar.zst \
  nvidia-utils-460.67-1-x86_64.pkg.tar.zst \
; do

  test -f $x || \
    wget https://archive.archlinux.org/repos/2021/04/08/extra/os/x86_64/$x
done

for x in \
  linux-5.11.13.arch1-1-x86_64.pkg.tar.zst \
  linux-headers-5.11.13.arch1-1-x86_64.pkg.tar.zst \
; do

  test -f $x || \
    wget https://archive.archlinux.org/repos/2021/04/15/core/os/x86_64/$x
done


pacman -U *.zst

Offline

#56 2021-05-04 17:34:06

ShayBox
Member
Registered: 2021-05-03
Posts: 5

Re: [SOLVED] nvidia driver causes kernel panic

Easier command:
downgrade nvidia-dkms nvidia-utils nvidia-settings lib32-nvidia-utils

Last edited by ShayBox (2021-05-04 17:34:34)

Offline

#57 2021-05-06 09:05:05

dummys
Member
Registered: 2015-12-29
Posts: 9

Re: [SOLVED] nvidia driver causes kernel panic

now news about this issue ? I don't want to downgrade...

Offline

#58 2021-05-06 18:33:12

Twister915
Member
Registered: 2021-05-06
Posts: 1

Re: [SOLVED] nvidia driver causes kernel panic

I've tried to upgrade every few days for the last month, and each time I use latest packages I get a kernel panic loading nvidia.

I only have to downgrade nvidia packages to 460.67-4, and have not downgraded the kernel. I downgrade nvidia, nvidia-dkms, nvidia-utils, and then my system boots. I am on kernel 5.11.16-arch1-1.

The following is the kernel panic on boot with latest nvidia packages (version 465.27-4):

BUG: kernel NULL pointer dereference, address: 000000000000001c
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0 
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 11 PID: 812 Comm: systemd-udevd Tainted: P           OE     5.11.16-arch1-1 #1
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570 Pro4, BIOS P3.00 06/17/2020
RIP: 0010:_nv032271rm+0xe/0x80 [nvidia]
Code: 89 d2 e8 45 88 f7 c4 48 83 c4 08 48 83 c5 50 c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 85 ff 74 5f 48 8b 17 48 89 f8 48 8b 0a <39> 71 04 74 53 8b 52 10 48 29 d0 48 8b 10 48 8b 12 39 72 04 74 42
RSP: 0018:ffff97b24389f578 EFLAGS: 00010282
RAX: ffff88fd439e8008 RBX: ffff88fd439e8008 RCX: 0000000000000018
RDX: ffffffffc4742400 RSI: 0000000000497031 RDI: ffff88fd439e8008
RBP: ffff88fd439d5c40 R08: 0000000000000020 R09: ffff88fd439d5c68
R10: ffff88fd058a0008 R11: 0000000010000010 R12: 00000000007ef3cb
R13: 0000000000000000 R14: 00000000000927c0 R15: ffff88fd058a0008
FS:  00007fcff5050a40(0000) GS:ffff890beecc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000001c CR3: 0000000106c38000 CR4: 0000000000350ee0
Call Trace:
 ? _nv032275rm+0x15/0x90 [nvidia]
 ? _nv039682rm+0x18/0xc0 [nvidia]
 ? _nv039641rm+0xf1/0x1f0 [nvidia]
 ? _nv018507rm+0xc2/0x180 [nvidia]
 ? _nv018462rm+0x19a/0x750 [nvidia]
 ? _nv032436rm+0x14b/0x200 [nvidia]
 ? _nv032436rm+0xab/0x200 [nvidia]
 ? _nv000859rm+0x2a5/0x470 [nvidia]
 ? _nv009647rm+0x4c3/0x650 [nvidia]
 ? _nv032313rm+0x11f/0x270 [nvidia]
 ? _nv032310rm+0x15d/0x1a0 [nvidia]
 ? _nv015534rm+0x232/0x330 [nvidia]
 ? _nv015556rm+0x7fd/0x1020 [nvidia]
 ? _nv027155rm+0x22c/0x4f0 [nvidia]
 ? _nv017787rm+0x303/0x5e0 [nvidia]
 ? _nv017788rm+0x30/0xa0 [nvidia]
 ? _nv017789rm+0xe1/0x220 [nvidia]
 ? _nv022829rm+0xed/0x220 [nvidia]
 ? _nv023065rm+0x30/0x60 [nvidia]
 ? _nv000704rm+0x16da/0x22b0 [nvidia]
 ? rm_init_adapter+0xc5/0xe0 [nvidia]
 ? kthread_create_on_node+0x51/0x70
 ? nv_open_device+0x122/0x8a0 [nvidia]
 ? nvidia_dev_get+0x63/0xb0 [nvidia]
 ? nvkms_open_gpu+0x4e/0x90 [nvidia_modeset]
 ? _nv000010kms+0x40/0x260 [nvidia_modeset]
 ? printk+0x68/0x7f
 ? security_kernfs_init_security+0x2a/0x40
 ? nv_drm_load+0xac/0x3ae [nvidia_drm]
 ? nv_drm_master_drop+0x60/0x60 [nvidia_drm]
 ? drm_dev_register+0xc8/0x1b0 [drm]
 ? nv_drm_probe_devices+0x184/0x210 [nvidia_drm]
 ? 0xffffffffc0a8e000
 ? do_one_initcall+0x57/0x220
 ? do_init_module+0x5c/0x270
 ? load_module+0x243e/0x2610
 ? __do_sys_init_module+0x136/0x1b0
 ? do_syscall_64+0x33/0x40
 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
Modules linked in: nvidia_drm(POE+) nvidia_modeset(POE) ucsi_ccg typec_ucsi intel_rapl_msr typec wmi_bmof nvidia(POE) snd_hda_codec_realtek snd_hda_codec_generic iwlmvm ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg m>
 i2c_nvidia_gpu soundcore fb_sys_fops curve25519_x86_64 rfkill dca libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libblake2s blake2s_x86_64 ip6_udp_tunnel wmi udp_tunnel libcurve25519_generic libchacha libblake2s_generic pinctrl_am>
CR2: 000000000000001c
---[ end trace b5ea4402a89e97ae ]---
RIP: 0010:_nv032271rm+0xe/0x80 [nvidia]
Code: 89 d2 e8 45 88 f7 c4 48 83 c4 08 48 83 c5 50 c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 85 ff 74 5f 48 8b 17 48 89 f8 48 8b 0a <39> 71 04 74 53 8b 52 10 48 29 d0 48 8b 10 48 8b 12 39 72 04 74 42
RSP: 0018:ffff97b24389f578 EFLAGS: 00010282
RAX: ffff88fd439e8008 RBX: ffff88fd439e8008 RCX: 0000000000000018
RDX: ffffffffc4742400 RSI: 0000000000497031 RDI: ffff88fd439e8008
RBP: ffff88fd439d5c40 R08: 0000000000000020 R09: ffff88fd439d5c68
R10: ffff88fd058a0008 R11: 0000000010000010 R12: 00000000007ef3cb
R13: 0000000000000000 R14: 00000000000927c0 R15: ffff88fd058a0008
FS:  00007fcff5050a40(0000) GS:ffff890beecc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000001c CR3: 0000000106c38000 CR4: 0000000000350ee0

Any workarounds? How do I keep CUDA working?

Offline

#59 2021-05-06 19:37:57

seth
Member
Registered: 2012-09-03
Posts: 22,053

Re: [SOLVED] nvidia driver causes kernel panic

The workaround is to use 460 and there's not going to be a solution until nvidia releases an update for the driver.
CUDA should™ still work, though? It's not bound to the exact driver version.

Online

#60 2021-05-07 03:09:48

ShayBox
Member
Registered: 2021-05-03
Posts: 5

Re: [SOLVED] nvidia driver causes kernel panic

CUDA still works for me on 460

Offline

#61 2021-05-10 20:35:37

Jubijub
Member
From: Lausanne, Switzerland
Registered: 2018-04-04
Posts: 22
Website

Re: [SOLVED] nvidia driver causes kernel panic

same issue (1080Ti, with AMD Ryzen 5950X, also with a screen connected via DP (Dell U3419W))

I manage to install the drivers and boot to tty if I don't use the modeset in the bootloader parameters / don't load the nvidia modules in mkinitcpio.conf. In which state a simple nvidia-smi hangs the system.

Offline

#62 2021-05-10 21:35:18

walmartshopper
Member
Registered: 2010-03-31
Posts: 33

Re: [SOLVED] nvidia driver causes kernel panic

Also affected here with an RTX 3080. I'm running four 1440p monitors and using all the ports on the GPU, so I can't just switch from DP to HDMI.  However  downgrading to the last 460 release and ignoring all nvidia packages in pacman.conf is still working and the dkms version still builds on the latest kernel.

Offline

#63 2021-05-10 22:29:55

Cavsfan
Member
From: USA
Registered: 2015-07-08
Posts: 100

Re: [SOLVED] nvidia driver causes kernel panic

This exact same thing has been happening to me. I noticed my fans were not turning awhile back. Luckily I found this thread.
This as mentioned temporarily fixed the problem:

sudo pacman -U nvidia-460.67-5-x86_64.pkg.tar.zst nvidia-utils-460.67-1-x86_64.pkg.tar.zst nvidia-settings-460.67-1-x86_64.pkg.tar.zst linux-5.11.11.arch1-1-x86_64.pkg.tar.zst linux-headers-5.11.11.arch1-1-x86_64.pkg.tar.zst

I spent a lot of time re-installing nvidia drivers, etc. and nothing worked until I seen ^
I was hoping this thread was not closed as the problem is definitely still there.

So, far other updates have gone smoothly.

Edit: I guess it was more than just the fans for most. The fans were all I lost but, still one needs fans.
        Nvidia Geforce GTX 980 Ti card

Last edited by Cavsfan (2021-06-22 20:00:21)

Offline

#64 2021-05-19 04:00:45

vinumoses
Member
Registered: 2021-05-01
Posts: 5

Re: [SOLVED] nvidia driver causes kernel panic

Has anyone who was affected by the DisplayPort bug for the Nvidia GTX / RTX series graphics cards with the 465.xxx graphics driver tried out the new Nvidia 465.31-2 driver to see if it fixes this? (I've switched over to the Nouveau driver in the interim, while waiting for a fix.)

Offline

#65 2021-05-19 07:01:55

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 13,247

Re: [SOLVED] nvidia driver causes kernel panic

The changelog doesn't read like it would fix this, but I'm not affected: https://www.nvidia.de/Download/driverRe … px/175763/

Offline

#66 2021-05-19 18:52:40

keibak
Member
Registered: 2017-05-24
Posts: 46

Re: [SOLVED] nvidia driver causes kernel panic

I'm sorry to report that the kernel panic persists with nvidia-465.31-1:

Mai 19 20:39:59 Angband kernel: WARNING: CPU: 9 PID: 521 at mm/kfence/core.c:133 kfence_unprotect+0x18/0x30
Mai 19 20:39:59 Angband kernel: Modules linked in: nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp mei_hdcp coretemp iTCO_wdt eeepc_wmi ee1004 asus_wmi intel_pmc_bxt iTCO_vendor_support wmi_bmof sparse_keymap intel_wmi_thunderbolt mxm_wmi kvm_intel nls_iso8859_1 vfat fat snd_hda_codec_realtek snd_hda_codec_generic wl(POE) kvm snd_hda_codec_hdmi ledtrig_audio irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi crypto_simd cfg80211 snd_hda_codec cryptd snd_hda_core rapl intel_cstate drm_kms_helper intel_uncore snd_hwdep snd_pcm snd_timer cec e1000e snd mei_me pcspkr i2c_i801 syscopyarea sysfillrect i2c_smbus rfkill sysimgblt mei soundcore fb_sys_fops wmi video mac_hid acpi_pad vboxnetflt(OE) vboxnetadp(OE) vboxdrv(OE) drm uhid sg crypto_user fuse agpgart bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 usbhid sr_mod xhci_pci crc32c_intel cdrom
Mai 19 20:39:59 Angband kernel:  xhci_pci_renesas
Mai 19 20:39:59 Angband kernel: CPU: 9 PID: 521 Comm: Xorg Tainted: P      D W  OE     5.12.4-arch1-2 #1
Mai 19 20:39:59 Angband kernel: Hardware name: System manufacturer System Product Name/PRIME Z370-A, BIOS 2401 07/12/2019
Mai 19 20:39:59 Angband kernel: RIP: 0010:kfence_unprotect+0x18/0x30
Mai 19 20:39:59 Angband kernel: Code: 05 ec ba 93 01 00 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 81 e7 00 f0 ff ff 31 f6 e8 fd fe ff ff 84 c0 74 01 c3 <0f> 0b c6 05 bf ba 93 01 00 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f
Mai 19 20:39:59 Angband kernel: RSP: 0018:ffffbb1b01007bf0 EFLAGS: 00010046
Mai 19 20:39:59 Angband kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffbb1b01007bcc
Mai 19 20:39:59 Angband kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Mai 19 20:39:59 Angband kernel: RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000000000000
Mai 19 20:39:59 Angband kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Mai 19 20:39:59 Angband kernel: R13: ffffbb1b01007c88 R14: 0000000000000086 R15: 0000000000000000
Mai 19 20:39:59 Angband kernel: FS:  0000000000000000(0000) GS:ffff952faec40000(0000) knlGS:0000000000000000
Mai 19 20:39:59 Angband kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mai 19 20:39:59 Angband kernel: CR2: 0000000000000086 CR3: 0000000099810004 CR4: 00000000003706e0
Mai 19 20:39:59 Angband kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mai 19 20:39:59 Angband kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Mai 19 20:39:59 Angband kernel: Call Trace:
Mai 19 20:39:59 Angband kernel:  page_fault_oops+0x9d/0x2d0
Mai 19 20:39:59 Angband kernel:  exc_page_fault+0x67/0x170
Mai 19 20:39:59 Angband kernel:  asm_exc_page_fault+0x1e/0x30
Mai 19 20:39:59 Angband kernel: RIP: 0010:_nv009371rm+0x3c/0x340 [nvidia]
Mai 19 20:39:59 Angband kernel: Code: 07 0f 1f 44 00 00 31 d2 48 8b 07 48 85 c0 75 1a e9 a1 02 00 00 66 0f 1f 84 00 00 00 00 00 48 8b 48 10 48 85 c9 74 17 48 89 c8 <48> 39 30 77 ef 0f 83 29 02 00 00 48 8b 48 18 48 85 c9 75 e9 48 89
Mai 19 20:39:59 Angband kernel: RSP: 0018:ffffbb1b01007d38 EFLAGS: 00010002
Mai 19 20:39:59 Angband kernel: RAX: 0000000000000086 RBX: ffffbb1b01007d80 RCX: 0000000000000086
Mai 19 20:39:59 Angband kernel: RDX: ffffbb1b01007dd0 RSI: 0000000000000209 RDI: ffffffffc377f658
Mai 19 20:39:59 Angband kernel: RBP: ffff95289826dff0 R08: 000000000000001f R09: 0000000000000000
Mai 19 20:39:59 Angband kernel: R10: ffffffffffffff01 R11: 0000000000000000 R12: 0000000000000000
Mai 19 20:39:59 Angband kernel: R13: ffffffffc377fe40 R14: ffff952895aae800 R15: ffffffffc377ca80
Mai 19 20:39:59 Angband kernel:  ? _nv039634rm+0xdf/0x1e0 [nvidia]
Mai 19 20:39:59 Angband kernel:  ? rm_cleanup_file_private+0x42/0x140 [nvidia]
Mai 19 20:39:59 Angband kernel:  ? nv_acpi_uninit+0x20/0xe0 [nvidia]
Mai 19 20:39:59 Angband kernel:  ? nvidia_close+0x150/0x310 [nvidia]
Mai 19 20:39:59 Angband kernel:  ? nvidia_frontend_close+0x2b/0x50 [nvidia]
Mai 19 20:39:59 Angband kernel:  ? __fput+0x8c/0x230
Mai 19 20:39:59 Angband kernel:  ? task_work_run+0x5c/0x90
Mai 19 20:39:59 Angband kernel:  ? do_exit+0x375/0xa50
Mai 19 20:39:59 Angband kernel:  ? do_sys_openat2+0xb0/0x160
Mai 19 20:39:59 Angband kernel:  ? rewind_stack_do_exit+0x17/0x17
Mai 19 20:39:59 Angband kernel: ---[ end trace 18515c0fd1051668 ]---
Mai 19 20:39:59 Angband kernel: BUG: kernel NULL pointer dereference, address: 0000000000000086
Mai 19 20:39:59 Angband kernel: #PF: supervisor read access in kernel mode
Mai 19 20:39:59 Angband kernel: #PF: error_code(0x0000) - not-present page
Mai 19 20:39:59 Angband kernel: PGD 0 P4D 0 
Mai 19 20:39:59 Angband kernel: Oops: 0000 [#2] PREEMPT SMP PTI
Mai 19 20:39:59 Angband kernel: CPU: 9 PID: 521 Comm: Xorg Tainted: P      D W  OE     5.12.4-arch1-2 #1
Mai 19 20:39:59 Angband kernel: Hardware name: System manufacturer System Product Name/PRIME Z370-A, BIOS 2401 07/12/2019
Mai 19 20:39:59 Angband kernel: RIP: 0010:_nv009371rm+0x3c/0x340 [nvidia]
Mai 19 20:39:59 Angband kernel: Code: 07 0f 1f 44 00 00 31 d2 48 8b 07 48 85 c0 75 1a e9 a1 02 00 00 66 0f 1f 84 00 00 00 00 00 48 8b 48 10 48 85 c9 74 17 48 89 c8 <48> 39 30 77 ef 0f 83 29 02 00 00 48 8b 48 18 48 85 c9 75 e9 48 89
Mai 19 20:39:59 Angband kernel: RSP: 0018:ffffbb1b01007d38 EFLAGS: 00010002
Mai 19 20:39:59 Angband kernel: RAX: 0000000000000086 RBX: ffffbb1b01007d80 RCX: 0000000000000086
Mai 19 20:39:59 Angband kernel: RDX: ffffbb1b01007dd0 RSI: 0000000000000209 RDI: ffffffffc377f658
Mai 19 20:39:59 Angband kernel: RBP: ffff95289826dff0 R08: 000000000000001f R09: 0000000000000000
Mai 19 20:39:59 Angband kernel: R10: ffffffffffffff01 R11: 0000000000000000 R12: 0000000000000000
Mai 19 20:39:59 Angband kernel: R13: ffffffffc377fe40 R14: ffff952895aae800 R15: ffffffffc377ca80
Mai 19 20:39:59 Angband kernel: FS:  0000000000000000(0000) GS:ffff952faec40000(0000) knlGS:0000000000000000
Mai 19 20:39:59 Angband kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mai 19 20:39:59 Angband kernel: CR2: 0000000000000086 CR3: 0000000099810004 CR4: 00000000003706e0
Mai 19 20:39:59 Angband kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mai 19 20:39:59 Angband kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Mai 19 20:39:59 Angband kernel: Call Trace:
Mai 19 20:39:59 Angband kernel:  ? _nv039634rm+0xdf/0x1e0 [nvidia]
Mai 19 20:39:59 Angband kernel:  ? rm_cleanup_file_private+0x42/0x140 [nvidia]
Mai 19 20:39:59 Angband kernel:  ? nv_acpi_uninit+0x20/0xe0 [nvidia]
Mai 19 20:39:59 Angband kernel:  ? nvidia_close+0x150/0x310 [nvidia]
Mai 19 20:39:59 Angband kernel:  ? nvidia_frontend_close+0x2b/0x50 [nvidia]
Mai 19 20:39:59 Angband kernel:  ? __fput+0x8c/0x230
Mai 19 20:39:59 Angband kernel:  ? task_work_run+0x5c/0x90
Mai 19 20:39:59 Angband kernel:  ? do_exit+0x375/0xa50
Mai 19 20:39:59 Angband kernel:  ? do_sys_openat2+0xb0/0x160
Mai 19 20:39:59 Angband kernel:  ? rewind_stack_do_exit+0x17/0x17
Mai 19 20:39:59 Angband kernel: Modules linked in: nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp mei_hdcp coretemp iTCO_wdt eeepc_wmi ee1004 asus_wmi intel_pmc_bxt iTCO_vendor_support wmi_bmof sparse_keymap intel_wmi_thunderbolt mxm_wmi kvm_intel nls_iso8859_1 vfat fat snd_hda_codec_realtek snd_hda_codec_generic wl(POE) kvm snd_hda_codec_hdmi ledtrig_audio irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi crypto_simd cfg80211 snd_hda_codec cryptd snd_hda_core rapl intel_cstate drm_kms_helper intel_uncore snd_hwdep snd_pcm snd_timer cec e1000e snd mei_me pcspkr i2c_i801 syscopyarea sysfillrect i2c_smbus rfkill sysimgblt mei soundcore fb_sys_fops wmi video mac_hid acpi_pad vboxnetflt(OE) vboxnetadp(OE) vboxdrv(OE) drm uhid sg crypto_user fuse agpgart bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 usbhid sr_mod xhci_pci crc32c_intel cdrom
Mai 19 20:39:59 Angband kernel:  xhci_pci_renesas
Mai 19 20:39:59 Angband kernel: CR2: 0000000000000086
Mai 19 20:39:59 Angband kernel: ---[ end trace 18515c0fd1051669 ]---
Mai 19 20:39:59 Angband kernel: RIP: 0010:_nv015537rm+0x1b6/0x330 [nvidia]
Mai 19 20:39:59 Angband kernel: Code: 8b 87 68 05 00 00 ba 01 00 00 00 be 02 00 00 00 e8 cf dc 38 c4 41 83 c5 01 41 83 fd 1f 0f 84 0b 01 00 00 48 8b 45 10 44 89 ee <48> 8b b8 70 01 00 00 48 8b 87 d8 04 00 00 e8 a7 dc 38 c4 89 c3 48
Mai 19 20:39:59 Angband kernel: RSP: 0018:ffffbb1b01007990 EFLAGS: 00010293
Mai 19 20:39:59 Angband kernel: RAX: 0000000000000000 RBX: 0000000000004000 RCX: 0000000000000002
Mai 19 20:39:59 Angband kernel: RDX: 0000000000000004 RSI: 0000000000000002 RDI: 0000000000000000
Mai 19 20:39:59 Angband kernel: RBP: ffff95289827add0 R08: 0000000000000001 R09: ffff95289827acb8
Mai 19 20:39:59 Angband kernel: R10: ffff9528a2d5c008 R11: 0000000010100000 R12: 0000000000004000
Mai 19 20:39:59 Angband kernel: R13: 0000000000000002 R14: ffff952898144010 R15: 0000000000000800
Mai 19 20:39:59 Angband kernel: FS:  0000000000000000(0000) GS:ffff952faec40000(0000) knlGS:0000000000000000
Mai 19 20:39:59 Angband kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mai 19 20:39:59 Angband kernel: CR2: 0000000000000086 CR3: 0000000099810004 CR4: 00000000003706e0
Mai 19 20:39:59 Angband kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mai 19 20:39:59 Angband kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Mai 19 20:39:59 Angband kernel: note: Xorg[521] exited with preempt_count 1
Mai 19 20:39:59 Angband kernel: Fixing recursive fault but reboot is needed!
Mai 19 20:39:59 Angband sddm[511]: Failed to read display number from pipe
Mai 19 20:39:59 Angband sddm[511]: Display server stopping...

Offline

#67 2021-05-20 14:53:49

scyron
Member
Registered: 2021-04-30
Posts: 1

Re: [SOLVED] nvidia driver causes kernel panic

Reporting the same issue on nvidia 465.31-3. But I managed to cover the holes... Apparently that's related to DP connections, and is affecting high-res displays (1440p/2k/4k) based on users reports.

I wanna thanks MetalMatze for providing the directions on this thread, if it wasn't for him I would be left behind on work since I need the nvidia drivers functioning properly for what I do.

Now I was using the latest Arch installation version - archlinux-2021.05.01-x86_64.iso which comes with gcc-11.1.0-1, and that's where the jump is. You need to remove and downgrade gcc as well, just to compile nvidia-dkms.

You should downgrade the packages on the following sequence, and don't forget (sadly) to ignore updates on all of them, including the kernel. You must only update gcc-10.2.0-6 to gcc-11.1.0-1 after compiling the nvidia driver.

linux-5.11.13.arch1-1-x86_64.pkg.tar.zst
linux-headers-5.11.13.arch1-1-x86_64.pkg.tar.zst
gcc-libs-10.2.0-6-x86_64.pkg.tar.zst
gcc-10.2.0-6-x86_64.pkg.tar.zst
nvidia-460.67-5-x86_64.pkg.tar.zst
nvidia-dkms-460.67-1-x86_64.pkg.tar.zst
nvidia-settings-460.67-1-x86_64.pkg.tar.zst
nvidia-utils-460.67-1-x86_64.pkg.tar.zst

For a permanent solution: Get rid of your nvidia card like I will be doing in next following months. It's been an entire year with issues on nvidia. From their failing EFI ROMs that screwed the iommu support (on purpose), to broken drivers that freezes, suspend mode issues and kernel panics. Arch and everything linux built-in has never failed me and works flawlessly.

Sorry about the ranting but that needs to be mentioned: I strongly doubt this will ever be addressed. Hope it helps...

Last edited by scyron (2021-05-20 17:26:59)

Offline

#68 2021-05-21 06:25:00

ShayBox
Member
Registered: 2021-05-03
Posts: 5

Re: [SOLVED] nvidia driver causes kernel panic

You only have to downgrade the nvidia packages, the kernel and gcc can stay updated.
This has been my first issue after years, It will be fixed, it's in both 465 and 460 now, it will start rolling out to distros like Ubuntu and they'll get the traction needed to fix it, right now it's a hand full of people on a couple rolling release distros with one guy looking into it.

Offline

#69 2021-05-21 11:11:41

dummys
Member
Registered: 2015-12-29
Posts: 9

Re: [SOLVED] nvidia driver causes kernel panic

scyron wrote:

Reporting the same issue on nvidia 465.31-3. But I managed to cover the holes... Apparently that's related to DP connections, and is affecting high-res displays (1440p/2k/4k) based on users reports.

I wanna thanks MetalMatze for providing the directions on this thread, if it wasn't for him I would be left behind on work since I need the nvidia drivers functioning properly for what I do.

Now I was using the latest Arch installation version - archlinux-2021.05.01-x86_64.iso which comes with gcc-11.1.0-1, and that's where the jump is. You need to remove and downgrade gcc as well, just to compile nvidia-dkms.

You should downgrade the packages on the following sequence, and don't forget (sadly) to ignore updates on all of them, including the kernel. You must only update gcc-10.2.0-6 to gcc-11.1.0-1 after compiling the nvidia driver.

linux-5.11.13.arch1-1-x86_64.pkg.tar.zst
linux-headers-5.11.13.arch1-1-x86_64.pkg.tar.zst
gcc-libs-10.2.0-6-x86_64.pkg.tar.zst
gcc-10.2.0-6-x86_64.pkg.tar.zst
nvidia-460.67-5-x86_64.pkg.tar.zst
nvidia-dkms-460.67-1-x86_64.pkg.tar.zst
nvidia-settings-460.67-1-x86_64.pkg.tar.zst
nvidia-utils-460.67-1-x86_64.pkg.tar.zst

For a permanent solution: Get rid of your nvidia card like I will be doing in next following months. It's been an entire year with issues on nvidia. From their failing EFI ROMs that screwed the iommu support (on purpose), to broken drivers that freezes, suspend mode issues and kernel panics. Arch and everything linux built-in has never failed me and works flawlessly.

Sorry about the ranting but that needs to be mentioned: I strongly doubt this will ever be addressed. Hope it helps...

sure what card do you will buy ? Are we sure that AMD radeon support is better in the kernel ?

Offline

#70 2021-05-21 16:46:41

ShayBox
Member
Registered: 2021-05-03
Posts: 5

Re: [SOLVED] nvidia driver causes kernel panic

The problem has been fixed in 465.31

Offline

#71 2021-05-21 18:22:40

keibak
Member
Registered: 2017-05-24
Posts: 46

Re: [SOLVED] nvidia driver causes kernel panic

Offline

#72 2021-05-22 21:05:57

Cavsfan
Member
From: USA
Registered: 2015-07-08
Posts: 100

Re: [SOLVED] nvidia driver causes kernel panic

I updated everything yesterday and can confirm that the Nvidia fans still do not run.
I downgraded everything back to the packages MetalMatze mentioned and the fans work.

Edit: * Using display port

Last edited by Cavsfan (2021-05-25 03:34:53)

Offline

#73 2021-05-22 21:15:27

akovia
Member
Registered: 2014-12-16
Posts: 11

Re: [SOLVED] nvidia driver causes kernel panic

I have 465.31 installed and still can't get to my desktop unless I use HDMI.
Display Port is still broken for me.

Offline

#74 2021-05-25 10:42:36

funnypigrun
Member
Registered: 2019-07-14
Posts: 12

Re: [SOLVED] nvidia driver causes kernel panic

Where’s linus when you need him

Offline

#75 2021-05-28 07:30:01

kokoko3k
Member
Registered: 2008-11-14
Posts: 2,156

Re: [SOLVED] nvidia driver causes kernel panic

dummys wrote:

[..]
sure what card do you will buy ? Are we sure that AMD radeon support is better in the kernel ?

Switched to amd last year, of course it is better.


Help me to improve ssh-rdp !

Offline

Board footer

Powered by FluxBB