You are not logged in.

#1 2024-03-06 16:00:34

Sveske-juice
Member
From: Denmark
Registered: 2022-06-12
Posts: 18

GPU driver failing?

I am dual-booting archlinux & windows. Windows is working perfectly. I have an AMD iGPU as well as an nVidia dGPU.

When booting into arch, the screen consists of a wide range of corrupted pixels with a few slim vertical stripes where it's possible to see stuff. It happens both in the TTY as well as in DE/WM. See below for top program running on live usb:

https://imgur.com/a/LuVfzx5
https://imgur.com/a/1mi7AjW

This started happening about december 2023 after an update. If i set pacman's repository manually to a date before this bug
happened it works, or if a boot an old live usb. I've also tried booting an up-to date NixOS live usb and the screen was working as intended.

The issue appears also in a liveusb. See below for $ journalctl -b output:

https://pastebin.com/Y2nEeygU

The only things i noticed in the log is:

Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: RAS: optional ras ta ucode is not available
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: RAP: optional rap ta ucode is not available
Mar 06 15:15:36 archiso kernel: [drm] psp gfx command LOAD_TA(0x1) failed and response status is (0x7)
Mar 06 15:15:36 archiso kernel: [drm] psp gfx command INVOKE_CMD(0x3) failed and response status is (0x4)
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: Secure display: Generic Failure.
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: SECUREDISPLAY: query securedisplay TA failed. ret 0x0
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: SMU is initialized successfully!

But this seems to be something related to DRM and probably not important. And:

Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: SE 1, SH per SE 1, CU per SH 8, active_cu_number 8
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: ring gfx_low uses VM inv eng 1 on hub 0
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: ring gfx_high uses VM inv eng 4 on hub 0
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 5 on hub 0
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 6 on hub 0
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 7 on hub 0
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 8 on hub 0
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 9 on hub 0
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 10 on hub 0
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 11 on hub 0
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 12 on hub 0
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 13 on hub 0
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 8
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: ring vcn_dec uses VM inv eng 1 on hub 8
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 4 on hub 8
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 5 on hub 8
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: amdgpu: ring jpeg_dec uses VM inv eng 6 on hub 8
Mar 06 15:15:36 archiso kernel: [drm] Initialized amdgpu 3.57.0 20150101 for 0000:05:00.0 on minor 2
Mar 06 15:15:36 archiso kernel: fbcon: amdgpudrmfb (fb0) is primary device
Mar 06 15:15:36 archiso kernel: [drm] DSC precompute is not needed.
Mar 06 15:15:36 archiso kernel: Console: switching to colour frame buffer device 240x67
Mar 06 15:15:36 archiso kernel: amdgpu 0000:05:00.0: [drm] fb0: amdgpudrmfb frame buffer device

hardware info:
Laptop: ASUS Rog Zephyrus G15
CPU: AMD Ryzen 4900HS with AMD Radeon graphics
dGPU: nVidia RTX 2060 Max-Q

Last edited by Sveske-juice (2024-03-06 16:03:06)

Offline

#2 2024-03-06 17:17:53

seth
Member
Registered: 2012-09-03
Posts: 53,234

Re: GPU driver failing?

Mar 06 15:15:36 archiso kernel: nouveau 0000:01:00.0: [drm] Cannot find any crtc or sizes
Mar 06 15:15:36 archiso kernel: nouveau 0000:01:00.0: [drm] Cannot find any crtc or sizes
Mar 06 15:15:36 archiso kernel: nouveau 0000:01:00.0: [drm] Cannot find any crtc or sizes

You're running (the install iso) on nouveau, but nothing seems attached there, so probably not relevant.

Mar 06 15:15:36 archiso kernel: [drm] Initialized simpledrm 1.0.0 20200625 for simple-framebuffer.0 on minor 0
Mar 06 15:15:36 archiso kernel: simple-framebuffer simple-framebuffer.0: [drm] fb0: simpledrmdrmfb frame buffer device

That's the simpledrm device, add "initcall_blacklist=simpledrm_platform_driver_init" to the https://wiki.archlinux.org/title/Kernel_parameters
Alternative, in case you want to use/try the nvidia blob, enable https://wiki.archlinux.org/title/NVIDIA … de_setting - use the "nvidia_drm.modeset=1" kernel parameter (modprobe.conf won't do!)

Mar 06 15:15:36 archiso kernel: RAS: Correctable Errors collector initialized.
Mar 06 15:15:36 archiso kernel: Unstable clock detected, switching default tracing clock to "global"
                                If you want to keep using the local clock, then add:
                                  "trace_clock=local"
                                on the kernel command line

Make sure  windows fast-start is (still, MS enables it with updates) disabled, 3rd link below.

Is the massive sub"pixel" hinting thing one can see in the photos part of the problem or your camera?
Does this happen w/ the LTS kernel? (What kernel does NixOS run?)
Does it help to add "amdgpu.ppfeaturemask=0xffffbffb" to the https://wiki.archlinux.org/title/Kernel_parameters ?

Offline

#3 2024-03-06 18:27:04

Sveske-juice
Member
From: Denmark
Registered: 2022-06-12
Posts: 18

Re: GPU driver failing?

seth wrote:

That's the simpledrm device, add "initcall_blacklist=simpledrm_platform_driver_init" to the https://wiki.archlinux.org/title/Kernel_parameters
Alternative, in case you want to use/try the nvidia blob, enable https://wiki.archlinux.org/title/NVIDIA … de_setting - use the "nvidia_drm.modeset=1" kernel parameter (modprobe.conf won't do!)

I tried setting the kernel parameters you mentioned:

initcall_blacklist=simpledrm_platform_driver_init nvidia_drm.modeset=1 amdgpu.ppfeaturemask=0xffffbffb

in the grub live usb. The issue still happened. This is the new output from journalctl:

https://pastebin.com/ppab6zVa

Don't know if this might be related?

Mar 06 17:36:18 archiso kernel: nvidia-gpu 0000:01:00.3: enabling device (0000 -> 0002)
Mar 06 17:36:18 archiso kernel: ccp 0000:05:00.2: enabling device (0000 -> 0002)
Mar 06 17:36:18 archiso kernel: ccp 0000:05:00.2: ccp: unable to access the device: you might be running a broken BIOS.
Mar 06 17:36:18 archiso kernel: ccp 0000:05:00.2: tee enabled
Mar 06 17:36:18 archiso kernel: ccp 0000:05:00.2: psp enabled
seth wrote:

Make sure  windows fast-start is (still, MS enables it with updates) disabled, 3rd link below.

I ran the command in cmd on windows:

> powercfg /H off

And cold-rebooted, but the issue still persisted sad.

seth wrote:

Is the massive sub"pixel" hinting thing one can see in the photos part of the problem or your camera?

The subpixel hinting is a part of the issue. It's not the camera. Maybe something is corrupting the framebuffer somehow?

seth wrote:

Does this happen w/ the LTS kernel? (What kernel does NixOS run?)

I don't know how to run the live usb with the LTS kernel. I instead downloaded a old live usb from https://archive.archlinux.org/iso/2023.09.01/ and flashed it. Tried booting it and there were no issues!

I will try to binary search my way to a specific date that causes the issue. I'll give an update ASAP.

Also a side note when booting from the old live usb i had to manually press the power button since it kept waiting for systemd-udevd and udevd-worker to quit. This is the journalctl of the live usb working: https://pastebin.com/7JQpHeMT

Last edited by Sveske-juice (2024-03-06 18:34:09)

Offline

#4 2024-03-06 19:12:07

Sveske-juice
Member
From: Denmark
Registered: 2022-06-12
Posts: 18

Re: GPU driver failing?

I tried testing these various live usb versions:

https://archive.archlinux.org/iso/2023.09.01/: worked!
https://archive.archlinux.org/iso/2023.12.01/: worked, but with error
https://archive.archlinux.org/iso/2024.01.01/: worked, but with error
https://archive.archlinux.org/iso/2024.02.01/: BRICKED!

So it seems to be some update from january.

When testing the various versions almost all gave this error to the journal:

Mar 06 18:43:41 archiso kernel: [drm] Initialized nouveau 1.4.0 20120801 for 0000:01:00.0 on minor 1
Mar 06 18:43:41 archiso kernel: BUG: kernel NULL pointer dereference, address: 0000000000000020
Mar 06 18:43:41 archiso kernel: #PF: supervisor read access in kernel mode
Mar 06 18:43:41 archiso kernel: #PF: error_code(0x0000) - not-present page
Mar 06 18:43:41 archiso kernel: PGD 0 P4D 0 
Mar 06 18:43:41 archiso kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Mar 06 18:43:41 archiso kernel: CPU: 14 PID: 212 Comm: (udev-worker) Not tainted 6.6.3-arch1-1 #1 6156c717f7d423f5954ce718462aaaaa43b9110d
Mar 06 18:43:41 archiso kernel: Hardware name: ASUSTeK COMPUTER INC. ROG Zephyrus G15 GA502IV_GA502IV/GA502IV, BIOS GA502IV.301 09/28/2023
Mar 06 18:43:41 archiso kernel: RIP: 0010:nvif_object_mthd+0xa6/0x200 [nouveau]
Mar 06 18:43:41 archiso kernel: Code: 00 e8 ae 8b 2b d8 49 8b 44 24 08 41 8d 56 20 4c 8d 7c 24 28 49 39 c4 0f 84 f9 00 00 00 4c 89 63 10 31 c9 48 89 de c6 43 06 ff <48> 8b 78 20 48 8b 40 38 48 8b 40 28 e8 d9 ff 2c d8 48 8b 3c 24 4c
Mar 06 18:43:41 archiso kernel: RSP: 0018:ffffc900071f35d0 EFLAGS: 00010246
Mar 06 18:43:41 archiso kernel: RAX: 0000000000000000 RBX: ffffc900071f35d8 RCX: 0000000000000000
Mar 06 18:43:41 archiso kernel: RDX: 0000000000000028 RSI: ffffc900071f35d8 RDI: ffffc900071f35f8
Mar 06 18:43:41 archiso kernel: RBP: ffff88811e86c800 R08: 0000000000000000 R09: 0000000000000000
Mar 06 18:43:41 archiso kernel: R10: ffff888101dc7a80 R11: 0002000000000000 R12: ffff8881213f5540
Mar 06 18:43:41 archiso kernel: R13: ffffc900071f35d8 R14: 0000000000000008 R15: ffffc900071f35f8
Mar 06 18:43:41 archiso kernel: FS:  00007fa81af1c480(0000) GS:ffff88880f980000(0000) knlGS:0000000000000000
Mar 06 18:43:41 archiso kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 06 18:43:41 archiso kernel: CR2: 0000000000000020 CR3: 000000011b194000 CR4: 0000000000350ee0
Mar 06 18:43:41 archiso kernel: Call Trace:
Mar 06 18:43:41 archiso kernel:  <TASK>
Mar 06 18:43:41 archiso kernel:  ? __die+0x23/0x70
Mar 06 18:43:41 archiso kernel:  ? page_fault_oops+0x171/0x4e0
Mar 06 18:43:41 archiso kernel:  ? srso_return_thunk+0x5/0x10
Mar 06 18:43:41 archiso kernel:  ? exc_page_fault+0x7f/0x180
Mar 06 18:43:41 archiso kernel:  ? asm_exc_page_fault+0x26/0x30
Mar 06 18:43:41 archiso kernel:  ? nvif_object_mthd+0xa6/0x200 [nouveau bf42bf36e8326bf06d6fea67bc3ccb5e7f7bbc6b]
Mar 06 18:43:41 archiso kernel:  nvif_conn_hpd_status+0x39/0xf0 [nouveau bf42bf36e8326bf06d6fea67bc3ccb5e7f7bbc6b]
Mar 06 18:43:41 archiso kernel:  nouveau_dp_detect+0x86/0x4f0 [nouveau bf42bf36e8326bf06d6fea67bc3ccb5e7f7bbc6b]
Mar 06 18:43:41 archiso kernel:  ? nvkm_uvmm_new+0x192/0x1d0 [nouveau bf42bf36e8326bf06d6fea67bc3ccb5e7f7bbc6b]
Mar 06 18:43:41 archiso kernel:  nouveau_connector_detect+0xa4/0x5b0 [nouveau bf42bf36e8326bf06d6fea67bc3ccb5e7f7bbc6b]
Mar 06 18:43:41 archiso kernel:  ? srso_return_thunk+0x5/0x10
Mar 06 18:43:41 archiso kernel:  drm_helper_probe_detect+0x88/0xb0
Mar 06 18:43:41 archiso kernel:  drm_helper_probe_single_connector_modes+0x455/0x540
Mar 06 18:43:41 archiso kernel:  ? drm_client_modeset_probe+0x1b9/0x1520
Mar 06 18:43:41 archiso kernel:  drm_client_modeset_probe+0x24b/0x1520
Mar 06 18:43:41 archiso kernel:  ? drm_sched_entity_init+0x106/0x1a0 [gpu_sched 7a23027f949c9f696f001cfeb375cbb5430b11dc]
Mar 06 18:43:41 archiso kernel:  ? nouveau_sched_entity_init+0x8b/0xa0 [nouveau bf42bf36e8326bf06d6fea67bc3ccb5e7f7bbc6b]
Mar 06 18:43:41 archiso kernel:  ? srso_return_thunk+0x5/0x10
Mar 06 18:43:41 archiso kernel:  ? __pm_runtime_suspend+0x4a/0xd0
Mar 06 18:43:41 archiso kernel:  ? srso_return_thunk+0x5/0x10
Mar 06 18:43:41 archiso kernel:  ? nouveau_drm_open+0x8e/0x1e0 [nouveau bf42bf36e8326bf06d6fea67bc3ccb5e7f7bbc6b]
Mar 06 18:43:41 archiso kernel:  __drm_fb_helper_initial_config_and_unlock+0x3d/0x540
Mar 06 18:43:41 archiso kernel:  ? srso_return_thunk+0x5/0x10
Mar 06 18:43:41 archiso kernel:  drm_fbdev_generic_client_hotplug+0x6a/0xb0
Mar 06 18:43:41 archiso kernel:  drm_client_register+0x79/0xc0
Mar 06 18:43:41 archiso kernel:  nouveau_drm_probe+0x258/0x280 [nouveau bf42bf36e8326bf06d6fea67bc3ccb5e7f7bbc6b]
Mar 06 18:43:41 archiso kernel:  local_pci_probe+0x45/0xa0
Mar 06 18:43:41 archiso kernel:  pci_device_probe+0xc1/0x260
Mar 06 18:43:41 archiso kernel:  ? sysfs_do_create_link_sd+0x6e/0xe0
Mar 06 18:43:41 archiso kernel:  really_probe+0x19e/0x3e0
Mar 06 18:43:41 archiso kernel:  ? __pfx___driver_attach+0x10/0x10
Mar 06 18:43:41 archiso kernel:  __driver_probe_device+0x78/0x160
Mar 06 18:43:41 archiso kernel:  driver_probe_device+0x1f/0x90
Mar 06 18:43:41 archiso kernel:  __driver_attach+0xd2/0x1c0
Mar 06 18:43:41 archiso kernel:  bus_for_each_dev+0x88/0xd0
Mar 06 18:43:41 archiso kernel:  bus_add_driver+0x116/0x220
Mar 06 18:43:41 archiso kernel:  driver_register+0x59/0x100
Mar 06 18:43:41 archiso kernel:  ? __pfx_nouveau_drm_init+0x10/0x10 [nouveau bf42bf36e8326bf06d6fea67bc3ccb5e7f7bbc6b]
Mar 06 18:43:41 archiso kernel:  do_one_initcall+0x5d/0x320
Mar 06 18:43:41 archiso kernel:  do_init_module+0x60/0x240
Mar 06 18:43:41 archiso kernel:  init_module_from_file+0x89/0xe0
Mar 06 18:43:41 archiso kernel:  idempotent_init_module+0x120/0x2b0
Mar 06 18:43:41 archiso kernel:  __x64_sys_finit_module+0x5e/0xb0
Mar 06 18:43:41 archiso kernel:  do_syscall_64+0x60/0x90
Mar 06 18:43:41 archiso kernel:  ? srso_return_thunk+0x5/0x10
Mar 06 18:43:41 archiso kernel:  ? ksys_lseek+0x6c/0xb0
Mar 06 18:43:41 archiso kernel:  ? srso_return_thunk+0x5/0x10
Mar 06 18:43:41 archiso kernel:  ? syscall_exit_to_user_mode+0x2b/0x40
Mar 06 18:43:41 archiso kernel:  ? srso_return_thunk+0x5/0x10
Mar 06 18:43:41 archiso kernel:  ? do_syscall_64+0x6c/0x90
Mar 06 18:43:41 archiso kernel:  ? syscall_exit_to_user_mode+0x2b/0x40
Mar 06 18:43:41 archiso kernel:  ? srso_return_thunk+0x5/0x10
Mar 06 18:43:41 archiso kernel:  ? do_syscall_64+0x6c/0x90
Mar 06 18:43:41 archiso kernel:  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Mar 06 18:43:41 archiso kernel: RIP: 0033:0x7fa81b9d373d
Mar 06 18:43:41 archiso kernel: Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c3 95 0c 00 f7 d8 64 89 01 48
Mar 06 18:43:41 archiso kernel: RSP: 002b:00007fff93255678 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
Mar 06 18:43:41 archiso kernel: RAX: ffffffffffffffda RBX: 000056496892f8b0 RCX: 00007fa81b9d373d
Mar 06 18:43:41 archiso kernel: RDX: 0000000000000000 RSI: 00007fa81bae0376 RDI: 0000000000000012
Mar 06 18:43:41 archiso kernel: RBP: 00007fa81bae0376 R08: 0000000000000070 R09: fffffffffffffe90
Mar 06 18:43:41 archiso kernel: R10: 0000000000000050 R11: 0000000000000246 R12: 0000000000020000
Mar 06 18:43:41 archiso kernel: R13: 000056496892ae00 R14: 0000000000000000 R15: 0000564968930110
Mar 06 18:43:41 archiso kernel:  </TASK>
Mar 06 18:43:41 archiso kernel: Modules linked in: uas usb_storage amdgpu(+) nouveau(+) crc32_pclmul amdxcp mxm_wmi crc32c_intel drm_exec drm_buddy gpu_sched sha512_ssse3 drm_suballoc_helper i2c_algo_bit sha256_ssse3 aesni_intel r8169 drm_ttm_helper nvme crypto_simd realtek ttm mdio_devres cryptd nvme_core drm_display_helper video xhci_pci libphy nvme_common cec xhci_pci_renesas wmi
Mar 06 18:43:41 archiso kernel: CR2: 0000000000000020
Mar 06 18:43:41 archiso kernel: ---[ end trace 0000000000000000 ]---
Mar 06 18:43:41 archiso kernel: RIP: 0010:nvif_object_mthd+0xa6/0x200 [nouveau]
Mar 06 18:43:41 archiso kernel: Code: 00 e8 ae 8b 2b d8 49 8b 44 24 08 41 8d 56 20 4c 8d 7c 24 28 49 39 c4 0f 84 f9 00 00 00 4c 89 63 10 31 c9 48 89 de c6 43 06 ff <48> 8b 78 20 48 8b 40 38 48 8b 40 28 e8 d9 ff 2c d8 48 8b 3c 24 4c
Mar 06 18:43:41 archiso kernel: RSP: 0018:ffffc900071f35d0 EFLAGS: 00010246
Mar 06 18:43:41 archiso kernel: RAX: 0000000000000000 RBX: ffffc900071f35d8 RCX: 0000000000000000
Mar 06 18:43:41 archiso kernel: RDX: 0000000000000028 RSI: ffffc900071f35d8 RDI: ffffc900071f35f8
Mar 06 18:43:41 archiso kernel: RBP: ffff88811e86c800 R08: 0000000000000000 R09: 0000000000000000
Mar 06 18:43:41 archiso kernel: R10: ffff888101dc7a80 R11: 0002000000000000 R12: ffff8881213f5540
Mar 06 18:43:41 archiso kernel: R13: ffffc900071f35d8 R14: 0000000000000008 R15: ffffc900071f35f8
Mar 06 18:43:41 archiso kernel: FS:  00007fa81af1c480(0000) GS:ffff88880f980000(0000) knlGS:0000000000000000
Mar 06 18:43:41 archiso kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 06 18:43:41 archiso kernel: CR2: 0000000000000020 CR3: 000000011b194000 CR4: 0000000000350ee0
Mar 06 18:43:41 archiso kernel: note: (udev-worker)[212] exited with irqs disabled

Also udev frequently failed to start by systemd. Full journal: https://pastebin.com/B9LDRXmd

Last edited by Sveske-juice (2024-03-06 19:56:24)

Offline

#5 2024-03-06 19:59:23

Sveske-juice
Member
From: Denmark
Registered: 2022-06-12
Posts: 18

Re: GPU driver failing?

After some more testing from daily builds from https://github.com/theCode-Breaker/daily-arch-builds

I tracked the breaking iso to be 2024-01-15. https://github.com/theCode-Breaker/dail … 2024.01.15

All ISO's after this date fails.

Strange thing i noticed is that all the ISO's before the breaking date all had the above mentioned error thrown. But the bricked versions after the breaking date do not have this error thrown. So somehow the error stops the issue from occurring?

Last edited by Sveske-juice (2024-03-06 20:00:33)

Offline

#6 2024-03-06 20:56:48

seth
Member
Registered: 2012-09-03
Posts: 53,234

Re: GPU driver failing?

The errors are from nouveau, it's also (or rather at least the nvidia device is) what causes the udev issues.
While it doesn't seem to be involved in the output chain, you could just try to blacklist it and see what happens: "module_blacklist=nouveau"
On an installed system I'd try the nvidia blob.

Edit: the breaking iso should be the first one w/ linux 6.7, no?

Last edited by seth (2024-03-06 20:59:52)

Offline

#7 2024-03-06 22:59:19

Sveske-juice
Member
From: Denmark
Registered: 2022-06-12
Posts: 18

Re: GPU driver failing?

seth wrote:

the breaking iso should be the first one w/ linux 6.7, no?

It indeed is.
It goes from linux 6.6.10.arch1-1 (in the latest iso that's working) to linux 6.7.arch3-1 (the one not working)

Trying to blacklist nouveau the issue still happens. It also happened using the nvidia blob driver.

When running with kernel param:

module_blacklist=nouveau

I get the following journal: https://pastebin.com/nyf4Mkfv

Offline

#8 2024-03-06 23:49:32

seth
Member
Registered: 2012-09-03
Posts: 53,234

Re: GPU driver failing?

nouveau is blocked, so the nvidia chip is most likely not involved
The only difference seems

Mar 06 18:43:41 archiso kernel: amdgpu 0000:05:00.0: amdgpu: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF

which looks whacko… and is in the 6.6 kernel log.

Does it help to add "amdgpu.sg_display=0"?

Offline

#9 2024-03-07 09:43:50

Sveske-juice
Member
From: Denmark
Registered: 2022-06-12
Posts: 18

Re: GPU driver failing?

Using

amdgpu.sg_display=0

The issue still appears unfortunately. This is the new journal: https://pastebin.com/raPUnQ6C

Offline

#10 2024-03-07 13:07:09

ltrump
Member
Registered: 2024-03-07
Posts: 1

Re: GPU driver failing?

I've experienced the same issue with my Zephyrus G15 (GA502), while rolling back the kernel to a version lower than 6.7 temporarily solved the problem. A lot of similar complaints as follows (whether archlinux or not):

https://gitlab.freedesktop.org/drm/amd/-/issues/3207
https://bugzilla.kernel.org/show_bug.cgi?id=218481
https://discussion.fedoraproject.org/t/ … lay/103388

Offline

#11 2024-03-09 16:26:42

Sveske-juice
Member
From: Denmark
Registered: 2022-06-12
Posts: 18

Re: GPU driver failing?

As mentioned in the freedesktop discussion, the bug was introduced in https://git.kernel.org/pub/scm/linux/ke … 47bf68201f

Offline

Board footer

Powered by FluxBB