Random Crashing?

vuittonarch · 2024-08-18 11:10:34

Hi all, my system is randomly crashing, seems to be when I go to click something?

I've checked journalctl and it says this before the crash
Aug 18 12:02:32 main kernel: amdgpu 0000:12:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Aug 18 12:02:32 main kernel: amdgpu 0000:12:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Aug 18 12:02:34 main kernel: amdgpu 0000:12:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Aug 18 12:02:34 main kernel: amdgpu 0000:12:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data

dmesg|grep amdgpu:

[   10.207536] [drm] amdgpu kernel modesetting enabled.
[   10.218116] amdgpu: Virtual CRAT table created for CPU
[   10.218129] amdgpu: Topology: Add CPU node
[   10.218232] amdgpu 0000:12:00.0: enabling device (0006 -> 0007)
[   10.219983] amdgpu 0000:12:00.0: amdgpu: Fetched VBIOS from VFCT
[   10.219985] amdgpu: ATOM BIOS: 102-RAPHAEL-006
[   10.303916] amdgpu 0000:12:00.0: vgaarb: deactivate vga console
[   10.303919] amdgpu 0000:12:00.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
[   10.303950] amdgpu 0000:12:00.0: amdgpu: VRAM: 512M 0x000000F400000000 - 0x000000F41FFFFFFF (512M used)
[   10.303952] amdgpu 0000:12:00.0: amdgpu: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
[   10.304036] [drm] amdgpu: 512M of VRAM memory ready
[   10.304037] [drm] amdgpu: 31713M of GTT memory ready.
[   10.304753] amdgpu 0000:12:00.0: amdgpu: Will use PSP to load VCN firmware
[   10.326745] amdgpu 0000:12:00.0: amdgpu: reserve 0xa00000 from 0xf41e000000 for PSP TMR
[   10.374430] amdgpu 0000:12:00.0: amdgpu: RAS: optional ras ta ucode is not available
[   10.380234] amdgpu 0000:12:00.0: amdgpu: RAP: optional rap ta ucode is not available
[   10.380236] amdgpu 0000:12:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[   10.381049] amdgpu 0000:12:00.0: amdgpu: SMU is initialized successfully!
[   10.382715] snd_hda_intel 0000:12:00.1: bound 0000:12:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
[   10.460766] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
[   10.460777] kfd kfd: amdgpu: Total number of KFD nodes to be created: 1
[   10.460938] amdgpu: Virtual CRAT table created for GPU
[   10.461027] amdgpu: Topology: Add dGPU node [0x164e:0x1002]
[   10.461029] kfd kfd: amdgpu: added device 1002:164e
[   10.461038] amdgpu 0000:12:00.0: amdgpu: SE 1, SH per SE 1, CU per SH 2, active_cu_number 2
[   10.461041] amdgpu 0000:12:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[   10.461043] amdgpu 0000:12:00.0: amdgpu: ring gfx_0.1.0 uses VM inv eng 1 on hub 0
[   10.461044] amdgpu 0000:12:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 4 on hub 0
[   10.461045] amdgpu 0000:12:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 5 on hub 0
[   10.461046] amdgpu 0000:12:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[   10.461047] amdgpu 0000:12:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[   10.461048] amdgpu 0000:12:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[   10.461049] amdgpu 0000:12:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[   10.461050] amdgpu 0000:12:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[   10.461051] amdgpu 0000:12:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[   10.461052] amdgpu 0000:12:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 12 on hub 0
[   10.461053] amdgpu 0000:12:00.0: amdgpu: ring sdma0 uses VM inv eng 13 on hub 0
[   10.461054] amdgpu 0000:12:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 8
[   10.461055] amdgpu 0000:12:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 8
[   10.461056] amdgpu 0000:12:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 8
[   10.461057] amdgpu 0000:12:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 8
[   10.461599] amdgpu 0000:12:00.0: amdgpu: Runtime PM not available
[   10.461866] [drm] Initialized amdgpu 3.57.0 20150101 for 0000:12:00.0 on minor 1
[   10.465895] fbcon: amdgpudrmfb (fb0) is primary device
[   10.829513] amdgpu 0000:12:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
[   11.015915] amdgpu 0000:12:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
[   11.131507] amdgpu 0000:12:00.0: [drm] fb0: amdgpudrmfb frame buffer device
[   12.567764] amdgpu 0000:12:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
[   13.071970] amdgpu 0000:12:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
[   13.259184] amdgpu 0000:12:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data

I'm unsure what else I can collect to diagnose this, if there is anything please let me know

Thanks for any help anyone can provide.

Last edited by vuittonarch (2024-08-18 11:12:50)

lifiumBTW · 2024-08-18 14:21:54

Did you update the system? A kernel update might be helpful. Other than that, you might wanna check out updating amdgpu driver. Maybe even check out other kernels like LTS, Zen or XanMod.

Last edited by lifiumBTW (2024-08-18 14:22:22)

fish4terrisa-MSDSM · 2024-08-18 14:54:22

Please tell me the detailed information of the computer with this problem.
And, can you give the full dmesg output(after deleted the sensitive informations), maybe they're related.
And if you are using a laptop, you can check the laptop page in the ArchWiki and try to find your model there. Maybe someone has already found this problem and noted the solution

vuittonarch · 2024-08-19 08:39:12

Hi, not a laptop. Systems up to date. mesa seems to have a new version on extra/testing branch of pacman, is this a red flag?
https://archlinux.org/packages/extra-te … 6_64/mesa/

Kernel: 6.10.5-arch1-1
CPU: AMD Ryzen 9 7950X (32) @ 6.142GHz
GPU (Integrated): AMD ATI 12:00.0 Raphael
Memory: 64GB

Unsure if dmesg has anything but this line is scattered a lot:
amdgpu 0000:12:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data

Last edited by vuittonarch (2024-08-19 08:40:49)

seth · 2024-08-19 09:06:22

Does the problem exist with
1. The 6.10.2 kernel ?
2. The LTS kernel ?

vuittonarch · 2024-08-19 12:21:28

Well its not easy to see if it still exists, the issue is intermittent and happens rarely.

This means it could take weeks to test an array of theories.

I've upgraded the GPU packages and I'm currently testing to see if a crash happens.

If its still present, I'll try different kernels.

Last edited by vuittonarch (2024-08-19 12:22:31)

archer_cc · 2024-10-08 08:59:43

vuittonarch wrote:

Well its not easy to see if it still exists, the issue is intermittent and happens rarely.
This means it could take weeks to test an array of theories.
I've upgraded the GPU packages and I'm currently testing to see if a crash happens.
If its still present, I'll try different kernels.

Hi bro, has there been any result regarding the collapse of amdgpu? Currently, I am using version 6.10.2 and still experience random crashes

zesko · 2024-11-14 14:30:35

Maybe you are not alone, https://gitlab.freedesktop.org/drm/amd/-/issues/3647

obelisk · 2024-11-21 22:51:51

Hi, don't know if it helps:
I played on my PC with the DRAM voltage, I undervoltage it. I decreases the XMP voltage step by step. Till it hangs very often when, for example, I was browsing with firefox. I increases the voltage two steps. Now my DRAM is still a little bit undervoltaged, but also already many months stable.
>did you change something with your RAM or CPU settings in the Bios/UEFI?
>Is the freezing something new or did you have it from beginning with this PC?

Arch Linux

#1 2024-08-18 11:10:34

Random Crashing?

#2 2024-08-18 14:21:54

Re: Random Crashing?

#3 2024-08-18 14:54:22

Re: Random Crashing?

#4 2024-08-19 08:39:12

Re: Random Crashing?

#5 2024-08-19 09:06:22

Re: Random Crashing?

#6 2024-08-19 12:21:28

Re: Random Crashing?

#7 2024-10-08 08:59:43

Re: Random Crashing?

#8 2024-11-14 14:30:35

Re: Random Crashing?

#9 2024-11-21 22:51:51

Re: Random Crashing?

Board footer