You are not logged in.
Pages: 1
The title is self-explanatory. The driver randomly crashes, which usually results in a small freeze until it goes back to normal.
I have zero clue about what could it be. I've never had issues with amdgpu drivers before, but this laptop is magical.
[ 2362.399606] ------------[ cut here ]------------
[ 2362.399631] WARNING: CPU: 2 PID: 12 at drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dmub_psr.c:126 dmub_psr_get_state+0xb1/0xd0 [amdgpu]
[ 2362.402395] Modules linked in: snd_seq_dummy snd_seq snd_seq_device rfcomm xt_conntrack xt_MASQUERADE bridge stp llc nf_conntrack_netlink xfrm_user xfrm_algo ip6table_nat ip6table_filter ip6_tables iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c xt_addrtype iptable_filter overlay ccm algif_aead crypto_null des3_ede_x86_64 des_generic libdes algif_skcipher cmac md4 bnep algif_hash af_alg vmnet(OE) vfat fat mousedev snd_soc_dmic snd_soc_acp6x_mach snd_acp6x_pdm_dma snd_sof_amd_acp70 snd_sof_amd_acp63 snd_soc_acpi_amd_match snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_sof intel_rapl_msr amd_atl intel_rapl_common snd_sof_utils snd_pci_ps snd_amd_sdw_acpi soundwire_amd snd_hda_codec_realtek soundwire_generic_allocation soundwire_bus snd_hda_codec_generic snd_hda_scodec_component iwlmvm snd_hda_codec_hdmi snd_soc_core snd_hda_intel snd_compress snd_intel_dspcfg ac97_bus snd_intel_sdw_acpi mac80211 snd_pcm_dmaengine snd_hda_codec
[ 2362.403115] uvcvideo snd_rpl_pci_acp6x snd_acp_pci videobuf2_vmalloc snd_hda_core kvm_amd libarc4 snd_acp_legacy_common uvc ptp snd_hwdep snd_pci_acp6x videobuf2_memops btusb videobuf2_v4l2 pps_core snd_pcm btrtl snd_pci_acp5x videobuf2_common r8169 spd5118 joydev snd_rn_pci_acp3x kvm btintel snd_timer iwlwifi videodev hid_multitouch btbcm ucsi_acpi realtek snd_acp_config btmtk i2c_piix4 snd typec_ucsi snd_soc_acpi mdio_devres bluetooth razermouse(OE) mc rapl pcspkr wmi_bmof typec cfg80211 nvidia_wmi_ec_backlight i2c_smbus k10temp soundcore libphy snd_pci_acp3x roles i2c_hid_acpi i2c_hid amd_pmc mac_hid vmmon(OE) vmw_vmci vboxnetflt(OE) vboxnetadp(OE) vboxdrv(OE) pkcs8_key_parser uinput i2c_dev crypto_user acpi_call(OE) loop nfnetlink ip_tables x_tables hid_asus radeon ext4 crc32c_generic mbcache jbd2 dm_crypt nvidia_uvm(POE) nvidia_drm(POE) nvidia_modeset(POE) cbc encrypted_keys trusted hid_generic asn1_encoder tee usbhid nvidia(POE) dm_mod amdgpu crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni
[ 2362.403902] polyval_generic amdxcp ghash_clmulni_intel drm_exec sha512_ssse3 asus_nb_wmi sha256_ssse3 gpu_sched asus_wmi sha1_ssse3 drm_buddy aesni_intel serio_raw sparse_keymap i2c_algo_bit gf128mul atkbd nvme platform_profile drm_suballoc_helper crypto_simd libps2 vivaldi_fmap rfkill drm_ttm_helper drm_display_helper cryptd ttm nvme_core cec i8042 ccp crc16 video nvme_auth serio wmi
[ 2362.404220] CPU: 2 UID: 0 PID: 12 Comm: kworker/u48:1 Tainted: P OE 6.12.10-arch1-1 #1 ac0cff2c6581af0a10f6e278cbc98026cc1e3dec
[ 2362.404262] Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[ 2362.404273] Hardware name: ASUSTeK COMPUTER INC. ASUS TUF Gaming A17 FA706NF_FA706NF/FA706NF, BIOS FA706NF.305 02/22/2024
[ 2362.404289] Workqueue: dm_vblank_control_workqueue amdgpu_dm_crtc_vblank_control_worker [amdgpu]
[ 2362.406834] RIP: 0010:dmub_psr_get_state+0xb1/0xd0 [amdgpu]
[ 2362.409714] Code: 28 00 00 00 75 31 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e e9 fd 6c 0f cf 83 c3 01 41 c7 04 24 ff 00 00 00 81 fb e9 03 00 00 75 8a <0f> 0b eb c8 3d ff 00 00 00 75 c1 eb f3 e8 8d d7 ea ce 66 66 2e 0f
[ 2362.409734] RSP: 0018:ffffb131400f3ce8 EFLAGS: 00010246
[ 2362.409765] RAX: 0000000000000000 RBX: 00000000000003e9 RCX: 0000000000000002
[ 2362.409784] RDX: 0000000000000000 RSI: 00000000000036b8 RDI: ffff9205ea800000
[ 2362.409799] RBP: 0000000000000000 R08: 0000000000000002 R09: ffff9205c0c66c80
[ 2362.409813] R10: 0000000000000007 R11: 000000000000001c R12: ffffb131400f3d2c
[ 2362.409826] R13: ffffb131400f3cec R14: ffff9205ccaa3f30 R15: ffffb131400f3e0c
[ 2362.409842] FS: 0000000000000000(0000) GS:ffff9208ea500000(0000) knlGS:0000000000000000
[ 2362.409860] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2362.409876] CR2: 00007669c9c46000 CR3: 0000000277222000 CR4: 0000000000f50ef0
[ 2362.409894] PKRU: 55555554
[ 2362.409909] Call Trace:
[ 2362.409940] <TASK>
[ 2362.409954] ? dmub_psr_get_state+0xb1/0xd0 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
[ 2362.412756] ? __warn.cold+0x93/0xf6
[ 2362.412791] ? dmub_psr_get_state+0xb1/0xd0 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
[ 2362.415414] ? report_bug+0xff/0x140
[ 2362.415454] ? handle_bug+0x58/0x90
[ 2362.415480] ? exc_invalid_op+0x17/0x70
[ 2362.415507] ? asm_exc_invalid_op+0x1a/0x20
[ 2362.415550] ? dmub_psr_get_state+0xb1/0xd0 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
[ 2362.418441] ? dmub_psr_get_state+0x53/0xd0 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
[ 2362.421422] dmub_psr_enable+0xc7/0x110 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
[ 2362.424120] edp_set_psr_allow_active+0x280/0x3b0 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
[ 2362.426845] amdgpu_dm_psr_disable+0x5b/0x80 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
[ 2362.429409] amdgpu_dm_crtc_vblank_control_worker+0x257/0x260 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
[ 2362.431946] process_one_work+0x17e/0x330
[ 2362.431986] worker_thread+0x2ce/0x3f0
[ 2362.432020] ? __pfx_worker_thread+0x10/0x10
[ 2362.432044] kthread+0xd2/0x100
[ 2362.432074] ? __pfx_kthread+0x10/0x10
[ 2362.432106] ret_from_fork+0x34/0x50
[ 2362.432130] ? __pfx_kthread+0x10/0x10
[ 2362.432159] ret_from_fork_asm+0x1a/0x30
[ 2362.432213] </TASK>
[ 2362.432225] ---[ end trace 0000000000000000 ]---
Offline
PSR issues are somewhat common on the 6.12 kernels but this is the first time I'm seeing an actual stacktrace in relation. Anyhow, adding "amdgpu.dcdebugmask=0x10" added to your kernel parameters should disable PSR in general at a small power saving hit. you might want to follow: https://bbs.archlinux.org/viewtopic.php?id=301280
Should that not help then we'll likely need more context like a full journal and what it is you're doing, whether a specific workload can trigger it or what have you.
Last edited by V1del (2025-01-30 17:56:09)
Offline
Pages: 1