You are not logged in.
This seems to happen only when the transition is "automatic". I have found only one such case as I do not have any thing else of the like to plug into my USB-C ports: Plugging in the power adapter. If the GPU is in D3cold while i do this, then it emits a dmesg:
kern :info : [ 408.308933] pcieport 0000:00:08.3: PME: Spurious native interrupt!
kern :info : [ 408.702579] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
kern :err : [ 409.999144] ucsi_acpi USBC000:00: unknown error 256
kern :err : [ 409.999149] ucsi_acpi USBC000:00: GET_CABLE_PROPERTY failed (-5)
kern :info : [ 411.528035] pcieport 0000:02:00.0: broken device, retraining non-functional downstream link at 2.5GT/s
kern :info : [ 412.529004] pcieport 0000:02:00.0: retraining failed
kern :info : [ 413.753031] pcieport 0000:02:00.0: broken device, retraining non-functional downstream link at 2.5GT/s
kern :info : [ 414.753353] pcieport 0000:02:00.0: retraining failed
kern :info : [ 414.753367] amdgpu 0000:03:00.0: not ready 1023ms after resume; waiting
kern :info : [ 415.802163] amdgpu 0000:03:00.0: not ready 2047ms after resume; waiting
kern :info : [ 417.913210] amdgpu 0000:03:00.0: not ready 4095ms after resume; waiting
kern :info : [ 422.457311] amdgpu 0000:03:00.0: not ready 8191ms after resume; waiting
kern :info : [ 431.161938] amdgpu 0000:03:00.0: not ready 16383ms after resume; waiting
kern :info : [ 448.057739] amdgpu 0000:03:00.0: not ready 32767ms after resume; waiting
kern :warn : [ 484.410401] amdgpu 0000:03:00.0: not ready 65535ms after resume; giving up
kern :err : [ 484.410448] amdgpu 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible
kern :err : [ 488.871088] [drm:gfxhub_v1_1_get_xgmi_info [amdgpu]] *ERROR* Timeout waiting for sem acquire in VM flush!
kern :err : [ 489.038960] amdgpu 0000:03:00.0: amdgpu: Timeout waiting for VM flush ACK!
kern :err : [ 489.206967] [drm:gfxhub_v1_1_get_xgmi_info [amdgpu]] *ERROR* Timeout waiting for sem acquire in VM flush!
kern :err : [ 489.372283] amdgpu 0000:03:00.0: amdgpu: Timeout waiting for VM flush ACK!
kern :info : [ 489.372294] [drm] PCIE GART of 512M enabled (table at 0x00000081FEB00000).
kern :info : [ 489.372324] amdgpu 0000:03:00.0: amdgpu: PSP is resuming...
kern :info : [ 489.412587] amdgpu 0000:03:00.0: amdgpu: reserve 0x1300000 from 0x81fc000000 for PSP TMR
kern :info : [ 489.412632] amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available
kern :info : [ 489.412673] amdgpu 0000:03:00.0: amdgpu: RAP: optional rap ta ucode is not available
kern :info : [ 489.412676] amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
kern :info : [ 489.412680] amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
kern :info : [ 489.412691] amdgpu 0000:03:00.0: amdgpu: smu driver if version = 0x00000035, smu fw if version = 0x00000040, smu fw program = 0, smu fw version = 0x00525b00 (82.91.0)
kern :info : [ 489.412696] amdgpu 0000:03:00.0: amdgpu: SMU driver if version not matched
kern :info : [ 489.412701] amdgpu 0000:03:00.0: amdgpu: dpm has been disabled
kern :info : [ 489.412708] amdgpu 0000:03:00.0: amdgpu: SMU is resumed successfully!
kern :info : [ 489.412771] [drm] DMUB unsupported on ASIC
kern :err : [ 493.934399] amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
kern :err : [ 494.106518] amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
kern :err : [ 494.278622] amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
kern :err : [ 494.450730] amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
kern :err : [ 494.622738] amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
kern :warn : [ 494.622912] ------------[ cut here ]------------
kern :warn : [ 494.622913] WARNING: CPU: 6 PID: 16809 at drivers/gpu/drm/amd/amdgpu/../display/dc/dcn20/dcn20_hubbub.c:566 hubbub2_get_dchub_ref_freq+0x9e/0xc0 [amdgpu]
kern :warn : [ 494.623032] Modules linked in: rndis_host vfat fat bnep btusb btrtl btintel btbcm btmtk bluetooth crc16 cdc_mbim cdc_wdm cdc_ncm cdc_ether usbnet r8152 mii libphy usbhid snd_seq_dummy snd_hrtimer snd_seq snd_seq_device typec_displayport ccm algif_aead crypto_null des3_ede_x86_64 cbc des_generic libdes algif_skcipher cmac md4 algif_hash af_alg amd_atl intel_rapl_msr intel_rapl_common snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_sof mt7921e snd_sof_utils mt7921_common snd_pci_ps mt792x_lib snd_amd_sdw_acpi soundwire_amd snd_hda_codec_realtek mt76_connac_lib soundwire_generic_allocation kvm_amd soundwire_bus mt76 joydev snd_hda_codec_generic cros_usbpd_charger snd_hda_scodec_component cros_ec_sysfs cros_usbpd_logger cros_ec_chardev mousedev cros_usbpd_notify cros_ec_debugfs gpio_cros_ec snd_hda_codec_hdmi snd_soc_core kvm snd_hda_intel snd_compress mac80211 snd_intel_dspcfg ac97_bus hid_sensor_als snd_intel_sdw_acpi snd_pcm_dmaengine
kern :warn : [ 494.623098] crct10dif_pclmul hid_sensor_trigger snd_hda_codec crc32_pclmul cros_ec_dev industrialio_triggered_buffer snd_rpl_pci_acp6x libarc4 framework_laptop(OE) polyval_clmulni kfifo_buf hid_sensor_iio_common snd_acp_pci polyval_generic snd_hda_core snd_acp_legacy_common gf128mul ghash_clmulni_intel snd_hwdep industrialio snd_pci_acp6x sha512_ssse3 snd_pcm cfg80211 sha1_ssse3 snd_pci_acp5x aesni_intel snd_rn_pci_acp3x snd_timer ucsi_acpi snd_acp_config crypto_simd cros_ec_lpcs snd_soc_acpi snd hid_sensor_hub amd_pmf typec_ucsi sp5100_tco cryptd xhci_pci hid_multitouch hid_generic cros_ec rapl wmi_bmof thunderbolt pcspkr amdtee typec snd_pci_acp3x soundcore xhci_pci_renesas rfkill i2c_piix4 ccp ryzen_smu(OE) roles i2c_hid_acpi amd_sfh platform_profile i2c_hid amd_pmc tee serio mac_hid pkcs8_key_parser i2c_dev sg crypto_user acpi_call(OE) loop nfnetlink ip_tables x_tables btrfs(E) libcrc32c(E) xor(E) raid6_pq(E) crc32c_generic(E) sha256_ssse3(E) crc32c_intel(E) nvme(E) nvme_core(E) nvme_auth(E) amdgpu(E)
kern :warn : [ 494.623166] drm_display_helper(E) video(E) drm_ttm_helper(E) drm_suballoc_helper(E) dm_mod(E) i2c_algo_bit(E) gpu_sched(E) drm_buddy(E) wmi(E) cec(E) ttm(E) drm_exec(E) amdxcp(E)
kern :warn : [ 494.623179] CPU: 6 PID: 16809 Comm: laptop_mode Tainted: G W OE 6.10.1-zen1-1-zen #1 441785709602f5529ff976c16f8d1d1b70253c34
kern :warn : [ 494.623183] Hardware name: Framework Laptop 16 (AMD Ryzen 7040 Series)/FRANMZCP07, BIOS 03.04 07/09/2024
kern :warn : [ 494.623185] RIP: 0010:hubbub2_get_dchub_ref_freq+0x9e/0xc0 [amdgpu]
kern :warn : [ 494.623237] Code: 83 c0 63 ff ff 3d 20 4e 00 00 77 22 89 5d 00 48 8b 44 24 08 65 48 2b 04 25 28 00 00 00 75 24 48 83 c4 10 5b 5d e9 cd 58 ac e9 <0f> 0b eb de 0f 0b eb da d1 eb 8d 83 c0 63 ff ff 3d 20 4e 00 00 76
kern :warn : [ 494.623239] RSP: 0018:ffffb84e4c733908 EFLAGS: 00010246
kern :warn : [ 494.623242] RAX: 0000000000001000 RBX: 00000000000186a0 RCX: 0000000000000000
kern :warn : [ 494.623244] RDX: ffffb84e4c73390c RSI: 00000000000039e5 RDI: ffff8e8b50a80000
kern :warn : [ 494.623245] RBP: ffff8e8b5352d3b0 R08: ffffb84e4c733908 R09: 000000000000000c
kern :warn : [ 494.623246] R10: ffff8e8b41eb4b00 R11: 0000000000000002 R12: ffff8e8b5352d000
kern :warn : [ 494.623248] R13: ffff8e8b48c7a600 R14: ffff8e8b5352d480 R15: 0000000000000000
kern :warn : [ 494.623249] FS: 00007ddff9461b80(0000) GS:ffff8e91c1d00000(0000) knlGS:0000000000000000
kern :warn : [ 494.623251] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kern :warn : [ 494.623252] CR2: 00007ebc48e6ebb0 CR3: 00000002610f0000 CR4: 0000000000f50ef0
kern :warn : [ 494.623254] PKRU: 55555554
kern :warn : [ 494.623256] Call Trace:
kern :warn : [ 494.623258] <TASK>
kern :warn : [ 494.623260] ? hubbub2_get_dchub_ref_freq+0x9e/0xc0 [amdgpu 0000000000000000000000000000000000000000]
kern :warn : [ 494.623302] ? __warn.cold+0x8e/0xf3
kern :warn : [ 494.623308] ? hubbub2_get_dchub_ref_freq+0x9e/0xc0 [amdgpu 0000000000000000000000000000000000000000]
kern :warn : [ 494.623349] ? report_bug+0xe7/0x210
kern :warn : [ 494.623354] ? handle_bug+0x3c/0x80
kern :warn : [ 494.623357] ? exc_invalid_op+0x19/0xc0
kern :warn : [ 494.623359] ? asm_exc_invalid_op+0x1a/0x20
kern :warn : [ 494.623365] ? hubbub2_get_dchub_ref_freq+0x9e/0xc0 [amdgpu 0000000000000000000000000000000000000000]
kern :warn : [ 494.623415] ? dcn32_init_hw+0x162/0x930 [amdgpu 0000000000000000000000000000000000000000]
kern :warn : [ 494.623506] ? dc_set_power_state+0x67/0x5aa0 [amdgpu 0000000000000000000000000000000000000000]
kern :warn : [ 494.623569] ? amdgpu_dm_update_connector_after_detect+0x58e/0x33e0 [amdgpu 0000000000000000000000000000000000000000]
kern :warn : [ 494.623662] ? srso_alias_return_thunk+0x5/0xfbef5
kern :warn : [ 494.623665] ? _dev_info+0x7d/0x98
kern :warn : [ 494.623671] ? amdgpu_file_to_fpriv+0x63f/0x1110 [amdgpu 0000000000000000000000000000000000000000]
kern :warn : [ 494.623705] ? amdgpu_device_resume+0x7c/0x300 [amdgpu 0000000000000000000000000000000000000000]
kern :warn : [ 494.623745] ? amdgpu_drm_ioctl+0x18b/0xe40 [amdgpu 0000000000000000000000000000000000000000]
kern :warn : [ 494.623778] ? __pfx_pci_pm_runtime_resume+0x10/0x10
kern :warn : [ 494.623782] ? __pfx_pci_pm_runtime_resume+0x10/0x10
kern :warn : [ 494.623784] ? __rpm_callback+0x41/0x170
kern :warn : [ 494.623788] ? __pfx_pci_pm_runtime_resume+0x10/0x10
kern :warn : [ 494.623791] ? rpm_resume+0x5bb/0x850
kern :warn : [ 494.623794] ? __pm_runtime_resume+0x4b/0x80
kern :warn : [ 494.623797] ? amdgpu_dpm_get_dpm_clock_table+0x3434/0x8fc0 [amdgpu 0000000000000000000000000000000000000000]
kern :warn : [ 494.623855] ? kernfs_fop_write_iter+0x13e/0x1f0
kern :warn : [ 494.623860] ? vfs_write+0x31d/0x4a0
kern :warn : [ 494.623865] ? __x64_sys_write+0x72/0xf0
kern :warn : [ 494.623868] ? do_syscall_64+0x82/0x190
kern :warn : [ 494.623870] ? do_fcntl+0x3c4/0x7d0
kern :warn : [ 494.623874] ? srso_alias_return_thunk+0x5/0xfbef5
kern :warn : [ 494.623878] ? srso_alias_return_thunk+0x5/0xfbef5
kern :warn : [ 494.623880] ? __count_memcg_events+0x57/0xf0
kern :warn : [ 494.623883] ? srso_alias_return_thunk+0x5/0xfbef5
kern :warn : [ 494.623885] ? handle_mm_fault+0x77c/0x1580
kern :warn : [ 494.623892] ? srso_alias_return_thunk+0x5/0xfbef5
kern :warn : [ 494.623894] ? do_user_addr_fault+0x5d5/0x860
kern :warn : [ 494.623898] ? srso_alias_return_thunk+0x5/0xfbef5
kern :warn : [ 494.623900] ? srso_alias_return_thunk+0x5/0xfbef5
kern :warn : [ 494.623902] ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
kern :warn : [ 494.623907] </TASK>
kern :warn : [ 494.623908] ---[ end trace 0000000000000000 ]---
kern :warn : [ 494.625639] amdgpu 0000:03:00.0: [drm] REG_WAIT timeout 1us * 1000 tries - dcn32_hubp_pg_control line:176
kern :warn : [ 494.627351] amdgpu 0000:03:00.0: [drm] REG_WAIT timeout 1us * 1000 tries - dcn32_hubp_pg_control line:180
kern :warn : [ 494.630081] amdgpu 0000:03:00.0: [drm] REG_WAIT timeout 1us * 1000 tries - dcn32_hubp_pg_control line:184
kern :warn : [ 494.631799] amdgpu 0000:03:00.0: [drm] REG_WAIT timeout 1us * 1000 tries - dcn32_hubp_pg_control line:188
kern :warn : [ 494.633506] amdgpu 0000:03:00.0: [drm] REG_WAIT timeout 1us * 1000 tries - dcn32_dsc_pg_control line:94
kern :warn : [ 494.635214] amdgpu 0000:03:00.0: [drm] REG_WAIT timeout 1us * 1000 tries - dcn32_dsc_pg_control line:102
kern :warn : [ 494.636917] amdgpu 0000:03:00.0: [drm] REG_WAIT timeout 1us * 1000 tries - dcn32_dsc_pg_control line:110
kern :warn : [ 494.638624] amdgpu 0000:03:00.0: [drm] REG_WAIT timeout 1us * 1000 tries - dcn32_dsc_pg_control line:118
kern :err : [ 503.512449] amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
kern :err : [ 503.685228] amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
kern :err : [ 503.856026] amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
kern :err : [ 504.025010] amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
kern :err : [ 504.196994] amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
kern :err : [ 504.367989] amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
kern :err : [ 504.535814] amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
kern :err : [ 505.214517] amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
kern :err : [ 505.385299] amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
kern :err : [ 505.915875] amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
kern :err : [ 506.087814] amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
kern :err : [ 506.767984] amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
kern :err : [ 506.938694] amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
kern :err : [ 507.467697] amdgpu 0000:03:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
kern :err : [ 507.625277] amdgpu 0000:03:00.0: amdgpu: rlc autoload: gc ucode autoload timeout
kern :err : [ 507.625284] amdgpu 0000:03:00.0: amdgpu: (-110) failed to wait rlc autoload complete
kern :err : [ 507.625287] [drm:amdgpu_file_to_fpriv [amdgpu]] *ERROR* resume of IP block <gfx_v11_0> failed -110
kern :err : [ 507.625376] amdgpu 0000:03:00.0: amdgpu: amdgpu_device_ip_resume failed (-110).
kern :info : [ 509.253058] pcieport 0000:02:00.0: Data Link Layer Link Active not set in 1000 msec
.
Booting with the power adapter in forces the dGPU into D0 while the adapter is plugged in.
Fedora 40 LiveUSB doesn't have this behavior, as far as I tested: The GPU can go into D3cold while on external power and does not crash on any plugs-in or unplugs.
If I force the GPU into D0 before plugging in the adapter, then the crash does not happen and the GPU inits normally.
I'm fresh out of ideas on this one, what should I do and/or provide?
Last edited by MangoTCF (2024-07-28 21:21:50)
Offline