You are not logged in.
I am seeing thousands of lines like these show up in journalctl and dmesg, so much so that earlier dmesg boot events are lost off the top. My gpu is an AMD RX560 running under kernel 6.12.9 and KDE. Probably doesn't matter but the CPU is a 9900x on an MSI x870 Tomahawk motherboard with 32G of RAM. The messages usually come in bursts. The one below shown at 10:40:51 showed up 21 times that second.
Jan 19 10:39:12 ryzen kernel: [drm] scheduler comp_1.0.2 is not ready, skipping
Jan 19 10:39:35 ryzen kernel: [drm] scheduler comp_1.0.2 is not ready, skipping
Jan 19 10:39:35 ryzen kernel: [drm] scheduler comp_1.0.2 is not ready, skipping
Jan 19 10:39:35 ryzen kernel: [drm] scheduler comp_1.0.2 is not ready, skipping
Jan 19 10:39:35 ryzen kernel: [drm] scheduler comp_1.0.2 is not ready, skipping
Jan 19 10:39:35 ryzen kernel: [drm] scheduler comp_1.0.2 is not ready, skipping
Jan 19 10:39:35 ryzen kernel: [drm] scheduler comp_1.0.2 is not ready, skipping
Jan 19 10:39:49 ryzen kernel: [drm] scheduler comp_1.0.2 is not ready, skipping
Jan 19 10:40:51 ryzen kernel: [drm] scheduler comp_1.0.2 is not ready, skipping
<snipped 19 entries at this same time>
Jan 19 10:40:51 ryzen kernel: [drm] scheduler comp_1.0.2 is not ready, skipping
Jan 19 10:42:38 ryzen kernel: [drm] scheduler comp_1.0.2 is not ready, skipping
Jan 19 10:42:39 ryzen kernel: [drm] scheduler comp_1.0.2 is not ready, skipping
I've searched around but usually these messages show up around more serious events and not just by themselves in isolation. Anybody have any ideas.
I had seen some suggestions that amdgpu.runpm=0 might help, but no luck.
Jan 19 07:47:21 ryzen kernel: Linux version 6.12.9-arch1-1 (linux@archlinux) (gcc (GCC) 14.2.1 20240910, GNU ld (GNU Binutils) 2.43.1) #1 SMP PREEMPT_DYNAMIC Fri, 10 Jan 202>
Jan 19 07:47:21 ryzen kernel: Command line: root=/dev/nvme0n1p5 rw initrd=\initramfs-linux.img module_blacklist=nouveau amdgpu.runpm=0
During boot, I get messages like these that tells me these messages are indeed from my amd gpu
Jan 19 07:47:21 ryzen kernel: amdgpu 0000:71:00.0: amdgpu: SE 1, SH per SE 1, CU per SH 2, active_cu_number 2
Jan 19 07:47:21 ryzen kernel: amdgpu 0000:71:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
Jan 19 07:47:21 ryzen kernel: amdgpu 0000:71:00.0: amdgpu: ring gfx_0.1.0 uses VM inv eng 1 on hub 0
Jan 19 07:47:21 ryzen kernel: amdgpu 0000:71:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 4 on hub 0
Jan 19 07:47:21 ryzen kernel: amdgpu 0000:71:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 5 on hub 0
Jan 19 07:47:21 ryzen kernel: amdgpu 0000:71:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
Jan 19 07:47:21 ryzen kernel: amdgpu 0000:71:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
Jan 19 07:47:21 ryzen kernel: amdgpu 0000:71:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
Jan 19 07:47:21 ryzen kernel: amdgpu 0000:71:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
Jan 19 07:47:21 ryzen kernel: amdgpu 0000:71:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
Jan 19 07:47:21 ryzen kernel: amdgpu 0000:71:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
Jan 19 07:47:21 ryzen kernel: amdgpu 0000:71:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 12 on hub 0
Jan 19 07:47:21 ryzen kernel: amdgpu 0000:71:00.0: amdgpu: ring sdma0 uses VM inv eng 13 on hub 0
Jan 19 07:47:21 ryzen kernel: amdgpu 0000:71:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 8
Jan 19 07:47:21 ryzen kernel: amdgpu 0000:71:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 8
Jan 19 07:47:21 ryzen kernel: amdgpu 0000:71:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 8
Jan 19 07:47:21 ryzen kernel: amdgpu 0000:71:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 8
Jan 19 07:47:21 ryzen kernel: amdgpu 0000:71:00.0: amdgpu: runtime pm is manually disabled
Jan 19 07:47:21 ryzen kernel: amdgpu 0000:71:00.0: amdgpu: Runtime PM not available
Last edited by DeKay (2025-01-20 14:44:56)
Offline
There's been a bunch of fixes for amdgpu and similar messages in 6.12.10 that's just been released, can you reproduce there?
Offline
There's been a bunch of fixes for amdgpu and similar messages in 6.12.10 that's just been released, can you reproduce there?
Thanks for the heads up! I'll give it a try and report back.
Offline
There's been a bunch of fixes for amdgpu and similar messages in 6.12.10 that's just been released, can you reproduce there?
Wow! 6.12.10 cleaned them all up! This is fantastic! Thank you so much @V1del. I'll mark this as [Solved].
Offline
I am seeing thousands of lines like these show up in journalctl and dmesg, so much so that earlier dmesg boot events are lost off the top.
Interesting. Shouldn't the messages be getting disabled/silenced after they get spammy? That is, after they have been printed a certain number of times?
Offline
ARGH! They're back. I think the problem is caused by putting the PC to sleep. It was fine up until last night when I put it to sleep. I woke it up this morning and the messages are back with one small difference: the number bounces back and forth between 1.0.3 and 1.0.1. There were over 20 messages each at 08:14:00 and 08:14:02.
<snip>
Jan 20 08:14:00 ryzen kernel: [drm] scheduler comp_1.0.3 is not ready, skipping
Jan 20 08:14:00 ryzen kernel: [drm] scheduler comp_1.0.1 is not ready, skipping
Jan 20 08:14:00 ryzen kernel: [drm] scheduler comp_1.0.3 is not ready, skipping
Jan 20 08:14:02 ryzen kernel: [drm] scheduler comp_1.0.1 is not ready, skipping
Jan 20 08:14:02 ryzen kernel: [drm] scheduler comp_1.0.3 is not ready, skipping
Jan 20 08:14:02 ryzen kernel: [drm] scheduler comp_1.0.1 is not ready, skipping
Jan 20 08:14:02 ryzen kernel: [drm] scheduler comp_1.0.3 is not ready, skipping
<snip>
This appears to be the cause of the problem: the messages start appearing not long after this. There are actually two of these WARNING's that show up back to back. I've snipped the second.
Jan 20 07:44:38 ryzen kernel: amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring comp_1.0.1 test failed (-110)
Jan 20 07:44:38 ryzen kernel: usb 1-2: reset high-speed USB device number 2 using xhci_hcd
Jan 20 07:44:38 ryzen kernel: amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring comp_1.0.3 test failed (-110)
Jan 20 07:44:38 ryzen kernel: [drm] UVD and UVD ENC initialized successfully.
Jan 20 07:44:38 ryzen kernel: usb 1-5: reset high-speed USB device number 4 using xhci_hcd
Jan 20 07:44:38 ryzen kernel: [drm] VCE initialized successfully.
Jan 20 07:44:38 ryzen kernel: ------------[ cut here ]------------
Jan 20 07:44:38 ryzen kernel: WARNING: CPU: 12 PID: 11871 at drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:100 generic_reg_update_ex+0x1d2/0x290 [amdgpu]
Jan 20 07:44:38 ryzen kernel: Modules linked in: uas usb_storage snd_seq_dummy snd_hrtimer snd_seq ccm michael_mic vfat fat qrtr_mhi amd_atl intel_rapl_msr intel_rapl_common snd_hda_codec_hdmi qrtr ath12k kvm_amd snd_hda_intel qmi_helpers raid1 mac80211 kvm btusb btrtl libarc4 btintel crct10dif_pclmul btbcm crc32_pclmul btmtk polyval_clmulni polyval_generic cfg80211 snd_>
Jan 20 07:44:38 ryzen kernel: amdxcp i2c_algo_bit drm_ttm_helper ttm drm_suballoc_helper drm_exec gpu_sched drm_buddy nvme drm_display_helper crc32c_intel nvme_core cec video nvme_auth crc16 wmi
Jan 20 07:44:38 ryzen kernel: CPU: 12 UID: 0 PID: 11871 Comm: kworker/u97:122 Not tainted 6.12.10-arch1-1 #1 ac0cff2c6581af0a10f6e278cbc98026cc1e3dec
Jan 20 07:44:38 ryzen kernel: Hardware name: Micro-Star International Co., Ltd. MS-7E51/MAG X870 TOMAHAWK WIFI (MS-7E51), BIOS 1.A20 12/17/2024
Jan 20 07:44:38 ryzen kernel: Workqueue: async async_run_entry_fn
Jan 20 07:44:38 ryzen kernel: RIP: 0010:generic_reg_update_ex+0x1d2/0x290 [amdgpu]
Jan 20 07:44:38 ryzen kernel: Code: 97 15 74 4a 41 c7 47 0c 00 00 00 00 48 8d 04 40 49 8d 04 87 44 89 60 15 44 89 68 19 44 89 70 1d 41 83 47 54 01 e9 7b ff ff ff <0f> 0b e9 87 fe ff ff 44 89 f2 44 89 e6 48 89 df e8 f9 fc ff ff 84
Jan 20 07:44:38 ryzen kernel: RSP: 0018:ffffa1a04bd3b9b8 EFLAGS: 00010246
Jan 20 07:44:38 ryzen kernel: RAX: ffffa1a04bd3b9e0 RBX: ffff8d660ead0200 RCX: 0000000000000000
Jan 20 07:44:38 ryzen kernel: RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff8d660ead0200
Jan 20 07:44:38 ryzen kernel: RBP: ffffa1a04bd3ba38 R08: 0000000000000000 R09: 0000000000000064
Jan 20 07:44:38 ryzen kernel: R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
Jan 20 07:44:38 ryzen kernel: R13: 0000000000000000 R14: ffff8d661a800000 R15: ffff8d66011b9800
Jan 20 07:44:38 ryzen kernel: FS: 0000000000000000(0000) GS:ffff8d6d4e000000(0000) knlGS:0000000000000000
Jan 20 07:44:38 ryzen kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 20 07:44:38 ryzen kernel: CR2: 0000000000000000 CR3: 000000001b022000 CR4: 0000000000f50ef0
Jan 20 07:44:38 ryzen kernel: PKRU: 55555554
Jan 20 07:44:38 ryzen kernel: Call Trace:
Jan 20 07:44:38 ryzen kernel: <TASK>
Jan 20 07:44:38 ryzen kernel: ? generic_reg_update_ex+0x1d2/0x290 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
Jan 20 07:44:38 ryzen kernel: ? __warn.cold+0x93/0xf6
Jan 20 07:44:38 ryzen kernel: ? generic_reg_update_ex+0x1d2/0x290 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
Jan 20 07:44:38 ryzen kernel: ? report_bug+0xff/0x140
Jan 20 07:44:38 ryzen kernel: ? handle_bug+0x58/0x90
Jan 20 07:44:38 ryzen kernel: ? exc_invalid_op+0x17/0x70
Jan 20 07:44:38 ryzen kernel: ? asm_exc_invalid_op+0x1a/0x20
Jan 20 07:44:38 ryzen kernel: ? generic_reg_update_ex+0x1d2/0x290 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
Jan 20 07:44:38 ryzen kernel: ? dm_read_reg_func+0x57/0xe0 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
Jan 20 07:44:38 ryzen kernel: ? generic_reg_get2+0x26/0x50 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
Jan 20 07:44:38 ryzen kernel: dce_aux_configure_timeout+0x100/0x220 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
Jan 20 07:44:38 ryzen kernel: try_to_configure_aux_timeout+0x7a/0xe0 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
Jan 20 07:44:38 ryzen kernel: retrieve_link_cap+0x71/0xda0 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
Jan 20 07:44:38 ryzen kernel: detect_link_and_local_sink+0xbe4/0x1090 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
Jan 20 07:44:38 ryzen kernel: link_detect+0x38/0x4e0 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
Jan 20 07:44:38 ryzen kernel: ? dal_gpio_destroy_irq+0x25/0x40 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
Jan 20 07:44:38 ryzen kernel: ? query_hpd_status+0x6e/0xa0 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
Jan 20 07:44:38 ryzen kernel: dm_resume+0x1fd/0x7b0 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
Jan 20 07:44:38 ryzen kernel: amdgpu_device_ip_resume_phase3+0x72/0xd0 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
Jan 20 07:44:38 ryzen kernel: amdgpu_device_resume+0xaa/0x2c0 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
Jan 20 07:44:38 ryzen kernel: amdgpu_pmops_resume+0x4a/0x80 [amdgpu 339295965688ae1ec04f4a1eca4d6cba2ce4f47c]
Jan 20 07:44:38 ryzen kernel: ? __pfx_pci_pm_resume+0x10/0x10
Jan 20 07:44:38 ryzen kernel: dpm_run_callback+0x47/0x150
Jan 20 07:44:38 ryzen kernel: device_resume+0xb0/0x280
Jan 20 07:44:38 ryzen kernel: async_resume+0x1d/0x30
Jan 20 07:44:38 ryzen kernel: async_run_entry_fn+0x31/0x140
Jan 20 07:44:38 ryzen kernel: process_one_work+0x17b/0x330
Jan 20 07:44:38 ryzen kernel: worker_thread+0x2ce/0x3f0
Jan 20 07:44:38 ryzen kernel: ? __pfx_worker_thread+0x10/0x10
Jan 20 07:44:38 ryzen kernel: kthread+0xcf/0x100
Jan 20 07:44:38 ryzen kernel: ? __pfx_kthread+0x10/0x10
Jan 20 07:44:38 ryzen kernel: ret_from_fork+0x31/0x50
Jan 20 07:44:38 ryzen kernel: ? __pfx_kthread+0x10/0x10
Jan 20 07:44:38 ryzen kernel: ret_from_fork_asm+0x1a/0x30
Jan 20 07:44:38 ryzen kernel: </TASK>
Jan 20 07:44:38 ryzen kernel: ---[ end trace 0000000000000000 ]---
Jan 20 07:44:38 ryzen kernel: ------------[ cut here ]------------
Any ideas what I can do here to recover from this without rebooting?
Offline
So far so good, it's not clear to me what is wrong with your setup beyond spamming the journal.
Though, these kinds of spam can be quite expensive, AFAIK to the extent that it might slow device or GPU or whatever activity by a huge margin.
Last edited by ReDress (2025-01-20 14:59:59)
Offline
Last edited by DeKay (2025-01-20 16:33:59)
Offline
Which kernel introduced the issue? Is the issue still present in 6.13?
sudo pacman -U https://pkgbuild.com/\~gromit/linux-bisection-kernels/linux-mainline-6.13-1-x86_64.pkg.tar.zst
Do you need help performing the bisection requested by upstream?
Offline
Seems like the stack trace is pointing here:
uint32_t generic_reg_update_ex(const struct dc_context *ctx,
uint32_t addr, int n,
uint8_t shift1, uint32_t mask1, uint32_t field_value1,
...)
{
struct dc_reg_value_masks field_value_mask = {0};
uint32_t reg_val;
va_list ap;va_start(ap, field_value1);
set_reg_field_values(&field_value_mask, addr, n, shift1, mask1,
field_value1, ap);va_end(ap);
if (ctx->dmub_srv &&
ctx->dmub_srv->reg_helper_offload.gather_in_progress)
return dmub_reg_value_pack(ctx, addr, &field_value_mask);
/* todo: return void so we can decouple code running in driver from register states *//* mmio write directly */
reg_val = dm_read_reg(ctx, addr);
reg_val = (reg_val & ~field_value_mask.mask) | field_value_mask.value;
dm_write_reg(ctx, addr, reg_val);
return reg_val;
}
It doesn't look like there's much there, though so maybe hardware write is failing
Offline
Jan 20 07:44:38 ryzen kernel: WARNING: CPU: 12 PID: 11871 at drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:100 generic_reg_update_ex+0x1d2/0x290 [amdgpu]
To me that points to https://github.com/archlinux/linux/blob … per.c#L100
I'm not knowledgeable enough to bisect the issue, unfortunately.
What help do you need with the bisection Dekay?
Offline
DeKay wrote:I'm not knowledgeable enough to bisect the issue, unfortunately.
What help do you need with the bisection Dekay?
Either way, that sounds like it would be a great idea.
Offline
@loqs @ReDress
Thanks for chiming in! I hope it doesn't look like I was ignoring your comments. I thought I was subscribed to the topic but guess not.
As I mentioned in the bug report, I don't really know when the problem started to happen. It is a new build with an old GPU and I hadn't been looking much at dmesg or journalctl on the old build because it was running well. So that leaves me unclear of even what the starting bound of the bisection would be. I'm also far from guru level: I've never bisected any code before, never mind the kernel, and I can see this being a drawn out affair to try to reproduce each build. I'm also a little uneasy about diving into unreleased kernels like 6.13, fearing I might mess my system up in the process.
My time is also getting chewed up on other higher priority stuff at the moment. Having said all that, maybe I'll give it a shot sometime down the road if my time frees up and upstream doesn't solve it. Can you point me to a good resource on how I'd go about the bisection with an uncertain starting point?
EDIT: a confounding factor is that my X870 motherboard is very recent and its onboard peripherals needing modules like ath12k don't play well with older kernels :-(
Last edited by DeKay (2025-01-22 16:44:32)
Offline
I have exactly the same issue with AMD CPU / GPU being reported in dmesg
AMD Ryzen 7 5800X
AMD Radeon RX 580
Happy to help debug if needed I can.
BTW: I'm a CPU engineer, and so I am used to Linux in general, but new to Arch. I've been a Kubuntu user for years and finally have taken the plunge (which was not as hard as I was expecting).
I have never heard of "Bisection Dekay or Delay"... but willing to learn...
Last edited by Paul_Chaffey (2025-01-22 22:21:16)
Offline
I have exactly the same issue with AMD CPU / GPU being reported in dmesg
...
I have never heard of "Bisection Dekay or Delay"... but willing to learn...
"DeKay" is my user name. "Bisection" is a reference to using "git bisect" to determine the commit that introduced the bug.
It would be fantastic if you took a shot at this @Paul_Chaffey. Please add to the bug report I linked with anything you find. Even an "I see this bug as well" to start with might not hurt so they know it isn't just a one-off with my system.
Offline
OK, this occurs when resuming from suspended power state. Here's the dmesg output:
[12762.034284] amdgpu 0000:26:00.0: amdgpu: PCI CONFIG reset
[12762.071637] ACPI: PM: Preparing to enter system sleep state S3
[12762.388086] ACPI: PM: Saving platform NVS memory
[12762.388288] Disabling non-boot CPUs ...
[12762.390088] smpboot: CPU 15 is now offline
[12762.392259] smpboot: CPU 14 is now offline
[12762.394490] smpboot: CPU 13 is now offline
[12762.396595] smpboot: CPU 12 is now offline
[12762.398752] smpboot: CPU 11 is now offline
[12762.400887] smpboot: CPU 10 is now offline
[12762.403034] smpboot: CPU 9 is now offline
[12762.405202] smpboot: CPU 8 is now offline
[12762.405767] Spectre V2 : Update user space SMT mitigation: STIBP off
[12762.407283] smpboot: CPU 7 is now offline
[12762.409282] smpboot: CPU 6 is now offline
[12762.411349] smpboot: CPU 5 is now offline
[12762.413516] smpboot: CPU 4 is now offline
[12762.415433] smpboot: CPU 3 is now offline
[12762.417424] smpboot: CPU 2 is now offline
[12762.419359] smpboot: CPU 1 is now offline
[12762.420425] ACPI: PM: Low-level resume complete
[12762.420450] ACPI: PM: Restoring platform NVS memory
[12762.420777] LVT offset 0 assigned for vector 0x400
[12762.421664] Enabling non-boot CPUs ...
[12762.421733] smpboot: Booting Node 0 Processor 1 APIC 0x2
[12762.434351] CPU1 is up
[12762.434403] smpboot: Booting Node 0 Processor 2 APIC 0x4
[12762.439537] CPU2 is up
[12762.439600] smpboot: Booting Node 0 Processor 3 APIC 0x6
[12762.446420] CPU3 is up
[12762.446474] smpboot: Booting Node 0 Processor 4 APIC 0x8
[12762.453741] CPU4 is up
[12762.453801] smpboot: Booting Node 0 Processor 5 APIC 0xa
[12762.460470] CPU5 is up
[12762.460513] smpboot: Booting Node 0 Processor 6 APIC 0xc
[12762.465987] CPU6 is up
[12762.466054] smpboot: Booting Node 0 Processor 7 APIC 0xe
[12762.470367] CPU7 is up
[12762.470410] smpboot: Booting Node 0 Processor 8 APIC 0x1
[12762.476488] Spectre V2 : Update user space SMT mitigation: STIBP always-on
[12762.476492] CPU8 is up
[12762.476534] smpboot: Booting Node 0 Processor 9 APIC 0x3
[12762.482821] CPU9 is up
[12762.482881] smpboot: Booting Node 0 Processor 10 APIC 0x5
[12762.486824] CPU10 is up
[12762.486871] smpboot: Booting Node 0 Processor 11 APIC 0x7
[12762.492860] CPU11 is up
[12762.492911] smpboot: Booting Node 0 Processor 12 APIC 0x9
[12762.497808] CPU12 is up
[12762.497900] smpboot: Booting Node 0 Processor 13 APIC 0xb
[12762.502939] CPU13 is up
[12762.502991] smpboot: Booting Node 0 Processor 14 APIC 0xd
[12762.510471] CPU14 is up
[12762.510513] smpboot: Booting Node 0 Processor 15 APIC 0xf
[12762.516147] CPU15 is up
[12762.517949] ACPI: PM: Waking up from system sleep state S3
[12762.521375] xhci_hcd 0000:03:00.0: xHC error in resume, USBSTS 0x401, Reinit
[12762.521378] usb usb1: root hub lost power or was reset
[12762.521380] usb usb2: root hub lost power or was reset
[12762.522343] serial 00:04: activated
[12762.578957] nvme nvme1: D3 entry latency set to 8 seconds
[12762.595994] nvme nvme1: 16/0/0 default/read/poll queues
[12762.652776] [drm] PCIE GART of 256M enabled (table at 0x000000F400380000).
[12762.745530] usb 3-2.1.3: reset full-speed USB device number 7 using xhci_hcd
[12762.831454] ata5: SATA link down (SStatus 0 SControl 330)
[12762.831714] ata6: SATA link down (SStatus 0 SControl 330)
[12762.831821] ata1: SATA link down (SStatus 0 SControl 330)
[12762.831847] ata2: SATA link down (SStatus 0 SControl 330)
[12762.848690] nvme nvme0: 8/0/0 default/read/poll queues
[12762.851387] nvme nvme0: Ignoring bogus Namespace Identifiers
[12762.952064] amdgpu 0000:26:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring comp_1.2.0 test failed (-110)
[12763.200375] amdgpu 0000:26:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring comp_1.2.1 test failed (-110)
[12763.267334] [drm] UVD and UVD ENC initialized successfully.
[12763.368419] [drm] VCE initialized successfully.
[12767.833833] usb 3-2.2: reset full-speed USB device number 8 using xhci_hcd
[12773.029487] usb 3-2.2: PM: dpm_run_callback(): usb_dev_resume returns -5
[12773.029499] usb 3-2.2: PM: failed to resume async: error -5
[12773.030814] OOM killer enabled.
[12773.030816] Restarting tasks ... done.
[12773.031301] random: crng reseeded on system resumption
[12773.031309] PM: suspend exit
[12773.031631] usb 3-2.2: USB disconnect, device number 8
[12773.032747] Bluetooth: hci0: RTL: examining hci_ver=0a hci_rev=000b lmp_ver=0a lmp_subver=8761
[12773.033751] Bluetooth: hci0: RTL: rom_version status=0 version=1
[12773.033756] Bluetooth: hci0: RTL: loading rtl_bt/rtl8761bu_fw.bin
[12773.034047] Bluetooth: hci0: RTL: loading rtl_bt/rtl8761bu_config.bin
[12773.034111] Bluetooth: hci0: RTL: cfg_sz 6, total sz 30210
[12773.068622] Generic FE-GE Realtek PHY r8169-0-2200:00: attached PHY driver (mii_bus:phy_addr=r8169-0-2200:00, irq=MAC)
[12773.070010] [drm] scheduler comp_1.2.0 is not ready, skipping
[12773.070014] [drm] scheduler comp_1.2.1 is not ready, skipping
[12773.131854] [drm] scheduler comp_1.2.0 is not ready, skipping
[12773.131858] [drm] scheduler comp_1.2.1 is not ready, skipping
[12773.132641] [drm] scheduler comp_1.2.0 is not ready, skipping
[12773.132644] [drm] scheduler comp_1.2.1 is not ready, skipping
[12773.132769] [drm] scheduler comp_1.2.0 is not ready, skipping
[12773.132771] [drm] scheduler comp_1.2.1 is not ready, skipping
[12773.150266] [drm] scheduler comp_1.2.0 is not ready, skipping
[12773.150269] [drm] scheduler comp_1.2.1 is not ready, skipping
[12773.160489] [drm] scheduler comp_1.2.0 is not ready, skipping
[12773.160492] [drm] scheduler comp_1.2.1 is not ready, skipping
[12773.178108] [drm] scheduler comp_1.2.0 is not ready, skipping
[12773.178111] [drm] scheduler comp_1.2.1 is not ready, skipping
[12773.185735] Bluetooth: hci0: RTL: fw version 0xdfc6d922
[12773.245219] usb 3-2.2: new full-speed USB device number 9 using xhci_hcd
[12773.245315] r8169 0000:22:00.0 enp34s0: Link is Down
[12773.252876] Bluetooth: MGMT ver 1.23
[12773.282315] [drm] scheduler comp_1.2.0 is not ready, skipping
[12773.282317] [drm] scheduler comp_1.2.1 is not ready, skipping
[12773.389553] usb 3-2.2: New USB device found, idVendor=15a2, idProduct=0300, bcdDevice= 0.01
[12773.389558] usb 3-2.2: New USB device strings: Mfr=0, Product=2, SerialNumber=3
[12773.389561] usb 3-2.2: Product: USB PnP Audio Device
[12773.389563] usb 3-2.2: SerialNumber: 20200601000001
[12773.393323] [drm] scheduler comp_1.2.0 is not ready, skipping
[12773.393325] [drm] scheduler comp_1.2.1 is not ready, skipping
[12773.439622] input: USB PnP Audio Device as /devices/pci0000:00/0000:00:08.1/0000:28:00.3/usb3/3-2/3-2.2/3-2.2:1.0/0003:15A2:0300.000B/input/input32
[12773.491996] hid-generic 0003:15A2:0300.000B: input,hidraw0: USB HID v2.01 Device [USB PnP Audio Device] on usb-0000:28:00.3-2.2/input0
[12773.497571] [drm] scheduler comp_1.2.0 is not ready, skipping
[12773.497573] [drm] scheduler comp_1.2.1 is not ready, skipping
[12776.940236] r8169 0000:22:00.0 enp34s0: Link is Up - 1Gbps/Full - flow control rx/tx
[12778.475067] [drm] scheduler comp_1.2.0 is not ready, skipping
[12778.475072] [drm] scheduler comp_1.2.1 is not ready, skipping
[12778.475688] [drm] scheduler comp_1.2.0 is not ready, skipping
Last edited by Paul_Chaffey (2025-01-23 08:41:38)
Offline
@Paul_Chaffey - If you wrap your log in code tags, it doesn't get to span the whole page like it currently does without the code wrap
Offline
Adding `pci=nommconf` to my kernel parameters solved the issue for me.
Edit: System info:
CPU: AMD 2700X
GPU: AMD RX580
Kernel: 6.12.10
OS: Arch Linux
Last edited by mehdi (2025-01-27 10:07:28)
Offline
Adding `pci=nommconf` to my kernel parameters solved the issue for me.
Thanks for posting this @mehdi but unfortunately this didn't help me :-(
[dk@ryzen ~]$ cat /proc/cmdline
root=/dev/nvme0n1p5 rw initrd=\initramfs-linux.img module_blacklist=nouveau pci=nommconf
EDIT: and adding that also breaks my GPU passthrough setup for my Windows VM.
Last edited by DeKay (2025-02-05 15:54:16)
Offline
mehdi wrote:Adding `pci=nommconf` to my kernel parameters solved the issue for me.
Thanks for posting this @mehdi but unfortunately this didn't help me :-(
Well, I have to retract my words. Everything is back So pci=nommconf had nothing to do with it.
Offline
Hi, fresh archlinux install from yesterday here!
I'm getting the same messages after a suspend:
First,
kernel: amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring comp_1.2.1 test failed (-110)
then
kernel: [drm] scheduler comp_1.2.1 is not ready, skipping
spamming.
My setup:
Graphics:
Device-1: Intel Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics
driver: i915 v: kernel
Device-2: Advanced Micro Devices [AMD/ATI] Ellesmere [Radeon Pro WX 5100]
driver: amdgpu v: kernel
Display: wayland server: Xwayland v: 24.1.4 compositor: Sway v: 1.10.1
driver: gpu: amdgpu resolution: 3840x2160~60Hz
API: Vulkan v: 1.4.303 drivers: N/A surfaces: xcb,xlib,wayland
Kernel: 6.12.10-arch1-1 x86_64
Last edited by elgmizik (2025-01-31 00:11:28)
Offline
In my case too it happens after resuming from suspend.
Kernel: 6.12.10-arch1-1
GPU: AMD Radeon RX 560
Wayland Native on Sway 1.10.1
Offline
My setup:
<snip>
Device-2: Advanced Micro Devices [AMD/ATI] Ellesmere [Radeon Pro WX 5100]
driver: amdgpu v: kernel
Display: wayland server: Xwayland v: 24.1.4 compositor: Sway v: 1.10.1
driver: gpu: amdgpu resolution: 3840x2160~60Hz
API: Vulkan v: 1.4.303 drivers: N/A surfaces: xcb,xlib,wayland
Kernel: 6.12.10-arch1-1 x86_64
WX 5100 is Polaris. That is the common denominator so far.
Offline
I'm seeing this issue on my RX 570; another Polaris card.
Possibly relevantly, I had to disable power management and the Display Core driver (by adding the kernel parameters `amdgpu.dpm=0` and `amdgpu.dc=0`) in order to prevent my RX 570 from losing its mind (heavily corrupted graphics), especially when waking from sleep. Is anyone else who's having this issue using `amdgpu` kernel parameters?
I've also been seeing an infrequent display flicker where what looks to be a terminal or console replaces the entire screen for a single frame. I need to do some more testing, but I'm starting to think that they occur at the same time as the log gets spammed with those `[drm] scheduler comp_X.Y.Z is not ready, skipping` messages.
Offline
Possibly relevantly, I had to disable power management and the Display Core driver (by adding the kernel parameters `amdgpu.dpm=0` and `amdgpu.dc=0`) in order to prevent my RX 570 from losing its mind (heavily corrupted graphics), especially when waking from sleep. Is anyone else who's having this issue using `amdgpu` kernel parameters?
I tried with and without `amdgpu.runpm=0 ` and it didn't make a difference. I didn't try your particular parameters. Next thing to try is to see if the 6.13 kernel that just went into stable makes any difference whatsoever.
I will also sometimes see my display flicker like it can't get sync sometimes when waking from sleep. It can happen too when my monitor is blanked from a bit from inactivity. I have this script that is run from an icon on my desktop to turn off the screen. When I shake my mouse again to turn it back on, it usually re-syncs.
/bin/sleep 1 && /bin/dbus-send --session --print-reply --dest=org.kde.kglobalaccel /component/org_kde_powerdevil org.kde.kglobalaccel.Component.invokeShortcut string:'Turn Off Screen'
Offline