You are not logged in.

#1 2018-08-18 14:28:16

Lanz
Member
From: Ontario, Canada
Registered: 2015-01-24
Posts: 11

Ryzen 2700X reproducible gaming crashes in kernel > 4.14.x

I am experiencing consistent, reproducible crashes when running games on my Ryzen 2700X with a Radeon R9-285 on an Asus ROG STRIX X470-F GAMING motherboard (latest firmware as of August 15, 2018).

Kernel: 4.18.1-arch1-1-ARCH #1 SMP PREEMPT Wed Aug 15 21:11:55 UTC 2018 x86_64 GNU/Linux

The system is not overclocked.

1. This is not the idle bug (I have Advanced > AMD CBS > Zen Common Options > Power Supply Idle Control set to "Common current idle" instead of "Auto" and I have idle=nomwait in my Grub command line.
2. This is not the C-state bug (I have the C6 C-state disabled in the BIOS and processor.max_cstate=5 rcu_nocbs=0-15 in my Grub command line).
3. This is not the segfault bug (I tested with the ryzen-test/kill-ryzen.sh script and it was negative).
4. This occurs on any kernel from 4.15 to 4.18.
5. This DOES NOT occur on the kernel provided by the Arch linux-lts package.
6. Googling around finds multiple people having a similar issue even on nVidia graphics cards, and no solution in sight. The things I've tried (listed above) are the only suggestions I was able to find in such threads.

To reproduce the total system freeze, what I do is start Quake 3 Arena, have 2 bots fight each other and spectate one of them, and simply let the machine run until it crashes. It happens in multiple other games but this is the easiest way to reproduce it. Sometimes it will happen immediately, sometimes it can run for 2-3 hours before a crash, but it will always crash eventually. I've been tweaking my BIOS with all sorts of different settings to try to find one that doesn't crash, with no luck.

As it's a full system lock with no way to SSH in, I have no log of the crash. Nothing abnormal appears in dmesg on boot, and nothing abnormal appears in journalctl --since "5 min ago" when I reboot after a hard crash.

Any suggestions would be appreciated.

Last edited by Lanz (2018-08-18 14:29:03)

Offline

#2 2018-08-18 14:49:26

progandy
Member
Registered: 2012-05-17
Posts: 5,202

Re: Ryzen 2700X reproducible gaming crashes in kernel > 4.14.x

You can try to create the ssh connection first and immediately start logging /proc/kmsg.
if that doesn't work, then you can try to set up netconsole
https://wiki.archlinux.org/index.php/Netconsole
https://wiki.ubuntu.com/Kernel/Netconso … Netconsole


| alias CUTF='LANG=en_XX.UTF-8@POSIX ' |

Offline

#3 2018-08-18 15:16:24

Lanz
Member
From: Ontario, Canada
Registered: 2015-01-24
Posts: 11

Re: Ryzen 2700X reproducible gaming crashes in kernel > 4.14.x

progandy wrote:

You can try to create the ssh connection first and immediately start logging /proc/kmsg.
if that doesn't work, then you can try to set up netconsole
https://wiki.archlinux.org/index.php/Netconsole
https://wiki.ubuntu.com/Kernel/Netconso … Netconsole

That was a great idea, progandy. I dunno why I didn't think of it. Here's a dump that I just captured via my Quake 3 method:

<3>[  887.393133] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:43:crtc-0] flip_done timed out
<3>[  897.417205] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:41:plane-5] flip_done timed out
<3>[  897.417243] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* amdgpu_dm_commit_planes: acrtc 0, already busy
<4>[  897.417279] WARNING: CPU: 9 PID: 742 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:4134 amdgpu_dm_atomic_commit_tail+0xb7f/0xd70 [amdgpu]
<4>[  897.417279] Modules linked in: uas usb_storage amdkfd amd_iommu_v2 amdgpu fuse snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi edac_mce_amd kvm_amd chash gpu_sched ttm snd_hda_intel kvm input_leds drm_kms_helper snd_hda_codec snd_hda_core irqbypass joydev mousedev drm crct10dif_pclmul snd_hwdep crc32_pclmul ghash_clmulni_intel snd_pcm pcbc igb eeepc_wmi snd_timer asus_wmi sparse_keymap snd rfkill aesni_intel led_class agpgart mxm_wmi wmi_bmof aes_x86_64 crypto_simd syscopyarea i2c_algo_bit cryptd sysfillrect sp5100_tco sysimgblt glue_helper ccp k10temp dca fb_sys_fops pcspkr soundcore i2c_piix4 rng_core rtc_cmos evdev gpio_amdpt pinctrl_amd wmi mac_hid pcc_cpufreq acpi_cpufreq vboxnetflt(O) vboxnetadp(O) vboxpci(O) vboxdrv(O) crypto_user ip_tables x_tables ext4 crc32c_generic crc16
<4>[  897.417306]  mbcache jbd2 fscrypto sd_mod hid_generic usbhid hid ahci xhci_pci libahci xhci_hcd crc32c_intel libata usbcore scsi_mod usb_common
<4>[  897.417313] CPU: 9 PID: 742 Comm: Xorg Tainted: G           O      4.18.1-arch1-1-ARCH #1
<4>[  897.417314] Hardware name: System manufacturer System Product Name/ROG STRIX X470-F GAMING, BIOS 4018 07/12/2018
<4>[  897.417341] RIP: 0010:amdgpu_dm_atomic_commit_tail+0xb7f/0xd70 [amdgpu]
<4>[  897.417341] Code: 0f 87 ae f6 ff ff e9 46 f7 ff ff 41 8b 95 d0 04 00 00 48 c7 c6 90 2d bb c0 48 89 44 24 38 48 c7 c7 ea c0 c0 c0 e8 b1 18 b6 ff <0f> 0b 48 8b 44 24 38 48 8b 0c 24 e9 05 fb ff ff 4c 8b 71 08 4d 85 
<4>[  897.417357] RSP: 0018:ffffab5e031efb30 EFLAGS: 00010046
<4>[  897.417358] RAX: 0000000000000000 RBX: 0000000000000005 RCX: 0000000000000002
<4>[  897.417359] RDX: 0000000000000000 RSI: ffffffff8ee81166 RDI: 00000000ffffffff
<4>[  897.417359] RBP: ffff97a9f65dc500 R08: ffffffff8e4dd530 R09: 000000000000040e
<4>[  897.417360] R10: 0000000000000004 R11: ffffffff8f603f2d R12: ffff97a9f7ca5800
<4>[  897.417360] R13: ffff97a9fba13800 R14: ffff97a9fba13800 R15: ffff97a99e15d000
<4>[  897.417361] FS:  00007fb9339d6e00(0000) GS:ffff97aa1ee40000(0000) knlGS:0000000000000000
<4>[  897.417362] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  897.417362] CR2: 00005651d103d010 CR3: 00000003feea6000 CR4: 00000000003406e0
<4>[  897.417363] Call Trace:
<4>[  897.417370]  commit_tail+0x3d/0x70 [drm_kms_helper]
<4>[  897.417374]  drm_atomic_helper_commit+0x103/0x110 [drm_kms_helper]
<4>[  897.417378]  drm_atomic_helper_legacy_gamma_set+0x136/0x160 [drm_kms_helper]
<4>[  897.417386]  drm_mode_gamma_set_ioctl+0x184/0x1e0 [drm]
<4>[  897.417393]  ? drm_plane_create_color_properties+0x1c0/0x1c0 [drm]
<4>[  897.417398]  drm_ioctl_kernel+0xa7/0xf0 [drm]
<4>[  897.417403]  drm_ioctl+0x30e/0x3c0 [drm]
<4>[  897.417410]  ? drm_plane_create_color_properties+0x1c0/0x1c0 [drm]
<4>[  897.417412]  ? __ia32_sys_epoll_ctl+0x20/0x20
<4>[  897.417414]  ? timerqueue_add+0x52/0x80
<4>[  897.417431]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
<4>[  897.417433]  do_vfs_ioctl+0xa4/0x620
<4>[  897.417436]  ? syscall_slow_exit_work+0x19b/0x1b0
<4>[  897.417437]  ksys_ioctl+0x60/0x90
<4>[  897.417438]  __x64_sys_ioctl+0x16/0x20
<4>[  897.417439]  do_syscall_64+0x5b/0x170
<4>[  897.417441]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
<4>[  897.417442] RIP: 0033:0x7fb937f9f79b
<4>[  897.417443] Code: 0f 1e fa 48 8b 05 c5 b6 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 95 b6 0c 00 f7 d8 64 89 01 48 
<4>[  897.417458] RSP: 002b:00007fff2434a9a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
<4>[  897.417459] RAX: ffffffffffffffda RBX: 000055f563ea7680 RCX: 00007fb937f9f79b
<4>[  897.417459] RDX: 00007fff2434a9e0 RSI: 00000000c02064a5 RDI: 000000000000000d
<4>[  897.417460] RBP: 00007fff2434a9e0 R08: 000055f563ea7d40 R09: 000055f563ea7f40
<4>[  897.417460] R10: 0000000000000100 R11: 0000000000000246 R12: 00000000c02064a5
<4>[  897.417461] R13: 000000000000000d R14: 000055f563ea70b0 R15: 000055f563ea7b40
<4>[  897.417462] ---[ end trace 905d504a4f5a6e74 ]---
<4>[  897.417494] WARNING: CPU: 9 PID: 742 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:3887 prepare_flip_isr+0x5f/0x70 [amdgpu]
<4>[  897.417494] Modules linked in: uas usb_storage amdkfd amd_iommu_v2 amdgpu fuse snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi edac_mce_amd kvm_amd chash gpu_sched ttm snd_hda_intel kvm input_leds drm_kms_helper snd_hda_codec snd_hda_core irqbypass joydev mousedev drm crct10dif_pclmul snd_hwdep crc32_pclmul ghash_clmulni_intel snd_pcm pcbc igb eeepc_wmi snd_timer asus_wmi sparse_keymap snd rfkill aesni_intel led_class agpgart mxm_wmi wmi_bmof aes_x86_64 crypto_simd syscopyarea i2c_algo_bit cryptd sysfillrect sp5100_tco sysimgblt glue_helper ccp k10temp dca fb_sys_fops pcspkr soundcore i2c_piix4 rng_core rtc_cmos evdev gpio_amdpt pinctrl_amd wmi mac_hid pcc_cpufreq acpi_cpufreq vboxnetflt(O) vboxnetadp(O) vboxpci(O) vboxdrv(O) crypto_user ip_tables x_tables ext4 crc32c_generic crc16
<4>[  897.417511]  mbcache jbd2 fscrypto sd_mod hid_generic usbhid hid ahci xhci_pci libahci xhci_hcd crc32c_intel libata usbcore scsi_mod usb_common
<4>[  897.417515] CPU: 9 PID: 742 Comm: Xorg Tainted: G        W  O      4.18.1-arch1-1-ARCH #1
<4>[  897.417516] Hardware name: System manufacturer System Product Name/ROG STRIX X470-F GAMING, BIOS 4018 07/12/2018
<4>[  897.417542] RIP: 0010:prepare_flip_isr+0x5f/0x70 [amdgpu]
<4>[  897.417542] Code: 00 00 02 00 00 00 48 89 97 a8 07 00 00 48 c7 80 20 02 00 00 00 00 00 00 8b 97 d0 04 00 00 bf 02 00 00 00 e9 13 48 b6 ff 0f 0b <0f> 0b eb b9 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 
<4>[  897.417557] RSP: 0018:ffffab5e031efb28 EFLAGS: 00010082
<4>[  897.417557] RAX: 0000000000000001 RBX: 0000000000000286 RCX: 0000000000000001
<4>[  897.417558] RDX: 0000000000000001 RSI: 0000000000000206 RDI: ffff97a9fba13800
<4>[  897.417558] RBP: ffff97a9f65dc500 R08: ffffffff8e4dd530 R09: 000000000000040e
<4>[  897.417559] R10: 0000000000000004 R11: ffff97aa0c657800 R12: 0000000000000000
<4>[  897.417559] R13: ffff97a9fba13800 R14: ffff97a9fba13800 R15: ffff97a99e15d000
<4>[  897.417560] FS:  00007fb9339d6e00(0000) GS:ffff97aa1ee40000(0000) knlGS:0000000000000000
<4>[  897.417561] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  897.417561] CR2: 00005651d103d010 CR3: 00000003feea6000 CR4: 00000000003406e0
<4>[  897.417561] Call Trace:
<4>[  897.417587]  amdgpu_dm_atomic_commit_tail+0x770/0xd70 [amdgpu]
<4>[  897.417592]  commit_tail+0x3d/0x70 [drm_kms_helper]
<4>[  897.417595]  drm_atomic_helper_commit+0x103/0x110 [drm_kms_helper]
<4>[  897.417598]  drm_atomic_helper_legacy_gamma_set+0x136/0x160 [drm_kms_helper]
<4>[  897.417605]  drm_mode_gamma_set_ioctl+0x184/0x1e0 [drm]
<4>[  897.417611]  ? drm_plane_create_color_properties+0x1c0/0x1c0 [drm]
<4>[  897.417616]  drm_ioctl_kernel+0xa7/0xf0 [drm]
<4>[  897.417621]  drm_ioctl+0x30e/0x3c0 [drm]
<4>[  897.417628]  ? drm_plane_create_color_properties+0x1c0/0x1c0 [drm]
<4>[  897.417629]  ? __ia32_sys_epoll_ctl+0x20/0x20
<4>[  897.417630]  ? timerqueue_add+0x52/0x80
<4>[  897.417647]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
<4>[  897.417648]  do_vfs_ioctl+0xa4/0x620
<4>[  897.417650]  ? syscall_slow_exit_work+0x19b/0x1b0
<4>[  897.417651]  ksys_ioctl+0x60/0x90
<4>[  897.417652]  __x64_sys_ioctl+0x16/0x20
<4>[  897.417653]  do_syscall_64+0x5b/0x170
<4>[  897.417654]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
<4>[  897.417655] RIP: 0033:0x7fb937f9f79b
<4>[  897.417655] Code: 0f 1e fa 48 8b 05 c5 b6 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 95 b6 0c 00 f7 d8 64 89 01 48 
<4>[  897.417670] RSP: 002b:00007fff2434a9a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
<4>[  897.417671] RAX: ffffffffffffffda RBX: 000055f563ea7680 RCX: 00007fb937f9f79b
<4>[  897.417672] RDX: 00007fff2434a9e0 RSI: 00000000c02064a5 RDI: 000000000000000d
<4>[  897.417672] RBP: 00007fff2434a9e0 R08: 000055f563ea7d40 R09: 000055f563ea7f40
<4>[  897.417673] R10: 0000000000000100 R11: 0000000000000246 R12: 00000000c02064a5
<4>[  897.417673] R13: 000000000000000d R14: 000055f563ea70b0 R15: 000055f563ea7b40
<4>[  897.417674] ---[ end trace 905d504a4f5a6e75 ]---

Last edited by Lanz (2018-08-19 00:39:05)

Offline

#4 2018-08-18 17:36:42

loqs
Member
Registered: 2014-03-06
Posts: 17,436

Re: Ryzen 2700X reproducible gaming crashes in kernel > 4.14.x

Once 4.18.3 makes its way into the core repository I would try that the amdgpu module has had a lot of changes in every recent release.

Offline

#5 2018-08-18 20:22:34

ewaller
Administrator
From: Pasadena, CA
Registered: 2009-07-13
Posts: 19,804

Re: Ryzen 2700X reproducible gaming crashes in kernel > 4.14.x

Lanz, please edit post #3 and use BBCode code tags around your program output
That is the same link that exists under every message post box in the forums for your reference.

Thanks.


Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing
---
How to Ask Questions the Smart Way

Offline

Board footer

Powered by FluxBB