You are not logged in.

#1 2021-10-24 22:33:13

helloworld1
Member
Registered: 2010-12-26
Posts: 72

[SOLVED] AMD 5700G iGPU OpenCL keeps freezing and crashing

I am unable to get OpenCL working reliable for darktable or Davinci Resolve on linux 5.14.14 (with linux-firmware) or linux-mainline 5.15.0.rc6 (With linux-firmware-git), opencl-amd 21.30.1290604-1 (I cannot use opencl-mesa since it supports non of the programs I need to run).
On both stable and mainline kernels, it only sometimes works for clinfo or "darktable-cltest". And then it crashes resets the GPU, and then freezes.

[   51.181768] ------------[ cut here ]------------
[   51.181771] WARNING: CPU: 9 PID: 235 at drivers/gpu/drm/ttm/ttm_bo.c:409 ttm_bo_release+0x2d1/0x300 [ttm]
[   51.181782] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device cfg80211 uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc mousedev joydev uas usb_storage intel_rapl_msr nct6775 intel_rapl_common hwmon_vid amdgpu snd_hda_codec_realtek edac_mce_amd snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi eeepc_wmi kvm_amd snd_hda_intel gpu_sched asus_wmi nls_iso8859_1 i2c_algo_bit sparse_keymap snd_intel_dspcfg platform_profile drm_ttm_helper snd_intel_sdw_acpi vfat rfkill kvm snd_hda_codec usbhid video wmi_bmof ttm fat irqbypass snd_hda_core crct10dif_pclmul drm_kms_helper crc32_pclmul ghash_clmulni_intel snd_hwdep aesni_intel snd_pcm cec crypto_simd agpgart snd_timer cryptd syscopyarea snd sp5100_tco sysfillrect rapl sysimgblt pcspkr k10temp i2c_piix4 ccp fb_sys_fops soundcore igc wmi tpm_crb mac_hid tpm_tis tpm_tis_core tpm gpio_amdpt pinctrl_amd gpio_generic rng_core acpi_cpufreq sg crypto_user drm fuse bpf_preload ip_tables
[   51.181817]  x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq xhci_pci crc32c_intel xhci_pci_renesas
[   51.181821] CPU: 9 PID: 235 Comm: kworker/9:2 Not tainted 5.15.0-rc6-1-mainline #1 2eb2dce07dbd87701c12affdd03a7e57c707456d
[   51.181823] Hardware name: ASUS System Product Name/ROG STRIX B550-I GAMING, BIOS 2423 08/11/2021
[   51.181824] Workqueue: kfd_process_wq kfd_process_wq_release [amdgpu]
[   51.181905] RIP: 0010:ttm_bo_release+0x2d1/0x300 [ttm]
[   51.181909] Code: 8d b6 b8 fe ff ff e8 7e 12 9a ff 49 8b 76 08 48 89 ef e8 b2 24 00 00 49 8b 7e 98 e9 70 fd ff ff e8 b4 b3 6e d5 e9 aa fd ff ff <0f> 0b e9 50 fd ff ff e8 b3 b1 6e d5 e9 f8 fe ff ff be 03 00 00 00
[   51.181910] RSP: 0018:ffffb02a80d67cc0 EFLAGS: 00010202
[   51.181911] RAX: 0000000000000001 RBX: ffffb02a80d67d08 RCX: 000000008040002d
[   51.181912] RDX: 0000000000000001 RSI: 000000008040002d RDI: ffff9f5ae5001db8
[   51.181913] RBP: ffff9f5ae6b05270 R08: ffff9f5ae5001db8 R09: 0000000000000000
[   51.181913] R10: 0000000000000001 R11: 0000000000000000 R12: ffff9f5b1bb35e30
[   51.181914] R13: ffff9f5ae5001c58 R14: ffff9f5ae5001db8 R15: dead000000000100
[   51.181914] FS:  0000000000000000(0000) GS:ffff9f65bde40000(0000) knlGS:0000000000000000
[   51.181915] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   51.181916] CR2: 000055a15fff4d38 CR3: 00000002bee10000 CR4: 0000000000750ee0
[   51.181916] PKRU: 55555554
[   51.181917] Call Trace:
[   51.181921]  amdgpu_bo_unref+0x1a/0x30 [amdgpu 15265334394386c2d975b46dc2248cfec063d665]
[   51.181977]  amdgpu_gem_object_free+0x30/0x50 [amdgpu 15265334394386c2d975b46dc2248cfec063d665]
[   51.182030]  amdgpu_amdkfd_gpuvm_free_memory_of_gpu+0x359/0x3c0 [amdgpu 15265334394386c2d975b46dc2248cfec063d665]
[   51.182098]  kfd_process_device_free_bos+0x9f/0xf0 [amdgpu 15265334394386c2d975b46dc2248cfec063d665]
[   51.182158]  kfd_process_wq_release+0x20d/0x2e0 [amdgpu 15265334394386c2d975b46dc2248cfec063d665]
[   51.182215]  process_one_work+0x1e8/0x3c0
[   51.182219]  worker_thread+0x50/0x3b0
[   51.182220]  ? process_one_work+0x3c0/0x3c0
[   51.182221]  kthread+0x132/0x160
[   51.182223]  ? set_kthread_struct+0x40/0x40
[   51.182223]  ret_from_fork+0x22/0x30
[   51.182226] ---[ end trace c1ead71d4c485365 ]---
[   62.360928] amdgpu: qcm fence wait loop timeout expired
[   62.360933] amdgpu: The cp might be in an unrecoverable state due to an unsuccessful queues preemption
[   62.360940] amdgpu 0000:07:00.0: amdgpu: GPU reset begin!
[   62.360935] amdgpu: Failed to evict process queues
[   62.360958] amdgpu: Failed to quiesce KFD
[   62.391040] [drm] free PSP TMR buffer
[   62.419108] amdgpu 0000:07:00.0: amdgpu: MODE2 reset
[   62.419684] amdgpu 0000:07:00.0: amdgpu: GPU reset succeeded, trying to resume
[   62.419799] [drm] PCIE GART of 1024M enabled.
[   62.419801] [drm] PTB located at 0x000000F400900000
[   62.420039] [drm] PSP is resuming...
[   62.440067] [drm] reserve 0x400000 from 0xf7ff800000 for PSP TMR
[   62.519663] amdgpu 0000:07:00.0: amdgpu: RAS: optional ras ta ucode is not available
[   62.527921] amdgpu 0000:07:00.0: amdgpu: RAP: optional rap ta ucode is not available
[   62.527923] amdgpu 0000:07:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[   62.527925] amdgpu 0000:07:00.0: amdgpu: SMU is resuming...
[   62.528936] amdgpu 0000:07:00.0: amdgpu: SMU is resumed successfully!
[   62.718941] [drm] kiq ring mec 2 pipe 1 q 0
[   62.720190] [drm] DMUB hardware initialized: version=0x01010019
[   62.779619] [drm] VCN decode and encode initialized successfully(under DPG Mode).
[   62.779665] [drm] JPEG decode initialized successfully.
[   62.779667] amdgpu 0000:07:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0
[   62.779669] amdgpu 0000:07:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[   62.779670] amdgpu 0000:07:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[   62.779671] amdgpu 0000:07:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[   62.779672] amdgpu 0000:07:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[   62.779672] amdgpu 0000:07:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[   62.779673] amdgpu 0000:07:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[   62.779674] amdgpu 0000:07:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[   62.779674] amdgpu 0000:07:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[   62.779675] amdgpu 0000:07:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[   62.779676] amdgpu 0000:07:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 1
[   62.779677] amdgpu 0000:07:00.0: amdgpu: ring vcn_dec uses VM inv eng 1 on hub 1
[   62.779677] amdgpu 0000:07:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 4 on hub 1
[   62.779678] amdgpu 0000:07:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 5 on hub 1
[   62.779678] amdgpu 0000:07:00.0: amdgpu: ring jpeg_dec uses VM inv eng 6 on hub 1
[   62.782386] amdgpu 0000:07:00.0: amdgpu: recover vram bo from shadow start
[   62.782388] amdgpu 0000:07:00.0: amdgpu: recover vram bo from shadow done
[   62.782398] amdgpu 0000:07:00.0: amdgpu: GPU reset(1) succeeded!

The hardware is running on Asus Strix B550 and 5700G with 64G ram and 16G gfx ram. The OpenCL works fine on Windows. I disabled RAM XMP profile but still get crashes and freezes.
Sometimes OpenCL problems can work for a short while. hashcat -b shows 5.15 performance is 10x of the 5.14 kernel. But it will crash in a minute or so.
Has anybody encountered the same problems and any resolutions? Thank you!

Last edited by helloworld1 (2021-11-08 04:47:25)

Offline

#2 2021-10-25 01:00:12

redshoe
Member
Registered: 2015-12-16
Posts: 212

Re: [SOLVED] AMD 5700G iGPU OpenCL keeps freezing and crashing

I (AMD Ryzen 4705U) had a similar problem. Try using older version of AMDGPU-PRO driver, preferably version 20.40 or 20.45.

Last edited by redshoe (2021-10-25 01:00:36)

Offline

#3 2021-10-25 03:30:00

helloworld1
Member
Registered: 2010-12-26
Posts: 72

Re: [SOLVED] AMD 5700G iGPU OpenCL keeps freezing and crashing

Unfortunately, Only 21.x AMDGPU-PRO seems to support the new zen 3 APUs. I have tried all 21.20 and 21.30 but they have the same issues. When using 20.30 or 20.45, it result in core dump when running clinfo.

Offline

#4 2021-10-25 03:43:44

redshoe
Member
Registered: 2015-12-16
Posts: 212

Re: [SOLVED] AMD 5700G iGPU OpenCL keeps freezing and crashing

I would try newer driver than 20.30 or 20.45, but older than 21.20 or 21.30. For mine, it seems that 20.40 is the right version that works.

Last edited by redshoe (2021-10-25 03:51:00)

Offline

#5 2021-10-25 04:35:27

helloworld1
Member
Registered: 2010-12-26
Posts: 72

Re: [SOLVED] AMD 5700G iGPU OpenCL keeps freezing and crashing

Thanks for the suggestion. I tried 20.50, 21.10 also. 20.50 will core dump and 21.10 has the same freezing issue.
I tried to fire a kernel bug https://bugzilla.kernel.org/show_bug.cgi?id=214807 but I doubt it will gain any traction.

Offline

#6 2021-10-25 12:43:42

redshoe
Member
Registered: 2015-12-16
Posts: 212

Re: [SOLVED] AMD 5700G iGPU OpenCL keeps freezing and crashing

One thing you could do is to downgrade the kernel to 5.8 and see what happens. Or the very last thing is to just wait it out until they have the driver. I guess that APU is so new that they don't have any solid support for it yet.

Offline

#7 2021-10-25 21:06:16

helloworld1
Member
Registered: 2010-12-26
Posts: 72

Re: [SOLVED] AMD 5700G iGPU OpenCL keeps freezing and crashing

Tried LTS kernel but makes no differences. The GPU is essentially the same as those in 4000 series APU. I am quite frustrated about AMD's driver support. Feel like the only option here is wait either kernel or AMD open cl driver to fix it.

Last edited by helloworld1 (2021-10-25 21:07:24)

Offline

#8 2021-10-25 21:08:01

helloworld1
Member
Registered: 2010-12-26
Posts: 72

Re: [SOLVED] AMD 5700G iGPU OpenCL keeps freezing and crashing

BTW, I have also tried ROCm 4.3.1 but it also freezes the same way.

Offline

#9 2021-10-26 00:30:48

redshoe
Member
Registered: 2015-12-16
Posts: 212

Re: [SOLVED] AMD 5700G iGPU OpenCL keeps freezing and crashing

Yeah man. AMD does make great hardware, but they are not really good on software support.

Offline

#10 2021-11-08 04:46:22

helloworld1
Member
Registered: 2010-12-26
Posts: 72

Re: [SOLVED] AMD 5700G iGPU OpenCL keeps freezing and crashing

Unfortunately I just got rid of the 5700G. I bought it for the iGPU but I cannot use it for OpenCL. Marking the thread as solved. The recommendation is buying a supported GPU or wait forever.

Offline

#11 2021-11-11 02:32:35

ryuanlu
Member
Registered: 2021-11-11
Posts: 1

Re: [SOLVED] AMD 5700G iGPU OpenCL keeps freezing and crashing

I had similar problem with AMD Ryzen 5700G Vega 8 + RX6600XT + GIGABYTE B550I AORUS AX PRO. My LCD monitor is connected to RX6600XT. iGPU is only used for OpenCL and vaapi.
OpenCL works well on RX6600XT but not on iGPU. It freezes and shows ttm_bo_release call traces in dmesg.

Recently I found if I connected my LCD to iGPU once, Opencl works without freezing and call traces.

Offline

#12 2021-11-11 19:07:50

helloworld1
Member
Registered: 2010-12-26
Posts: 72

Re: [SOLVED] AMD 5700G iGPU OpenCL keeps freezing and crashing

Thanks. I tried connecting monitor to 5700g and it still manifest the same issue.
But if I don't connect monitor, the opencl on iGPU is actually pretty stable without crashing but performs badly.

In the end it's sad that I still need dGPU in order to use iGPU.

Offline

Board footer

Powered by FluxBB