You are not logged in.

#1 2020-03-20 20:24:33

Apache14
Member
Registered: 2020-03-20
Posts: 4

Kernel crash while using amdgpu pro OpenCL (5.5.10)

While trying to use folding@home in OpenCL mode on my Vega64 I am seeing the below crash while using the amd provided OpenCL package (In AUR)

Reverting to the LTS 5.4.26 kernel fixes everything.

I'm pretty sure this is a kernel bug but not sure what to try next.

[  265.625351] [drm:amdgpu_ttm_backend_bind [amdgpu]] *ERROR* failed to pin userptr
[  265.625358] ------------[ cut here ]------------
[  265.625359] kernel BUG at mm/slub.c:304!
[  265.625362] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[  265.625365] CPU: 21 PID: 3405 Comm: FahCore_21 Not tainted 5.5.10-arch1-1 #1
[  265.625366] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS MASTER/X570 AORUS MASTER, BIOS F7f 10/25/2019
[  265.625368] RIP: 0010:kfree+0x25a/0x270
[  265.625369] Code: 5d 41 5c 41 5d e9 b6 6f fd ff 4d 89 e9 48 89 d9 48 89 da 48 89 ee 5b 4c 89 e7 5d 41 b8 01 00 00 00 41 5c 41 5d e9 46 f9 ff ff <0f> 0b 48 8b 05 ed 58 18 01 e9 d6 fd ff ff 0f 1f 84 00 00 00 00 00
[  265.625370] RSP: 0018:ffff8f04c260bbe8 EFLAGS: 00010246
[  265.625371] RAX: ffff8cd78a0c65d0 RBX: ffff8cd78a0c65d0 RCX: ffff8cd78a0c65d0
[  265.625372] RDX: 000000000009ce15 RSI: ffff8cd78ed72060 RDI: ffff8cd78a0c65d0
[  265.625372] RBP: fffff37890283180 R08: ffff8cd743fefce0 R09: ffff8cd78bda03d0
[  265.625373] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8cd78cc079c0
[  265.625374] R13: ffffffffc0442cf5 R14: ffff8cd782aa4f50 R15: ffff8cd78a5b4000
[  265.625374] FS:  00007f1063b18f80(0000) GS:ffff8cd78ed40000(0000) knlGS:0000000000000000
[  265.625375] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  265.625376] CR2: 00007f1060009000 CR3: 000000037b5d0000 CR4: 0000000000340ee0
[  265.625376] Call Trace:
[  265.625408]  amdgpu_ttm_tt_unpopulate+0x85/0xc0 [amdgpu]
[  265.625413]  ttm_tt_destroy.part.0+0x49/0x50 [ttm]
[  265.625415]  ttm_bo_cleanup_memtype_use+0x32/0x80 [ttm]
[  265.625417]  ttm_bo_put+0x2b3/0x330 [ttm]
[  265.625448]  amdgpu_bo_unref+0x1a/0x30 [amdgpu]
[  265.625478]  amdgpu_gem_object_free+0x30/0x50 [amdgpu]
[  265.625509]  amdgpu_gem_userptr_ioctl+0x1fc/0x250 [amdgpu]
[  265.625545]  ? amdgpu_gem_create_ioctl+0x250/0x250 [amdgpu]
[  265.625554]  drm_ioctl_kernel+0xb2/0x100 [drm]
[  265.625563]  drm_ioctl+0x209/0x360 [drm]
[  265.625593]  ? amdgpu_gem_create_ioctl+0x250/0x250 [amdgpu]
[  265.625623]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
[  265.625625]  do_vfs_ioctl+0x4b7/0x730
[  265.625627]  ? __do_munmap+0x2fe/0x4c0
[  265.625628]  ksys_ioctl+0x5e/0x90
[  265.625630]  __x64_sys_ioctl+0x16/0x20
[  265.625631]  do_syscall_64+0x4e/0x150
[  265.625634]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  265.625635] RIP: 0033:0x7f1063c102eb
[  265.625636] Code: 0f 1e fa 48 8b 05 a5 8b 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 8b 0c 00 f7 d8 64 89 01 48
[  265.625637] RSP: 002b:00007ffe03fff648 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  265.625638] RAX: ffffffffffffffda RBX: 00007ffe03fff6a0 RCX: 00007f1063c102eb
[  265.625639] RDX: 00007ffe03fff6a0 RSI: 00000000c0186451 RDI: 0000000000000015
[  265.625639] RBP: 00000000c0186451 R08: 0000000004b983b8 R09: 00007ffe03fff738
[  265.625640] R10: 000000000420d000 R11: 0000000000000246 R12: 0000000004b983b8
[  265.625640] R13: 0000000000000015 R14: 000000000420d000 R15: 00007ffe03fffa00
[  265.625642] Modules linked in: fuse ccm xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo nft_counter xt_addrtype br_netfilter bridge nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nf_log_ipv6 nf_log_ipv4 nf_log_common nft_log nft_ct overlay nf_tables_set nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip6_tables nft_compat cmac algif_hash nf_tables 8021q algif_skcipher garp it87 af_alg mrp stp bnep llc nfnetlink hwmon_vid joydev nls_iso8859_1 nls_cp437 vfat fat mousedev input_leds edac_mce_amd snd_hda_codec_realtek snd_hda_codec_generic kvm_amd iwlmvm kvm ledtrig_audio snd_hda_codec_hdmi irqbypass mac80211 snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core libarc4 crct10dif_pclmul crc32_pclmul snd_hwdep ghash_clmulni_intel iwlwifi wmi_bmof mxm_wmi btusb snd_pcm btrtl btbcm btintel raid0 aesni_intel snd_timer crypto_simd bluetooth cryptd cfg80211 md_mod igb glue_helper ccp r8169 snd
[  265.625666]  sp5100_tco ecdh_generic pcspkr i2c_piix4 k10temp ecc rng_core soundcore realtek libphy rfkill dca wmi evdev pinctrl_amd mac_hid acpi_cpufreq sg crypto_user ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 hid_generic usbhid hid sd_mod uas usb_storage ahci libahci libata crc32c_intel xhci_pci scsi_mod xhci_hcd radeon amdgpu gpu_sched i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm agpgart
[  265.625679] ---[ end trace b1122b7bea6616be ]---
[  265.625680] RIP: 0010:kfree+0x25a/0x270
[  265.625681] Code: 5d 41 5c 41 5d e9 b6 6f fd ff 4d 89 e9 48 89 d9 48 89 da 48 89 ee 5b 4c 89 e7 5d 41 b8 01 00 00 00 41 5c 41 5d e9 46 f9 ff ff <0f> 0b 48 8b 05 ed 58 18 01 e9 d6 fd ff ff 0f 1f 84 00 00 00 00 00
[  265.625681] RSP: 0018:ffff8f04c260bbe8 EFLAGS: 00010246
[  265.625682] RAX: ffff8cd78a0c65d0 RBX: ffff8cd78a0c65d0 RCX: ffff8cd78a0c65d0
[  265.625683] RDX: 000000000009ce15 RSI: ffff8cd78ed72060 RDI: ffff8cd78a0c65d0
[  265.625683] RBP: fffff37890283180 R08: ffff8cd743fefce0 R09: ffff8cd78bda03d0
[  265.625684] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8cd78cc079c0
[  265.625684] R13: ffffffffc0442cf5 R14: ffff8cd782aa4f50 R15: ffff8cd78a5b4000
[  265.625685] FS:  00007f1063b18f80(0000) GS:ffff8cd78ed40000(0000) knlGS:0000000000000000
[  265.625686] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  265.625686] CR2: 00007f1060009000 CR3: 000000037b5d0000 CR4: 0000000000340ee0

Not sure what to try next to investigate.

Offline

#2 2020-03-21 22:28:31

thorstenhirsch
Member
Registered: 2005-08-03
Posts: 102

Re: Kernel crash while using amdgpu pro OpenCL (5.5.10)

Same here.

Offline

#3 2020-03-24 11:57:07

Apache14
Member
Registered: 2020-03-20
Posts: 4

Re: Kernel crash while using amdgpu pro OpenCL (5.5.10)

Just submitted this patch set .

https://lists.freedesktop.org/archives/ … 60116.html

Fixes the OpenCL issues for me

Offline

Board footer

Powered by FluxBB