You are not logged in.

#1 2020-12-02 08:10:58

andrej.podzimek
Member
From: Zürich, Switzerland
Registered: 2005-04-10
Posts: 115

[SOLVED] eGPU won't initialize with an ASRock x570 Creator motherboard

This is not an ArchLinux-specific problem, but since the machine runs ArchLinux, I'm asking (also) here… The TL;DR is that while my Razer Core X Chroma with an NVidia Quadro P5000 works fine with my Linux laptop (Lenovo Carbon X1 v7), it won't initialize [here's a dmesg output] when connected to my desktop (ASRock x570 Creator). This part of dmesg sums it up:

Nov 30 16:44:33 charon kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 234
Nov 30 16:44:33 charon kernel: nvidia 0000:3d:00.0: enabling device (0000 -> 0003)
Nov 30 16:44:33 charon kernel: NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
                               NVRM: BAR0 is 0M @ 0x0 (PCI:0000:3d:00.0)
Nov 30 16:44:33 charon kernel: NVRM: The system BIOS may have misconfigured your GPU.
Nov 30 16:44:33 charon kernel: nvidia: probe of 0000:3d:00.0 failed with error -1
Nov 30 16:44:33 charon kernel: NVRM: The NVIDIA probe routine failed for 1 device(s).
Nov 30 16:44:33 charon kernel: NVRM: None of the NVIDIA devices were initialized.
Nov 30 16:44:33 charon kernel: nvidia-nvlink: Unregistered the Nvlink Core, major device number 234
More details:
  • Motherboard: ASRock x570 Creator with BIOS 3.10

  • CPU: AMD Ryzen 3950X

  • System: ArchLinux with kernel 5.9.11

  • Devices in PCIe slots (some of which might be causing resource conflicts, so I'm listing them):

    • Standard PCIe GPU: AMD Radeon Pro W5700

    • M2 storage: Two M2 M-key PCIe SSDs/NVMs (i.e., both M-key slots occupied)

    • M2 networking: M2 E-key WiFi card (factory default)

  • Device in the eGPU: NVidia Quadro P5000

  • UEFI settings: Above 64b decoding, IOMMU and SR-IOV all enabled; I can see 64b stuff in lspci -v and dmesg shows one 64b root bus resource:

    Nov 30 15:44:31 archlinux kernel: pci_bus 0000:00: root bus resource [mem 0x2050000000-0x7fffffffff window]
What I've tried so far
  • This trick, of course:

    pci=realloc,assign-busses,hpbussize=0x33

    This got me from “No bus number available for hot-added bridge” to (at least) a detection of the NVidia card (i.e., the state described above), but didn't make the NVidia work. (See the dmesg output above.)

  • Asking on the NVidia forums and also on the Razer forums, but got no response there thus far.

  • Other random hacks — in particular, this crash from 2015 still “works”, after all those years, and instantly freezes my machine. big_smile There are other similar ideas, also a guaranteed crash in my case.

I was planning to open a support case with the motherboard's manufacturer, ASRock, but I'm no longer sure whether this is the motherboard's fault or a more general problem with the x570 chipset and PCIe resource management.

Any ideas? What else should I try? Is it perhaps impossible to make the eGPU work due to PCIe resource constraints on this system? But Thunderbolt 3 docks with PCIe buses work just fine with this very same setup. (Tried two of those.) So I was hoping that the eGPU could work as well.

Last edited by andrej.podzimek (2020-12-09 19:26:45)

Offline

#2 2020-12-02 13:35:58

Lone_Wolf
Member
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 11,920

Re: [SOLVED] eGPU won't initialize with an ASRock x570 Creator motherboard

Your motherboard should have two usb type C ports on the rear I/O panel.
Additoinally there's a Front Panel Type C USB 3.2 Gen2 Header , so you could have more usb-c ports somewhere else on the case.

Have you tried the eGPU in all usb-c ports ?

Please post full dmesg and/or journactl -b output, also lsusb -tv


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#3 2020-12-02 15:54:28

andrej.podzimek
Member
From: Zürich, Switzerland
Registered: 2005-04-10
Posts: 115

Re: [SOLVED] eGPU won't initialize with an ASRock x570 Creator motherboard

Lone_Wolf wrote:

Your motherboard should have two usb type C ports on the rear I/O panel.
Additoinally there's a Front Panel Type C USB 3.2 Gen2 Header , so you could have more usb-c ports somewhere else on the case.

Have you tried the eGPU in all usb-c ports ?

This is a Thunderbolt eGPU. It just doesn't work in a USB-only USB-C. (And even if it did work, 10 Gb/s instead of 40 Gb/s just wouldn't be an option.)

The rear USB-C ports on my motherboard are (indeed) Thunderbolts. I've tried both of them, but it fails the same way.

(Of course there are additional USB-C (non-Thunderbolt) ports as well as USB 3.1 type A ports. But those are unrelated to Thunderbolt and eGPUs.)

Please post full dmesg and/or journactl -b output

Here's a Pastebin link to `journalctl -k -b | grep -v audit`; posting the full output wouldn't be a good idea security-wise.

also lsusb -tv

`lsusb -tv` is utterly irrelevant; what really matters is `boltctl list`:

 ● Razer Core X Chroma
   ├─ type:          peripheral
   ├─ name:          Core X Chroma
   ├─ vendor:        Razer
   ├─ uuid:          00653854-e510-2701-ffff-ffffffffffff
   ├─ generation:    Thunderbolt 3
   ├─ status:        authorized
   │  ├─ domain:     ce010000-0060-6c0e-03b7-b91c46b12223
   │  ├─ rx speed:   40 Gb/s = 2 lanes * 20 Gb/s
   │  ├─ tx speed:   40 Gb/s = 2 lanes * 20 Gb/s
   │  └─ authflags:  secure
   ├─ authorized:    Wed 02 Dec 2020 03:45:57 PM UTC
   ├─ connected:     Wed 02 Dec 2020 03:45:46 PM UTC
   └─ stored:        Mon 30 Nov 2020 02:18:57 PM UTC
      ├─ policy:     auto
      └─ key:        yes

 ● Razer Core X Chroma #2
   ├─ type:          peripheral
   ├─ name:          Core X Chroma
   ├─ vendor:        Razer
   ├─ uuid:          00306925-e510-2701-ffff-ffffffffffff
   ├─ generation:    Thunderbolt 3
   ├─ status:        authorized
   │  ├─ domain:     ce010000-0060-6c0e-03b7-b91c46b12223
   │  ├─ rx speed:   40 Gb/s = 2 lanes * 20 Gb/s
   │  ├─ tx speed:   40 Gb/s = 2 lanes * 20 Gb/s
   │  └─ authflags:  secure
   ├─ authorized:    Wed 02 Dec 2020 03:46:07 PM UTC
   ├─ connected:     Wed 02 Dec 2020 03:45:46 PM UTC
   └─ stored:        Mon 30 Nov 2020 02:19:08 PM UTC
      ├─ policy:     auto
      └─ key:        yes

The additional USB buses in the Core X Chroma (a hub with an ASIX ethernet adapter) won't appear in `lsusb -tv` at all, because the PCIe resources haven't been initialized:

/:  Bus 12.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 10000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
/:  Bus 11.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 480M
    ID 1d6b:0002 Linux Foundation 2.0 root hub
/:  Bus 10.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 10000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
    |__ Port 3: Dev 2, If 0, Class=Hub, Driver=hub/5p, 5000M
        ID 0424:5537 Microchip Technology, Inc. (formerly SMSC) 
    |__ Port 4: Dev 3, If 0, Class=Hub, Driver=hub/5p, 5000M
        ID 0424:5537 Microchip Technology, Inc. (formerly SMSC) 
/:  Bus 09.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 480M
    ID 1d6b:0002 Linux Foundation 2.0 root hub
    |__ Port 1: Dev 2, If 0, Class=Human Interface Device, Driver=usbhid, 12M
        ID 1532:0226 Razer USA, Ltd Huntsman Elite
    |__ Port 1: Dev 2, If 1, Class=Human Interface Device, Driver=usbhid, 12M
        ID 1532:0226 Razer USA, Ltd Huntsman Elite
    |__ Port 1: Dev 2, If 2, Class=Human Interface Device, Driver=usbhid, 12M
        ID 1532:0226 Razer USA, Ltd Huntsman Elite
    |__ Port 3: Dev 3, If 0, Class=Hub, Driver=hub/5p, 480M
        ID 0424:2137 Microchip Technology, Inc. (formerly SMSC) 
        |__ Port 5: Dev 5, If 0, Class=Human Interface Device, Driver=usbhid, 12M
            ID 04d8:0b26 Microchip Technology, Inc. 
    |__ Port 4: Dev 4, If 0, Class=Hub, Driver=hub/5p, 480M
        ID 0424:2137 Microchip Technology, Inc. (formerly SMSC) 
        |__ Port 5: Dev 7, If 0, Class=Human Interface Device, Driver=usbhid, 12M
            ID 04d8:0b26 Microchip Technology, Inc. 
/:  Bus 08.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/1p, 10000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
/:  Bus 07.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/1p, 480M
    ID 1d6b:0002 Linux Foundation 2.0 root hub
    |__ Port 1: Dev 2, If 0, Class=, Driver=, 12M
        ID 0639:7213 Chrontel, Inc. 
/:  Bus 06.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 10000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
    |__ Port 1: Dev 2, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
        ID 090c:1000 Silicon Motion, Inc. - Taiwan (formerly Feiya Technology Corp.) Flash Drive
    |__ Port 2: Dev 3, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
        ID 090c:1000 Silicon Motion, Inc. - Taiwan (formerly Feiya Technology Corp.) Flash Drive
/:  Bus 05.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 480M
    ID 1d6b:0002 Linux Foundation 2.0 root hub
    |__ Port 4: Dev 2, If 3, Class=Audio, Driver=snd-usb-audio, 12M
        ID 1532:0518 Razer USA, Ltd Nommo Pro
    |__ Port 4: Dev 2, If 1, Class=Human Interface Device, Driver=, 12M
        ID 1532:0518 Razer USA, Ltd Nommo Pro
    |__ Port 4: Dev 2, If 2, Class=Audio, Driver=snd-usb-audio, 12M
        ID 1532:0518 Razer USA, Ltd Nommo Pro
    |__ Port 4: Dev 2, If 0, Class=Human Interface Device, Driver=usbhid, 12M
        ID 1532:0518 Razer USA, Ltd Nommo Pro
    |__ Port 6: Dev 3, If 0, Class=Wireless, Driver=btusb, 12M
        ID 8087:0029 Intel Corp. AX200 Bluetooth
    |__ Port 6: Dev 3, If 1, Class=Wireless, Driver=btusb, 12M
        ID 8087:0029 Intel Corp. AX200 Bluetooth
/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 10000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
    |__ Port 3: Dev 4, If 0, Class=Mass Storage, Driver=uas, 5000M
        ID 174c:55aa ASMedia Technology Inc. ASM1051E SATA 6Gb/s bridge, ASM1053E SATA 6Gb/s bridge, ASM1153 SATA 3Gb/s bridge, ASM1153E SATA 6Gb/s bridge
    |__ Port 4: Dev 2, If 0, Class=Hub, Driver=hub/4p, 5000M
        ID 0bda:0411 Realtek Semiconductor Corp. Hub
        |__ Port 4: Dev 3, If 0, Class=Vendor Specific Class, Driver=r8152, 5000M
            ID 0bda:8153 Realtek Semiconductor Corp. RTL8153 Gigabit Ethernet Adapter
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 480M
    ID 1d6b:0002 Linux Foundation 2.0 root hub
    |__ Port 2: Dev 7, If 0, Class=Hub, Driver=hub/4p, 480M
        ID 0bda:5411 Realtek Semiconductor Corp. RTS5411 Hub
    |__ Port 3: Dev 2, If 0, Class=Human Interface Device, Driver=usbhid, 12M
        ID 1b1c:0c17 Corsair 
    |__ Port 4: Dev 3, If 0, Class=Hub, Driver=hub/7p, 480M
        ID 1a40:0201 Terminus Technology Inc. FE 2.1 7-port Hub
        |__ Port 3: Dev 4, If 3, Class=Audio, Driver=snd-usb-audio, 480M
            ID 1532:0e03 Razer USA, Ltd Gaming Webcam [Kiyo]
        |__ Port 3: Dev 4, If 1, Class=Video, Driver=uvcvideo, 480M
            ID 1532:0e03 Razer USA, Ltd Gaming Webcam [Kiyo]
        |__ Port 3: Dev 4, If 2, Class=Audio, Driver=snd-usb-audio, 480M
            ID 1532:0e03 Razer USA, Ltd Gaming Webcam [Kiyo]
        |__ Port 3: Dev 4, If 0, Class=Video, Driver=uvcvideo, 480M
            ID 1532:0e03 Razer USA, Ltd Gaming Webcam [Kiyo]
        |__ Port 5: Dev 5, If 0, Class=Human Interface Device, Driver=usbhid, 12M
            ID 1b1c:0c1a Corsair 
        |__ Port 6: Dev 6, If 0, Class=Human Interface Device, Driver=usbhid, 12M
            ID 1532:0060 Razer USA, Ltd RZ01-0213 Gaming Mouse [Lancehead Tournament Edition]
        |__ Port 6: Dev 6, If 1, Class=Human Interface Device, Driver=usbhid, 12M
            ID 1532:0060 Razer USA, Ltd RZ01-0213 Gaming Mouse [Lancehead Tournament Edition]
        |__ Port 6: Dev 6, If 2, Class=Human Interface Device, Driver=usbhid, 12M
            ID 1532:0060 Razer USA, Ltd RZ01-0213 Gaming Mouse [Lancehead Tournament Edition]
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 10000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 480M
    ID 1d6b:0002 Linux Foundation 2.0 root hub

(On the laptop where the eGPU works fine, both the PCIe side of things (which includes the NVidia) and all the USB ports and devices get recognized. This is, sadly, not the case on the ASRock x570 Creator; the x570 has a strange PCIe resource management glitch.)

Last edited by andrej.podzimek (2020-12-02 16:11:52)

Offline

#4 2020-12-04 14:56:44

Lone_Wolf
Member
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 11,920

Re: [SOLVED] eGPU won't initialize with an ASRock x570 Creator motherboard

ASrock motherboard documentation didn't make clear which ports where the thunderbolt ones, other sources did state thunderbolt3 uses the usb-c connector.
That's why I mentioned all usb-c ports.

Nov 30 15:45:10 charon kernel: pci 0000:07:00.0: BAR 15: assigned [mem 0xb0000000-0xb01fffff 64bit pref]
Nov 30 15:45:10 charon kernel: pci 0000:07:00.0: BAR 14: assigned [mem 0xd8000000-0xd81fffff]
Nov 30 15:45:10 charon kernel: pci 0000:07:00.0: BAR 13: assigned [io  0xd000-0xdfff]
Nov 30 15:45:10 charon kernel: pci 0000:08:01.0: BAR 15: no space for [mem size 0x18000000 64bit pref]
Nov 30 15:45:10 charon kernel: pci 0000:08:01.0: BAR 15: failed to assign [mem size 0x18000000 64bit pref]
Nov 30 15:45:10 charon kernel: pci 0000:08:01.0: BAR 14: no space for [mem size 0x01800000]
Nov 30 15:45:10 charon kernel: pci 0000:08:01.0: BAR 14: failed to assign [mem size 0x01800000]
Nov 30 15:45:10 charon kernel: pci 0000:08:04.0: BAR 14: assigned [mem 0xd8000000-0xd81fffff]
Nov 30 15:45:10 charon kernel: pci 0000:08:04.0: BAR 15: assigned [mem 0xb0000000-0xb01fffff 64bit pref]
Nov 30 15:45:10 charon kernel: pci 0000:08:01.0: BAR 13: assigned [io  0xd000-0xdfff]
Nov 30 15:45:10 charon kernel: pci 0000:08:04.0: BAR 13: no space for [io  size 0x1000]
Nov 30 15:45:10 charon kernel: pci 0000:08:04.0: BAR 13: failed to assign [io  size 0x1000]
Nov 30 15:45:10 charon kernel: pci 0000:08:01.0: BAR 15: no space for [mem size 0x18000000 64bit pref]
Nov 30 15:45:10 charon kernel: pci 0000:08:01.0: BAR 15: failed to assign [mem size 0x18000000 64bit pref]
Nov 30 15:45:10 charon kernel: pci 0000:08:01.0: BAR 14: no space for [mem size 0x01800000]
Nov 30 15:45:10 charon kernel: pci 0000:08:01.0: BAR 14: failed to assign [mem size 0x01800000]
Nov 30 15:45:10 charon kernel: pci 0000:08:04.0: BAR 14: assigned [mem 0xd8000000-0xd80fffff]
Nov 30 15:45:10 charon kernel: pci 0000:08:04.0: BAR 15: assigned [mem 0xb0000000-0xb00fffff 64bit pref]
Nov 30 15:45:10 charon kernel: pci 0000:08:01.0: BAR 13: assigned [io  0xd000-0xdfff]
Nov 30 15:45:10 charon kernel: pci 0000:08:04.0: BAR 13: no space for [io  size 0x1000]
Nov 30 15:45:10 charon kernel: pci 0000:08:04.0: BAR 13: failed to assign [io  size 0x1000]
Nov 30 15:45:10 charon kernel: pci 0000:08:04.0: BAR 14: reassigned [mem 0xd8000000-0xd81fffff] (expanded by 0x100000)
Nov 30 15:45:10 charon kernel: pci 0000:08:04.0: BAR 15: reassigned [mem 0xb0000000-0xb01fffff 64bit pref] (expanded by 0x100000)
Nov 30 15:45:10 charon kernel: pci 0000:09:00.0: BAR 1: no space for [mem size 0x10000000 64bit pref]
Nov 30 15:45:10 charon kernel: pci 0000:09:00.0: BAR 1: failed to assign [mem size 0x10000000 64bit pref]
Nov 30 15:45:10 charon kernel: pci 0000:09:00.0: BAR 3: no space for [mem size 0x02000000 64bit pref]
Nov 30 15:45:10 charon kernel: pci 0000:09:00.0: BAR 3: failed to assign [mem size 0x02000000 64bit pref]
Nov 30 15:45:10 charon kernel: pci 0000:09:00.0: BAR 0: no space for [mem size 0x01000000]
Nov 30 15:45:10 charon kernel: pci 0000:09:00.0: BAR 0: failed to assign [mem size 0x01000000]
Nov 30 15:45:10 charon kernel: pci 0000:09:00.0: BAR 6: no space for [mem size 0x00080000 pref]
Nov 30 15:45:10 charon kernel: pci 0000:09:00.0: BAR 6: failed to assign [mem size 0x00080000 pref]
Nov 30 15:45:10 charon kernel: pci 0000:09:00.1: BAR 0: no space for [mem size 0x00004000]
Nov 30 15:45:10 charon kernel: pci 0000:09:00.1: BAR 0: failed to assign [mem size 0x00004000]
Nov 30 15:45:10 charon kernel: pci 0000:09:00.0: BAR 5: assigned [io  0xd000-0xd07f]

Those messages occur multiple times in the log.
There are also messages about adding and removing stuff from iommu groups, not beign able to ioremap a snd_hda_intel device.

Please boot without the eGPU present and post the log to verify whether the eGPU is related to the cause or an innocent victim.


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#5 2020-12-04 16:12:24

andrej.podzimek
Member
From: Zürich, Switzerland
Registered: 2005-04-10
Posts: 115

Re: [SOLVED] eGPU won't initialize with an ASRock x570 Creator motherboard

Lone_Wolf wrote:

Those messages occur multiple times in the log.
There are also messages about adding and removing stuff from iommu groups, not beign able to ioremap a snd_hda_intel device.

Please boot without the eGPU present and post the log to verify whether the eGPU is related to the cause or an innocent victim.

Here's a dmesg without the Core X Chroma eGPU.

This^^^ is still with IOMMU and SR-IOV on in the UEFI setup. I also tried to boot without those options, to see if the eGPU situation would improve, but nope, the outcome was pretty much the same. And as before, a Lenovo Thunderbolt dock worked perfectly fine while the eGPU just didn't work (despite appearing in boltctl list).

Offline

#6 2020-12-04 17:20:22

Lone_Wolf
Member
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 11,920

Re: [SOLVED] eGPU won't initialize with an ASRock x570 Creator motherboard

No sign of those error messages, so the eGPU is related to the cause.

You are using amd-ucode, but asrock released a newer firmware version 3.13[1]  ,  try with that one ?

The kernel could also be related, have you tried with stock linux, linux-lts or linux-mainline[2]  ?


[1] https://www.asrock.com/MB/AMD/X570%20Cr … x.asp#BIOS
[2] https://aur.archlinux.org/packages/linux-mainline/


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#7 2020-12-09 19:26:02

andrej.podzimek
Member
From: Zürich, Switzerland
Registered: 2005-04-10
Posts: 115

Re: [SOLVED] eGPU won't initialize with an ASRock x570 Creator motherboard

Alright, I've figured it out, based on this post. The magic is:

pcie_ports=native pci=assign-busses,hpbussize=0x33,realloc,hpmmiosize=128M,hpmmioprefsize=16G

With this^^^ on the kernel command line, I can just plug in the eGPU an it works, no problem at all. The nvidia kernel module loads correctly and I'm calculating Folding@Home on the eGPU right now, so it definitely works.

(My machine won't boot if I add the recommended nocrs to pci=..., because the kernel can't talk to SATA controllers and drives in that mode and freezes forever while trying to do so. But the eGPU works without nocrs just fine, so I'm not messing with that any further.)

Offline

Board footer

Powered by FluxBB