You are not logged in.

#1 2017-02-23 09:08:40

kaymio
Member
Registered: 2017-01-07
Posts: 27

[SOLVED] AMDGPU IB ring test failed / RX480 constant at 80°C

Since yesterdays update, I guess from 4.9.8 to 4.9.11, at startup it prints these messages in relation to my RX480. Especially the IB test *ERROR*s.
AMDGPUPRO is not installed. AMDGPU handled by mesa 17.0.0-1.

Feb 23 09:24:57 cryptArch kernel: [drm] Initialized
Feb 23 09:24:57 cryptArch kernel: [drm] amdgpu kernel modesetting enabled.
Feb 23 09:24:57 cryptArch kernel: fb: switching to amdgpudrmfb from EFI VGA
Feb 23 09:24:57 cryptArch kernel: [drm] initializing kernel modesetting (POLARIS10 0x1002:0x67DF 0x148C:0x2372 0xC7).
Feb 23 09:24:57 cryptArch kernel: [drm] register mmio base: 0xFEA00000
Feb 23 09:24:57 cryptArch kernel: [drm] register mmio size: 262144
Feb 23 09:24:57 cryptArch kernel: [drm] doorbell mmio base: 0xD0000000
Feb 23 09:24:57 cryptArch kernel: [drm] doorbell mmio size: 2097152
Feb 23 09:24:57 cryptArch kernel: [drm] probing gen 2 caps for device 1002:5a16 = 31cc82/0
Feb 23 09:24:57 cryptArch kernel: [drm] probing mlw for device 1002:5a16 = 31cc82
Feb 23 09:24:57 cryptArch kernel: [drm] UVD is enabled in VM mode
Feb 23 09:24:57 cryptArch kernel: [drm] VCE enabled in VM mode
Feb 23 09:24:57 cryptArch kernel: [drm] GPU post is not needed
Feb 23 09:24:57 cryptArch kernel: [drm] Detected VRAM RAM=8192M, BAR=256M
Feb 23 09:24:57 cryptArch kernel: [drm] RAM width 256bits GDDR5
Feb 23 09:24:57 cryptArch kernel: [drm] amdgpu: 8192M of VRAM memory ready
Feb 23 09:24:57 cryptArch kernel: [drm] amdgpu: 8007M of GTT memory ready.
Feb 23 09:24:57 cryptArch kernel: [drm] GART: num cpu pages 2050016, num gpu pages 2050016
Feb 23 09:24:57 cryptArch kernel: [drm] PCIE GART of 8007M enabled (table at 0x0000000000040000).
Feb 23 09:24:57 cryptArch kernel: [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
Feb 23 09:24:57 cryptArch kernel: [drm] Driver supports precise vblank timestamp query.
Feb 23 09:24:57 cryptArch kernel: [drm] amdgpu: irq initialized.
Feb 23 09:24:57 cryptArch kernel: [drm] AMDGPU Display Connectors
Feb 23 09:24:57 cryptArch kernel: [drm] Connector 0:
Feb 23 09:24:57 cryptArch kernel: [drm]   DP-1
Feb 23 09:24:57 cryptArch kernel: [drm]   HPD6
Feb 23 09:24:57 cryptArch kernel: [drm]   DDC: 0x4868 0x4868 0x4869 0x4869 0x486a 0x486a 0x486b 0x486b
Feb 23 09:24:57 cryptArch kernel: [drm]   Encoders:
Feb 23 09:24:57 cryptArch kernel: [drm]     DFP1: INTERNAL_UNIPHY2
Feb 23 09:24:57 cryptArch kernel: [drm] Connector 1:
Feb 23 09:24:57 cryptArch kernel: [drm]   DP-2
Feb 23 09:24:57 cryptArch kernel: [drm]   HPD4
Feb 23 09:24:57 cryptArch kernel: [drm]   DDC: 0x4870 0x4870 0x4871 0x4871 0x4872 0x4872 0x4873 0x4873
Feb 23 09:24:57 cryptArch kernel: [drm]   Encoders:
Feb 23 09:24:57 cryptArch kernel: [drm]     DFP2: INTERNAL_UNIPHY2
Feb 23 09:24:57 cryptArch kernel: [drm] Connector 2:
Feb 23 09:24:57 cryptArch kernel: [drm]   DP-3
Feb 23 09:24:57 cryptArch kernel: [drm]   HPD1
Feb 23 09:24:57 cryptArch kernel: [drm]   DDC: 0x486c 0x486c 0x486d 0x486d 0x486e 0x486e 0x486f 0x486f
Feb 23 09:24:57 cryptArch kernel: [drm]   Encoders:
Feb 23 09:24:57 cryptArch kernel: [drm]     DFP3: INTERNAL_UNIPHY1
Feb 23 09:24:57 cryptArch kernel: [drm] Connector 3:
Feb 23 09:24:57 cryptArch kernel: [drm]   HDMI-A-1
Feb 23 09:24:57 cryptArch kernel: [drm]   HPD5
Feb 23 09:24:57 cryptArch kernel: [drm]   DDC: 0x4874 0x4874 0x4875 0x4875 0x4876 0x4876 0x4877 0x4877
Feb 23 09:24:57 cryptArch kernel: [drm]   Encoders:
Feb 23 09:24:57 cryptArch kernel: [drm]     DFP4: INTERNAL_UNIPHY1
Feb 23 09:24:57 cryptArch kernel: [drm] Connector 4:
Feb 23 09:24:57 cryptArch kernel: [drm]   DVI-D-1
Feb 23 09:24:57 cryptArch kernel: [drm]   HPD3
Feb 23 09:24:57 cryptArch kernel: [drm]   DDC: 0x487c 0x487c 0x487d 0x487d 0x487e 0x487e 0x487f 0x487f
Feb 23 09:24:57 cryptArch kernel: [drm]   Encoders:
Feb 23 09:24:57 cryptArch kernel: [drm]     DFP5: INTERNAL_UNIPHY
Feb 23 09:24:57 cryptArch kernel: [drm] Found UVD firmware Version: 1.79 Family ID: 16
Feb 23 09:24:57 cryptArch kernel: [drm] Found VCE firmware Version: 52.4 Binary ID: 3
Feb 23 09:24:58 cryptArch kernel: [drm] ring test on 0 succeeded in 10 usecs
Feb 23 09:24:58 cryptArch kernel: [drm] ring test on 1 succeeded in 25 usecs
Feb 23 09:24:58 cryptArch kernel: [drm] ring test on 2 succeeded in 17 usecs
Feb 23 09:24:58 cryptArch kernel: [drm] ring test on 3 succeeded in 3 usecs
Feb 23 09:24:58 cryptArch kernel: [drm] ring test on 4 succeeded in 2 usecs
Feb 23 09:24:58 cryptArch kernel: [drm] ring test on 5 succeeded in 2 usecs
Feb 23 09:24:58 cryptArch kernel: [drm] ring test on 6 succeeded in 2 usecs
Feb 23 09:24:58 cryptArch kernel: [drm] ring test on 7 succeeded in 2 usecs
Feb 23 09:24:58 cryptArch kernel: [drm] ring test on 8 succeeded in 2 usecs
Feb 23 09:24:58 cryptArch kernel: [drm] ring test on 9 succeeded in 6 usecs
Feb 23 09:24:58 cryptArch kernel: [drm] ring test on 10 succeeded in 5 usecs
Feb 23 09:24:58 cryptArch kernel: [drm] ring test on 11 succeeded in 1 usecs
Feb 23 09:24:58 cryptArch kernel: [drm] UVD initialized successfully.
Feb 23 09:24:58 cryptArch kernel: [drm] ring test on 12 succeeded in 9 usecs
Feb 23 09:24:58 cryptArch kernel: [drm] ring test on 13 succeeded in 5 usecs
Feb 23 09:24:58 cryptArch kernel: [drm] VCE initialized successfully.
Feb 23 09:24:58 cryptArch kernel: [drm] fb mappable at 0xC124A000
Feb 23 09:24:58 cryptArch kernel: [drm] vram apper at 0xC0000000
Feb 23 09:24:58 cryptArch kernel: [drm] size 9216000
Feb 23 09:24:58 cryptArch kernel: [drm] fb depth is 24
Feb 23 09:24:58 cryptArch kernel: [drm]    pitch is 7680
Feb 23 09:24:58 cryptArch kernel: fbcon: amdgpudrmfb (fb0) is primary device
Feb 23 09:24:58 cryptArch kernel: amdgpu 0000:01:00.0: fb0: amdgpudrmfb frame buffer device
Feb 23 09:24:58 cryptArch kernel: [drm] ib test on ring 0 succeeded
Feb 23 09:24:59 cryptArch kernel: [drm:gfx_v8_0_ring_test_ib [amdgpu]] *ERROR* amdgpu: IB test timed out.
Feb 23 09:24:59 cryptArch kernel: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* amdgpu: failed testing IB on ring 1 (-110).
Feb 23 09:25:00 cryptArch kernel: [drm:gfx_v8_0_ring_test_ib [amdgpu]] *ERROR* amdgpu: IB test timed out.
Feb 23 09:25:00 cryptArch kernel: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* amdgpu: failed testing IB on ring 2 (-110).
Feb 23 09:25:01 cryptArch kernel: [drm:gfx_v8_0_ring_test_ib [amdgpu]] *ERROR* amdgpu: IB test timed out.
Feb 23 09:25:01 cryptArch kernel: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* amdgpu: failed testing IB on ring 3 (-110).
Feb 23 09:25:02 cryptArch kernel: [drm:gfx_v8_0_ring_test_ib [amdgpu]] *ERROR* amdgpu: IB test timed out.
Feb 23 09:25:02 cryptArch kernel: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* amdgpu: failed testing IB on ring 4 (-110).
Feb 23 09:25:03 cryptArch kernel: [drm:gfx_v8_0_ring_test_ib [amdgpu]] *ERROR* amdgpu: IB test timed out.
Feb 23 09:25:03 cryptArch kernel: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* amdgpu: failed testing IB on ring 5 (-110).
Feb 23 09:25:04 cryptArch kernel: [drm:gfx_v8_0_ring_test_ib [amdgpu]] *ERROR* amdgpu: IB test timed out.
Feb 23 09:25:04 cryptArch kernel: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* amdgpu: failed testing IB on ring 6 (-110).
Feb 23 09:25:05 cryptArch kernel: [drm:gfx_v8_0_ring_test_ib [amdgpu]] *ERROR* amdgpu: IB test timed out.
Feb 23 09:25:05 cryptArch kernel: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* amdgpu: failed testing IB on ring 7 (-110).
Feb 23 09:25:06 cryptArch kernel: [drm:gfx_v8_0_ring_test_ib [amdgpu]] *ERROR* amdgpu: IB test timed out.
Feb 23 09:25:06 cryptArch kernel: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* amdgpu: failed testing IB on ring 8 (-110).
Feb 23 09:25:06 cryptArch kernel: [drm] ib test on ring 9 succeeded
Feb 23 09:25:06 cryptArch kernel: [drm] ib test on ring 10 succeeded
Feb 23 09:25:06 cryptArch kernel: [drm] ib test on ring 11 succeeded
Feb 23 09:25:06 cryptArch kernel: [drm] ib test on ring 12 succeeded
Feb 23 09:25:06 cryptArch kernel: [drm:amdgpu_device_init [amdgpu]] *ERROR* ib ring test failed (-110).
Feb 23 09:25:06 cryptArch kernel: [drm] Initialized amdgpu 3.8.0 20150101 for 0000:01:00.0 on minor 0

More concerning to me is that the RX480 reaches up to 80°C without any significant load, while before it hovered between 45°C and 50°C. It also acts as a heat source within my system and pushes my CPU from the usual ~48°C up to 55°C at present.

Last edited by kaymio (2017-02-24 09:05:39)

Offline

#2 2017-02-23 09:16:21

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,425

Re: [SOLVED] AMDGPU IB ring test failed / RX480 constant at 80°C

there are a few threads that the linux-firmware update seems to have some issues with AMDGPU. You might want to try and downgrade that to the previous version (or wait for the next release/update linux-firmware from testing, since a fix is on its way.)

Last edited by V1del (2017-02-23 09:18:10)

Offline

#3 2017-02-23 09:23:46

kaymio
Member
Registered: 2017-01-07
Posts: 27

Re: [SOLVED] AMDGPU IB ring test failed / RX480 constant at 80°C

Thx for the info. I guess I'll wait, but my BOINC load will have to wait a little till then.

Offline

#4 2017-02-23 14:46:20

ArchArrow
Member
Registered: 2017-02-15
Posts: 26

Re: [SOLVED] AMDGPU IB ring test failed / RX480 constant at 80°C

I have the exact same problems as kaymio.
Will wait for an kernel update

Offline

#5 2017-02-23 16:01:38

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,425

Re: [SOLVED] AMDGPU IB ring test failed / RX480 constant at 80°C

The linux-firmware package is largely independent of the kernel version and the rest of the system, it really shouldn't hurt at all to down-/upgrade to the version in testing if you need/want to use your graphics card.

Offline

#6 2017-02-23 16:06:00

GourdCaptain
Member
Registered: 2009-04-18
Posts: 121

Re: [SOLVED] AMDGPU IB ring test failed / RX480 constant at 80°C

The linux-firmware version currently in testing, 20170217.12987ca-2, https://www.archlinux.org/packages/test … -firmware/ should fix this issue and has on mine and another person's system. I'm using the base repo's 4.9.11 kernel (I've also tested a manually compiled 4.10 for this bug and solution). It's also getting fixed upstream. (I've got a RX 460, which has a more severe Xorg GPU crash bug caused by the same problematic firmware.)

EDIT: Didn't see the above posts mentioning this, but I figure my system actually working with it was a decent anecdote to share anyway..

Last edited by GourdCaptain (2017-02-23 16:09:21)

Offline

#7 2017-02-23 16:39:31

ArchArrow
Member
Registered: 2017-02-15
Posts: 26

Re: [SOLVED] AMDGPU IB ring test failed / RX480 constant at 80°C

@GourdCaptain: Thanks for your input. I did a little bit more research and yes it is of course the "linux-firmware" package that I am waiting for, not an updated kernel.

However I don't fully understand what is actually broken. It's not the kernel and its not the driver, because then the error would have to be fixed in "xf86-video-amdgpu".
As far as I understand the package "linux-firmware" includes CPU microcode. How is this relevant for AMD G(!)PUs and why isn't this microcode in the appropriate "xf86-video-amdgpu" driver package?

Last edited by ArchArrow (2017-02-23 16:43:00)

Offline

#8 2017-02-23 16:41:18

GourdCaptain
Member
Registered: 2009-04-18
Posts: 121

Re: [SOLVED] AMDGPU IB ring test failed / RX480 constant at 80°C

ArchArrow wrote:

@GourdCaptain: Thanks for your input. I did a little bit more research and yes it is of course the "linux-firmware" package that I am waiting for, not an updated kernel.

However I don't fully understand what is actually broken. It's not the kernel and its not the driver, because then the error would have to be fixed in "xf86-video-amdgpu".
As far as I understood the package "linux-firmware" include CPU microcode. How is this relevant for AMD G(!)PUs and why isn't this microcode in the appropriate "xf86-video-amdgpu" driver package?

The linux-firmware package ships the firmware to be loaded into the graphics card at boot, similar to the microcode. This is part of the "linux-firmware" distribution (part of the linux kernel project, and removed from the base package due to not being open source) and is not handled by the same project as xf86-video-amdgpu (the xorg project).

EDIT: Linux-firmware ships a lot of firmware images, used for (non-Intel) CPUs, GPUs, wireless cards, bluetooth, and so on.

Last edited by GourdCaptain (2017-02-23 16:43:04)

Offline

#9 2017-02-23 16:51:01

ArchArrow
Member
Registered: 2017-02-15
Posts: 26

Re: [SOLVED] AMDGPU IB ring test failed / RX480 constant at 80°C

Hmmm I think I understand what you're trying to say, but it's still a weird concept for me to grasp.

So "linux-firmware" has the AMD GPU firmwares for all the different GPUs (and other products like WIFI, Bluetooth, etc...)  and loads that into the GPU when it boots up.

That would mean that the GPU couldn't just hold its own firmware, which doesn't seem to make sense, considering the GPU is already showing something BEFORE booting (UEFI/BIOS)

What am I getting wrong?

Offline

#10 2017-02-23 16:54:28

GourdCaptain
Member
Registered: 2009-04-18
Posts: 121

Re: [SOLVED] AMDGPU IB ring test failed / RX480 constant at 80°C

ArchArrow wrote:

Hmmm I think I understand what you're trying to say, but it's still a weird concept for me to grasp.

So "linux-firmware" has the AMD GPU firmwares for all the different GPUs (and other products like WIFI, Bluetooth, etc...)  and loads that into the GPU when it boots up.

That would mean that the GPU couldn't just hold its own firmware, which doesn't seem to make sense, considering the GPU is already showing something BEFORE booting (UEFI/BIOS)

What am I getting wrong?

Without the firmware (which is stored in internal volatile memory) the GPU is limited to basic output functionality with what is inherently permenently on the card. No acceleration, just the probably off native resolution blurry legacy output mode your BIOS uses as well as the OS prior to KMS initializing. It just pushes pixels to the screen blindly or basic text output. Once loaded, advanced GPU functionality starts working as it now has the software internally for it to use.

Offline

#11 2017-02-23 23:45:35

ArchArrow
Member
Registered: 2017-02-15
Posts: 26

Re: [SOLVED] AMDGPU IB ring test failed / RX480 constant at 80°C

Thanks for clearing that up. It seems a bit weird that a 250 US Dollar graphics card cannot afford non-volatile storage for its own firmware, but I'm sure there are reasons.

And back to topic: I just installed the updated linux-firmware package from <testing> and everything works as it should again!

Offline

#12 2017-02-24 01:30:56

GourdCaptain
Member
Registered: 2009-04-18
Posts: 121

Re: [SOLVED] AMDGPU IB ring test failed / RX480 constant at 80°C

ArchArrow wrote:

Thanks for clearing that up. It seems a bit weird that a 250 US Dollar graphics card cannot afford non-volatile storage for its own firmware, but I'm sure there are reasons.

And back to topic: I just installed the updated linux-firmware package from <testing> and everything works as it should again!

It's a standard for some reason. CPUs do the same with microcode (why you have to set it up as an initramfs for early boot on a lot of recent Intel chips), wifi cards with their firmware. I don't know why, although I haven't really looked.

Offline

#13 2017-02-24 09:04:45

kaymio
Member
Registered: 2017-01-07
Posts: 27

Re: [SOLVED] AMDGPU IB ring test failed / RX480 constant at 80°C

The linux-firmware from testing also resolved the problem on my Arch.

Enable the testing repo in pacman.conf
pacman -Syy
pacman -S linux-firmware

You should not forget to disable the testing repo afterwards, before updating the system fully for the next time.

I consider this thread as [SOLVED]

Offline

#14 2017-02-24 11:37:27

Lone_Wolf
Member
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 11,868

Re: [SOLVED] AMDGPU IB ring test failed / RX480 constant at 80°C

ArchArrow wrote:

It seems a bit weird that a 250 US Dollar graphics card cannot afford non-volatile storage for its own firmware, but I'm sure there are reasons.

correcting firmware errors that are discovered AFTER the cards are sold to people is a major reason.

Last edited by Lone_Wolf (2017-02-24 11:37:55)


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#15 2017-02-25 19:30:31

edres
Member
Registered: 2017-02-06
Posts: 7

Re: [SOLVED] AMDGPU IB ring test failed / RX480 constant at 80°C

kaymio wrote:

The linux-firmware from testing also resolved the problem on my Arch.

Enable the testing repo in pacman.conf
pacman -Syy
pacman -S linux-firmware

You should not forget to disable the testing repo afterwards, before updating the system fully for the next time.

I consider this thread as [SOLVED]

Thanks for the step by step guide ^^

Offline

Board footer

Powered by FluxBB