You are not logged in.

#1 2016-08-12 12:45:34

ledbettj
Member
From: Atlanta
Registered: 2012-10-14
Posts: 35
Website

rx480 - loading amdgpu driver turns off display

I just recently replaced an older card with an amd rx 480 based card; prior to installing the card, I uninstalled all nvidia related packages, installed mesa / xf86-video-amdgpu, and kernel 4.7 + linux-firmware-git + libdrm-git.

The card is detected fine and the system boots fine, but as soon as the amdgpu driver is loaded, the monitor (hdmi) immediately says "no input" and puts itself to sleep.  The machine is still running fine and ssh-able just fine;  here's the dmesg log when loading the amdgpu module.


[  208.352310] amdgpu 0000:01:00.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff

according to my googling this is apparently harmless.

[  208.488937] [AVFS] Something is broken. See log!

This one I can't find any google references for except the source code.  I also don't see it writing data to any other log file in the code, so unsure if this is an issue or not.

[  208.355719] [drm] Connector 3:
[  208.355720] [drm]   HDMI-A-3
[  208.355721] [drm]   HPD5
[  208.355723] [drm]   DDC: 0x4874 0x4874 0x4875 0x4875 0x4876 0x4876 0x4877 0x4877
[  208.355724] [drm]   Encoders:
[  208.355725] [drm]     DFP4: INTERNAL_UNIPHY1

This is the HDMI port that the monitor is connected to;  monitor works just fine with on board intel graphics and same HDMI cable, and works fine right up until amdgpu get's loaded sad


Does anyone have any clues or suggestions?  Do i need mesa-git to even get the driver to give me something besides a black screen on the framebuffer console?

Last edited by ledbettj (2016-08-12 12:46:38)

Offline

#2 2016-08-12 18:53:48

Lone_Wolf
Member
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 11,920

Re: rx480 - loading amdgpu driver turns off display

fb: switching to amdgpudrmfb from EFI VGA

ledbettj , are you booting in uefi mode or legacy mode ?
is your motherboard firmware uptodate ?
Are you using intel microcode ?

Does it make a difference if you put the amdgpu module in initramfs ?


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#3 2016-08-12 19:58:36

ledbettj
Member
From: Atlanta
Registered: 2012-10-14
Posts: 35
Website

Re: rx480 - loading amdgpu driver turns off display

Thanks for the reply.  I'm booting in UEFI.  I have a z75 pro3 motherboard that I updated to the latest efi firmware which I think included a Intel microcode update with no luck.  I might also have the Intel ucode package installed.

Adding amdgpu to the init just causes the screen to go black sooner, so I can't even unlock the disk.

Offline

#4 2016-08-13 14:18:16

Lone_Wolf
Member
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 11,920

Re: rx480 - loading amdgpu driver turns off display

Does this system have an integrated (intel) videocard as well as the discrete AMD card ?

please post full dmesg.


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#5 2016-08-13 17:47:16

rc.conf
Member
From: Germany
Registered: 2016-08-13
Posts: 1

Re: rx480 - loading amdgpu driver turns off display

Hey ledbettj,

I'm not using Arch Linux but Gentoo with kernel 4.7, AMD firmware as of 28 Jun 2016 and Mesa 12.1.0-devel (git-3fb4a9b) with an RX 480. I built all the drivers and firmware into the kernel and the RX 480 works just fine.

ledbettj wrote:
[  208.352310] amdgpu 0000:01:00.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff

according to my googling this is apparently harmless.

I have exacly the same line in my log.

ledbettj wrote:
[  208.488937] [AVFS] Something is broken. See log!

This one I can't find any google references for except the source code.  I also don't see it writing data to any other log file in the code, so unsure if this is an issue or not.

Again, I have exactly the same line in my log.

Generally, your amdgpu related dmesg looks very similar to mine. My Dell monitor is connected via HDMI and as soon as the amdgpu driver is loaded, the display goes blank and comes back with proper, i.e. native, resolution of the monitor a second later. Anyways, here's my dmesg output:

[    0.498761] [drm] Initialized drm 1.1.0 20060810
[    0.498794] [drm] amdgpu kernel modesetting enabled.
[    0.498808] checking generic (c0000000 300000) vs hw (c0000000 10000000)
[    0.498809] fb: switching to amdgpudrmfb from EFI VGA
[    0.498828] Console: switching to colour dummy device 80x25
[    0.499127] [drm] initializing kernel modesetting (POLARIS10 0x1002:0x67DF 0x1682:0x9480 0xC7).
[    0.499134] [drm] register mmio base: 0xDFE00000
[    0.499135] [drm] register mmio size: 262144
[    0.499138] [drm] doorbell mmio base: 0xD0000000
[    0.499140] [drm] doorbell mmio size: 2097152
[    0.499149] [drm] probing gen 2 caps for device 8086:1901 = 261ad03/e
[    0.499151] [drm] probing mlw for device 8086:1901 = 261ad03
[    0.499165] amdgpu 0000:01:00.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff
[    0.499190] ATOM BIOS: D00901
[    0.499338] amdgpu 0000:01:00.0: VRAM: 8192M 0x0000000000000000 - 0x00000001FFFFFFFF (8192M used)
[    0.499341] amdgpu 0000:01:00.0: GTT: 8192M 0x0000000200000000 - 0x00000003FFFFFFFF
[    0.499344] [drm] Detected VRAM RAM=8192M, BAR=256M
[    0.499346] [drm] RAM width 256bits GDDR5
[    0.499386] [TTM] Zone  kernel: Available graphics memory: 16446258 kiB
[    0.499389] [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
[    0.499390] [TTM] Initializing pool allocator
[    0.499394] [TTM] Initializing DMA pool allocator
[    0.499405] [drm] amdgpu: 8192M of VRAM memory ready
[    0.499406] [drm] amdgpu: 8192M of GTT memory ready.
[    0.499416] [drm] GART: num cpu pages 2097152, num gpu pages 2097152
[    0.500584] [drm] PCIE GART of 8192M enabled (table at 0x0000000000040000).
[    0.500591] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    0.500593] [drm] Driver supports precise vblank timestamp query.
[    0.500606] amdgpu 0000:01:00.0: amdgpu: using MSI.
[    0.500621] [drm] amdgpu: irq initialized.
[    0.500625] Can't find requested voltage id in vdd_dep_on_sclk table!
[    0.500719] amdgpu: powerplay initialized
[    0.500837] [drm] AMDGPU Display Connectors
[    0.500839] [drm] Connector 0:
[    0.500840] [drm]   DP-1
[    0.500842] [drm]   HPD6
[    0.500843] [drm]   DDC: 0x4868 0x4868 0x4869 0x4869 0x486a 0x486a 0x486b 0x486b
[    0.500846] [drm]   Encoders:
[    0.500848] [drm]     DFP1: INTERNAL_UNIPHY2
[    0.500849] [drm] Connector 1:
[    0.500851] [drm]   DP-2
[    0.500852] [drm]   HPD4
[    0.500854] [drm]   DDC: 0x4870 0x4870 0x4871 0x4871 0x4872 0x4872 0x4873 0x4873
[    0.500856] [drm]   Encoders:
[    0.500858] [drm]     DFP2: INTERNAL_UNIPHY2
[    0.500859] [drm] Connector 2:
[    0.500861] [drm]   DP-3
[    0.500862] [drm]   HPD1
[    0.500864] [drm]   DDC: 0x486c 0x486c 0x486d 0x486d 0x486e 0x486e 0x486f 0x486f
[    0.500866] [drm]   Encoders:
[    0.500868] [drm]     DFP3: INTERNAL_UNIPHY1
[    0.500869] [drm] Connector 3:
[    0.500871] [drm]   HDMI-A-1
[    0.500872] [drm]   HPD5
[    0.500874] [drm]   DDC: 0x4874 0x4874 0x4875 0x4875 0x4876 0x4876 0x4877 0x4877
[    0.500876] [drm]   Encoders:
[    0.500877] [drm]     DFP4: INTERNAL_UNIPHY1
[    0.500974] amdgpu 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000200000008, cpu addr 0xffff880851f26008
[    0.501042] amdgpu 0000:01:00.0: fence driver on ring 1 use gpu addr 0x000000020000001c, cpu addr 0xffff880851f2601c
[    0.501082] amdgpu 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000200000030, cpu addr 0xffff880851f26030
[    0.501128] amdgpu 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000200000044, cpu addr 0xffff880851f26044
[    0.501164] amdgpu 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000200000058, cpu addr 0xffff880851f26058
[    0.501286] amdgpu 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000020000006c, cpu addr 0xffff880851f2606c
[    0.501356] amdgpu 0000:01:00.0: fence driver on ring 6 use gpu addr 0x0000000200000080, cpu addr 0xffff880851f26080
[    0.501397] amdgpu 0000:01:00.0: fence driver on ring 7 use gpu addr 0x0000000200000094, cpu addr 0xffff880851f26094
[    0.501434] amdgpu 0000:01:00.0: fence driver on ring 8 use gpu addr 0x00000002000000a8, cpu addr 0xffff880851f260a8
[    0.501474] amdgpu 0000:01:00.0: fence driver on ring 9 use gpu addr 0x00000002000000bc, cpu addr 0xffff880851f260bc
[    0.501511] amdgpu 0000:01:00.0: fence driver on ring 10 use gpu addr 0x00000002000000d0, cpu addr 0xffff880851f260d0
[    0.501517] [drm] Found UVD firmware Version: 1.79 Family ID: 16
[    0.501770] amdgpu 0000:01:00.0: fence driver on ring 11 use gpu addr 0x000000000109c420, cpu addr 0xffffc90003c5a420
[    0.501778] [drm] Found VCE firmware Version: 52.4 Binary ID: 3
[    0.501856] amdgpu 0000:01:00.0: fence driver on ring 12 use gpu addr 0x00000002000000f8, cpu addr 0xffff880851f260f8
[    0.501896] amdgpu 0000:01:00.0: fence driver on ring 13 use gpu addr 0x000000020000010c, cpu addr 0xffff880851f2610c
[    0.561483] [AVFS] Something is broken. See log!
[    0.585945] [drm] ring test on 0 succeeded in 11 usecs
[    0.586159] [drm] ring test on 1 succeeded in 27 usecs
[    0.586246] [drm] ring test on 2 succeeded in 16 usecs
[    0.586255] [drm] ring test on 3 succeeded in 1 usecs
[    0.586264] [drm] ring test on 4 succeeded in 1 usecs
[    0.586273] [drm] ring test on 5 succeeded in 1 usecs
[    0.586282] [drm] ring test on 6 succeeded in 1 usecs
[    0.586307] [drm] ring test on 7 succeeded in 1 usecs
[    0.586314] [drm] ring test on 8 succeeded in 1 usecs
[    0.586356] [drm] ring test on 9 succeeded in 5 usecs
[    0.586363] [drm] ring test on 10 succeeded in 5 usecs
[    0.612404] [drm] ring test on 11 succeeded in 2 usecs
[    0.612406] [drm] UVD initialized successfully.
[    0.722453] [drm] ring test on 12 succeeded in 10 usecs
[    0.722465] [drm] ring test on 13 succeeded in 4 usecs
[    0.722466] [drm] VCE initialized successfully.
[    1.303620] [drm] fb mappable at 0xC12A6000
[    1.303622] [drm] vram apper at 0xC0000000
[    1.303624] [drm] size 8294400
[    1.303625] [drm] fb depth is 24
[    1.303627] [drm]    pitch is 7680
[    1.303742] fbcon: amdgpudrmfb (fb0) is primary device
[    1.490361] tsc: Refined TSC clocksource calibration: 3312.028 MHz
[    1.490362] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x2fbdaeaf6e9, max_idle_ns: 440795311350 ns
[    1.933733] Console: switching to colour frame buffer device 240x67
[    1.936890] amdgpu 0000:01:00.0: fb0: amdgpudrmfb frame buffer device
[    1.941575] [drm] ib test on ring 0 succeeded in 0 usecs
[    1.941887] [drm] ib test on ring 1 succeeded in 0 usecs
[    1.942003] [drm] ib test on ring 2 succeeded in 0 usecs
[    1.942103] [drm] ib test on ring 3 succeeded in 0 usecs
[    1.942170] [drm] ib test on ring 4 succeeded in 0 usecs
[    1.942241] [drm] ib test on ring 5 succeeded in 0 usecs
[    1.942353] [drm] ib test on ring 6 succeeded in 0 usecs
[    1.942409] [drm] ib test on ring 7 succeeded in 0 usecs
[    1.942461] [drm] ib test on ring 8 succeeded in 0 usecs
[    1.942505] [drm] ib test on ring 9 succeeded in 0 usecs
[    1.942546] [drm] ib test on ring 10 succeeded in 0 usecs
[    1.943749] [drm] ib test on ring 11 succeeded
[    1.943928] [drm] ib test on ring 12 succeeded
[    1.945617] [drm] Initialized amdgpu 3.2.0 20150101 for 0000:01:00.0 on minor 0

I read in some hackintosh forum that the reference model RX 480 models have framebuffer support, while the non-reference models with a different port configuration do have problems. Do you run a reference RX 480?


- rc.conf

Offline

#6 2016-08-14 22:13:18

Ownaginatious
Member
Registered: 2010-08-28
Posts: 60

Re: rx480 - loading amdgpu driver turns off display

Having the exact same problem here. OP, are you using an HDMI2.0 monitor by chance? I asked the about the same thing on reddit here: https://www.reddit.com/r/archlinux/comm … ce_amdgpu/

Unfortunately, no solution yet sad

Offline

#7 2016-08-14 23:23:23

ledbettj
Member
From: Atlanta
Registered: 2012-10-14
Posts: 35
Website

Re: rx480 - loading amdgpu driver turns off display

Thanks for the advice guys.  I managed to get it working by ordering a displayport cable and using that instead of HDMI.  No idea why. My monitor is a LG 29UM68-P which I don't think is HDMI 2.0.

Offline

#8 2016-08-16 10:48:24

libgradev
Member
From: Wandering the Wilds
Registered: 2012-02-23
Posts: 35

Re: rx480 - loading amdgpu driver turns off display

ledbettj wrote:

Thanks for the advice guys.  I managed to get it working by ordering a displayport cable and using that instead of HDMI.  No idea why. My monitor is a LG 29UM68-P which I don't think is HDMI 2.0.

That monitor's HDMI 1.4 according to the specs.


ASRock TRX40 Creator B1.70 | AMD TR3970X | 64GB G.Skill Trident Z | AMD RX 6900XT 16GB / AMD RX 6800XT 16GB (VFIO) | Samsung CRG90 | BENQ 1080p (portrait) | 1x Samsung 850 EVO 1TB | 2x Samsung 960 EVO NVMe | 5x WD Red 4TB (RAID6) | Corsair MP600 Force 500GB  + 8GB Seagate (store) | Sennheiser MOMENTUM 3 | Roccat KoneXTD Optical

Offline

#9 2016-08-16 10:50:01

libgradev
Member
From: Wandering the Wilds
Registered: 2012-02-23
Posts: 35

Re: rx480 - loading amdgpu driver turns off display

Ownaginatious wrote:

Having the exact same problem here. OP, are you using an HDMI2.0 monitor by chance? I asked the about the same thing on reddit here: https://www.reddit.com/r/archlinux/comm … ce_amdgpu/

Unfortunately, no solution yet sad

HDMI 2.0 requires the DAL code which isn't set to be mainlined yet...

Your best bet (if you can) is DisplayPort - which is working fine here.


ASRock TRX40 Creator B1.70 | AMD TR3970X | 64GB G.Skill Trident Z | AMD RX 6900XT 16GB / AMD RX 6800XT 16GB (VFIO) | Samsung CRG90 | BENQ 1080p (portrait) | 1x Samsung 850 EVO 1TB | 2x Samsung 960 EVO NVMe | 5x WD Red 4TB (RAID6) | Corsair MP600 Force 500GB  + 8GB Seagate (store) | Sennheiser MOMENTUM 3 | Roccat KoneXTD Optical

Offline

#10 2016-08-16 20:00:13

Ownaginatious
Member
Registered: 2010-08-28
Posts: 60

Re: rx480 - loading amdgpu driver turns off display

libgradev wrote:
Ownaginatious wrote:

Having the exact same problem here. OP, are you using an HDMI2.0 monitor by chance? I asked the about the same thing on reddit here: https://www.reddit.com/r/archlinux/comm … ce_amdgpu/

Unfortunately, no solution yet sad

HDMI 2.0 requires the DAL code which isn't set to be mainlined yet...

Your best bet (if you can) is DisplayPort - which is working fine here.

Unfortunately, the monitor/TV I'm using only has HDMI2.0 inputs. I guess I wait it out until HDMI2.0 support is added. Hopefully, it's sooner than later sad

Offline

#11 2016-08-26 20:30:18

Ownaginatious
Member
Registered: 2010-08-28
Posts: 60

Re: rx480 - loading amdgpu driver turns off display

Just an update; I was able to fix the issue by using a custom EDID. Turns out Vizio really screwed theirs up, and I guess the Linux video drivers don't tend to cover for all the corner cases like the Windows ones do.

Offline

#12 2016-09-07 09:16:49

jnko
Member
Registered: 2015-11-03
Posts: 13

Re: rx480 - loading amdgpu driver turns off display

Ownaginatious wrote:

Just an update; I was able to fix the issue by using a custom EDID. Turns out Vizio really screwed theirs up, and I guess the Linux video drivers don't tend to cover for all the corner cases like the Windows ones do.


Would you like to be so kindly and share with the rest of us how you did this? Thanks!

Well, I've a RX470 with amdgpu-pro from aur and just a couple of similar issues.
What I need to mention is that this also happens with non-pro drivers.

Sep 07 10:28:44 alaska kernel: [drm] amdgpu kernel modesetting enabled.
Sep 07 10:28:44 alaska kernel: checking generic (e0000000 7f0000) vs hw (e0000000 10000000)
Sep 07 10:28:44 alaska kernel: fb: switching to amdgpudrmfb from EFI VGA
Sep 07 10:28:44 alaska kernel: Console: switching to colour dummy device 80x25
Sep 07 10:28:44 alaska kernel: [drm] initializing kernel modesetting (POLARIS10 0x1002:0x67DF 0x1682:0x9470 0xCF).
Sep 07 10:28:44 alaska kernel: [drm] register mmio base: 0xFBE00000
Sep 07 10:28:44 alaska kernel: [drm] register mmio size: 262144
Sep 07 10:28:44 alaska kernel: [drm] doorbell mmio base: 0xF0000000
Sep 07 10:28:44 alaska kernel: [drm] doorbell mmio size: 2097152
Sep 07 10:28:44 alaska kernel: [drm] probing gen 2 caps for device 8086:6f08 = 77a3103/e
Sep 07 10:28:44 alaska kernel: [drm] probing mlw for device 8086:6f08 = 77a3103
Sep 07 10:28:44 alaska kernel: amdgpu 0000:03:00.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff
Sep 07 10:28:44 alaska kernel: ATOM BIOS: D00003
Sep 07 10:28:44 alaska kernel: amdgpu 0000:03:00.0: VRAM: 4096M 0x0000000000000000 - 0x00000000FFFFFFFF (4096M used)
Sep 07 10:28:44 alaska kernel: amdgpu 0000:03:00.0: GTT: 4096M 0x0000000100000000 - 0x00000001FFFFFFFF
Sep 07 10:28:44 alaska kernel: [drm] Detected VRAM RAM=4096M, BAR=256M
Sep 07 10:28:44 alaska kernel: [drm] RAM width 256bits GDDR5
Sep 07 10:28:44 alaska kernel: [TTM] Zone  kernel: Available graphics memory: 32950394 kiB
Sep 07 10:28:44 alaska kernel: [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
Sep 07 10:28:44 alaska kernel: [TTM] Initializing pool allocator
Sep 07 10:28:44 alaska kernel: [TTM] Initializing DMA pool allocator
Sep 07 10:28:44 alaska kernel: [drm] amdgpu: 4096M of VRAM memory ready
Sep 07 10:28:44 alaska kernel: [drm] amdgpu: 4096M of GTT memory ready.
Sep 07 10:28:44 alaska kernel: [drm] GART: num cpu pages 1048576, num gpu pages 1048576
Sep 07 10:28:44 alaska kernel: [drm] PCIE GART of 4096M enabled (table at 0x0000000000040000).
Sep 07 10:28:44 alaska kernel: amdgpu 0000:03:00.0: amdgpu: using MSI.
Sep 07 10:28:44 alaska kernel: [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
Sep 07 10:28:44 alaska kernel: [drm] Driver supports precise vblank timestamp query.
Sep 07 10:28:44 alaska kernel: [drm] amdgpu: irq initialized.
Sep 07 10:28:44 alaska kernel: Can't find requested voltage id in vdd_dep_on_sclk table!
Sep 07 10:28:44 alaska kernel: amdgpu: powerplay initialized
Sep 07 10:28:44 alaska kernel: [drm] AMDGPU Display Connectors
Sep 07 10:28:44 alaska kernel: [drm] Connector 0:
Sep 07 10:28:44 alaska kernel: [drm]   DP-1
Sep 07 10:28:44 alaska kernel: [drm]   HPD6
Sep 07 10:28:44 alaska kernel: [drm]   DDC: 0x4868 0x4868 0x4869 0x4869 0x486a 0x486a 0x486b 0x486b
Sep 07 10:28:44 alaska kernel: [drm]   Encoders:
Sep 07 10:28:44 alaska kernel: [drm]     DFP1: INTERNAL_UNIPHY2
Sep 07 10:28:44 alaska kernel: [drm] Connector 1:
Sep 07 10:28:44 alaska kernel: [drm]   DP-2
Sep 07 10:28:44 alaska kernel: [drm]   HPD4
Sep 07 10:28:44 alaska kernel: [drm]   DDC: 0x4870 0x4870 0x4871 0x4871 0x4872 0x4872 0x4873 0x4873
Sep 07 10:28:44 alaska kernel: [drm]   Encoders:
Sep 07 10:28:44 alaska kernel: [drm]     DFP2: INTERNAL_UNIPHY2
Sep 07 10:28:44 alaska kernel: [drm] Connector 2:
Sep 07 10:28:44 alaska kernel: [drm]   DP-3
Sep 07 10:28:44 alaska kernel: [drm]   HPD1
Sep 07 10:28:44 alaska kernel: [drm]   DDC: 0x486c 0x486c 0x486d 0x486d 0x486e 0x486e 0x486f 0x486f
Sep 07 10:28:44 alaska kernel: [drm]   Encoders:
Sep 07 10:28:44 alaska kernel: [drm]     DFP3: INTERNAL_UNIPHY1
Sep 07 10:28:44 alaska kernel: [drm] Connector 3:
Sep 07 10:28:44 alaska kernel: [drm]   HDMI-A-1
Sep 07 10:28:44 alaska kernel: [drm]   HPD5
Sep 07 10:28:44 alaska kernel: [drm]   DDC: 0x4874 0x4874 0x4875 0x4875 0x4876 0x4876 0x4877 0x4877
Sep 07 10:28:44 alaska kernel: [drm]   Encoders:
Sep 07 10:28:44 alaska kernel: [drm]     DFP4: INTERNAL_UNIPHY1
Sep 07 10:28:44 alaska kernel: [drm] Connector 4:
Sep 07 10:28:44 alaska kernel: [drm]   DVI-D-1
Sep 07 10:28:44 alaska kernel: [drm]   HPD3
Sep 07 10:28:44 alaska kernel: [drm]   DDC: 0x487c 0x487c 0x487d 0x487d 0x487e 0x487e 0x487f 0x487f
Sep 07 10:28:44 alaska kernel: [drm]   Encoders:
Sep 07 10:28:44 alaska kernel: [drm]     DFP5: INTERNAL_UNIPHY
Sep 07 10:28:44 alaska kernel: amdgpu 0000:03:00.0: fence driver on ring 0 use gpu addr 0x0000000100000008, cpu addr 0xfff
Sep 07 10:28:44 alaska kernel: amdgpu 0000:03:00.0: fence driver on ring 1 use gpu addr 0x000000010000001c, cpu addr 0xfff
Sep 07 10:28:44 alaska kernel: amdgpu 0000:03:00.0: fence driver on ring 2 use gpu addr 0x0000000100000030, cpu addr 0xfff
Sep 07 10:28:44 alaska kernel: amdgpu 0000:03:00.0: fence driver on ring 3 use gpu addr 0x0000000100000044, cpu addr 0xfff
Sep 07 10:28:44 alaska kernel: amdgpu 0000:03:00.0: fence driver on ring 4 use gpu addr 0x0000000100000058, cpu addr 0xfff
Sep 07 10:28:44 alaska kernel: amdgpu 0000:03:00.0: fence driver on ring 5 use gpu addr 0x000000010000006c, cpu addr 0xfff
Sep 07 10:28:44 alaska kernel: amdgpu 0000:03:00.0: fence driver on ring 6 use gpu addr 0x0000000100000080, cpu addr 0xfff
Sep 07 10:28:44 alaska kernel: amdgpu 0000:03:00.0: fence driver on ring 7 use gpu addr 0x0000000100000094, cpu addr 0xfff
Sep 07 10:28:44 alaska kernel: amdgpu 0000:03:00.0: fence driver on ring 8 use gpu addr 0x00000001000000a8, cpu addr 0xfff
Sep 07 10:28:44 alaska kernel: amdgpu 0000:03:00.0: fence driver on ring 9 use gpu addr 0x00000001000000bc, cpu addr 0xfff
Sep 07 10:28:44 alaska kernel: amdgpu 0000:03:00.0: fence driver on ring 10 use gpu addr 0x00000001000000d0, cpu addr 0xff
Sep 07 10:28:44 alaska kernel: [drm] Found UVD firmware Version: 1.79 Family ID: 16
Sep 07 10:28:44 alaska kernel: amdgpu 0000:03:00.0: fence driver on ring 11 use gpu addr 0x000000000089c420, cpu addr 0xff
Sep 07 10:28:44 alaska kernel: [drm] Found VCE firmware Version: 52.4 Binary ID: 3
Sep 07 10:28:44 alaska kernel: amdgpu 0000:03:00.0: fence driver on ring 12 use gpu addr 0x00000001000000f8, cpu addr 0xff
Sep 07 10:28:44 alaska kernel: amdgpu 0000:03:00.0: fence driver on ring 13 use gpu addr 0x000000010000010c, cpu addr 0xff
Sep 07 10:28:44 alaska kernel: [AVFS] Something is broken. See log!VDDCI is larger than max VDDCI in VDDCI Voltage Table!
Sep 07 10:28:44 alaska kernel: [drm] ring test on 0 succeeded in 11 usecs
Sep 07 10:28:44 alaska kernel: [drm] ring test on 1 succeeded in 26 usecs
Sep 07 10:28:44 alaska kernel: [drm] ring test on 2 succeeded in 17 usecs
Sep 07 10:28:44 alaska kernel: [drm] ring test on 3 succeeded in 4 usecs
Sep 07 10:28:44 alaska kernel: [drm] ring test on 4 succeeded in 2 usecs
Sep 07 10:28:44 alaska kernel: [drm] ring test on 5 succeeded in 2 usecs
Sep 07 10:28:44 alaska kernel: [drm] ring test on 6 succeeded in 3 usecs
Sep 07 10:28:44 alaska kernel: [drm] ring test on 7 succeeded in 3 usecs
Sep 07 10:28:44 alaska kernel: [drm] ring test on 8 succeeded in 2 usecs
Sep 07 10:28:44 alaska kernel: [drm] ring test on 9 succeeded in 5 usecs
Sep 07 10:28:44 alaska kernel: [drm] ring test on 10 succeeded in 5 usecs
Sep 07 10:28:44 alaska kernel: [drm] ring test on 11 succeeded in 1 usecs
Sep 07 10:28:44 alaska kernel: [drm] UVD initialized successfully.
Sep 07 10:28:44 alaska kernel: [drm] ring test on 12 succeeded in 10 usecs
Sep 07 10:28:44 alaska kernel: [drm] ring test on 13 succeeded in 5 usecs
Sep 07 10:28:44 alaska kernel: [drm] VCE initialized successfully.
Sep 07 10:28:44 alaska kernel: [drm] fb mappable at 0xE0AA6000
Sep 07 10:28:44 alaska kernel: [drm] vram apper at 0xE0000000
Sep 07 10:28:44 alaska kernel: [drm] size 8294400
Sep 07 10:28:44 alaska kernel: [drm] fb depth is 24
Sep 07 10:28:44 alaska kernel: [drm]    pitch is 7680
Sep 07 10:28:44 alaska kernel: fbcon: amdgpudrmfb (fb0) is primary device
Sep 07 10:28:44 alaska kernel: tsc: Refined TSC clocksource calibration: 3399.996 MHz
Sep 07 10:28:44 alaska kernel: clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x31024b3bec5, max_idle_ns: 44079536
Sep 07 10:28:44 alaska kernel: Console: switching to colour frame buffer device 240x67
Sep 07 10:28:44 alaska kernel: amdgpu 0000:03:00.0: fb0: amdgpudrmfb frame buffer device
Sep 07 10:28:44 alaska kernel: [drm] ib test on ring 0 succeeded in 0 usecs
Sep 07 10:28:44 alaska kernel: [drm] ib test on ring 1 succeeded in 0 usecs
Sep 07 10:28:44 alaska kernel: [drm] ib test on ring 2 succeeded in 0 usecs
Sep 07 10:28:44 alaska kernel: [drm] ib test on ring 3 succeeded in 0 usecs
Sep 07 10:28:44 alaska kernel: [drm] ib test on ring 4 succeeded in 0 usecs
Sep 07 10:28:44 alaska kernel: [drm] ib test on ring 5 succeeded in 0 usecs
Sep 07 10:28:44 alaska kernel: [drm] ib test on ring 6 succeeded in 0 usecs
Sep 07 10:28:44 alaska kernel: [drm] ib test on ring 7 succeeded in 0 usecs
Sep 07 10:28:44 alaska kernel: [drm] ib test on ring 8 succeeded in 0 usecs
Sep 07 10:28:44 alaska kernel: [drm] ib test on ring 9 succeeded in 0 usecs
Sep 07 10:28:44 alaska kernel: [drm] ib test on ring 10 succeeded in 0 usecs
Sep 07 10:28:44 alaska kernel: [drm] ib test on ring 11 succeeded
Sep 07 10:28:44 alaska kernel: [drm] ib test on ring 12 succeeded
Sep 07 10:28:44 alaska kernel: [drm] Initialized amdgpu 3.2.0 20150101 for 0000:03:00.0 on minor 0

Well all in all this works, but at some point my screen just turns black. The computer still runs, accessing it via ssh is still working but the display doesn't turn on again until restart.
There is nothing that I could reproduce until now. Just after some time (one ... a coulpe of days) the following happens:

Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0018(Rec
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0:   device [8086:6f08] error status/mask=00000001/00002000
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0:    [ 0] Receiver Error         (First)
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: can't find device of ID0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Uncorrected (Non-Fatal) error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction L
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0:   device [8086:6f08] error status/mask=00004020/00000000
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0:    [ 5] Surprise Down Error
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0:    [14] Completion Timeout     (First)
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: broadcast error_detected message
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Device recovery failed
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: AER: Multiple Uncorrected (Fatal) error received: id=0018
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0:   device [8086:6f08] error status/mask=00004020/00000000
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0:    [ 5] Surprise Down Error
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0:    [14] Completion Timeout     (First)
Sep 06 07:59:15 alaska kernel: pcieport 0000:00:03.0: broadcast error_detected message
Sep 06 07:59:15 alaska kernel: amdgpu 0000:03:00.0: device has no AER-aware driver
Sep 06 07:59:15 alaska kernel: snd_hda_intel 0000:03:00.1: device has no AER-aware driver
Sep 06 07:59:16 alaska kernel: pcieport 0000:00:03.0: Root Port link has been reset
Sep 06 07:59:16 alaska kernel: pcieport 0000:00:03.0: AER: Device recovery failed

Well if I turn off the kernel AER with noaer=1 the messages go away but the issue is still the same.

So I would be glad to get some ideas or I've to wait for 4.8 kernel...

Last edited by jnko (2016-09-07 09:20:00)

Offline

Board footer

Powered by FluxBB