You are not logged in.

#1 2019-08-07 14:05:40

kindaro
Member
Registered: 2017-01-16
Posts: 19

Transient DRM or modesetting issue with Ryzen 7 Pro.

So I have this new laptop. It sometimes boots well, but more often, some time after `systemd`
starts, but before switching to high resolution, the screen will turn dark (though
backlit)
. After several attempts, it would boot well again. It may take upwards of three
attemts. I have not yet discovered any external factor that would decide whether it boots well or not.

What I did is cross-check the journal. You see, two boots of the same configuration should have mostly identical journal,
but that appears not to be the case here.

This is a representative good boot:

% journalctl --output=short-unix --boot=0 | grep -F '[drm]' | cut -d ' ' -f 3-
kernel: [drm] amdgpu kernel modesetting enabled.
kernel: [drm] initializing kernel modesetting (RAVEN 0x1002:0x15DD 0x103C:0x83D5 0xD0).
kernel: [drm] register mmio base: 0xE0700000
kernel: [drm] register mmio size: 524288
kernel: [drm] add ip block number 0 <soc15_common>
kernel: [drm] add ip block number 1 <gmc_v9_0>
kernel: [drm] add ip block number 2 <vega10_ih>
kernel: [drm] add ip block number 3 <psp>
kernel: [drm] add ip block number 4 <gfx_v9_0>
kernel: [drm] add ip block number 5 <sdma_v4_0>
kernel: [drm] add ip block number 6 <powerplay>
kernel: [drm] add ip block number 7 <dm>
kernel: [drm] add ip block number 8 <vcn_v1_0>
kernel: [drm] VCN decode is enabled in VM mode
kernel: [drm] VCN encode is enabled in VM mode
kernel: [drm] VCN jpeg decode is enabled in VM mode
kernel: [drm] BIOS signature incorrect 0 0
kernel: [drm] RAS INFO: ras initialized successfully, hardware ability[0] ras_mask[0]
kernel: [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
kernel: [drm] Detected VRAM RAM=1024M, BAR=1024M
kernel: [drm] RAM width 64bits DDR4
kernel: [drm] amdgpu: 1024M of VRAM memory ready
kernel: [drm] amdgpu: 3072M of GTT memory ready.
kernel: [drm] GART: num cpu pages 262144, num gpu pages 262144
kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F400900000).
kernel: [drm] use_doorbell being set to: [true]
kernel: [drm] Found VCN firmware Version ENC: 1.9 DEC: 1 VEP: 0 Revision: 28
kernel: [drm] PSP loading VCN firmware
kernel: [drm] reserve 0x400000 from 0xf400c00000 for PSP TMR SIZE
kernel: [drm] DM_PPLIB: values for F clock
kernel: [drm] DM_PPLIB:         400000 in kHz
kernel: [drm] DM_PPLIB:         933000 in kHz
kernel: [drm] DM_PPLIB:         1067000 in kHz
kernel: [drm] DM_PPLIB:         1200000 in kHz
kernel: [drm] DM_PPLIB: values for DCF clock
kernel: [drm] DM_PPLIB:         300000 in kHz
kernel: [drm] DM_PPLIB:         600000 in kHz
kernel: [drm] DM_PPLIB:         626000 in kHz
kernel: [drm] DM_PPLIB:         654000 in kHz
kernel: [drm] Display Core initialized with v3.2.27!
kernel: [drm] SADs count is: -2, don't need to read it
kernel: [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
kernel: [drm] Driver supports precise vblank timestamp query.
kernel: [drm] VCN decode and encode initialized successfully(under SPG Mode).
kernel: [drm] fb mappable at 0x91000000
kernel: [drm] vram apper at 0x90000000
kernel: [drm] size 8294400
kernel: [drm] fb depth is 24
kernel: [drm]    pitch is 7680
kernel: [drm] Initialized amdgpu 3.32.0 20150101 for 0000:04:00.0 on minor 0

A representative failing boot:

% journalctl --output=short-unix --boot=-2 | grep -F '[drm]' | cut -d ' ' -f 3-
kernel: [drm] amdgpu kernel modesetting enabled.
kernel: [drm] initializing kernel modesetting (RAVEN 0x1002:0x15DD 0x103C:0x83D5 0xD0).
kernel: [drm] register mmio base: 0xE0700000
kernel: [drm] register mmio size: 524288
kernel: [drm] add ip block number 0 <soc15_common>
kernel: [drm] add ip block number 1 <gmc_v9_0>
kernel: [drm] add ip block number 2 <vega10_ih>
kernel: [drm] add ip block number 3 <psp>
kernel: [drm] add ip block number 4 <gfx_v9_0>
kernel: [drm] add ip block number 5 <sdma_v4_0>
kernel: [drm] add ip block number 6 <powerplay>
kernel: [drm] add ip block number 7 <dm>
kernel: [drm] add ip block number 8 <vcn_v1_0>
kernel: [drm] VCN decode is enabled in VM mode
kernel: [drm] VCN encode is enabled in VM mode
kernel: [drm] VCN jpeg decode is enabled in VM mode
kernel: [drm] BIOS signature incorrect 0 0
kernel: [drm] BIOS signature incorrect 0 0
kernel: [drm] RAS INFO: ras initialized successfully, hardware ability[0] ras_mask[0]
kernel: [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
kernel: [drm] Detected VRAM RAM=1024M, BAR=1024M
kernel: [drm] RAM width 128bits DDR4
kernel: [drm] amdgpu: 1024M of VRAM memory ready
kernel: [drm] amdgpu: 3072M of GTT memory ready.
kernel: [drm] GART: num cpu pages 262144, num gpu pages 262144
kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F400900000).
kernel: [drm] use_doorbell being set to: [true]
kernel: [drm] Found VCN firmware Version ENC: 1.9 DEC: 1 VEP: 0 Revision: 28
kernel: [drm] PSP loading VCN firmware
kernel: [drm] reserve 0x400000 from 0xf400c00000 for PSP TMR SIZE
kernel: [drm] DM_PPLIB: values for F clock
kernel: [drm] DM_PPLIB:         400000 in kHz
kernel: [drm] DM_PPLIB:         933000 in kHz
kernel: [drm] DM_PPLIB:         1067000 in kHz
kernel: [drm] DM_PPLIB:         1200000 in kHz
kernel: [drm] DM_PPLIB: values for DCF clock
kernel: [drm] DM_PPLIB:         300000 in kHz
kernel: [drm] DM_PPLIB:         600000 in kHz
kernel: [drm] DM_PPLIB:         626000 in kHz
kernel: [drm] DM_PPLIB:         654000 in kHz
kernel: [drm] Unsupported Connector type:21!
kernel: [drm] Unsupported Connector type:21!
kernel: [drm] Unsupported Connector type:21!
kernel: [drm] Unsupported Connector type:21!
kernel: [drm] Display Core initialized with v3.2.27!
kernel: [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
kernel: [drm] Driver supports precise vblank timestamp query.
kernel: [drm] VCN decode and encode initialized successfully(under SPG Mode).
kernel: [drm] Initialized amdgpu 3.32.0 20150101 for 0000:04:00.0 on minor 0

This is the diff:

% diff -u /tmp/journal.{02,00}
--- /tmp/journal.02	2019-08-07 16:03:21.656343950 +0300
+++ /tmp/journal.00	2019-08-07 16:03:21.616343950 +0300
@@ -15,11 +15,10 @@
 kernel: [drm] VCN encode is enabled in VM mode
 kernel: [drm] VCN jpeg decode is enabled in VM mode
 kernel: [drm] BIOS signature incorrect 0 0
-kernel: [drm] BIOS signature incorrect 0 0
 kernel: [drm] RAS INFO: ras initialized successfully, hardware ability[0] ras_mask[0]
 kernel: [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
 kernel: [drm] Detected VRAM RAM=1024M, BAR=1024M
-kernel: [drm] RAM width 128bits DDR4
+kernel: [drm] RAM width 64bits DDR4
 kernel: [drm] amdgpu: 1024M of VRAM memory ready
 kernel: [drm] amdgpu: 3072M of GTT memory ready.
 kernel: [drm] GART: num cpu pages 262144, num gpu pages 262144
@@ -38,12 +37,14 @@
 kernel: [drm] DM_PPLIB:         600000 in kHz
 kernel: [drm] DM_PPLIB:         626000 in kHz
 kernel: [drm] DM_PPLIB:         654000 in kHz
-kernel: [drm] Unsupported Connector type:21!
-kernel: [drm] Unsupported Connector type:21!
-kernel: [drm] Unsupported Connector type:21!
-kernel: [drm] Unsupported Connector type:21!
 kernel: [drm] Display Core initialized with v3.2.27!
+kernel: [drm] SADs count is: -2, don't need to read it
 kernel: [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
 kernel: [drm] Driver supports precise vblank timestamp query.
 kernel: [drm] VCN decode and encode initialized successfully(under SPG Mode).
+kernel: [drm] fb mappable at 0x91000000
+kernel: [drm] vram apper at 0x90000000
+kernel: [drm] size 8294400
+kernel: [drm] fb depth is 24
+kernel: [drm]    pitch is 7680
 kernel: [drm] Initialized amdgpu 3.32.0 20150101 for 0000:04:00.0 on minor 0

I checked and indeed in all the successful boots so far there was RAM width 64 bits, on all the
failing boots 128. I do not know how to to proceed from here.

I have the most recent Arch kernel, 5.2.6-arch1-1-ARCH. I even tried building a kernel from
amd-staging-drm-next, but it changed nothing.

Here is the graphics controller:

% sudo lspci -vv -xxx -nn -s 04:00.0
04:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] [1002:15dd] (rev d0) (prog-if 00 [VGA controller])
	DeviceName: Onboard IGD
	Subsystem: Hewlett-Packard Company Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] [103c:83d5]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort+ <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 77
	Region 0: Memory at d0000000 (64-bit, prefetchable) [size=256M]
	Region 2: Memory at e0000000 (64-bit, prefetchable) [size=2M]
	Region 4: I/O ports at 2000 [size=256]
	Region 5: Memory at e0700000 (32-bit, non-prefetchable) [size=512K]
	[virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
	Capabilities: [48] Vendor Specific Information: Len=08 <?>
	Capabilities: [50] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [64] Express (v2) Legacy Endpoint, MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 8GT/s (ok), Width x16 (ok)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR+, OBFF Not Supported
			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
			 AtomicOpsCtl: ReqEn+
		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+
			 EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
	Capabilities: [a0] MSI: Enable+ Count=1/4 Maskable- 64bit+
		Address: 00000000fee00000  Data: 0000
	Capabilities: [c0] MSI-X: Enable- Count=3 Masked-
		Vector table: BAR=5 offset=00042000
		PBA: BAR=5 offset=00043000
	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
	Capabilities: [200 v1] Resizable BAR <?>
	Capabilities: [270 v1] Secondary PCI Express <?>
	Capabilities: [2a0 v1] Access Control Services
		ACSCap:	SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
		ACSCtl:	SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
	Capabilities: [2b0 v1] Address Translation Service (ATS)
		ATSCap:	Invalidate Queue Depth: 00
		ATSCtl:	Enable+, Smallest Translation Unit: 00
	Capabilities: [2c0 v1] Page Request Interface (PRI)
		PRICtl: Enable+ Reset-
		PRISta: RF- UPRGI- Stopped+
		Page Request Capacity: 00000020, Page Request Allocation: 00000020
	Capabilities: [2d0 v1] Process Address Space ID (PASID)
		PASIDCap: Exec+ Priv+, Max PASID Width: 10
		PASIDCtl: Enable+ Exec- Priv-
	Capabilities: [320 v1] Latency Tolerance Reporting
		Max snoop latency: 0ns
		Max no snoop latency: 0ns
	Kernel driver in use: amdgpu
	Kernel modules: amdgpu
00: 02 10 dd 15 07 04 10 08 d0 00 00 03 00 00 80 00
10: 0c 00 00 d0 00 00 00 00 0c 00 00 e0 00 00 00 00
20: 01 20 00 00 00 00 70 e0 00 00 00 00 3c 10 d5 83
30: 00 00 00 00 48 00 00 00 00 00 00 00 ff 01 00 00
40: 00 00 00 00 00 00 00 00 09 50 08 00 3c 10 d5 83
50: 01 64 03 f0 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 10 a0 12 00 a1 8f 00 10 10 29 00 00
70: 03 0d 40 00 40 00 03 11 00 00 00 00 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 08 70 00 40 00 00 00
90: 0e 00 00 00 03 00 1f 00 00 00 00 00 00 00 00 00
a0: 05 c0 85 00 00 00 e0 fe 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 11 00 02 00 05 20 04 00 05 30 04 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Last edited by kindaro (2019-08-07 16:37:18)

Offline

#2 2019-08-08 04:37:33

gnox
Member
Registered: 2013-05-18
Posts: 81

Re: Transient DRM or modesetting issue with Ryzen 7 Pro.

Are you using the amd-ucode (microcode update)?

Offline

#3 2019-08-08 09:19:00

kindaro
Member
Registered: 2017-01-16
Posts: 19

Re: Transient DRM or modesetting issue with Ryzen 7 Pro.

Last time I checked, there was no microcode update available for my processor. Anyway, I just tried adding the amd-ucode image to the boot process and it did not have any noticeable effect.

Offline

Board footer

Powered by FluxBB