You are not logged in.

#1 2019-06-28 21:44:36

Zorbik
Member
Registered: 2016-08-09
Posts: 42

[SOLVED] AMD GPU graphics freezing without nomodeset

System specs:
CPU: AMD Ryzen 7 2700x
GPU: AMD Vega 64
Kernel:

Linux archiso 5.1.15-arch1-1-ARCH

Drivers:

vulkan-radeon 19.1.1-1
mesa 19.1.1-1
xf86-video-amdgpu 19.0.1-1

Kernel boot cmdline:

\vmlinuz-linux root=UUID=e80d7e54-fa27-4426-ab6d-621dcd80da04 rw add_efi_memmap initrd=amd-ucode.img loglevel=7 rd.udev.log_priority=7 vt.global_cursor_default=1 initrd=initramfs-linux.img

Recently updated packages. June 23rd was the last update before not being able to boot. June 28th upgrades is my attempted troubleshooting.

cat /var/log/pacman.log

produces (grepped for the important date(s)): https://gist.github.com/ryanseipp/8467e … 54c6a72ca4

Journalctl log of the boot process in question: https://gist.github.com/ryanseipp/93039 … 59c85d201c


lspci -nnvv
0c:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 XL/XT [Radeon RX Vega 56/64] [1002:687f] (rev c1) (prog-if 00 [VGA controller])
	Subsystem: Gigabyte Technology Co., Ltd Vega 10 XL/XT [Radeon RX Vega 56/64] [1458:2308]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 11
	Region 0: Memory at e0000000 (64-bit, prefetchable) [size=256M]
	Region 2: Memory at f0000000 (64-bit, prefetchable) [size=2M]
	Region 4: I/O ports at e000 [size=256]
	Region 5: Memory at fd400000 (32-bit, non-prefetchable) [size=512K]
	Expansion ROM at 000c0000 [disabled] [size=128K]
	Capabilities: [48] Vendor Specific Information: Len=08 <?>
	Capabilities: [50] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [64] Express (v2) Legacy Endpoint, MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 256 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 8GT/s (ok), Width x16 (ok)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR+, OBFF Not Supported
			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
			 AtomicOpsCtl: ReqEn-
		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+
			 EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
	Capabilities: [150 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
	Capabilities: [200 v1] Resizable BAR <?>
	Capabilities: [270 v1] Secondary PCI Express <?>
	Capabilities: [2a0 v1] Access Control Services
		ACSCap:	SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
		ACSCtl:	SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
	Capabilities: [2b0 v1] Address Translation Service (ATS)
		ATSCap:	Invalidate Queue Depth: 00
		ATSCtl:	Enable+, Smallest Translation Unit: 00
	Capabilities: [2c0 v1] Page Request Interface (PRI)
		PRICtl: Enable- Reset-
		PRISta: RF- UPRGI- Stopped+
		Page Request Capacity: 00000020, Page Request Allocation: 00000000
	Capabilities: [2d0 v1] Process Address Space ID (PASID)
		PASIDCap: Exec+ Priv+, Max PASID Width: 10
		PASIDCtl: Enable- Exec- Priv-
	Capabilities: [320 v1] Latency Tolerance Reporting
		Max snoop latency: 0ns
		Max no snoop latency: 0ns

I have a Gigabyte X470 Aorus Gaming WiFi motherboard (BIOS ID: 8A06BG0W which hasn't been updated in just over a year now). rEFInd is my bootloader, which normally boots using the quiet and splash kernel parameters into lightdm with the lightdm-webkit2-greeter. I can see the first error message in the journalctl -b -1 log is with lightdm. However, I see this issue whenever booting into the arch linux live-boot usb as well. I needed to set nomodeset kernel parameter to get the info for this post. Because of this, I'm concerned this is actually my graphics card going faulty, which is odd because the PC isn't on for very long and I've only had this gpu since last August-ish. I'd be very grateful if anyone has any advice or knows anything more about possible issues here. Also, please let me know if there's any info I missed that can be of use. Thanks.

EDIT: I was able to put the graphics card into a windows PC (fully compatible build for this card) and it appears that the display port controller for the Vega 64 is dead. I only get a video output from an HDMI cable. This is odd however, because I was still able to use the DisplayPort cable with nomodeset and get an output after booting. This CPU has no integrated graphics, so the GPU must be doing some work. Could there be some issue in using rich video output over a DisplayPort controller that would cause it to fail outright but still show to a monitor as a valid connection? Will test in the Arch PC with the HDMI cable soon.

EDIT 2: Vega 64 is back in the Arch PC. Tested using HDMI and things work perfectly. Attempted to replug DisplayPort and nothing works. However, I remembered that the DisplayPort cable I was using went through a KVM. Direct link from Vega 64 to my monitor works like a charm. No issues whatsoever as far as I can tell. Unplugged my KVM and it's LEDs are spazzing. Unplugging the rest of the cables from the KVM and the LEDs stopped flickering. Guess my (also less than a year old) Kinivo 301BN HDMI kvm went kaput, or the cable that was connected to it. More testing to do, but I guess the root issue of this post has been solved, so will mark the title as such.

Last edited by Zorbik (2019-06-28 22:14:18)

Offline

Board footer

Powered by FluxBB