You are not logged in.
Pages: 1
Hi,
just after I login with graphical display , the fan starts producing a lot of noise.
after troubleshooting I noticed that amdgpu driver is not working properly
lspci -v
02:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Lexa PRO [Radeon RX 550/550X] (rev c0)
Subsystem: Lenovo Lexa PRO [Radeon RX 550/550X]
Flags: fast devsel, IRQ 16
Memory at e0000000 (64-bit, prefetchable) [size=256M]
Memory at f0000000 (64-bit, prefetchable) [size=2M]
I/O ports at d000 [size=256]
Memory at f2400000 (32-bit, non-prefetchable) [size=256K]
Expansion ROM at f2440000 [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [270] Secondary PCI Express <?>
Capabilities: [320] Latency Tolerance Reporting
Capabilities: [370] L1 PM Substates
Kernel modules: amdgpu
there is no "kernel driver in use" entry for this device.
although the module is loaded:
lsmod |grep amdgpu
amdgpu 3952640 0
chash 16384 1 amdgpu
gpu_sched 36864 1 amdgpu
amd_iommu_v2 20480 1 amdgpu
ttm 114688 1 amdgpu
i2c_algo_bit 16384 2 amdgpu,i915
drm_kms_helper 212992 2 amdgpu,i915
drm 495616 12 gpu_sched,drm_kms_helper,amdgpu,i915,ttm
interesting dmesg portion:
[ 14.511270] amdgpu: [powerplay]
failed to send message 254 ret is 0
[ 14.652372] input: TPPS/2 Elan TrackPoint as /devices/platform/i8042/serio1/serio2/input/input19
[ 15.057656] amdgpu: [powerplay] SMU load firmware failed
[ 15.057658] amdgpu: [powerplay] fw load failed
[ 15.057659] firmware loading failed
[ 15.057662] amdgpu 0000:02:00.0: amdgpu_device_ip_init failed
[ 15.057664] amdgpu 0000:02:00.0: Fatal error during GPU init
[ 15.057667] [drm] amdgpu: finishing device.
and stacktrace is printed just below it.
full dmesg output: https://pastebin.com/bpiuRsxn
more information:
laptop : lenovo thinkpad E580
bios firmware is up to date
the problem is not new, it started with fedora 29 (kernel 4.18.16) , and still persists till now , fedora 28 (kernel 4.16 and 4.17) didn't have this problem, and live images of various distributions with kernel 4.14 fails to boot on my machine. after switching to arch the problem persisted thorugh all kernels from 4.19 till now (5.1.5). although the I am seeing less error messages and faster boot after kernel 5.1, particularly this message:
[ 14.511270] amdgpu: [powerplay]
failed to send message 254 ret is 0
but with many different message codes, currently there is only one .
I don't mind switching the discrete card off, but did't figure out a way to do it yet, no option in the bios setup to disable it.
I have tried this: https://wiki.archlinux.org/index.php/AT … e_Graphics but /sys/kernel/debug/vgaswitcheroo/switch does not exist
and this: https://wiki.archlinux.org/index.php/Hy … _acpi_call , all return "failed"
Offline
The dmesg output suggest you may have low-level hardware/driver issues .
[ 13.707995] ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [TBF3] at bit offset/length 262144/32768 exceeds size of target Buffer (262144 bits) (20190215/dsopcode-203)
[ 13.707999] ACPI Error: Aborting method \_SB.PCI0.GFX0.GETB due to previous error (AE_AML_BUFFER_LIMIT) (20190215/psparse-531)
[ 13.708002] ACPI Error: Aborting method \_SB.PCI0.GFX0.ATRM due to previous error (AE_AML_BUFFER_LIMIT) (20190215/psparse-531)
[ 13.708006] failed to evaluate ATRM got AE_AML_BUFFER_LIMIT
Usually acpi errors are harmless, but these have to do with graphics.
[ 34.227543] pcieport 0000:00:1d.2: AER: Corrected error received: 0000:00:1d.2
[ 34.227566] pcieport 0000:00:1d.2: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 34.227575] pcieport 0000:00:1d.2: device [8086:9d1a] error status/mask=00000001/00002000
[ 34.227583] pcieport 0000:00:1d.2: [ 0] RxErr
Although these are corrected, they do point to problems in the setup.
8086 is the Vendor ID used by Intel, but which device is this ?
Please post
lspci -nnk
so we can determine that.
Which card do you have setup as primary gpu in firmware, intel or amd ?
Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.
clean chroot building not flexible enough ?
Try clean chroot manager by graysky
Offline
Hi Lone_Wolf,
Thanks for the reply , here is the output :
lspci -nnk
00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers [8086:5914] (rev 08)
Subsystem: Lenovo Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers [17aa:5068]
Kernel driver in use: skl_uncore
00:02.0 VGA compatible controller [0300]: Intel Corporation UHD Graphics 620 [8086:5917] (rev 07)
Subsystem: Lenovo UHD Graphics 620 [17aa:5069]
Kernel driver in use: i915
Kernel modules: i915
00:08.0 System peripheral [0880]: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th Gen Core Processor Gaussian Mixture Model [8086:1911]
Subsystem: Lenovo Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th Gen Core Processor Gaussian Mixture Model [17aa:5068]
00:14.0 USB controller [0c03]: Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller [8086:9d2f] (rev 21)
Subsystem: Lenovo Sunrise Point-LP USB 3.0 xHCI Controller [17aa:5068]
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
00:14.2 Signal processing controller [1180]: Intel Corporation Sunrise Point-LP Thermal subsystem [8086:9d31] (rev 21)
Subsystem: Lenovo Sunrise Point-LP Thermal subsystem [17aa:5068]
Kernel driver in use: intel_pch_thermal
Kernel modules: intel_pch_thermal
00:16.0 Communication controller [0780]: Intel Corporation Sunrise Point-LP CSME HECI #1 [8086:9d3a] (rev 21)
Subsystem: Lenovo Sunrise Point-LP CSME HECI [17aa:5068]
Kernel driver in use: mei_me
Kernel modules: mei_me
00:17.0 SATA controller [0106]: Intel Corporation Sunrise Point-LP SATA Controller [AHCI mode] [8086:9d03] (rev 21)
Subsystem: Lenovo Sunrise Point-LP SATA Controller [AHCI mode] [17aa:5068]
Kernel driver in use: ahci
Kernel modules: ahci
00:1c.0 PCI bridge [0604]: Intel Corporation Sunrise Point-LP PCI Express Root Port #1 [8086:9d10] (rev f1)
Kernel driver in use: pcieport
00:1c.4 PCI bridge [0604]: Intel Corporation Sunrise Point-LP PCI Express Root Port #5 [8086:9d14] (rev f1)
Kernel driver in use: pcieport
00:1d.0 PCI bridge [0604]: Intel Corporation Sunrise Point-LP PCI Express Root Port #9 [8086:9d18] (rev f1)
Kernel driver in use: pcieport
00:1d.2 PCI bridge [0604]: Intel Corporation Sunrise Point-LP PCI Express Root Port #11 [8086:9d1a] (rev f1)
Kernel driver in use: pcieport
00:1d.3 PCI bridge [0604]: Intel Corporation Device [8086:9d1b] (rev f1)
Kernel driver in use: pcieport
00:1f.0 ISA bridge [0601]: Intel Corporation Sunrise Point LPC Controller/eSPI Controller [8086:9d4e] (rev 21)
Subsystem: Lenovo Sunrise Point LPC Controller/eSPI Controller [17aa:5068]
00:1f.2 Memory controller [0580]: Intel Corporation Sunrise Point-LP PMC [8086:9d21] (rev 21)
Subsystem: Lenovo Sunrise Point-LP PMC [17aa:5068]
00:1f.3 Audio device [0403]: Intel Corporation Sunrise Point-LP HD Audio [8086:9d71] (rev 21)
Subsystem: Lenovo Sunrise Point-LP HD Audio [17aa:5068]
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel, snd_soc_skl
00:1f.4 SMBus [0c05]: Intel Corporation Sunrise Point-LP SMBus [8086:9d23] (rev 21)
Subsystem: Lenovo Sunrise Point-LP SMBus [17aa:5068]
Kernel driver in use: i801_smbus
Kernel modules: i2c_i801
02:00.0 Display controller [0380]: Advanced Micro Devices, Inc. [AMD/ATI] Lexa PRO [Radeon RX 550/550X] [1002:699f] (rev c0)
Subsystem: Lenovo Lexa PRO [Radeon RX 550/550X] [17aa:5069]
Kernel modules: amdgpu
03:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 10)
Subsystem: Lenovo RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [17aa:5068]
Kernel driver in use: r8169
Kernel modules: r8169
04:00.0 Non-Volatile memory controller [0108]: Toshiba America Info Systems Device [1179:0113] (rev 01)
Subsystem: Toshiba America Info Systems Device [1179:0001]
Kernel driver in use: nvme
05:00.0 Network controller [0280]: Intel Corporation Dual Band Wireless-AC 3165 Plus Bluetooth [8086:3166] (rev 99)
Subsystem: Intel Corporation Dual Band Wireless-AC 3165 Plus Bluetooth [8086:4210]
Kernel driver in use: iwlwifi
Kernel modules: iwlwifi
06:00.0 SD Host controller [0805]: O2 Micro, Inc. SD/MMC Card Reader Controller [1217:8621] (rev 01)
Subsystem: Lenovo SD/MMC Card Reader Controller [17aa:5068]
Kernel driver in use: sdhci-pci
Kernel modules: sdhci_pci
Offline
Which card do you have setup as primary gpu in firmware, intel or amd ?
Not sure if I understand exactly what do you mean , but the intel is the integrated graphics, and amd is the discrete card, which I don't use. In fact the laptop display work normally with amdgpu not loaded/blacklisted, except for the loud fan noise .
Offline
In firmware you can set which card is to be used as primary video and which one is secondary.
Often this option is called "select primary card" but there are firmwares that use other terms like "Boot first" .
Default setting is normally to use the integrated gpu as primary.
The lspci output confirms what I expected.
00:1d.2 PCI bridge [0604]: Intel Corporation Sunrise Point-LP PCI Express Root Port #11 [8086:9d1a] (rev f1)
Kernel driver in use: pcieport
Check lspci -t output, if that shows 9d1a is connected to something other then the amd videocard I'd be very surprised.
You have configured intel microcode updates and are using latest firmware.
Try booting with pcie_aspm=off as kernel parameter.
Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.
clean chroot building not flexible enough ?
Try clean chroot manager by graysky
Offline
lspci -tnnv
-[0000:00]-+-00.0 Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers [8086:5914]
+-02.0 Intel Corporation UHD Graphics 620 [8086:5917]
+-08.0 Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th Gen Core Processor Gaussian Mixture Model [8086:1911]
+-14.0 Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller [8086:9d2f]
+-14.2 Intel Corporation Sunrise Point-LP Thermal subsystem [8086:9d31]
+-16.0 Intel Corporation Sunrise Point-LP CSME HECI #1 [8086:9d3a]
+-17.0 Intel Corporation Sunrise Point-LP SATA Controller [AHCI mode] [8086:9d03]
+-1c.0-[02]----00.0 Advanced Micro Devices, Inc. [AMD/ATI] Lexa PRO [Radeon RX 550/550X] [1002:699f]
+-1c.4-[03]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168]
+-1d.0-[04]----00.0 Toshiba America Info Systems Device [1179:0113]
+-1d.2-[05]----00.0 Intel Corporation Dual Band Wireless-AC 3165 Plus Bluetooth [8086:3166]
+-1d.3-[06]----00.0 O2 Micro, Inc. SD/MMC Card Reader Controller [1217:8621]
+-1f.0 Intel Corporation Sunrise Point LPC Controller/eSPI Controller [8086:9d4e]
+-1f.2 Intel Corporation Sunrise Point-LP PMC [8086:9d21]
+-1f.3 Intel Corporation Sunrise Point-LP HD Audio [8086:9d71]
\-1f.4 Intel Corporation Sunrise Point-LP SMBus [8086:9d23]
It seems that 9d1a is connected to the bluetooth adapter instead .
I didn't notice any change when booting with pcie_aspm=off
Offline
Try booting with amdgpu.dc=0 as kernel parameter.
Are you booting to
uefi
uefi in legacy/CSM mode
or BIOS ?
Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.
clean chroot building not flexible enough ?
Try clean chroot manager by graysky
Offline
I am booting in uefi mode.
both amdgpu.dc=0 and amdgpu.dpm=0 don't seem to change anything.
Offline
Hi Kal1lov,
Also I have Thinkpad E580 and I have the same problem.
At mine the problem arrived after my laptop was in service because crash.
I think they change in service the motherboard because after arrived at home my windows notice me with some problem necessary to confirm the license (something like that).
Any way after this now I have problems with driver in Windows - I have Code 43 - do not matter what driver version I put - old or new.
Also for Linux OS I try:
- arch
- manjaro
- OpenSuse
- Gentoo
- Debian
Now I have debian with 4.19 kernel, this is my dmesg output https://pastebin.com/fHYZqFQ2
and my out for:
└─ $ ▶ lspci -nnk
00:00.0 Host bridge [0600]: Intel Corporation Device [8086:5914] (rev 08)
Subsystem: Lenovo Device [17aa:5068]
Kernel driver in use: skl_uncore
00:02.0 VGA compatible controller [0300]: Intel Corporation Device [8086:5917] (rev 07)
Subsystem: Lenovo Device [17aa:5069]
Kernel driver in use: i915
Kernel modules: i915
00:08.0 System peripheral [0880]: Intel Corporation Skylake Gaussian Mixture Model [8086:1911]
Subsystem: Lenovo Skylake Gaussian Mixture Model [17aa:5068]
00:14.0 USB controller [0c03]: Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller [8086:9d2f] (rev 21)
Subsystem: Lenovo Sunrise Point-LP USB 3.0 xHCI Controller [17aa:5068]
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
00:14.2 Signal processing controller [1180]: Intel Corporation Sunrise Point-LP Thermal subsystem [8086:9d31] (rev 21)
Subsystem: Lenovo Sunrise Point-LP Thermal subsystem [17aa:5068]
Kernel driver in use: intel_pch_thermal
Kernel modules: intel_pch_thermal
00:16.0 Communication controller [0780]: Intel Corporation Sunrise Point-LP CSME HECI #1 [8086:9d3a] (rev 21)
Subsystem: Lenovo Sunrise Point-LP CSME HECI [17aa:5068]
Kernel driver in use: mei_me
Kernel modules: mei_me
00:1c.0 PCI bridge [0604]: Intel Corporation Device [8086:9d10] (rev f1)
Kernel driver in use: pcieport
00:1c.4 PCI bridge [0604]: Intel Corporation Sunrise Point-LP PCI Express Root Port #5 [8086:9d14] (rev f1)
Kernel driver in use: pcieport
00:1d.0 PCI bridge [0604]: Intel Corporation Sunrise Point-LP PCI Express Root Port #9 [8086:9d18] (rev f1)
Kernel driver in use: pcieport
00:1d.2 PCI bridge [0604]: Intel Corporation Device [8086:9d1a] (rev f1)
Kernel driver in use: pcieport
00:1d.3 PCI bridge [0604]: Intel Corporation Device [8086:9d1b] (rev f1)
Kernel driver in use: pcieport
00:1f.0 ISA bridge [0601]: Intel Corporation Device [8086:9d4e] (rev 21)
Subsystem: Lenovo Device [17aa:5068]
00:1f.2 Memory controller [0580]: Intel Corporation Sunrise Point-LP PMC [8086:9d21] (rev 21)
Subsystem: Lenovo Sunrise Point-LP PMC [17aa:5068]
00:1f.3 Audio device [0403]: Intel Corporation Device [8086:9d71] (rev 21)
Subsystem: Lenovo Device [17aa:5068]
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel, snd_soc_skl
00:1f.4 SMBus [0c05]: Intel Corporation Sunrise Point-LP SMBus [8086:9d23] (rev 21)
Subsystem: Lenovo Sunrise Point-LP SMBus [17aa:5068]
Kernel driver in use: i801_smbus
Kernel modules: i2c_i801
02:00.0 Display controller [0380]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:699f] (rev c0)
Subsystem: Lenovo Device [17aa:5069]
Kernel modules: amdgpu
03:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 10)
Subsystem: Lenovo RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [17aa:5068]
Kernel driver in use: r8169
Kernel modules: r8169
04:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd Device [144d:a804]
Subsystem: Samsung Electronics Co Ltd Device [144d:a801]
Kernel driver in use: nvme
Kernel modules: nvme
05:00.0 Network controller [0280]: Intel Corporation Intel Dual Band Wireless-AC 3165 Plus Bluetooth [8086:3166] (rev 99)
Subsystem: Intel Corporation Intel Dual Band Wireless-AC 3165 Plus Bluetooth [8086:4210]
Kernel driver in use: iwlwifi
Kernel modules: iwlwifi
06:00.0 SD Host controller [0805]: O2 Micro, Inc. SD/MMC Card Reader Controller [1217:8621] (rev 01)
Subsystem: Lenovo SD/MMC Card Reader Controller [17aa:5068]
Kernel driver in use: sdhci-pci
Kernel modules: sdhci_pci
and
└─ $ ▶ lspci -tnnv
-[0000:00]-+-00.0 Intel Corporation Device [8086:5914]
+-02.0 Intel Corporation Device [8086:5917]
+-08.0 Intel Corporation Skylake Gaussian Mixture Model [8086:1911]
+-14.0 Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller [8086:9d2f]
+-14.2 Intel Corporation Sunrise Point-LP Thermal subsystem [8086:9d31]
+-16.0 Intel Corporation Sunrise Point-LP CSME HECI #1 [8086:9d3a]
+-1c.0-[02]----00.0 Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:699f]
+-1c.4-[03]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168]
+-1d.0-[04]----00.0 Samsung Electronics Co Ltd Device [144d:a804]
+-1d.2-[05]----00.0 Intel Corporation Intel Dual Band Wireless-AC 3165 Plus Bluetooth [8086:3166]
+-1d.3-[06]----00.0 O2 Micro, Inc. SD/MMC Card Reader Controller [1217:8621]
+-1f.0 Intel Corporation Device [8086:9d4e]
+-1f.2 Intel Corporation Sunrise Point-LP PMC [8086:9d21]
+-1f.3 Intel Corporation Device [8086:9d71]
\-1f.4 Intel Corporation Sunrise Point-LP SMBus [8086:9d23]
also I try kernel parameters:
amdgpu.dc=0, amdgpu.audio=0
but without success.
If I found a solution I am back with news.
Last edited by florintanasa (2019-06-28 17:06:56)
Offline
Same prroblem here... doenst work on my e480....
Offline
Offline
I used:
amdgpu.fw_load_type=0
amdgpu.fw_load_type=1
amdgpu.fw_load_type=2
not work, i have the same error.
Now checked with last amd driver on ubuntu and the result is the same.
Offline
blacklisting the amdgpu kernel module solved the issue for me:
https://askubuntu.com/questions/1080217 … gpu-driver.
Another solution is to turn off the GPU:
echo 1 > /sys/bus/pci/devices/0000:02:00.0/remove
the device string of the gpu , in the above case 0000:02:00.0, could be different for other cases. It can be obtained from the output of lspci -v, as explained in https://prefetch.net/articles/linuxpci.html
Offline
kalilov, please use your original account.
Banned duplicate account.
Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
The shortest way to ruin a country is to give power to demagogues.— Dionysius of Halicarnassus
---
How to Ask Questions the Smart Way
Offline
Pages: 1