You are not logged in.
Pages: 1
Hiyas! A few days ago my computer suddenly crashed, after which I could no longer boot properly. Whenever the screen would normally adjust to the right resolution it would just black screen instead. The same thing is happening with the latest version of the arch installer. I found that using the nomodeset kernel parameter let's you get past that, but sadly doesn't fix anything. The error message I usually get is
Feb 19 19:04:30 archlinux kernel: mce: [Hardware Error]: Machine check events logged
Feb 19 19:04:30 archlinux kernel: mce: [Hardware Error]: CPU 4: Machine Check: 0 Bank 5: b6a0000001000108
Feb 19 19:04:30 archlinux kernel: mce: [Hardware Error]: TSC 0 ADDR fff80445a52ef6 SYND 4d000000 IPID 500b000000000
Feb 19 19:04:30 archlinux kernel: mce: [Hardware Error]: PROCESSOR 2:a20f10 TIME 1676829720 SOCKET 0 APIC 8 microcode a201009Trying it again now after a few days also gets me
Feb 20 20:57:36 archlinux kernel: AMD-Vi: Completion-Wait loop timed out
Feb 20 20:57:36 archlinux kernel: AMD-Vi: Completion-Wait loop timed out
Feb 20 20:57:36 archlinux kernel: AMD-Vi: Completion-Wait loop timed out
Feb 20 20:57:36 archlinux kernel: pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
Feb 20 20:57:36 archlinux kernel: AMD-Vi: Extended features (0x58f77ef22294ade, 0x0): PPR X2APIC NX GT IA GA PC GA_vAPIC
Feb 20 20:57:36 archlinux kernel: AMD-Vi: Interrupt remapping enabled
Feb 20 20:57:36 archlinux kernel: AMD-Vi: X2APIC enabled
Feb 20 20:57:36 archlinux kernel: AMD-Vi: Virtual APIC enabled
Feb 20 20:57:36 archlinux kernel: PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
Feb 20 20:57:36 archlinux kernel: software IO TLB: mapped [mem 0x00000000d7147000-0x00000000db147000] (64MB)
Feb 20 20:57:36 archlinux kernel: iommu ivhd0: AMD-Vi: Event logged [IOTLB_INV_TIMEOUT device=0000:04:00.0 address=0x100218650]
Feb 20 20:57:36 archlinux kernel: LVT offset 0 assigned for vector 0x400
Feb 20 20:57:36 archlinux kernel: iommu ivhd0: AMD-Vi: Event logged [IOTLB_INV_TIMEOUT device=0000:04:00.0 address=0x100218670]Turning off iommu removes those, but sadly doesn't fix anything
I tried changing GPUs from my R9 270 to an old Radeon HD 5750, which got rid of all these problems, making me think it was a GPU issue. However I managed to get a replacement R9 270, which has the exact same issues, making me think it might've not been it. I've already tried Memtest, and since the working GPU is only with 1 PCIE power cable, I've also tested both in single use and the non-working GPUs with non-necessary stuff disconnected from the PSU, to no avail.
Is there way to go on about figuring out what might be the problem here? I tried the tools from https://wiki.archlinux.org/title/Machin … _exception , but sadly errors seem to occur before rasdaemon.service gets started
Last edited by Hello_324 (2023-03-07 12:43:23)
Offline
changing GPUs from my R9 270 to an old Radeon HD 5750, which got rid of all these problemsR9 270 is Curacao or Pitcairn, according to wikipedia (I'm not learning geographics) a southern island chip (while the HD 5750 is much older)
=> Do you use it along the radeon or the amdgpu driver?
https://wiki.archlinux.org/title/AMDGPU … K)_support
And does that have an impact on the situation?
Offline
=> Do you use it along the radeon or the amdgpu driver?
Yes, I already had it enabled it from a long time ago. The GPU used to work just fine until the crash
Offline
And does it work w/ the radeon module?
Offline
And does it work w/ the radeon module?
Sorry if I misunderstand things, but do you want me to change
options amdgpu si_support=1
options amdgpu cik_support=1
options radeon si_support=0
options radeon cik_support=0to
options amdgpu si_support=0
options amdgpu cik_support=0
options radeon si_support=1
options radeon cik_support=1?
The former is what I was using before (and still have on) which worked before, doing an
lspci -k | grep -A 3 -E "(VGA|3D)"with the 5750 gives me
Kernel driver in use: radeon
Kernel modules: radeon, amdgpuAlso sorry I completely forgot, but I'm dualbooting on this system, and Windows 10 also crashes after the Windows logo where it would usually adjust the resolution.
Offline
Sorry if I misunderstand things, but do you want me to…
Yes.
However
I'm dualbooting on this system, and Windows 10 also crashes after the Windows logo
3rd link below. Mandatory.
Disable it (it's NOT the BIOS setting!) and reboot windows and linux twice for voodo reasons.
But this rather suggests the GPU is broken and/or underpowered. Did you change its PCIe slot?
Did you forget to connect a dedicated 6/8-pin power connector?
Offline
Sorry if I misunderstand things, but do you want me to…
Yes.
However
I'm dualbooting on this system, and Windows 10 also crashes after the Windows logo
3rd link below. Mandatory.
Disable it (it's NOT the BIOS setting!) and reboot windows and linux twice for voodo reasons.But this rather suggests the GPU is broken and/or underpowered. Did you change its PCIe slot?
Did you forget to connect a dedicated 6/8-pin power connector?
Tried it, sadly it didn't work.
I already had it fast-start disabled and also tried the 2nd PCIe slot to no avail. All power connectors are plugged in, plugging in only 1 of the 2 4pins makes you get no output at all. Going to try again with a new PSU that hopefully arrives in a few days
Offline
Trying out with a new PSU I actually managed to get past the black screen at the arch install CD on the first time, getting the following ERROR/warning at the start:
[drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
radeon: 0000:2b:00.0: failed initializing UVD (-1).After I then tried to properly boot the old problems came back, even with the arch install CD. Is there any way to find out what exactly is causing it without exchanging hardware pieces one at a time? Me having had the same issue with another GPU of the same model, and the GPUs working with nomodeset makes me think it might not be a GPU error either
Offline
Windows 10 also crashes
changing GPUs from my R9 270 to an old Radeon HD 5750, which got rid of all these problems
Windows means it's the HW, the change means it's directly related to the GPU (model)
PSU (power) or GPU (chip) or board (bus) - I don't really see what else could interfere.
Offline
PSU (power) or GPU (chip) or board (bus) - I don't really see what else could interfere.
I thought so too, but trying both a new PSU and another GPU of the same model gave me the same errors. Maybe I got unlucky with the 2nd GPU, guess I will try it again with a different brand, once I can get my hand on one
Edit:
Sorry it indeed seems to have been the GPU. Trying the same GPU (with the same PSU) with a different motherboard + CPU brings about the same problem. Guess I really got unlucky with the 2nd GPU
Last edited by Hello_324 (2023-02-26 01:45:42)
Offline
And does it work w/ the radeon module?
Edit: sorry, forgot about the windows situation.
Both 270 are a decade old…
Please always remember to mark resolved threads by editing your initial posts subject - so others will know that there's no task left, but maybe a solution to find.
Thanks.
Last edited by seth (2023-02-26 08:00:47)
Offline
Pages: 1