You are not logged in.

#1 2018-12-28 10:08:46

atrus6
Member
Registered: 2011-02-17
Posts: 16

Kernel Panic after installation of RX580

I installed a new 580, which works fine in Windows.

However, upon trying to load up Arch, I get a near immediate kernel panic:
https://i.imgur.com/q3MhCfk.png

I found this post here which added the options amdgpu.dc=0 and amdgpu.powerplay=0

Which got rid of the kernal panic, but went into another error saying I should/have to run fcsk manually. Which I didn't take a screenshot of, and ran immediately. On hindsight, I probably shouldn't have.

The system reboots...and I get this error:
https://i.imgur.com/wyf4XOF.png Which is admittedly an awesome error message to see, on someone else's screen. Googling this shows that it's a more common error with improperly configured new installs. Unfortunately, this is a 3+ year old install, which was working just fine a little bit ago.

I restart, because that fixes things, and I get:
https://i.imgur.com/QnDsqzn.png
Rather blurry, unfortunately, but says: mount: /new_root: wrong fs type, bad option, bad superblock on /dev/sda1, missing codepage or helper program, or other error.

And that's where I'm at. Honestly, I'm not even sure if installing the 580 is even the problem anyore.

Any suggestions?

Update: Booting with params amdgpu.dc=0 amdgpu.powerplay=0 and iommu=0 gets me a lot further in the boot process. After about 10 minutes of waiting I get this far
https://i.imgur.com/60skKzt.jpg
Still can't do anything. Can't switch to another tty, or any real keyboard response at all.

Mod edit: Replaced oversized images with urls. Please re-read the Code of Conduct and adhere to the image posting guidelines going forward. -- WorMzy

Last edited by atrus6 (2018-12-28 10:49:26)

Offline

#2 2018-12-28 11:24:14

loqs
Member
Registered: 2014-03-06
Posts: 18,931

Re: Kernel Panic after installation of RX580

Do you have installation media you can boot into the system from?

Offline

#3 2018-12-28 11:48:35

atrus6
Member
Registered: 2011-02-17
Posts: 16

Re: Kernel Panic after installation of RX580

I can boot a current arch live usb with the options amdgpu.dc=0 amdgpu.powerplay=0, iommu=0, trying to boot otherwise causes that to panic as well.

Offline

#4 2018-12-28 11:56:42

loqs
Member
Registered: 2014-03-06
Posts: 18,931

Re: Kernel Panic after installation of RX580

I would suggest you backup any data from system you do not have a backup of before continuing then.
Can you mount all the systems filesystem's under /mnt?  If so can you arch-chroot in?
Edit:
If you can chroot in please post the output of to a pastebin (please see tip box)

pacman -Qkk 2>&1

Last edited by loqs (2018-12-28 12:11:54)

Offline

#5 2018-12-30 17:32:47

atrus6
Member
Registered: 2011-02-17
Posts: 16

Re: Kernel Panic after installation of RX580

https://ptpb.pw/gwaJ

Here you are.

Offline

#6 2018-12-30 17:45:33

loqs
Member
Registered: 2014-03-06
Posts: 18,931

Re: Kernel Panic after installation of RX580

There is a lot of corruption.  Please see S.M.A.R.T.#Run_a_test to check the condition of the storage device.
If the device does not support self testing please post the output of `# smartctl -a /dev/<device>`

Offline

#7 2018-12-30 18:21:27

atrus6
Member
Registered: 2011-02-17
Posts: 16

Re: Kernel Panic after installation of RX580

https://ptpb.pw/45id

Unless I'm reading it wrong, it doesn't seem too bad. Halfway through the lifespan.

Offline

#8 2018-12-30 19:19:18

loqs
Member
Registered: 2014-03-06
Posts: 18,931

Re: Kernel Panic after installation of RX580

I would suggest following Pacman#Pacman_crashes_during_an_upgrade
If mkinitcpio,  linux,  systemd,  util-linux are not updated during step 4 include them in step 6.

Offline

#9 2018-12-31 02:21:34

jamespharvey20
Member
Registered: 2015-06-09
Posts: 129

Re: Kernel Panic after installation of RX580

atrus6 wrote:

...
Which got rid of the kernal panic, but went into another error saying I should/have to run fcsk manually. Which I didn't take a screenshot of, and ran immediately. On hindsight, I probably shouldn't have.
...

For the future, yeah, it's always a good idea to consider making an image of a drive before any type of recovery like that, if it's a software corruption issue.  If the drive is physically failing, you wouldn't necessarily want to do this as it puts more wear on the drive, and could decrease success chances by a data recovery specialist or a DIY run of something like "ddrescue".


atrus6 wrote:

https://ptpb.pw/45id

Unless I'm reading it wrong, it doesn't seem too bad. Halfway through the lifespan.

With (113) unexpected power losses, I'd cut that down going forward.  That seems like an awful lot to me, and certainly risks filesystem corruption.  Not sure if those are from physically rebooting/turning off the machine, lockups where you had no choice (like what you're experiencing now), or electrical problems.

If I were in your shoes, I'd be considering replacing the drive, or at least keeping extremely up to date with backups.  Others may disagree with considering replacing it.

"Ave_Block-Erase Count" refers to how many times the SSD has been written to.  "Ave" stands for average, so I think it's giving you an average number that each block has been written to.  Crucial lists your drive as having a "72TB total bytes written" endurance.  "Ave_Block-Erase_Count" of 1531 * 120GB (size of drive) = 179.4 TB.  It's possible I'm misinterpreting something here, but I read this to mean you're well past their expected endurance.  It surprises me that it's listed as 51% lifetime used, but who knows how Crucial internally comes up with that.  I'm of course comparing an average of the drive over the whole thing, and it's possible some areas are aged substantially more than others.

16 "Reallocated Event Count" would cause me to consider replacing the drive as well.  Maybe I am overly cautious, but with platter drives, I replaced any that ever had reallocated sectors.  Granted, the most important thing is how this is changing overtime.  If this happened at the beginning of using the drive and has remained at 16, then it wouldn't be anything more than to watch.  These were corrected by the drive, but if there are periodically more sectors it needs to replace, it increases the risk of running into one that just fails and can't be reallocated.

My NVMe isn't usable through smartctl, only "nvme-cli" (AUR), and either the drive or nvme-cli doesn't give these 2 smart values to me, so I can't compare to mine.  I receive a "media_errors: 0" but I don't know if that's equivalent to a reallocated event.  EDIT: Looks like smartctl got updated at some point and shows everything nvme-cli does, but the drive still doesn't give out information on average block erase count or reallocated events.

Last edited by jamespharvey20 (2018-12-31 02:33:11)

Offline

Board footer

Powered by FluxBB