You are not logged in.

#1 2024-03-02 07:04:02

NinjaCheetah
Member
Registered: 2024-03-02
Posts: 5

[HW Error] SSD having firmware/controller issues

I have a 2012 MacBook Pro that I've had dualbooted with macOS and Arch Linux for a little bit, and while I was using it the other day, my active application locked up, then the entire GNOME shell locked up, then even my cursor froze, so I waited a minute and did a force reboot. I then discovered that Arch Linux would no longer boot, it would instead display the following error and throw me into systemd emergency mode:

[ TIME ] Timed out waiting for device /dev/disk/by-uuid/097C-888C.
[DEPEND] Dependency failed for /boot.
[DEPEND] Dependency failed for Local File Systems.
[DEPEND] Dependency failed for File System Check on /dev/disk/by-uuid/097C-888C.
[ TIME ] Timed out waiting for device /dev/disk/by-uuid/77464eaf-1b76-4fd7-94a8-5fa7e12c6439.
[DEPEND] Dependency failed for /dev/disk/by-uuid/77464eaf-1b76-4fd7-94a8-5fa7e12c6439.
[DEPEND] Dependency failed for Swaps.

I originally just thought I had managed to corrupt my filesystem somehow, but then I discovered that macOS had simultaneously stopped booting, and when booting verbose I could also see a bunch of drive related errors there too. Neither OS can read the others' file system in my current setup so there should be no way one corrupting could possibly damage the other, at least that I am aware of. I already tried weird Mac fixes like an NVRAM clear and SMC reset just in case because I've had those solve weirder issues, but this does not appear to be related to it being a Mac or to macOS in any way.

My first priority was to recover the file I was working on when it all froze so I booted a live environment and found that it was extremely slow to access the SSD. It took about a full minute to mount the drive, and then an additional three or so minutes to load directories from the root of the drive to about 6 layers deep and copy out a ~2KB file, which was concerning. Once I had that saved, I attempted to run some SMART tests, but smartctl was giving me a lot of trouble. Attempting to run

smartctl -a -d ata /dev/sda

returns:

Read Device Identity failed: Input/output error

If this is a USB connected device, look at the various --device=TYPE variants
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

Permissive doesn't help, but "verypermissive" allows it to run, but is only able to retrieve some basic SMART data telling me that I have 98% wear leveling still available and essentially nothing else, because the rest is just errors saying it couldn't get information from the disk. I ran both a short and long test (both requiring "verypermissive") and they yielded no errors.

Interacting with the drive in any capacity from the live environment will also cause the following logs to be output (or at least something very similar depending on what it's trying to do):

[ 1597.203523] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x10000 action 0x6 frozen
[ 1597.204526] ata1: SError: { PHYRdyChg }
[ 1597.205485] ata1.00: failed command: READ DMA EXT
[ 1597.206434] ata1.00: cmd 25/00:08:00:b1:09/00:00:23:00:00/e0 tag 26 dma 4096 in
[ 1597.206434]                res 40/00:00:00:00:00/00:00:00:00:00/00 Emax 0x4 (timeout)
[ 1597.208419] ata1.00: status: { DRDY }

I'm thinking that at this point, I should assume the drive dead, save everything I can, and replace it, but I was wondering if there's any chance that it may still be okay, and if there are more tests that I am missing, or even maybe an explanation for why I'm having trouble using SMART with the drive at all. Thanks!

Last edited by NinjaCheetah (2024-03-09 20:12:44)

Offline

#2 2024-03-02 12:56:55

xerxes_
Member
Registered: 2018-04-29
Posts: 1,056

Re: [HW Error] SSD having firmware/controller issues

What is that model of SSD?
Maybe it would be easier to check smart and data from other system PC, not Apple?

Offline

#3 2024-03-02 15:18:32

NinjaCheetah
Member
Registered: 2024-03-02
Posts: 5

Re: [HW Error] SSD having firmware/controller issues

SSD is a Samsung 860 EVO, probably just under three years old. I know that it does have full typical SMART support because I have a slightly older one in a different laptop that I had run SMART tests on thinking something was wrong (it had nothing to do with the SSD, it was fine, so it’s not that this drive seems to fail easily or anything).

It’s possible that I would get betters results from a different machine, and I can try that, though I’m not really sure since smartctl should theoretically work fine on the hardware, especially given they offer a macOS version of it as well.

Also thought I’d note that the data recovery was done under a Fedora live USB because it’s what I had immediately, hence the loading time for folders because it was graphical. The rest of my testing was from Arch media.

Offline

#4 2024-03-02 15:58:28

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,317

Re: [HW Error] SSD having firmware/controller issues

Please post your complete system journal from the live system after trying to use the drive a bit

sudo journalctl -b | curl -F 'file=@-' 0x0.st

There's most certainly a hardware problem, but if you're lucky, it's the connection, not the drive.

Offline

#5 2024-03-02 23:48:11

NinjaCheetah
Member
Registered: 2024-03-02
Posts: 5

Re: [HW Error] SSD having firmware/controller issues

Mounted the drive and did a few things with it so that I saw lots of ata1 errors appear, then exported the journal.

Link: http://0x0.st/H78z.txt

Offline

#6 2024-03-03 07:28:47

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,317

Re: [HW Error] SSD having firmware/controller issues

The device fails on early response right away, w/o even trying to access the data.
It's like the controller/firmware.
You're pretty much here, https://bbs.archlinux.org/viewtopic.php … 8#p2063098 - and from that example it looks bleak sad

Did you run smartctl from the installed system or some live distro?
https://superuser.com/questions/1166750 … d-shutdown

If you're super-lucky, it's just ncq or alpm, https://wiki.archlinux.org/title/Solid_ … NCQ_errors
The problem with that theory is however the same behavior on MacOS hmm

Offline

#7 2024-03-09 06:59:23

NinjaCheetah
Member
Registered: 2024-03-02
Posts: 5

Re: [HW Error] SSD having firmware/controller issues

Sorry for the late response, I was waiting on a Thunderbolt 2 cable to arrive so that I could mount my macOS partition from a different computer and copy some files out, so I couldn't try anything that involved erasing.

I've since tried using sedutil to do a full secure erase of the SSD, but it appears this SSD does not support OPAL, or at least is reporting that it does not. Honestly I'm not sure I trust anything it reports to the OS at this point though so if it were working it likely would, given that the 870 EVO in the linked thread does.

sedutil-cli

just tells me "Invalid or unsupported disk /dev/sda".

In the process of doing data recovery, it seems like the condition has worsened. Before I tried sedutil (and then a normal reformatting after), macOS had stopped appearing as a bootable option whatsoever and the partition could not be mounted from anywhere, and Arch Linux was mountable but trying to copy any files off of it resulted in lots of the same ata1 error I posted previously, with no actual copies happening.

It's seeming like the fate of this drive is sealed at this point and that it will not be savable, unfortunately, unless some miracle solution turns up. sad

Offline

#8 2024-03-09 07:05:53

NinjaCheetah
Member
Registered: 2024-03-02
Posts: 5

Re: [HW Error] SSD having firmware/controller issues

Quick addendum to answer a question I missed and add a detail:
I ran smartctl from an Arch Linux live USB, since nothing is booting at all off the SSD.

I also have begun to notice some I/O errors mixed in with the ata1 messages about the drive not responding that seem to correlate with the drive having an increasing number of issues reading.

Offline

#9 2024-03-09 08:08:35

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,317

Re: [HW Error] SSD having firmware/controller issues

sad

Please always remember to mark resolved threads by editing your initial posts subject - so others will know that there's no task left, but maybe a solution to find.
Thanks.
(eg. "[HW error]" or "[my SSD died]")

Offline

Board footer

Powered by FluxBB