You are not logged in.

#1 2016-06-11 21:58:09

Inxsible
Forum Fellow
From: Chicago
Registered: 2008-06-09
Posts: 9,183

ata1: COMRESET failed (errno= -16)

Hello all, its been a while since I have been here. smile


My headless server got knocked out a week ago and I was trying to look into it.

When I boot my server, it just boots to a rapidly flashing cursor. So I plugged in the latest Arch on a USB device and even that cannot start up correctly. After choosing the Boot x86_64, all I get is

ata1: COMRESET failed (errno=-16)
ata1: COMRESET failed (errno=-16)
ata1: COMRESET failed (errno=-16)
ata1: COMRESET failed (errno=-16)
ata1: COMRESET failed (errno=-16)
Mounting '/dev/disk/by-label/Arch_201606' to '/run/archiso/bootmnt'
Waiting 30 seconds for device /dev/disk/by-label/ARCH_201606 ...
ata1: COMRESET failed (errno=-16)
ERROR: '/dev/disk/by-label/ARCH_201606' device did not show up after 30 seconds...
Falling back to interactive prompt
You can try to fix the problem manually, log out when you are finished
sh: can't access tty; job control turned off
[rootfs]# ata1: COMRESET failed (errno=-16)
[rootfs]#

I have been pretty good with the updates on both my machines. I normally ssh into the server and update it whenever I update my desktop.

Need a bit of help fixing this.

Last edited by Inxsible (2016-06-12 18:21:11)


Forum Rules

There's no such thing as a stupid question, but there sure are a lot of inquisitive idiots !

Offline

#2 2016-06-11 23:39:03

R00KIE
Forum Fellow
From: Between a computer and a chair
Registered: 2008-09-14
Posts: 4,734

Re: ata1: COMRESET failed (errno= -16)

That "ata1: COMRESET failed (errno=-16)" spells trouble but you shouldn't be seeing that when booting from a usb drive I suppose. I would start by making sure the usb drive boots fine in a known good computer to take that out of the equation, then I would try a memtest on the "bad" machine and go from there.

If possible you could also try to connect the disk from the "bad" machine to a "good" machine and check if the disk is not failing (check the output of smartctl) and while you are at it do an fsck just in case something is not ok with the filesystem and also take the chance to do a backup. I would also debug the "bad" machine without any internal disk connected, as I suspect you might have to do some forced shutdowns if the machine hangs while debugging.

Check for any loose cables, if fans are working, if heatsinks are not clogged with dust/lint and if heatsinks are properly seated.


R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K

Offline

#3 2016-06-12 18:20:54

Inxsible
Forum Fellow
From: Chicago
Registered: 2008-06-09
Posts: 9,183

Re: ata1: COMRESET failed (errno= -16)

Sat overnight as is and in the morning I exited out of the rootfs prompt and got into the archlinux install root.

weird. Maybe my drive is on its last legs. I have 2 drives in the server, a 500gb and a 1 TB. Will have to check which one might be dying.


Forum Rules

There's no such thing as a stupid question, but there sure are a lot of inquisitive idiots !

Offline

#4 2016-06-18 23:32:40

Inxsible
Forum Fellow
From: Chicago
Registered: 2008-06-09
Posts: 9,183

Re: ata1: COMRESET failed (errno= -16)

Ok. I booted up the archlinux usb on the server and here are the results of the following commands:

lvdisplay:

WARNING: Device for PV Ee0Cr6-xxxx-xxxx-xxxx-xxxx-xxxx-xxxxxx not found or rejected by a filter.
---Logical volume---
LV Path        /dev/vgrp/lvroot
LV Name      lvroot
...
...
...

---Logical volume ---
LV Path      /dev/vgrp/lvhome
LV Name     lvhome
...
...
...

--- Logical volume ---
LV Path     /dev/vgrp/lvdata
LV Name   lvdata
...
...
... 

I am concerned about the warning at the very start.

trying to fsck the lvm volumes gives me

fsck.ext4 /dev/vgrp/lvroot
e2fsck 1.42.13 (17-May-2015)
fsck.ext4: No such file or directory while trying to open /dev/vgrp/lvroot
Possibly non-existent device?

smartctl gives me

smartctl -H /dev/sdb
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result : PASSED

what next ?


Forum Rules

There's no such thing as a stupid question, but there sure are a lot of inquisitive idiots !

Offline

#5 2016-06-18 23:55:13

Inxsible
Forum Fellow
From: Chicago
Registered: 2008-06-09
Posts: 9,183

Re: ata1: COMRESET failed (errno= -16)

vgdisplay gives me

vgdisplay
WARNING: Device for PV Ee0Cr6-xxxx-xxxx-xxxx-xxxx-xxxx-xxxxxx not found or rejected by a filter.
--- Volume group ---
VG Name      vgrp
...
...
VG Size        1.36 TiB

and pvdisplay gives me

pvdisplay
WARNING: Device for PV Ee0Cr6-xxxx-xxxx-xxxx-xxxx-xxxx-xxxxxx not found or rejected by a filter.
--- Physical volume ---
PV Name      [unknown]
VG Name      vgrp
PV Size        465.26 GiB/not usable 3.01 MiB
...
...
PV UUID      Ee0Cr6-xxxx-xxxx-xxxx-xxxx-xxxx-xxxxxx

--- Physical Volume ---
PV Name    /dev/sdb1
VG Name   vgrp
PV Size     931.51 GiB/ not usable 4.69 MiB
...
...
...
PV UUID    3mmitA-xxxx-xxxx-xxxx-xxxx-xxxx-xxxxxx

and finally fdisk -l gives

fdisk -l 
Disk /dev/sda: 904MiB
...
...
Disk /dev/sdb: 931.56 GiB
...
...
Disk /dev/loop0: 323.1 MiB
...
...

Why does fdisk not see the 500GB disk ? Is that the one that has possibly failed ? If so, how do I recover the data from the bad drive as well as the other 1TB drive given that they were part of the LVM

Thanks in advance.


Forum Rules

There's no such thing as a stupid question, but there sure are a lot of inquisitive idiots !

Offline

#6 2016-06-19 04:13:02

Inxsible
Forum Fellow
From: Chicago
Registered: 2008-06-09
Posts: 9,183

Re: ata1: COMRESET failed (errno= -16)

Finally pvscan gives me

pvscan
WARNING: Device for PV Ee0Cr6-xxxx-xxxx-xxxx-xxxx-xxxx-xxxxxx not found or rejected by a filter.
PV [unknown]    VG  vgrp            lvm2[465.26 GiB / 0    free]
PV /dev/sdb1    VG  vgrp            lvm2[931.51 GiB/  0    free]
Total: 2 [1.36 TiB]/ in use: 2[1.36 TiB] / in no VG: 0 [0     ]

Ok, how do I recover from this ?

Last edited by Inxsible (2016-06-19 04:13:48)


Forum Rules

There's no such thing as a stupid question, but there sure are a lot of inquisitive idiots !

Offline

#7 2016-06-19 08:32:24

mich41
Member
Registered: 2012-06-22
Posts: 796

Re: ata1: COMRESET failed (errno= -16)

Well, run dmesg and see what happens with this suspicious ata1 port.

Offline

#8 2016-06-19 11:56:10

R00KIE
Forum Fellow
From: Between a computer and a chair
Registered: 2008-09-14
Posts: 4,734

Re: ata1: COMRESET failed (errno= -16)

I would look not only into the health status reported by smartctl but also the attributes as something might be starting to go wrong and some attributes will never trigger the failing status. I would also look into both drives, before you seem to have looked only into sdb.

I have never had to deal with an lvn failure so I can't be of much help, by I would suggest checking your /etc/lvm/lvm.conf and see if you have defined any filter, that is what the message implies. Another thing to check is the files in /etc/lvm/backup, there are backups of the configuration of the lvs and which pvs the vg uses, at least it's what I can make of it, and there might be a mismatch in the ids and lvm complains.


R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K

Offline

#9 2016-06-20 00:57:56

Inxsible
Forum Fellow
From: Chicago
Registered: 2008-06-09
Posts: 9,183

Re: ata1: COMRESET failed (errno= -16)

Yes, but I don't think the drive is being seen by anything other than the lvm commands for some reason. So how do I run smartctl or fsck on the drive ? There is no /dev/sdc and the /dev/sda is the usb with Arch live on it.


Forum Rules

There's no such thing as a stupid question, but there sure are a lot of inquisitive idiots !

Offline

#10 2016-06-20 06:45:27

mich41
Member
Registered: 2012-06-22
Posts: 796

Re: ata1: COMRESET failed (errno= -16)

So the disk kicked the bucket. Or was it the SATA port or some cable? Try other ports and cables (both data and power).

Last edited by mich41 (2016-06-20 06:46:52)

Offline

#11 2016-06-20 15:50:23

Inxsible
Forum Fellow
From: Chicago
Registered: 2008-06-09
Posts: 9,183

Re: ata1: COMRESET failed (errno= -16)

mich41 wrote:

So the disk kicked the bucket. Or was it the SATA port or some cable? Try other ports and cables (both data and power).

Looks like it, but now the question is can I recover the data from the fried(WD) drive and the good(Samsung) drive given that they were both part of lvm disks ? If so, how do I recover the data?

photorec? testdisk? Do they work on LVM disks?


Forum Rules

There's no such thing as a stupid question, but there sure are a lot of inquisitive idiots !

Offline

#12 2016-06-20 21:30:52

R00KIE
Forum Fellow
From: Between a computer and a chair
Registered: 2008-09-14
Posts: 4,734

Re: ata1: COMRESET failed (errno= -16)

Have you confirmed that it doesn't work in another machine? Short of making use of a data recovery service I'm not sure how to get data from an unresponsive disk. Having LVM in the mix might not make recovery easy.


R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K

Offline

#13 2016-06-22 13:48:07

Inxsible
Forum Fellow
From: Chicago
Registered: 2008-06-09
Posts: 9,183

Re: ata1: COMRESET failed (errno= -16)

I will check the disk...it probably won't boot up because of lvm (since my desktop can only accommodate 1 disk), but even if it shows up somehow, I can see if data can be recovered...before putting it back out to pasture.


Forum Rules

There's no such thing as a stupid question, but there sure are a lot of inquisitive idiots !

Offline

Board footer

Powered by FluxBB