You are not logged in.
Hello
Some days ago, a SATA power conector fused (there were four "pop" sounds and burn smell). The sata burnt was not connected to any drive. If you want photos:
I replaced the power supply and powered on the pc again. The computer wirks (I'm writing this on it) but I have found that a couple of hard drives have a wrong capacity.
In sdb (WDC WD20EARS-00MVWB0), when trying to mount the lasy partition I get in dmesg:
[ 2256.299657] EXT4-fs (sdb7): bad geometry: block count 52170752 exceeds size of device (52170510 blocks)
In sdc (WDC WD30EZRX-00DC0B0), there are no partitions recognized while there should be one. In fact, I know that SMART reports a differrent capacity now
2013: User Capacity: 3.000.592.982.016 bytes [3,00 TB]
Now: User Capacity: 3.000.591.900.160 bytes [3,00 TB]
Both drives pass the SMART short and conveyance (edit: and long) tests without error.
Do you have any idea on how to recover the original drive capacity, so the filesystems can be mounted again?
Edit: I link the SMART output
Edit2: Host Protected Area was enabled on sdc, I disabled it and the partition was recognized nad the disk has the original size.
And I resized the partition (not the fs) in sdb7 using a bit of free space after it. Now the filesystem mounts
Of course, there is risk of ~1.5 mb of corrupted data in the drives
Last edited by noalwin (2015-11-28 12:33:50)
Offline
Noalwin,
sounds like either your sata controller was damaged, or the drive electronics.
Can you connect the drives to another system to verify they are working correctly ?
Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.
clean chroot building not flexible enough ?
Try clean chroot manager by graysky
Offline
I'd say that if you have important data on those 2 drives which is not backed up and is important, then hand those drives over to a data recovery company. For the drives to be reporting a different user capacity something went seriously wrong. That doesn't surprise me much, from the second link, the drive on the left seem to have been hit quite badly, even if it looks ok after cleaning it doesn't meant it didn't suffer damage from heating before magic smoke came out of your machine.
R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K
Offline
I get the impression that the hardware is fine. Are the drives on the same make and model controller they had been prior to the melt down?
Also, out of curiosity, how did you clean the drive? I was going to suggest anhydrous isopropanol. Not much worry about getting on the plastic, eh?
Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing
---
How to Ask Questions the Smart Way
Online
Yeah. I suspect that the drives are ok. A short in PSU power plug should only trigger overload protection on the PSU or just fry it.
The SMART status from earlier is from 2013? Could that be the case of SMART reporting the usable space in the disk? This is a result of damaged sectors leaved out of the total size. I'm not very familiar with SMART, so we need someone to confirm this.
Both drives pass the SMART short and conveyance tests without error.
This only strenghtens my suspicion of drives being OK.
K.i.s.s. <3
Offline
I added the smartctl -a output of the drives to the original post, and also the output of the 3tb drive from 2013 (when I bought it)
Noalwin,
sounds like either your sata controller was damaged, or the drive electronics.
Can you connect the drives to another system to verify they are working correctly ?
I don't have another computer available to test right now.
I'd say that if you have important data on those 2 drives which is not backed up and is important, then hand those drives over to a data recovery company. For the drives to be reporting a different user capacity something went seriously wrong. That doesn't surprise me much, from the second link, the drive on the left seem to have been hit quite badly, even if it looks ok after cleaning it doesn't meant it didn't suffer damage from heating before magic smoke came out of your machine.
The data in those partitions was not important. I just wanted to avoid to repartition and reformat the drive to match the "new" capacity.
I get the impression that the hardware is fine. Are the drives on the same make and model controller they had been prior to the melt down?
Also, out of curiosity, how did you clean the drive? I was going to suggest anhydrous isopropanol. Not much worry about getting on the plastic, eh?
A piece of paper slightly wet with water. Probably not the best method, but I didn't connected it until some time after.
Yeah. I suspect that the drives are ok. A short in PSU power plug should only trigger overload protection on the PSU or just fry it.
The SMART status from earlier is from 2013? Could that be the case of SMART reporting the usable space in the disk? This is a result of damaged sectors leaved out of the total size. I'm not very familiar with SMART, so we need someone to confirm this.noalwin wrote:Both drives pass the SMART short and conveyance tests without error.
This only strenghtens my suspicion of drives being OK.
If there aredamaged sectors, I would expect a reallocated sector count bigger than 0
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
But I see that the "VALUE" of Offline_Uncorrectable changed from 100 to 200 in the 3tb drive.
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 <- 2013
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 <- 2015
Last edited by noalwin (2015-11-22 02:00:55)
Offline
The Offline_Uncorrectable sectors will not reduce the reported user capacity and I'd say that even if there were a few reallocated sectors that would not change it either, drives have spare space hidden from the user exactly for that purpose, otherwise even just one bad sector would require partition resizing to accommodate the new size. Of course that if too many sectors are reallocated then drive capacity might start to decrease but that doesn't seem to be the case here.
Regarding the smart tests, if you want to be really sure do the long test, the short and conveyance tests can miss problems, after all they are short tests that run in a couple of minutes. If you want to be even more sure the drives are ok before putting them into service again, run badblocks on those drives and let it finish all the write+readback passes and see if you get errors.
R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K
Offline
anyway, appearing of Offline_Uncorrectable sectors usually means that something really bad happens to the drive.
as for me, a drive with UNC certainly has to be be replaced ASAP.
— love is the law, love under wheel, — said aleister crowley and typed in his terminal:
usermod -a -G wheel love
Offline
The disks passed the long SMART test without error.
About the "Offline_Uncorrectable" attribute, I have been always under the impression that the real value is the RAW one, that is 0 in this case.
And about the badblocks write+readback, it would mean to destroy the data in the drives. While they are not critical, at least on the 3tb drive, I would like to conserve it. (Edit: I forgot about the non-destructive write mode)
I was (wishfully) thinking that maybe the firmware was somehow damaged and to reflash the drives would solve it. But I have not been able to find how to reflash/update the firmware for Western Digital drives.
By the way, the 3TB drive capacity difference is 1081856 bytes. That is 2113 logical sectors of 512 bytes, or 264.125 of physical sectors of 4096 bytes. Since it uses Advanced Format with 4K physical sectors, a fractional value seems very strange to me. Also, the original capacity (3000592982016) is divisible by 4096, while the new one (3000591900160) is not.
I'm analisyng the first sectors of the 3TB drive. The first 512bytes (the MBR) is zeroed, and after it begins the primary GPT header, and 512 bytes before the end of the drive, is the secondary GPT header. They vary on the CRC, the Header and backup LBA (they are interchanged) and the start of the partition array (I guess that each one points to their own copy)... So the drive has the first and the last blocks on the proper places, but according to smart the drive is 1mb smaller...
Tough I remember an application complaining that the GPT was incorrect and it had to use the backup one... But it should not have overwrited the "original one"...
And by the way, the ext4 filesystem seems to start at the proper offset marked in the partition table, while the end offset is outside the size of the drive. But the offset, plus the size of the GPT gives the original size (3000592982016 bytes)
sfdisk -V /dev/sdc
/dev/sdc:
Partition 1 is too big for the disk.
1 error detected.
I'm tired, I'll go to sleep
Last edited by noalwin (2015-11-22 23:59:02)
Offline
Well, in the end it seems that sdc (the 3tb drive) had Host Protected Area enabled. After I disabled it (with hdparm -N), sdc1 became visible. Of course, there is risk that ~1.5 mb of data is corrupt. But I'm not worried about it.
In the case of sdb7, there were no Host Protected Area enabled... but there were free space after the partition, so I resicez the partition without resizing the filesystem. I'll try to check the integrity of the data there...
Offline
I haven't read through all of the posts but if in fact the HPA you linked was to blame, I recommend that you change the title of your first post to something more accurate to help others when searching. In other words, if the hardware failure was not the cause of the problem.
CPU-optimized Linux-ck packages @ Repo-ck • AUR packages • Zsh and other configs
Offline
I haven't read through all of the posts but if in fact the HPA you linked was to blame, I recommend that you change the title of your first post to something more accurate to help others when searching. In other words, if the hardware failure was not the cause of the problem.
HPA was the cause in sdc, but in sdb it was disabled... I don't know what caused it... Anyway I will update the title
Offline