You are not logged in.
hello,
before anything i'm dual booting arch with windows 7 and i can boot into windows 7 normally and explore my archlinux drive.
today i was under pressure, i had to leave and take my laptop, it already had the issue of not being able to enter sleep mode or hibernate it always relaunches and turns on as soon as i hit sleep/hibernate(or just hangs on the desktop and can't do anything really) so i could only shutdown restart and turn on.
this time it wouldn't sleep or hibernate and hanged, so i turned it off with power button and took the laptop with me when i turned it on, i was asked to manually run fsck, so i boot up from an archlinux image through usb and so far i never had problems running fsck and fix the issue but this time i'm getting this when i run
fsck /dev/sda4and it's just hanging on force rewrite, idk if it's supposed to take a while or not...
i have so much software and projects on there that i need and can't afford to lose everything i built and configured for so long, what do i need to do?
i'd really appreciate the help
Last edited by aksilarch (2021-11-10 20:31:45)
archlinux cures my depression.
Offline
<moderator mode>
Please update your thread title. "Urgent" is superfluous.
You may want to read the article linked in my signature.
</moderator mode>
What type of drive is it? What is the file system?
Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
The shortest way to ruin a country is to give power to demagogues.— Dionysius of Halicarnassus
---
How to Ask Questions the Smart Way
Offline
<moderator mode>
Please update your thread title. "Urgent" is superfluous.
You may want to read the article linked in my signature.
</moderator mode>
thanks for the notice, i updated the title.
What type of drive is it? What is the file system?
what do you mean by type of drive? it's an old laptop (HP-G62 notebook) with an HDD and the filesystem is ext4.
archlinux cures my depression.
Offline
Please run a S.M.A.R.T. self test on the drive and post the results.
Do you have a backup of the data on the drive?
Offline
Please run a S.M.A.R.T. self test on the drive and post the results.
Do you have a backup of the data on the drive?
i will try it and post the results but which one's needed short or extended?
archlinux cures my depression.
Offline
Start with short and post the results from that.
Offline
Start with short and post the results from that.
here's the short:
smartctl -H /dev/sda
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.13.13-1-MANJARO] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Please note the following marginal Attributes:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
190 Airflow_Temperature_Cel 0x0022 068 041 045 Old_age Always In_the_past 32 (Min/Max 23/37)fsck still does nothing, it just hangs on force rewrite for the /dev/sda4.
archlinux cures my depression.
Offline
or is it this one?
smartctl -a /dev/sda
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.13.13-1-MANJARO] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Toshiba 2.5" HDD MK..56GSY
Device Model: TOSHIBA MK3256GSY
Serial Number: 50T1FASWS
LU WWN Device Id: 5 000039 2867896a4
Firmware Version: LH013C
User Capacity: 320,072,933,376 bytes [320 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm
Form Factor: 2.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS (minor revision not indicated)
SATA Version is: SATA 2.6, 3.0 Gb/s
Local Time is: Tue Nov 9 02:30:22 2021 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 121) The previous self-test completed having
the read element of the test failed.
Total time to complete Offline
data collection: ( 120) seconds.
Offline data collection
capabilities: (0x51) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 80) minutes.
SCT capabilities: (0x0033) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 100 100 050 Pre-fail Always - 0
2 Throughput_Performance 0x0007 100 100 050 Pre-fail Always - 0
3 Spin_Up_Time 0x0003 100 100 002 Pre-fail Always - 2164
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 5227
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 9
7 Seek_Error_Rate 0x000f 100 100 050 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 100 100 050 Pre-fail Offline - 0
9 Power_On_Minutes 0x0032 082 082 000 Old_age Always - 123h+20m
10 Spin_Retry_Count 0x0013 204 100 030 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 5218
183 Runtime_Bad_Block 0x0022 100 100 001 Old_age Always - 9
184 End-to-End_Error 0x0033 100 100 097 Pre-fail Always - 0
185 Unknown_Attribute 0x0032 100 100 001 Old_age Always - 65535
187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 782
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 100 100 001 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 069 041 045 Old_age Always In_the_past 31 (Min/Max 23/37)
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 1098
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 14614751
193 Load_Cycle_Count 0x0032 093 093 000 Old_age Always - 75257
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 8
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 3
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
SMART Error Log Version: 1
ATA Error Count: 1670 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 1670 occurred at disk power-on lifetime: 7400 hours (308 days + 8 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 42 70 30 c4 40 Error: UNC at LBA = 0x00c43070 = 12857456
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 08 40 70 30 c4 40 00 01:28:14.362 READ FPDMA QUEUED
ef 10 02 00 00 00 a0 00 01:28:14.362 SET FEATURES [Enable SATA feature]
ec 00 00 00 00 00 a0 00 01:28:14.361 IDENTIFY DEVICE
ef 03 45 00 00 00 a0 00 01:28:14.361 SET FEATURES [Set transfer mode]
ef 10 02 00 00 00 a0 00 01:28:14.361 SET FEATURES [Enable SATA feature]
Error 1669 occurred at disk power-on lifetime: 7400 hours (308 days + 8 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 aa 70 30 c4 40 Error: UNC at LBA = 0x00c43070 = 12857456
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 08 a8 70 30 c4 40 00 01:28:11.275 READ FPDMA QUEUED
60 08 a0 68 30 c4 40 00 01:28:11.274 READ FPDMA QUEUED
60 08 98 60 30 c4 40 00 01:28:11.274 READ FPDMA QUEUED
60 08 90 58 30 c4 40 00 01:28:11.274 READ FPDMA QUEUED
60 08 88 50 30 c4 40 00 01:28:11.274 READ FPDMA QUEUED
Error 1668 occurred at disk power-on lifetime: 7400 hours (308 days + 8 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 12 70 30 c4 40 Error: UNC at LBA = 0x00c43070 = 12857456
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 20 f0 c8 5d 02 40 00 01:28:08.043 READ FPDMA QUEUED
60 18 08 a0 5d 02 40 00 01:28:08.043 READ FPDMA QUEUED
60 08 00 48 5d 02 40 00 01:28:08.043 READ FPDMA QUEUED
60 30 f8 08 5c 02 40 00 01:28:08.043 READ FPDMA QUEUED
60 10 e8 f0 5b 02 40 00 01:28:08.042 READ FPDMA QUEUED
Error 1667 occurred at disk power-on lifetime: 7393 hours (308 days + 1 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 02 70 30 c4 40 Error: UNC at LBA = 0x00c43070 = 12857456
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 08 00 70 30 c4 40 00 00:02:39.405 READ FPDMA QUEUED
ef 10 02 00 00 00 a0 00 00:02:39.402 SET FEATURES [Enable SATA feature]
ec 00 00 00 00 00 a0 00 00:02:39.401 IDENTIFY DEVICE
ef 03 45 00 00 00 a0 00 00:02:39.401 SET FEATURES [Set transfer mode]
ef 10 02 00 00 00 a0 00 00:02:39.401 SET FEATURES [Enable SATA feature]
Error 1666 occurred at disk power-on lifetime: 7393 hours (308 days + 1 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 7a 70 30 c4 40 Error: UNC at LBA = 0x00c43070 = 12857456
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 08 78 70 30 c4 40 00 00:02:36.315 READ FPDMA QUEUED
60 08 70 68 30 c4 40 00 00:02:36.315 READ FPDMA QUEUED
60 08 68 60 30 c4 40 00 00:02:36.314 READ FPDMA QUEUED
60 08 60 58 30 c4 40 00 00:02:36.314 READ FPDMA QUEUED
60 08 58 50 30 c4 40 00 00:02:36.313 READ FPDMA QUEUED
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 90% 7400 341050017
# 2 Short offline Completed: read failure 90% 7400 341050017
# 3 Short offline Completed: read failure 90% 7391 341050017
# 4 Short offline Completed without error 00% 4 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testingarchlinux cures my depression.
Offline
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 9
...
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 3Pending relocated sectors are not a good sign.
Do you have another drive you can image the one with the corrupted file system on to?
Offline
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 9 ... 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 3Pending relocated sectors are not a good sign.
Do you have another drive you can image the one with the corrupted file system on to?
i have a 3.1 usb drive with 32gb capacity, what can i do? i was thinking of porting my whole system to it, format the partition where the system is located and then reinstall it with restoring the ported image (in order to keep all my old configuration and everything i built, simply restore it?) without losing the dual boot or the grub at least but how do i do all of that. i'm kind of stuck and i don't know how to proceed because i don't have many options
Last edited by aksilarch (2021-11-09 01:50:01)
archlinux cures my depression.
Offline
My guess looking at screenshot you posted is there is a bad block in the area used by sda4. fsck tried to write to that block, the drive could not do this. At this point the system will hang waiting for the write operation to complete or timeout.
The drive firmware should be retrying the write until it gives up marks the block as bad and uses a reserve block. This part for some reason does not seem to have happened or something else failed in the process of it happening.
If the above is correct then you need a replacement drive or you you could try and avoid that block Badblocks#During_filesystem_check and hope more bad blocks do not appear.
Whatever was stored in that block is unrecoverable.
Offline
My guess looking at screenshot you posted is there is a bad block in the area used by sda4. fsck tried to write to that block, the drive could not do this. At this point the system will hang waiting for the write operation to complete or timeout.
The drive firmware should be retrying the write until it gives up marks the block as bad and uses a reserve block. This part for some reason does not seem to have happened or something else failed in the process of it happening.If the above is correct then you need a replacement drive or you you could try and avoid that block Badblocks#During_filesystem_check and hope more bad blocks do not appear.
Whatever was stored in that block is unrecoverable.
booting into windows, using diskinternals i could access my files (well everything i needed) and so, i'm on what you just suggested to me i ran:
fsck -vcck /dev/sda4 i'll follow up when it's done, so far it's at 0.50% with errors (1/0/0)
archlinux cures my depression.
Offline
Sorry for not getting back earlier. Yes to all my questions. I wanted to know if it was a spinning disk, and yes, ext4 is what I was looking for as to the file system.
It sounds like you have a hard failure. Let's see what happens with the fsck. If we can recover the data, great. This drive is toast and cannot be trusted.
This is why $DEITY created backups.
Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
The shortest way to ruin a country is to give power to demagogues.— Dionysius of Halicarnassus
---
How to Ask Questions the Smart Way
Offline
I'd suggest to not run fsck nor smart tests on that drive.
dd_rescue the partition onto a drive that is not broken and fsck the image.
Every attempt to write onto a broken disc has the potential to make the situation much worse (eg. turn the superblock from readable into garbage)
Because of the bad block, some data will be lost - no matter what.
Offline
here's the output of
fsck -vcck /dev/sda4 fsck -vcck /dev/sda4
fsck from util-linux 2.37.2
e2fsck 1.46.4 (18-Aug-2021)
Checking for bad blocks (non-destructive read-write test)
Testing with random pattern: done
/dev/sda4: Updating bad block inode.
Pass 1: Checking inodes, blocks, and sizes
Running additional passes to resolve blocks claimed by more than one inode...
Pass 1B: Rescanning for multiply-claimed blocks
Multiply-claimed block(s) in inode 274367: 1072654
Multiply-claimed block(s) in inode 5505417: 153684
Pass 1C: Scanning directories for inodes with multiply-claimed blocksthen it's asking me to ignore and force write again
Last edited by aksilarch (2021-11-09 09:01:30)
archlinux cures my depression.
Offline
I'd suggest to not run fsck nor smart tests on that drive.
dd_rescue the partition onto a drive that is not broken and fsck the image.
Every attempt to write onto a broken disc has the potential to make the situation much worse (eg. turn the superblock from readable into garbage)Because of the bad block, some data will be lost - no matter what.
well, if i do that (cause i never used ddrescue), can i later on restore that image? and use it? with all my packages and configurations that is?
archlinux cures my depression.
Offline
That depends on how much could be rescued and how much damage there was.
I'd do a clean reinstall and restore configs / data separately if possible.
Don't run fsck and even badblocks on a known bad drive! Even though it says non-destructive, it's destructive if it can't write the original data back after overwriting it. If you still need the data off such a drive, ddrescue is the only way to go forward.
Offline
That depends on how much could be rescued and how much damage there was.
I'd do a clean reinstall and restore configs / data separately if possible.
Don't run fsck and even badblocks on a known bad drive! Even though it says non-destructive, it's destructive if it can't write the original data back after overwriting it. If you still need the data off such a drive, ddrescue is the only way to go forward.
that's the thing, i'm still new to this, i want to know how i can restore my package data and config? is there documentation?
archlinux cures my depression.
Offline
Your immediate concern is to secure the data.
Whether and what can be restored from that afterwards is something you'll figure *afterwards* because it drastically depends on what's actually still readable.
Offline
Your immediate concern is to secure the data.
Whether and what can be restored from that afterwards is something you'll figure *afterwards* because it drastically depends on what's actually still readable.
okay gotcha, will do
archlinux cures my depression.
Offline
so i ran ddrescue, what do you guys make of this?
ddrescue -d -r3 /dev/sda4 myarch.img myarch.logfile
GNU ddrescue 1.25
Press Ctrl-C to interrupt
ipos: 4393 MB, non-trimmed: 0 B, current rate: 0 B/s
opos: 4393 MB, non-scraped: 0 B, average rate: 20483 kB/s
non-tried: 0 B, bad-sector: 1024 B, error rate: 85 B/s
rescued: 146084 MB, bad areas: 1, run time: 1h 58m 52s
pct rescued: 99.99%, read errors: 11, remaining time: n/a
time since last successful read: 24s
Finishedarchlinux cures my depression.
Offline
You've lost only the one bad block.
You can now from the live distro check which file(s) is(are) affected by the block and judge how bad that is (could be a file that you can simply re-install, could be cat-meme, could be your master thesis…)
=> https://wiki.archlinux.org/title/Identi … ,_and_ext4
You must then chose what to do w/ the broken disk.
You can either replace it, recreate the partition table on the new disk and just dd the rescued image onto the appropriate partition (it should have the original size and must not be smaller - you can however still grow the filesystem later if you made the partition bigger), install windows, whatever and delete/overwrite the damaged file you identified before.
Or you can hope (errhemm…) that there are no further issues w/ the drive and just continue to use it.
The SMART data you posted doesn't lend itself to support that hope, so you would at least want to badblocks torture the drive a bit in the non-destructive mode
=> https://wiki.archlinux.org/title/Badblo … structive)
You also want to check the SMART status inbetween to see whether the pending and re-allocated sectors creep up (what is a sure confirmation that you wasted your time and wanted to replace the drive)
Offline
You've lost only the one bad block.
You can now from the live distro check which file(s) is(are) affected by the block and judge how bad that is (could be a file that you can simply re-install, could be cat-meme, could be your master thesis…)
=> https://wiki.archlinux.org/title/Identi … ,_and_ext4
here's an output:
tune2fs -l /dev/sda4 | grep Block
Block count: 35665238
Block size: 4096
Blocks per group: 32768
[manjaro archsda4]# debugfs
debugfs 1.46.4 (18-Aug-2021)
debugfs: open /dev/sda4
debugfs: testb 1072654
Block 1072654 marked in use
debugfs: icheck 1072654
Block Inode number
1072654 274367
debugfs: ncheck 274367
Inode Pathname
274367 /usr/lib/oracle/admin/XE/adump
debugfs: icheck 153684
Block Inode number
153684 5505417
debugfs: ncheck 5505417
Inode Pathname
ncheck: Input/output error while calling ext2_dir_iterate
5505417 /var/log/journal/d791f2fb296940189b0e7d03e3a5756c/system@00000000000000000000000000000000-0000000000000000-0000000000000000.journal
debugfs: testb 5505417
Block 5505417 marked in usearchlinux cures my depression.
Offline
marking this thread as solved.
here's what i did (as a temporary fix until i install a new drive):
# ddrescue -d -r3 /dev/sda4 myarch.img myarch.logfilewhere /dev/sda4 is the source data and filesystem (containing archlinux) to image and myarch.img is the destination and name of the result image.
(executed this command inside an external drive)
after that deleted the /dev/sda4 partition w/ gparted (and extended it for it to have more storage space).
then i ran:
# ddrescue -f myarch.img /dev/sda4 restore.logfileto restore the myarch.img image into /dev/sda4 from the external drive, when the cloning completed, i had to run
fsck /dev/sda4, fixed the duplicate multiply blocks and now everything is working fine on the machine with the same "FAULTY" drive, it has to be changed soon though.
thanks to all
Last edited by aksilarch (2021-11-10 20:41:22)
archlinux cures my depression.
Offline
You could have skipped the reverse dd(rescue) - but instead you want to make sure to https://wiki.archlinux.org/title/Badblo … stem_check and since you apparently grew the partition, you also want to grow the FS w/ resize2fs, https://wiki.archlinux.org/title/Parted … partitions
Don't trust the disk and make sure to store (copies of) any important data on external drives.
Good luck.
Offline