You are not logged in.

#1 2012-08-23 14:42:19

68flag
Member
Registered: 2012-08-23
Posts: 4

DRDY ERR I/O Errors only occur with KDE. HDD or kernel related?

This isn't too urgent since I'm not having this problem anymore, but I would like to know what caused it in the first place.

A few months ago, I think it was in May, when I was booting into KDE, the desktop wouldnt load. After logging in it goes to a black screen with a cursor. I check my tty consoles and they are all spammed with DRDY and I/O errors.
Today my log files don't go back to that time, but I apparantly wrote them down:

[timestamp] ata1.00: irq_stat 0x40000001
[timestamp] ata1: SError: { CommWake }
[timestamp] ata1.00: failed command: READ FPDMA QUEUED
[timestamp] ata1.00: cmd 60/08:00:79:c5:e9/00:00:38:00:00/40 tag 0 ncq 4096 in
[timestamp]                res 41/40:08:7c:c5:e9/00:00:38:00:00/60 Emast 0x409 (media error) <F>
[timestamp] ata1.00: status {DRDY ERR }
[timestamp] error: { UNC }
[timestamp] end_request: I/O error, dev sda, sector 9540445540   

It repeats this over and over again on all 7 consoles. I believe the hex numbers and sector # change slightly each time it repeats.
I restart the computer and it does the same thing. Then one day it booted into the desktop after sitting at a black screen with a cursor for 10 minutes.

When I saw Media and I/O error, I immediately thought it was my hard drive failing (which sucks because I had this computer for less than a year). But the weird thing was that once it finally boots into a desktop enviroment, everything works fine. I can access any file without any error, I can boot into my windows partition perfectly, and everything runs as fast as it did before. I ran fschk and chksdk several times, I passed with 0 bad sectors. SMART tests come out perfect.

The strangest thing about this is that the errors only come about when I boot into KDE. If I boot into any other enviroment I do not get these errors. Completely reinstalling KDE did not fix it. I don't remember updating or changing anything before these errors started appearing. I did do a full system update once I was actually able to get into my KDE session, but it didnt do anything. I eventually solved it by using another desktop enviroment, which was xfice at the time but now its cinnamon. I never got those errors ever since.

I posted something similar to this on KDE's forum, and no one could figure it out. One of the admins told me its very unusual that something like this would happen only with KDE, and the READ FPDMA/DRDY ERRs "are sourced from the kernel - and are disk/kernel related according to https://bugs.launchpad.net/ubuntu/+sour ... bug/550559."
As of now I can install KDE perfectly without any sort of issue, but I know several kernel updates occured since then. So could this be a issue with the kernel? Or is my hdd really starting to fail? I backed up the computer already, but I record music constantly and don't "aggresively" run backups (i.e not every day, but every month or so). Should I start doing so?

Thanks

Offline

#2 2012-08-27 07:00:35

lspci
Member
From: Virginia, USA
Registered: 2012-06-09
Posts: 242

Re: DRDY ERR I/O Errors only occur with KDE. HDD or kernel related?

68flag wrote:

This isn't too urgent since I'm not having this problem anymore, but I would like to know what caused it in the first place.

A few months ago, I think it was in May, when I was booting into KDE, the desktop wouldnt load. After logging in it goes to a black screen with a cursor. I check my tty consoles and they are all spammed with DRDY and I/O errors.
Today my log files don't go back to that time, but I apparantly wrote them down:

[timestamp] ata1.00: irq_stat 0x40000001
[timestamp] ata1: SError: { CommWake }
[timestamp] ata1.00: failed command: READ FPDMA QUEUED
[timestamp] ata1.00: cmd 60/08:00:79:c5:e9/00:00:38:00:00/40 tag 0 ncq 4096 in
[timestamp]                res 41/40:08:7c:c5:e9/00:00:38:00:00/60 Emast 0x409 (media error) <F>
[timestamp] ata1.00: status {DRDY ERR }
[timestamp] error: { UNC }
[timestamp] end_request: I/O error, dev sda, sector 9540445540   

It repeats this over and over again on all 7 consoles. I believe the hex numbers and sector # change slightly each time it repeats.
I restart the computer and it does the same thing. Then one day it booted into the desktop after sitting at a black screen with a cursor for 10 minutes.

When I saw Media and I/O error, I immediately thought it was my hard drive failing (which sucks because I had this computer for less than a year). But the weird thing was that once it finally boots into a desktop enviroment, everything works fine. I can access any file without any error, I can boot into my windows partition perfectly, and everything runs as fast as it did before. I ran fschk and chksdk several times, I passed with 0 bad sectors. SMART tests come out perfect.

The strangest thing about this is that the errors only come about when I boot into KDE. If I boot into any other enviroment I do not get these errors. Completely reinstalling KDE did not fix it. I don't remember updating or changing anything before these errors started appearing. I did do a full system update once I was actually able to get into my KDE session, but it didnt do anything. I eventually solved it by using another desktop enviroment, which was xfice at the time but now its cinnamon. I never got those errors ever since.

I posted something similar to this on KDE's forum, and no one could figure it out. One of the admins told me its very unusual that something like this would happen only with KDE, and the READ FPDMA/DRDY ERRs "are sourced from the kernel - and are disk/kernel related according to https://bugs.launchpad.net/ubuntu/+sour ... bug/550559."
As of now I can install KDE perfectly without any sort of issue, but I know several kernel updates occured since then. So could this be a issue with the kernel? Or is my hdd really starting to fail? I backed up the computer already, but I record music constantly and don't "aggresively" run backups (i.e not every day, but every month or so). Should I start doing so?

Thanks

I'm afraid to say that I think it's HDD.  I've had a similar problem.  Or wait, I dunno... now that I've actually read your post I'm not so sure.  That's a good thing, though! big_smile  It may mean that there's still hope for my poor Compaq laptop.  I only encountered the error(s) one time after my re-installation of Arch Linux, which I did just to make sure it wasn't my installation.  (And 'cause I wanted to use the new Grub and didn't feel like upgrading.)     
https://bbs.archlinux.org/viewtopic.php?id=147189

Yeah, I would backup religiously.... well sort of.  Just do a backup everytime you do something important to you, like record a song.  Do you already have a backup script or like setup set up? tongue

Personally, I'm due for a new backup within a week or two simply because it's been about a month or so since my last one.  So I guess as long as you do a backup once a month, you should.... be fine.  I'm not saying that doing more regular backups would necessarily be a bad thing, although it could prove to be quite a pain.

See if you find any of these useful. 

http://www.linuxquestions.org/questions … ing-838499

http://mikeys-ranting.blogspot.com/2010 … olved.html

https://answers.launchpad.net/ubuntu/+question/122588

Here's a bit from a comment on a post @ http://superuser.com/questions/121391/s … -icrc-abrt

DRDY ERR messages actually seems to be reported as a kernel bug in a lot of systems which seems to relate a lot with Ubuntu and to a smaller extent Debian. I am investigating this because this is something that has started happening with me recently. I would recommend the following (You will require a bootable CD for some of this and you may need it due to disk issues for all of this. The Ubuntu desktop install CD works well without making you install anything):

Put "options libata noacpi=1" in /etc/modprobe.d/options.conf
Run "e2fsck -f -c -v /dev/sda1" but replace /dev/sda1 with the partitions causing the error. As far as I know, e2fsck needs a partition with the file system so this probably won't work on the whole disk. If it does work on the whole disk, you still need to run it on the partitions anyways. You need a bootable CD for this.
Edit the file /boot/grub/menu.lst and on the line that starts with "# kopt" add "noapic" to the end of the line. The # at the start is important and does not act like a comment. Do not remove the #.
This does not affect the disk but if you change "splash" to "nosplash" and remove the word "quiet" from /boot/grub/menu.lst on the line that starts with "# defoptions" Then it will not have an image when you boot ubuntu but instead will give you more verbose output.
On Ubuntu, after you change anything inside /boot/grub/menu.lst you must run /usr/sbin/update-grub

Personally, I'm too new here and really don't know enough about the cryptic codes of the SATA stuff so here's what I found as a comment to one of the above links. 

The error is related to SATA Native Command Queueing (NCQ). FPDMA = First Party DMA. This is a newish performance feature on SATA drives.
I'd recommend a couple checks:
1) Update to the latest driver if you haven't already (seems several ubuntu users have also seen this - https://bugs.launchpad.net/ubuntu/+bug/550559) 
2) If you're using new drives with a motherboard that's an older generation, there might be a SATA spec compatibility issue. You can sometimes jumper the SATA drives to legacy mode (sometimes called SATA2 or 1.5). Also, you may be able to set your hardware (in BIOS) to a legacy mode. This should fix the above error, but might impose a perf penalty. There may also be a BIOS update to better support SATA.
3) Check if there's any driver options to disable NCQ support. Though the queueing provides a perf boost, its not the end of the world to go without it.

Last edited by lspci (2012-08-27 07:24:04)


Please don't be a help vampire. | Bitbucket

Give a little more for a little less today.  smile

Offline

#3 2012-08-27 13:17:19

bart_b
Member
Registered: 2012-06-16
Posts: 20

Re: DRDY ERR I/O Errors only occur with KDE. HDD or kernel related?

Some of these issues are related to interference of the Spread Spectrum Clocking (SSC), SSC is a way of eliminating interference. Now some BIOSes let you tweak this setting and some harddisks like WD have a jumper setting to disable/enable the spread spectrum frequency. But since it is more of a RF issue rearranging your sata cables might help or replace them with a better quality cable could do the trick to reduce interference. If you have overclocked your PC, try to re-tweak it to a more sane level.

Hope this helps

Offline

#4 2012-08-28 05:24:01

68flag
Member
Registered: 2012-08-23
Posts: 4

Re: DRDY ERR I/O Errors only occur with KDE. HDD or kernel related?

@lspci
Those links you sent me were VERY helpful! I used the online SMART test using SpeedFan on windows; I never thought of using smartctl --all /dev/sda.

Heres what it gave me:

smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.4.9-1-ARCH] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Toshiba 2.5" HDD MK..76GSX
Device Model:     TOSHIBA MK5076GSX
Serial Number:    91OHT43CT
LU WWN Device Id: 5 000039 3818077b6
Firmware Version: GS002D
User Capacity:    500,107,862,016 bytes [500 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Tue Aug 28 00:22:51 2012 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(  120) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 163) minutes.
SCT capabilities: 	       (0x003d)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 128
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   100   100   001    Pre-fail  Always       -       1756
  5 Reallocated_Sector_Ct   0x0033   100   100   050    Pre-fail  Always       -       0
  9 Power_On_Minutes        0x0032   092   092   000    Old_age   Always       -       3277h+14m
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       1072
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       2191
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       133
193 Load_Cycle_Count        0x0032   094   094   000    Old_age   Always       -       61036
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -       45 (Min/Max 15/53)
199 UDMA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       25758112
200 Multi_Zone_Error_Rate   0x0032   100   100   000    Old_age   Always       -       56684070
240 Head_Flying_Hours       0x0032   094   094   000    Old_age   Always       -       153507
241 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always       -       5471510246
242 Total_LBAs_Read         0x0032   100   100   000    Old_age   Always       -       7806093714
254 Free_Fall_Sensor        0x0032   100   100   000    Old_age   Always       -       0

SMART Error Log Version: 1
ATA Error Count: 558 (device log contains only the most recent five errors)
	CR = Command Register [HEX]
	FR = Features Register [HEX]
	SC = Sector Count Register [HEX]
	SN = Sector Number Register [HEX]
	CL = Cylinder Low Register [HEX]
	CH = Cylinder High Register [HEX]
	DH = Device/Head Register [HEX]
	DC = Device Command Register [HEX]
	ER = Error register [HEX]
	ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 558 occurred at disk power-on lifetime: 2148 hours (89 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 41 02 7c c5 e9 68  Error: UNC at LBA = 0x08e9c57c = 149538172

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 79 c5 e9 40 00      00:06:08.673  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:06:08.673  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:06:08.673  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:06:08.672  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 00      00:06:08.672  SET FEATURES [Set transfer mode]

Error 557 occurred at disk power-on lifetime: 2148 hours (89 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 41 02 7c c5 e9 68  Error: UNC at LBA = 0x08e9c57c = 149538172

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 79 c5 e9 40 00      00:06:04.673  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:06:04.673  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:06:04.673  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:06:04.672  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 00      00:06:04.672  SET FEATURES [Set transfer mode]

Error 556 occurred at disk power-on lifetime: 2148 hours (89 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 41 02 7c c5 e9 68  Error: UNC at LBA = 0x08e9c57c = 149538172

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 79 c5 e9 40 00      00:06:00.673  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:06:00.673  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:06:00.673  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:06:00.672  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 00      00:06:00.672  SET FEATURES [Set transfer mode]

Error 555 occurred at disk power-on lifetime: 2148 hours (89 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 41 02 7c c5 e9 68  Error: UNC at LBA = 0x08e9c57c = 149538172

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 79 c5 e9 40 00      00:05:56.673  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:05:56.673  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:05:56.672  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:05:56.672  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 00      00:05:56.672  SET FEATURES [Set transfer mode]

Error 554 occurred at disk power-on lifetime: 2148 hours (89 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 41 02 7c c5 e9 68  Error: WP at LBA = 0x08e9c57c = 149538172

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 08 18 d1 d8 e7 40 00      00:05:52.624  WRITE FPDMA QUEUED
  61 08 10 89 d8 e7 40 00      00:05:52.624  WRITE FPDMA QUEUED
  61 08 40 51 d8 e7 40 00      00:05:52.624  WRITE FPDMA QUEUED
  61 10 38 01 d8 e7 40 00      00:05:52.623  WRITE FPDMA QUEUED
  61 08 30 e1 d7 e7 40 00      00:05:52.623  WRITE FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      2320         -
# 2  Extended offline    Aborted by host               90%      2075         -
# 3  Short offline       Completed without error       00%      1900         -
# 4  Short offline       Completed without error       00%      1574         -
# 5  Short offline       Completed without error       00%      1415         -
# 6  Short offline       Completed without error       00%      1081         -
# 7  Short offline       Completed without error       00%       637         -
# 8  Short offline       Completed without error       00%       248         -
# 9  Short offline       Completed without error       00%         0         -
#10  Short offline       Completed without error       00%         0         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

The READ FPDMA QUEUED errors come up in SMART's log, so those problems I had have to be HDD related, not kernel. Other than that I really don't know what I'm looking at here. I don't like the numbers for UDMA_CRC_Error Count or Multi Zone Error rate, but I don't know what those mean. Until someone explains to me what I'm dealing with here, I'm going to assume it is an HDD problem and start furiously backing stuff up.

bart_b wrote:

Some of these issues are related to interference of the Spread Spectrum Clocking (SSC), SSC is a way of eliminating interference. Now some BIOSes let you tweak this setting and some harddisks like WD have a jumper setting to disable/enable the spread spectrum frequency. But since it is more of a RF issue rearranging your sata cables might help or replace them with a better quality cable could do the trick to reduce interference. If you have overclocked your PC, try to re-tweak it to a more sane level.

I forgot to mention that my computer is a Dell Inspiron n5110. Its a piece of garbage, its the only computer I am aware off that has programs that PREVENT you from connecting to the internet when you don't buy their stupid extended warranty. I'm saving up for a thinkpad t420 but thats another story

Woah. Radio signals interfering with data passing through the SATA cables? I've had poorly sheilded guitar cables that pick up odd AM stations, but SATA cables? Thats bizzare. I don't own a PC yet, but I'm going to build one for my brothers once the parts come in, and I'll definitely make a note of that. I didn't even know you could overclock a CPU to the point where it GENERATES RF signals powerful enough to interfere with the HDD. Thats really interesting.

Another note I should mention is that the motherboard on my dell died maybe a month or 2 after the I/O errors. Gotta love dell. Every single dell computer I owned died of mobo failure without any warning. Even their printers. Good god.
Anyway, could the mobo have been a factor? I know the tech that replaced the mobo told me that the LCD cable wasn't seated correctly, so could the SATA cables have been bad as well? I really doubt it though, but who knows

---
Edited for embarrassing spelling mistakes

Last edited by 68flag (2012-08-28 05:31:15)

Offline

#5 2012-08-28 05:36:33

lspci
Member
From: Virginia, USA
Registered: 2012-06-09
Posts: 242

Re: DRDY ERR I/O Errors only occur with KDE. HDD or kernel related?

68flag wrote:

@lspci
Those links you sent me were VERY helpful! I used the online SMART test using SpeedFan on windows; I never thought of using smartctl --all /dev/sda.

Heres what it gave me:

smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.4.9-1-ARCH] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Toshiba 2.5" HDD MK..76GSX
Device Model:     TOSHIBA MK5076GSX
Serial Number:    91OHT43CT
LU WWN Device Id: 5 000039 3818077b6
Firmware Version: GS002D
User Capacity:    500,107,862,016 bytes [500 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Tue Aug 28 00:22:51 2012 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(  120) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 163) minutes.
SCT capabilities: 	       (0x003d)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 128
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   100   100   001    Pre-fail  Always       -       1756
  5 Reallocated_Sector_Ct   0x0033   100   100   050    Pre-fail  Always       -       0
  9 Power_On_Minutes        0x0032   092   092   000    Old_age   Always       -       3277h+14m
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       1072
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       2191
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       133
193 Load_Cycle_Count        0x0032   094   094   000    Old_age   Always       -       61036
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -       45 (Min/Max 15/53)
199 UDMA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       25758112
200 Multi_Zone_Error_Rate   0x0032   100   100   000    Old_age   Always       -       56684070
240 Head_Flying_Hours       0x0032   094   094   000    Old_age   Always       -       153507
241 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always       -       5471510246
242 Total_LBAs_Read         0x0032   100   100   000    Old_age   Always       -       7806093714
254 Free_Fall_Sensor        0x0032   100   100   000    Old_age   Always       -       0

SMART Error Log Version: 1
ATA Error Count: 558 (device log contains only the most recent five errors)
	CR = Command Register [HEX]
	FR = Features Register [HEX]
	SC = Sector Count Register [HEX]
	SN = Sector Number Register [HEX]
	CL = Cylinder Low Register [HEX]
	CH = Cylinder High Register [HEX]
	DH = Device/Head Register [HEX]
	DC = Device Command Register [HEX]
	ER = Error register [HEX]
	ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 558 occurred at disk power-on lifetime: 2148 hours (89 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 41 02 7c c5 e9 68  Error: UNC at LBA = 0x08e9c57c = 149538172

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 79 c5 e9 40 00      00:06:08.673  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:06:08.673  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:06:08.673  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:06:08.672  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 00      00:06:08.672  SET FEATURES [Set transfer mode]

Error 557 occurred at disk power-on lifetime: 2148 hours (89 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 41 02 7c c5 e9 68  Error: UNC at LBA = 0x08e9c57c = 149538172

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 79 c5 e9 40 00      00:06:04.673  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:06:04.673  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:06:04.673  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:06:04.672  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 00      00:06:04.672  SET FEATURES [Set transfer mode]

Error 556 occurred at disk power-on lifetime: 2148 hours (89 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 41 02 7c c5 e9 68  Error: UNC at LBA = 0x08e9c57c = 149538172

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 79 c5 e9 40 00      00:06:00.673  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:06:00.673  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:06:00.673  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:06:00.672  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 00      00:06:00.672  SET FEATURES [Set transfer mode]

Error 555 occurred at disk power-on lifetime: 2148 hours (89 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 41 02 7c c5 e9 68  Error: UNC at LBA = 0x08e9c57c = 149538172

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 79 c5 e9 40 00      00:05:56.673  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:05:56.673  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:05:56.672  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:05:56.672  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 00      00:05:56.672  SET FEATURES [Set transfer mode]

Error 554 occurred at disk power-on lifetime: 2148 hours (89 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 41 02 7c c5 e9 68  Error: WP at LBA = 0x08e9c57c = 149538172

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 08 18 d1 d8 e7 40 00      00:05:52.624  WRITE FPDMA QUEUED
  61 08 10 89 d8 e7 40 00      00:05:52.624  WRITE FPDMA QUEUED
  61 08 40 51 d8 e7 40 00      00:05:52.624  WRITE FPDMA QUEUED
  61 10 38 01 d8 e7 40 00      00:05:52.623  WRITE FPDMA QUEUED
  61 08 30 e1 d7 e7 40 00      00:05:52.623  WRITE FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      2320         -
# 2  Extended offline    Aborted by host               90%      2075         -
# 3  Short offline       Completed without error       00%      1900         -
# 4  Short offline       Completed without error       00%      1574         -
# 5  Short offline       Completed without error       00%      1415         -
# 6  Short offline       Completed without error       00%      1081         -
# 7  Short offline       Completed without error       00%       637         -
# 8  Short offline       Completed without error       00%       248         -
# 9  Short offline       Completed without error       00%         0         -
#10  Short offline       Completed without error       00%         0         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

The READ FPDMA QUEUED errors come up in SMART's log, so those problems I had have to be HDD related, not kernel. Other than that I really don't know what I'm looking at here. I don't like the numbers for UDMA_CRC_Error Count or Multi Zone Error rate, but I don't know what those mean. Until someone explains to me what I'm dealing with here, I'm going to assume it is an HDD problem and start furiously backing stuff up.

bart_b wrote:

Some of these issues are related to interference of the Spread Spectrum Clocking (SSC), SSC is a way of eliminating interference. Now some BIOSes let you tweak this setting and some harddisks like WD have a jumper setting to disable/enable the spread spectrum frequency. But since it is more of a RF issue rearranging your sata cables might help or replace them with a better quality cable could do the trick to reduce interference. If you have overclocked your PC, try to re-tweak it to a more sane level.

I forgot to mention that my computer is a Dell Inspiron n5110. Its a piece of garbage, its the only computer I am aware off that has programs that PREVENT you from connecting to the internet when you don't buy their stupid extended warranty. I'm saving up for a thinkpad t420 but thats another story

Woah. Radio signals interfering with data passing through the SATA cables? I've had poorly sheilded guitar cables that pick up odd AM stations, but SATA cables? Thats bizzare. I don't own a PC yet, but I'm going to build one for my brothers once the parts come in, and I'll definitely make a note of that. I didn't even know you could overclock a CPU to the point where it GENERATES RF signals powerful enough to interfere with the HDD. Thats really interesting.

Another note I should mention is that the motherboard on my dell died maybe a month or 2 after the I/O errors. Gotta love dell. Every single dell computer I owned died of mobo failure without any warning. Even their printers. Good god.
Anyway, could the mobo have been a factor? I know the tech that replaced the mobo told me that the LCD cable wasn't seated correctly, so could the SATA cables have been bad as well? I really doubt it though, but who knows

---
Edited for embarrassing spelling mistakes

I think it's rather interesting that both you and I have Toshiba hard drives.  I've heard that they're supposed to be as delicate as glass, but mine's held up all right, until these weird errors and stuff. 

As for motherboards.  If the motherboard is messed up, everything will seem messed up, but may or may not.


Please don't be a help vampire. | Bitbucket

Give a little more for a little less today.  smile

Offline

#6 2012-08-28 08:21:24

bart_b
Member
Registered: 2012-06-16
Posts: 20

Re: DRDY ERR I/O Errors only occur with KDE. HDD or kernel related?

SSC is intended as a solution for the failure. It's not that you have big radio towers in your PC you must see it more like an echo on your sata cable bouncing around.  Modern day sata drivers in the linux kernel are intended for highspeed sata 600 and if your machine does not like this speed you can try to set your drive to sata-150
If your toshi-drive has a jumper to force it to sata-150 try that for the moment.
Another possible solution if you install "sdparm" and setup your device with the --flexible option, see man sdparm

Last edited by bart_b (2012-08-28 08:59:34)

Offline

#7 2012-08-28 15:30:32

68flag
Member
Registered: 2012-08-23
Posts: 4

Re: DRDY ERR I/O Errors only occur with KDE. HDD or kernel related?

@bart
I would rather not mess around with the HDD itself, that usually creates problems rather than solving them. And this computer's HDD was not designed to be user-servicable; I would have to rip the whole computer apart to gain access to it.
How would I know if my computer uses sata 600? Laptops usually don't have SATA cables, the HDD plugs directly into the mobo.

SDparm appears to be for SCSI drives only. When I run sdparm --f I get this:

Read write error recovery mode page:
  AWRE        1
  ARRE        0
  PER         0
Caching (SBC) mode page:
  WCE         1
  RCD         0
Control mode page:
  SWP         0

It looks to me like sdparm/hdparm are performance tuners, which I don't think I need because I doubt that the problem is performance related. If it was, I would have problems with windows and other linux desktop enviroments as well. I just want to know if the errors were signs that the drive would be failing soon.

I just realized that KDE was the only enviroment I used that had a paging utility that ran constantly. Could those errors be caused by a problem with the page files?

Offline

Board footer

Powered by FluxBB