You are not logged in.

#1 2006-12-11 21:28:18

kozaki
Member
From: London >. < Paris
Registered: 2005-06-13
Posts: 673
Website

SATA HDD randomly freeze for ~60" on arch32/64 kernels

On arch32/kernel2.6.19 & arch64/kernel2.6.18 with SataII HDD plugged to SataII connector, the HDD fully freeze then is remounted read only.

A few freezes occured this month that I first thought were X related.
But if I wait long enough, ~1 minute, everything goes back to normal.
You'll find system specs below.

When this occurs, as there is _no disk I/O_ I can do just nothing (anything from shifting to next graphical desktop to login etc.). Torrent UP/DL go to 0 (then restart when freeze stop), and disk I/O stops (like a dd if=dvd of=image.iso), then restart !
Most of the time when I get these errors the system will recover after
~2 to 4 minutes of unresponsiveness (no disk I/O),

On arch64 / kernel 2.6.18, logs are quite readable bout this issue:
# cat /var/log/kernel.log |grep frozen

Dec  3 22:59:24 llewellyn ata1.00: exception Emask 0x0 SAct 0x3f SErr 0x0 action 0x2 frozen
Dec  4 02:25:58 llewellyn ata1.00: exception Emask 0x0 SAct 0x1f SErr 0x0 action 0x2 frozen
Dec  4 09:06:56 llewellyn ata1.00: exception Emask 0x0 SAct 0x1ffff SErr 0x0 action 0x2 frozen
Dec  4 09:09:37 llewellyn ata1.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x2 frozen
Dec  4 09:35:24 llewellyn ata1.00: exception Emask 0x0 SAct 0x3ff SErr 0x0 action 0x2 frozen
Dec  5 12:56:13 llewellyn ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen
Dec  5 12:56:44 llewellyn ata1.00: exception Emask 0x0 SAct 0x1fff SErr 0x0 action 0x2 frozen
Dec  5 13:09:40 llewellyn ata1.00: exception Emask 0x0 SAct 0x3fffff SErr 0x0 action 0x2 frozen
Dec  5 16:27:12 llewellyn ata1.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x2 frozen
Dec  5 18:12:09 llewellyn ata1.00: exception Emask 0x0 SAct 0x7fff SErr 0x0 action 0x2 frozen
Dec 10 22:32:51 llewellyn ata1.00: exception Emask 0x0 SAct 0x7ff SErr 0x0 action 0x2 frozen
Dec 10 22:33:36 llewellyn ata1.00: exception Emask 0x0 SAct 0x7fff SErr 0x0 action 0x2 frozen
Dec 10 22:34:56 llewellyn ata1.00: exception Emask 0x0 SAct 0xfffffff SErr 0x0 action 0x2 frozen
Dec 10 22:38:57 llewellyn ata1.00: exception Emask 0x0 SAct 0xffff SErr 0x0 action 0x2 frozen
Dec 10 22:58:07 llewellyn ata1.00: exception Emask 0x0 SAct 0x3fc SErr 0x0 action 0x2 frozen
Dec 10 22:58:39 llewellyn ata1.00: exception Emask 0x0 SAct 0x7ff SErr 0x0 action 0x2 frozen
Dec 10 22:59:24 llewellyn ata1.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x2 frozen
Dec 10 23:34:34 llewellyn ata1.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x2 frozen
Dec 10 23:36:41 llewellyn ata1.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x2 frozen
Dec 10 23:37:17 llewellyn ata1.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x2 frozen
Dec 10 23:37:48 llewellyn ata1.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x2 frozen
Dec 10 23:38:18 llewellyn ata1.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x2 frozen
Dec 10 23:38:49 llewellyn ata1.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x2 frozen
Dec 10 23:39:19 llewellyn ata1.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x2 frozen
Dec 10 23:39:50 llewellyn ata1.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x2 frozen
Dec 10 23:40:20 llewellyn ata1.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x2 frozen
Dec 11 21:26:44 llewellyn ata1.00: exception Emask 0x0 SAct 0x7f SErr 0x0 action 0x2 frozen

So this occured before as well as after I switched my system's HDD (Hitachi SataII) to the SataII connector with "Sata mode".

Research tell this might be libata related :

Andrew Paprocki wrote:

[PATCH] Added S.M.A.R.T. command decoding to libata error reporting.
    * Date: Mon, 30 Oct 2006 01:09:20 -0500
       Added S.M.A.R.T. command decoding to libata error reporting.

This is useful because if a user program attempts to send an
invalid smart command, the standard reporting only indicates
that cmd 0xb0 (now ATA_CMD_SMART) failed. This code prints out
a readable string indicating which smart cmd was attempted as
encoded in tf.feature per the ATAPI spec.

Example with patch applied:
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata1.00: tag 0 cmd 0xb0 Emask 0x2 stat 0x50 err 0x0 (HSM violation)
ata1.00: smart cmd 0xd2 (enable/disable attribute autosave)

source : http://www.spinics.net/lists/linux-ide/msg05458.html

Here are the system specs
Linux llewellyn 2.6.18-ARCH #1 SMP PREEMPT Fri Dec 1 15:35:16 UTC 2006 x86_64 AMD Athlon(tm) 64 Processor 3200+ AuthenticAMD GNU/Linux
# hwdetect --show-sata
SATA   : libata
# hwdetect --show-scsi
SCSI   : ahci sd_mod
# hwdetect --show-ide
IDE    : ide-cd ide-core ide-disk alim15x3 generic

Any idea about where to start or look for (logs, modules to load or look for, other (custom) kernel to test) ?
HDD also freeze on arch32 w/ kernel 2.6.19-ARCH but haven't seen anything in the logs then (please see below)


Seeded last month: Arch 50 gig, derivatives 1 gig
Desktop @3.3GHz 8 gig RAM, linux-ck
laptop #1 Atom 2 gig RAM, Arch linux stock i686 (6H w/ 6yrs old battery smile) #2: ARM Tegra K1, 4 gig RAM, ChrOS
Atom Z520 2 gig RAM, OMV (Debian 7) kernel 3.16 bpo on SDHC | PGP Key: 0xFF0157D9

Offline

#2 2006-12-12 05:58:03

AndyRTR
Developer
From: Magdeburg/Germany
Registered: 2005-10-07
Posts: 1,641

Re: SATA HDD randomly freeze for ~60" on arch32/64 kernels

Check you harddisc (drivefitnesstest from hgst.com).
Check if your problem resists with kernel 2.6.19.
Disable smart support in BIOS.

Offline

#3 2006-12-13 00:21:55

kozaki
Member
From: London >. < Paris
Registered: 2005-06-13
Posts: 673
Website

Re: SATA HDD randomly freeze for ~60" on arch32/64 kernels

Thank you.
I'll do that ASAP (currently pretty booked for a couple of days more)


Seeded last month: Arch 50 gig, derivatives 1 gig
Desktop @3.3GHz 8 gig RAM, linux-ck
laptop #1 Atom 2 gig RAM, Arch linux stock i686 (6H w/ 6yrs old battery smile) #2: ARM Tegra K1, 4 gig RAM, ChrOS
Atom Z520 2 gig RAM, OMV (Debian 7) kernel 3.16 bpo on SDHC | PGP Key: 0xFF0157D9

Offline

#4 2006-12-13 23:29:07

kozaki
Member
From: London >. < Paris
Registered: 2005-06-13
Posts: 673
Website

Re: SATA HDD randomly freeze for ~60" on arch32/64 kernels

Same thing occurs on arch32 w/kernel 2.6.19  right after succesfull boot : ~2' then remounted ro then no command is accessible & I have to use magick keys to reboot.

But there are no "frozen" string on kernel.log !? (on opposite it has on arch64, kernel 2.6.18-ARCH):

# lspci
00:12.0 IDE interface: ALi Corporation M5229 IDE (rev c7)
00:12.1 IDE interface: ALi Corporation ULi 5289 SATA (rev 10)
03:00.0 SATA controller: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller

# lshwd -cc

IDE interface
  alim15x3        : Acer Laboratories Inc. [ALi]|M5229 IDE
  sata_uli        : ALi Corporation|ALi M5289 Serial ATA / RAID Host Controller
SATA controller
  unknown         : ALi Corporation|ALi M5289 Serial ATA / RAID Host Controller

# lsmod

jfs                   168656  1
vfat                   12672  1
fat                    48432  1 vfat
ext2                   61968  2
ide_disk               14464  2
sata_uli                6660  0
ide_cd                 39584  0
cdrom                  37160  1 ide_cd
ext3                  130832  3
mbcache                 8584  2 ext2,ext3
jbd                    61680  1 ext3
ahci                   16004  7
libata                 93088  2 sata_uli,ahci
generic                 6660  0 [permanent]
alim15x3               11544  0 [permanent]
ide_core              134784  5 usb_storage,ide_disk,ide_cd,generic,alim15x3
sd_mod                 18176  8

mkinitcio.conf:

MODULES="sd_mod alim15x3 generic uli526x ahci libata jbd ext3"
HOOKS="base keymap

I wanted to compile a custom kernel with abs then try kernel 2.6.19 on arch64 but every kernel26 I tried to compile (from standard to morph & beyond) quickly stops with this error:

HOSTCC  scripts/basic/fixdep
/bin/sh: scripts/basic/fixdep: Permission denied
make[2]: *** [scripts/basic/fixdep] Error 1
make[1]: *** [scripts_basic] Error 2
make: *** No rule to make target `include/config/auto.conf', needed by `include/config/kernel.release'.  Stop

That's a lot for the free time I have this week. So will stay with arch64/kernel 2.6.18 'til I get more sparse time (haven't find anything relative on the bbs here or googling)


Seeded last month: Arch 50 gig, derivatives 1 gig
Desktop @3.3GHz 8 gig RAM, linux-ck
laptop #1 Atom 2 gig RAM, Arch linux stock i686 (6H w/ 6yrs old battery smile) #2: ARM Tegra K1, 4 gig RAM, ChrOS
Atom Z520 2 gig RAM, OMV (Debian 7) kernel 3.16 bpo on SDHC | PGP Key: 0xFF0157D9

Offline

#5 2006-12-16 10:46:09

kozaki
Member
From: London >. < Paris
Registered: 2005-06-13
Posts: 673
Website

Re: SATA HDD randomly freeze for ~60" on arch32/64 kernels

Does someone please understand how this kernel 2.6.19 patch should be read/used by one using arch w/ kernel 2.6.19 with SataII HDD connected to Jmicron SataII controler ?
[PATCH] non-libata driver for Jmicron devices

Alan Cox wrote:

Less functional than libata this just uses the merged interface provided for dumb legacy OS's.  This is basically a bridge for people not yet ready to use libata for some reason or another.

Port visibility is entirely dependant on the BIOS setup.

What I got is an Asrock Dual SataII mobo with OS's on a Hitachi 80G SataII HDD connected to the Jmicron connector :
# lspci | grep SATA

03:00.0 SATA controller: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller

# lshwd -cc

SATA controller
  unknown         : Acer Laboratories Inc. [ALi]|M5229 IDE


Seeded last month: Arch 50 gig, derivatives 1 gig
Desktop @3.3GHz 8 gig RAM, linux-ck
laptop #1 Atom 2 gig RAM, Arch linux stock i686 (6H w/ 6yrs old battery smile) #2: ARM Tegra K1, 4 gig RAM, ChrOS
Atom Z520 2 gig RAM, OMV (Debian 7) kernel 3.16 bpo on SDHC | PGP Key: 0xFF0157D9

Offline

#6 2006-12-16 12:04:09

kozaki
Member
From: London >. < Paris
Registered: 2005-06-13
Posts: 673
Website

Re: SATA HDD randomly freeze for ~60" on arch32/64 kernels

AndyRTR wrote:

Check you harddisc (drivefitnesstest from hgst.com).
Check if your problem resists with kernel 2.6.19.
Disable smart support in BIOS.

- Haven't yet tested HDD fitness. Will do this, but in my experience most HDD failure occurs right after buying it, or after >3 years. This Hitachi is 1 year old.
- Same freeze with/without SMART (on SATAII HDD only and both HDD)
- Same with (BIOS) JMicron SataII on mode SATA STRONG & NORMAL
Also deactivated light o/c I usually put on the CPU --> same freeze.
- Same with/without torrent client running (but was compiling qemu when it occured, so there was disk I/O at that time)

kernel.log without SMART, o/c:

Dec 16 12:53:45 llewellyn ata1.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x2 frozen
Dec 16 12:53:45 llewellyn ata1.00: tag 0 cmd 0x61 Emask 0x4 stat 0x40 err 0x0 (timeout)
Dec 16 12:53:45 llewellyn ata1.00: tag 1 cmd 0x61 Emask 0x4 stat 0x40 err 0x0 (timeout)
...
Dec 16 12:53:45 llewellyn ata1.00: tag 29 cmd 0x61 Emask 0x4 stat 0x40 err 0x0 (timeout)
Dec 16 12:53:45 llewellyn ata1.00: tag 30 cmd 0x61 Emask 0x4 stat 0x40 err 0x0 (timeout)
Dec 16 12:53:45 llewellyn ata1: soft resetting port
Dec 16 12:53:45 llewellyn ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Dec 16 12:53:45 llewellyn ata1.00: configured for UDMA/133
Dec 16 12:53:45 llewellyn ata1: EH complete
Dec 16 12:53:45 llewellyn SCSI device sda: 160836480 512-byte hdwr sectors (82348 MB)
Dec 16 12:53:45 llewellyn sda: Write Protect is off
Dec 16 12:53:45 llewellyn sda: Mode Sense: 00 3a 00 00
Dec 16 12:53:45 llewellyn SCSI device sda: drive cache: write back


Seeded last month: Arch 50 gig, derivatives 1 gig
Desktop @3.3GHz 8 gig RAM, linux-ck
laptop #1 Atom 2 gig RAM, Arch linux stock i686 (6H w/ 6yrs old battery smile) #2: ARM Tegra K1, 4 gig RAM, ChrOS
Atom Z520 2 gig RAM, OMV (Debian 7) kernel 3.16 bpo on SDHC | PGP Key: 0xFF0157D9

Offline

Board footer

Powered by FluxBB