You are not logged in.

#1 2016-01-26 15:42:29

jcarneal
Member
Registered: 2016-01-25
Posts: 10

Samsung 850 EVO SSD Freezing - Kernel Issues?

Hello everyone,

First time caller, long time listener smile

This is a great forum, and my apologies I haven't introduced myself yet.  I'm a linux user of 20+ years, but more recently have been using macs for my desktop.  But those days are now behind me, and I'm in the process of converting my "desktop" daily driver to an ASUS laptop with Arch linux, dwm, etc.  Thanks to a lot of the helpful documentation I've found here, I've been having very good luck with that until the following very annoying issue with my new SSD reared up in the last week or two.

MACHINE DETAILS:
- ASUS 551JQ w/ 16GB RAM
- Samsung 850 EVO SSD 2TB
- Kernel 4.3.3-3-ARCH
- Everything else recently updated via pacman

PROBLEM: 
Under heavy write loads the machine will freeze temporarily, then remount the root partition as read-only.  It then remains running but becomes unusable thereafter.  Every command I give it after this happens is met with "input/output error" from the commandline.  Eventually I have to hard reset the machine as I cannot even get 'reboot' or 'shutdown' to run.

Read loads don't seem to bother it.  The easiest way to reproduce the error is to run an rsync copying a large directory of files from another local network machine to this machine.  It will lock up as I have described above 100% of the time doing this.

I've tried getting the journal logs off the machine but can't find a way to do so after the fact.  Tried scp from ramdisk, usb thumbdrive, ec.  Nothing has worked, but I'm open to other ideas.  In lieu of actual logs, I've included some pics from my phone that I took after the most recent issue.

The problem always begins with something like this:

ata5.00:  exception Emask 0x0 SAct 0x7fffffef Serr 0x0 action x6
ata5.00:  failed command: WRITE FPDMA QUEUED

More here:

http://tinyurl.com/hq674be
http://tinyurl.com/h8hwayw
http://tinyurl.com/zntyoyl


Things I've tried so far:

- Confirm NLQ TRIM blacklist via kernel source and dmesg (it is)
- libata.force=5:noncq 
- pci=nomsi
- libata.force=1.5Gbps
- remount root as rw after the fact (can't, says it's write-protected)
- filling up the remainder of the drive using dd if=/dev/zero to see if it locked on a certain sector (it ran fine)

None of these made any difference and the problem remains but an rsync away.

I'm aware that the Samsung SSDs have a checkered past with the linux kernel, and I'm wondering now if perhaps I'm better off converting to a standard SATA drive until another manufacturer makes an SSD in this size (2TB).  That's a big PITA though, so I'm all ears if anyone here has some additional ides on things to try first.

Thanks in advance!

Jeff

Moderator edit:  Converted over sized image to url link https://wiki.archlinux.org/index.php/Fo … s_and_code 

Last edited by jcarneal (2016-01-30 12:45:11)

Offline

#2 2016-01-26 21:10:38

R00KIE
Forum Fellow
From: Between a computer and a chair
Registered: 2008-09-14
Posts: 4,734

Re: Samsung 850 EVO SSD Freezing - Kernel Issues?

I would try libata.force=noncq. Don't try to be picky and disable ncq only for that drive(1), you are troubleshooting so better be blunt with it.

I would also check if there are any firmware updates for your SSD, as that maybe have been solved if there are newer firmware versions. As for getting logs out of the machine, given that your machine boots just fine you may have luck with a serial port or usb-serial adpater, this of course if you have access to an extra computer with a serial port.

Regarding the error messages, check here [1] and see what they mean, it might provide you some clues to what's causing the problem.

(1) On my laptop the bus number for disks (at least for IDE disks) changes depending on if I'm doing a cold boot or a reboot, I'm not saying that is the case with your motherboard but why wonder if you can disable ncq globally.

[1] https://ata.wiki.kernel.org/index.php/L … r_messages


R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K

Offline

#3 2016-01-27 22:55:09

mich41
Member
Registered: 2012-06-22
Posts: 796

Re: Samsung 850 EVO SSD Freezing - Kernel Issues?

Those timeouts followed by failed reset attempts suggest that the SSD firmware crashes and becomes unresponsive. I second the recommendation to update it.

Also, WRITE FPDMA QUEUED is an NCQ command. If you took those screenshots after disabling noncq then clearly you did something wrong. You should be getting "WRITE DMA EXT" on noncq.

Offline

#4 2016-01-27 23:04:29

Mathuin
Member
Registered: 2014-04-11
Posts: 16

Re: Samsung 850 EVO SSD Freezing - Kernel Issues?

Ah fuck. I wanted to buy that SSD. With that and the 840 evo problems, I'm not too keen on buying a Samsung SSD.

Offline

#5 2016-01-28 02:42:45

jcarneal
Member
Registered: 2016-01-25
Posts: 10

Re: Samsung 850 EVO SSD Freezing - Kernel Issues?

Thanks for the replies everyone.

So it looks like there may be a newer firmware version out for the drive.  Problem is apparently Samsung has bricked drives in the past with firmware updates, so I'm going to run a backup tonight and try the update tomorrow once that's done.

mich41, I don't dispute what you say, and perhaps I did do something wrong, but I don't see what it could be.  I checked /proc/cmdline after all the changes I made above and they registered there.  Further, I confirmed with the 4.3.3 kernel source that all the Samsung EVO drives are still blacklisted from NLQ TRIM.  If tha'ts correct, then the noncq would be redundant any way, right?  Any ideas on what else to check in this regard?

I will also try R00KIE's suggestion to noncq all ata via kernel command line.

Thanks again,

Jeff

Last edited by jcarneal (2016-01-28 02:47:26)

Offline

#6 2016-01-28 02:49:49

sevendogs
Member
From: Texas
Registered: 2016-01-24
Posts: 201

Re: Samsung 850 EVO SSD Freezing - Kernel Issues?

Interesting - running a Samsung 850 Evo Pro on this Arch box with no issues. Box has only been "live" for a week or so though. I too have heard about bricking issues by updating firmware so I did not do it when I got the drive. Doing nothing special config wise other than I changed "relatime" to "noatime" in fstab based on advice I read about SSDs. This is my first SSD so really have no experience with them.


"Give a man a truth and he will think for a day. Teach a man to reason and he will think for a lifetime"

Offline

#7 2016-02-01 05:37:16

jcarneal
Member
Registered: 2016-01-25
Posts: 10

Re: Samsung 850 EVO SSD Freezing - Kernel Issues?

Well, I updated the firmware to the latest version and tried the "libata.force=noncq".  Unfortunately neither fixed the issue.  Or at least, didn't completely cure my problems with this drive under linux.

I did manage to get the latest log off the machine after a crash and have included an abbreviated version below.  Since I'm out of both ideas and time on this, I've replaced the drive with a conventional SATA hdd and have reinstalled my system on it.  So far no issues whatsoever, and surprisingly, the limited benchmarks I've done so far don't indicate any significant degradation in speed, even on read-intensive tasks.

If someone else can figure out what's going on with these drives, my hat's off to you.  For everyone else, I'd consider letting this thread serve as a cautionary tale against this particular model (Samsung 850 EVO 2TB) until a fix is definitively determined.

Jeff

--

Jan 29 12:22:37 ivey kernel: ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Jan 29 12:23:37 ivey kernel: ata5.00: failed command: WRITE DMA EXT
Jan 29 12:23:37 ivey kernel: ata5.00: cmd 35/00:b8:80:6a:03/00:2f:2f:00:00/e0 tag 7 dma 6254592 out
                                      res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 29 12:23:37 ivey kernel: ata5.00: status: { DRDY }
Jan 29 12:23:37 ivey kernel: ata5: hard resetting link
Jan 29 12:23:37 ivey kernel: ata5: link is slow to respond, please be patient (ready=0)
Jan 29 12:23:37 ivey kernel: ata5: COMRESET failed (errno=-16)
Jan 29 12:23:37 ivey kernel: ata5: hard resetting link
Jan 29 12:23:37 ivey kernel: ata5: link is slow to respond, please be patient (ready=0)
Jan 29 12:23:37 ivey kernel: ata5: COMRESET failed (errno=-16)
Jan 29 12:23:37 ivey kernel: ata5: hard resetting link
Jan 29 12:23:37 ivey kernel: ata5: link is slow to respond, please be patient (ready=0)
Jan 29 12:23:37 ivey kernel: ata5: COMRESET failed (errno=-16)
Jan 29 12:23:37 ivey kernel: ata5: limiting SATA link speed to 3.0 Gbps
Jan 29 12:23:37 ivey kernel: ata5: hard resetting link
Jan 29 12:23:37 ivey kernel: ata5: COMRESET failed (errno=-16)
Jan 29 12:23:37 ivey kernel: ata5: reset failed, giving up
Jan 29 12:23:37 ivey kernel: ata5.00: disabled
Jan 29 12:23:37 ivey kernel: ata5.00: device reported invalid CHS sector 0
Jan 29 12:23:37 ivey kernel: ata5: EH complete
Jan 29 12:23:37 ivey kernel: sd 4:0:0:0: [sda] tag#9 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
Jan 29 12:23:37 ivey kernel: sd 4:0:0:0: [sda] tag#9 CDB: opcode=0x2a 2a 00 2f 03 6a 80 00 2f b8 00
Jan 29 12:23:37 ivey kernel: blk_update_request: I/O error, dev sda, sector 788753024
Jan 29 12:23:37 ivey kernel: EXT4-fs warning (device sda2): ext4_end_bio:329: I/O error -5 writing to inode 20451673 (offset 0 size 6254592 starting block 98594384)
Jan 29 12:23:37 ivey kernel: Buffer I/O error on device sda2, logical block 98462800
Jan 29 12:23:37 ivey kernel: Buffer I/O error on device sda2, logical block 98462801
Jan 29 12:23:37 ivey kernel: Buffer I/O error on device sda2, logical block 98462802
Jan 29 12:23:37 ivey kernel: Buffer I/O error on device sda2, logical block 98462803
Jan 29 12:23:37 ivey kernel: Buffer I/O error on device sda2, logical block 98462804
Jan 29 12:23:37 ivey kernel: Buffer I/O error on device sda2, logical block 98462805
Jan 29 12:23:37 ivey kernel: Buffer I/O error on device sda2, logical block 98462806
Jan 29 12:23:37 ivey kernel: Buffer I/O error on device sda2, logical block 98462807
Jan 29 12:23:37 ivey kernel: Buffer I/O error on device sda2, logical block 98462808
Jan 29 12:23:37 ivey kernel: Buffer I/O error on device sda2, logical block 98462809
Jan 29 12:23:37 ivey kernel: EXT4-fs warning (device sda2): ext4_end_bio:329: I/O error -5 writing to inode 20451673 (offset 0 size 6254592 starting block 98594640)
Jan 29 12:23:37 ivey kernel: EXT4-fs warning (device sda2): ext4_end_bio:329: I/O error -5 writing to inode 20451673 (offset 0 size 6254592 starting block 98594896)
Jan 29 12:23:37 ivey kernel: EXT4-fs warning (device sda2): ext4_end_bio:329: I/O error -5 writing to inode 20451673 (offset 0 size 6254592 starting block 98595152)
Jan 29 12:23:37 ivey kernel: EXT4-fs warning (device sda2): ext4_end_bio:329: I/O error -5 writing to inode 20451673 (offset 0 size 6254592 starting block 98595408)
Jan 29 12:23:37 ivey kernel: EXT4-fs warning (device sda2): ext4_end_bio:329: I/O error -5 writing to inode 20451673 (offset 0 size 6254592 starting block 98595655)
Jan 29 12:23:37 ivey kernel: sd 4:0:0:0: [sda] tag#10 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
Jan 29 12:23:37 ivey kernel: sd 4:0:0:0: [sda] tag#10 CDB: opcode=0x28 28 00 a0 d1 11 f8 00 00 08 00
Jan 29 12:23:37 ivey kernel: blk_update_request: I/O error, dev sda, sector 2698056184
Jan 29 12:23:37 ivey kernel: EXT4-fs error (device sda2): ext4_find_entry:1451: inode #84279594: comm rsync: reading directory lblock 0
Jan 29 12:23:37 ivey kernel: sd 4:0:0:0: [sda] tag#11 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
Jan 29 12:23:37 ivey kernel: sd 4:0:0:0: [sda] tag#11 CDB: opcode=0x2a 2a 00 74 54 16 10 00 05 40 00

Last edited by jcarneal (2016-02-01 12:12:17)

Offline

#8 2016-02-01 09:09:03

R00KIE
Forum Fellow
From: Between a computer and a chair
Registered: 2008-09-14
Posts: 4,734

Re: Samsung 850 EVO SSD Freezing - Kernel Issues?

@jcarneal
Please use code tags when pasting code or text output from programs.

@sevendogs
I can't comment about Samsung specifically but so far I've updated my SSD's firmware a couple of times and so far no problems, but I always do a full disk image just in case. Also you may want to read about the lazytime mount option.

[1] https://wiki.archlinux.org/index.php/Fo … s_and_code


R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K

Offline

Board footer

Powered by FluxBB