You are not logged in.

#1 2016-03-13 14:54:32

piknik
Member
Registered: 2016-03-13
Posts: 8

[SOLVED] Intel Haswell i7 hung tasks

I've got a new system built from earlier this year with the following specs:

- Intel i7-4790
- NVIDIA 970
- 16GB RAM
- 2 SSDs (with f2fs)
- 3 HDDs (a couple are 4+ years old. Have smartd monitoring them)
- Latest updates (4.4.3-1 stock kernel)

I don't use the 970 within arch, I have it under qemu/libvirtd with UEFI boot in Windows 10, so I assume it has nothing to do with my issue.

So, my issue is that sometimes maybe once a week, X just stops receiving input at all. However, I can switch TTYs with CTRL + ALT + 1/2/3 and login there, but after successful login, I don't see the regular Arch message, and no prompt appears at all. After a while I get a complete screen-ful of messages like this:

INFO: task xxxx:#### blocked for more than 120 seconds.

Still nothing, so it forces me to restart. Since I run this computer 24/7 as a server, I put in some settings from this site to make it restart automatically.

I've found some related posts about this issue for Haswell specifically, but all that I've seen have been reported a couple years ago. I've installed the microcode updates as indicated, but nothing changed in regards to this issue. I'd like to know exactly why the Linux kernel hangs, but I'm not sure where to find the log for this, if it exists. One important point I should mention, is that the last time this occurred was exactly when I tried running the Windows VM in libvirt-manager.

Where would I find the logs for this sort of issue? Would the logs even exist? And is there a fix for my issue? I'd apprecate any help.

EDIT: I have a laptop with a i7-2720QM and for the last 4 years I've had it, nothing like this ever happens, not even recently.

EDIT: Maybe another important point, this issue never occurred while away from the server, has always happened while using the desktop.

Last edited by piknik (2016-03-20 22:02:11)

Offline

#2 2016-03-13 15:30:00

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,597
Website

Re: [SOLVED] Intel Haswell i7 hung tasks

Does this happen if you switch to the linux-lts kernel (supported) or build 4.3.6 yourself and see.  I have the same CPU and a similar problem with the monitor and not waking up from sleep under the 4.4.x series and with 4.5rc7.  I have another machine (Broadwell) that prints out the same 'blocked for more than 120 seconds' error that you're describing but only on shutdown.  Again, the solution is to use the 4.3.6 kernel.


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#3 2016-03-13 15:37:37

piknik
Member
Registered: 2016-03-13
Posts: 8

Re: [SOLVED] Intel Haswell i7 hung tasks

graysky wrote:

Does this happen if you switch to the linux-lts kernel (supported) or build 4.3.6 yourself and see.  I have the same CPU and a similar problem with the monitor and not waking up from sleep under the 4.4.x series and with 4.5rc7.  I have another machine (Broadwell) that prints out the same 'blocked for more than 120 seconds' error that you're describing but only on shutdown.  Again, the solution is to use the 4.3.6 kernel.

I'll try using linux-lts. I'll get back within a week whether or not the issue persists. Thanks for the suggestion!

Offline

#4 2016-03-17 19:14:23

piknik
Member
Registered: 2016-03-13
Posts: 8

Re: [SOLVED] Intel Haswell i7 hung tasks

Back again. This time a lot of executables became corrupted for no apparent reason, and eventually I couldn't even login "Login incorrect" with no password prompt. Then I found out that /bin/sh seg faulted when trying to chroot from the install media to fix things. At that point I've decided to update my flaky BIOS (odd behaviors when accessed). Re-formatted the partitions to ext4 from f2fs on the SSDs, installed linux-lts, and all the usual stuff.

No re-occuring issues at this point, I'm not exactly sure what the exact issue was that caused all these problems.

Offline

#5 2016-03-17 19:16:19

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,597
Website

Re: [SOLVED] Intel Haswell i7 hung tasks

So it's running under linux-lts.  Please try the standard linux package and report back.  Bugs should be reproducible.


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#6 2016-03-17 21:14:55

piknik
Member
Registered: 2016-03-13
Posts: 8

Re: [SOLVED] Intel Haswell i7 hung tasks

Alright, the standard linux package has been re-installed and is in use now. I'll report back when/if there are issues.

Offline

#7 2016-03-18 20:51:23

piknik
Member
Registered: 2016-03-13
Posts: 8

Re: [SOLVED] Intel Haswell i7 hung tasks

I'm experiencing problems again, this time I think I know where they're coming from.

Using the standard kernel, I sometimes get micro-stutters with my mouse. I exasperated the problem by viewing .avi files spit out from motion in VLC. The avi's aren't perfect, they stutter at times, but I noticed that my mouse stuttered exactly along with the videos. Some of the clips were pretty bad and froze everything for a few good seconds. I ran a qemu VM, and while navigating the gnome 3 menu, everything completely froze (no HDD, no panic, no TTY switching) for over a minute which prompted me to restart. This time I restarted with linux-lts.

Under linux-lts, I tried viewing stuttering avi files, but when they stuttered my mouse didn't, so I think I narrowed down the problem to interoperability between the inter graphics driver/decoder to the kernel. I'm not even sure where to start logging this kind of thing to report a bug, some help would be nice again smile

EDIT: After viewing the AVI files some more, now the display started to micro-stutter, and even froze for > 1 second then crashed firefox as I was writing this edit.

EDIT2: Viewed the log and got out of memory errors, maybe related to intel GPU settings:

Mar 18 16:54:58 PIKNIK-SRV kernel: Purging GPU memory, 167936 bytes freed, 25513984 bytes still pinned.
Mar 18 16:54:58 PIKNIK-SRV kernel: Purging GPU memory, 32768 bytes freed, 25513984 bytes still pinned.
Mar 18 16:54:58 PIKNIK-SRV kernel: vlc invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Mar 18 16:54:58 PIKNIK-SRV kernel: vlc cpuset=/ mems_allowed=0
Mar 18 16:54:58 PIKNIK-SRV kernel: CPU: 0 PID: 2203 Comm: vlc Not tainted 4.1.19-1-lts #1
Mar 18 16:54:58 PIKNIK-SRV kernel: Hardware name: Gigabyte Technology Co., Ltd. Z97X-SLI/Z97X-SLI-CF, BIOS F9 09/18/2015
Mar 18 16:54:58 PIKNIK-SRV kernel:  0000000000000286 000000009e063d93 ffff88025e027968 ffffffff81582692
Mar 18 16:54:58 PIKNIK-SRV kernel:  ffff8800591309f0 ffff88041c66c590 ffff88025e0279c8 ffffffff81581c72
Mar 18 16:54:58 PIKNIK-SRV kernel:  0000000000000000 0000000000000000 0000000000000000 000000009e063d93
Mar 18 16:54:58 PIKNIK-SRV kernel: Call Trace:
Mar 18 16:54:58 PIKNIK-SRV kernel:  [<ffffffff81582692>] dump_stack+0x63/0x81
Mar 18 16:54:58 PIKNIK-SRV kernel:  [<ffffffff81581c72>] dump_header+0x88/0x1d4
Mar 18 16:54:58 PIKNIK-SRV kernel:  [<ffffffff8116529e>] oom_kill_process+0x34e/0x3b0
Mar 18 16:54:58 PIKNIK-SRV kernel:  [<ffffffff81165626>] __out_of_memory+0x326/0x560
Mar 18 16:54:58 PIKNIK-SRV kernel:  [<ffffffff81165a4b>] out_of_memory+0x5b/0x80
Mar 18 16:54:58 PIKNIK-SRV kernel:  [<ffffffff8116b543>] __alloc_pages_nodemask+0x943/0x9f0
Mar 18 16:54:58 PIKNIK-SRV kernel:  [<ffffffff811b1ba1>] alloc_pages_current+0x91/0x110
Mar 18 16:54:58 PIKNIK-SRV kernel:  [<ffffffff81161d07>] __page_cache_alloc+0xa7/0xd0
Mar 18 16:54:58 PIKNIK-SRV kernel:  [<ffffffff81163b2c>] filemap_fault+0x14c/0x410
Mar 18 16:54:58 PIKNIK-SRV kernel:  [<ffffffff8118e31c>] __do_fault+0x4c/0xe0
Mar 18 16:54:58 PIKNIK-SRV kernel:  [<ffffffff811938ad>] handle_mm_fault+0xf2d/0x1890
Mar 18 16:54:58 PIKNIK-SRV kernel:  [<ffffffff8106643e>] __do_page_fault+0x15e/0x480
Mar 18 16:54:58 PIKNIK-SRV kernel:  [<ffffffff81066782>] do_page_fault+0x22/0x30
Mar 18 16:54:58 PIKNIK-SRV kernel:  [<ffffffff815899d8>] page_fault+0x28/0x30
Mar 18 16:54:58 PIKNIK-SRV kernel: Mem-Info:
Mar 18 16:54:58 PIKNIK-SRV kernel: active_anon:357805 inactive_anon:2522661 isolated_anon:0
                                    active_file:428 inactive_file:604 isolated_file:0
                                    unevictable:8 dirty:0 writeback:0 unstable:0
                                    slab_reclaimable:20119 slab_unreclaimable:11252
                                    mapped:15242 shmem:2535683 pagetables:15688 bounce:0
                                    free:20692 free_pcp:783 free_cma:0
Mar 18 16:54:58 PIKNIK-SRV kernel: Node 0 DMA free:15896kB min:12kB low:12kB high:16kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB 
Mar 18 16:54:58 PIKNIK-SRV kernel: lowmem_reserve[]: 0 2923 15723 15723
Mar 18 16:54:58 PIKNIK-SRV kernel: Node 0 DMA32 free:54064kB min:2972kB low:3712kB high:4456kB active_anon:367280kB inactive_anon:1123348kB active_file:348kB inactive_file:540kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3005192kB managed:2995396kB mlocked:0kB dirty:0kB writeback:0kB mapped:11960kB 
Mar 18 16:54:58 PIKNIK-SRV kernel: lowmem_reserve[]: 0 0 12800 12800
Mar 18 16:54:58 PIKNIK-SRV kernel: Node 0 Normal free:12808kB min:13012kB low:16264kB high:19516kB active_anon:1063940kB inactive_anon:8967296kB active_file:1364kB inactive_file:1876kB unevictable:32kB isolated(anon):0kB isolated(file):0kB present:13367296kB managed:13107804kB mlocked:32kB dirty:0kB writeback:0kB mapp
Mar 18 16:54:58 PIKNIK-SRV kernel: lowmem_reserve[]: 0 0 0 0
Mar 18 16:54:58 PIKNIK-SRV kernel: Node 0 DMA: 2*4kB (U) 2*8kB (U) 2*16kB (U) 1*32kB (U) 3*64kB (U) 2*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (M) = 15896kB
Mar 18 16:54:58 PIKNIK-SRV kernel: Node 0 DMA32: 2947*4kB (UEM) 1885*8kB (UEM) 817*16kB (UEM) 283*32kB (UEM) 49*64kB (UEM) 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB (R) 0*4096kB = 54180kB
Mar 18 16:54:58 PIKNIK-SRV kernel: Node 0 Normal: 2350*4kB (UEM) 247*8kB (EMR) 7*16kB (E) 1*32kB (R) 1*64kB (R) 1*128kB (R) 1*256kB (R) 1*512kB (R) 0*1024kB 0*2048kB 0*4096kB = 12480kB
Mar 18 16:54:58 PIKNIK-SRV kernel: Node 0 hugepages_total=2060 hugepages_free=2060 hugepages_surp=0 hugepages_size=2048kB
Mar 18 16:54:58 PIKNIK-SRV kernel: 2539193 total pagecache pages
Mar 18 16:54:58 PIKNIK-SRV kernel: 2237 pages in swap cache
Mar 18 16:54:58 PIKNIK-SRV kernel: Swap cache stats: add 2362197, delete 2359960, find 1738/1990
Mar 18 16:54:58 PIKNIK-SRV kernel: Free swap  = 0kB
Mar 18 16:54:58 PIKNIK-SRV kernel: Total swap = 9432788kB
Mar 18 16:54:58 PIKNIK-SRV kernel: 4097119 pages RAM
Mar 18 16:54:58 PIKNIK-SRV kernel: 0 pages HighMem/MovableOnly
Mar 18 16:54:58 PIKNIK-SRV kernel: 67345 pages reserved
Mar 18 16:54:58 PIKNIK-SRV kernel: 0 pages hwpoisoned
Mar 18 16:54:58 PIKNIK-SRV kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Mar 18 16:54:58 PIKNIK-SRV kernel: [  213]     0   213    15845       99      32       3        0             0 systemd-journal
Mar 18 16:54:58 PIKNIK-SRV kernel: [  244]     0   244     8999      359      20       3        0         -1000 systemd-udevd
Mar 18 16:54:58 PIKNIK-SRV kernel: [  467]    89   467   161455    29728     107       4        0             0 mysqld *****Lots of these*****
Mar 18 16:54:58 PIKNIK-SRV kernel: Out of memory: Kill process 1106 (firefox) score 11 or sacrifice child
Mar 18 16:54:58 PIKNIK-SRV kernel: Killed process 1106 (firefox) total-vm:1448408kB, anon-rss:295552kB, file-rss:0kB

I'm going to try some config under the Intel Graphics article

EDIT3: I've tried every config listed in that article, and there's nothing that solved my problem. Just to reiterate, the micro-stuttering is very apparent when viewing VLC videos, as well, it continues even after closing VLC. Now I think it has something to do with memory, since that's what the logs have been showing, and now that's where I'm kinda stuck. I've tested my memory with memtest86 just before reformatting with no problems, so now I don't know.

EDIT4: I've found out that vlc leaks memory when repeating video files. Going to file a bug report to them. Hopefully that was all the problem I was having, though doesn't explain why my system completely froze while running a vm.

EDIT5: No problems as of yet. I assume the problem has been solved, just don't use vlc.

Last edited by piknik (2016-03-20 22:01:58)

Offline

Board footer

Powered by FluxBB