You are not logged in.
Pages: 1
Hello,
today I had a task hung forever. Here is the behavior :
few messages sent by kernel:
[40802.952619] INFO: task plasma-desktop:13028 blocked for more than 120 seconds.
[40802.952621] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[40802.952623] plasma-desktop D c5207f2c 0 13028 1 0x00000000
[40802.952626] c5207f3c 00000086 00000002 c5207f2c 00000000 00000101 00000024 00000000
[40802.952630] c111167e c5206000 00000000 c5207ecc c11ac11d f5406380 c14c4380 f2c988a0
[40802.952634] f2c98a64 c11ac164 c5207f60 c14c4380 f5406380 f2c988a0 c142df60 bfb60c68
[40802.952638] Call Trace:
[40802.952643] [<c111167e>] ? do_filp_open+0x14e/0x6a0
[40802.952647] [<c11ac11d>] ? __copy_to_user_ll+0x5d/0x70
[40802.952649] [<c11ac164>] ? copy_to_user+0x34/0x50
[40802.952652] [<c131b885>] rwsem_down_failed_common+0x95/0xe0
[40802.952655] [<c131b8e2>] rwsem_down_write_failed+0x12/0x20
[40802.952657] [<c131b94a>] call_rwsem_down_write_failed+0x6/0x8
[40802.952660] [<c131b0f5>] ? down_write+0x15/0x17
[40802.952663] [<c10e12e0>] sys_mmap_pgoff+0xd0/0x1c0
[40802.952665] [<c10037df>] sysenter_do_call+0x12/0x28
[40802.952668] [<c1310000>] ? amd_64_threshold_cpu_callback+0xbc/0x23c
This task has ended after some times.
iotop saw me two another tasks totally hang :
Total DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
26 be/7 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [khugepaged]
13199 be/4 hamelg 0.00 B/s 0.00 B/s 0.00 % 96.27 % convert -de~ar-pluie.gif
$ ps -lp 13199
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
0 D 2000 13199 13196 0 80 0 - 24709 conges ? 00:00:00 convert
I had to reboot.
It's the first time i see that.
my kernel is :
$ yaourt -Q kernel26
core/kernel26 2.6.38.6-2 (base)
$ cat /proc/version
Linux version 2.6.38-ARCH (tobias@T-POWA-LX) (gcc version 4.6.0 20110429 (prerelease) (GCC) ) #1 SMP PREEMPT Fri May 13 07:54:18 UTC 2011
Last edited by hamelg (2011-05-31 21:39:27)
Offline
The bug has still occured yesterday.
it looks like this one :
https://lkml.org/lkml/2011/4/20/435
Last edited by hamelg (2011-05-31 21:34:59)
Offline
Same behavior with 2.6.39.1
It seems there are some severe issues with the mm subsystem since 2.6.38 :
http://www.mentby.com/Group/linux-kerne … -to-0.html
https://bugzilla.kernel.org/show_bug.cgi?id=35512
Here is a typical stack of a hung task :
[<c10e47b7>] congestion_wait+0x57/0xf0
[<c1064030>] ? abort_exclusive_wait+0x80/0x80
[<c10ff6c1>] compact_zone+0x731/0x760
[<c10ff76e>] compact_zone_order+0x7e/0xa0
[<c10ff835>] try_to_compact_pages+0xa5/0xe0
[<c10d32c3>] __alloc_pages_direct_compact+0x83/0x170
[<c10d3791>] __alloc_pages_nodemask+0x3e1/0x7b0
[<c110d128>] ? __mem_cgroup_commit_charge+0x78/0x110
[<c110a56f>] do_huge_pmd_anonymous_page+0x10f/0x2f0
[<c101ea7b>] ? lapic_next_event+0x1b/0x20
[<c10ec035>] handle_mm_fault+0x175/0x210
[<c1028a30>] ? vmalloc_sync_all+0x120/0x120
[<c1028b43>] do_page_fault+0x113/0x420
[<c10687e1>] ? hrtimer_interrupt+0x141/0x260
[<c104cedc>] ? irq_exit+0x3c/0x90
[<c134c84b>] ? smp_apic_timer_interrupt+0x5b/0x8a
[<c1028a30>] ? vmalloc_sync_all+0x120/0x120
[<c134be93>] error_code+0x67/0x6c
a possible workaround is to disable the khugepaged kernel thread :
echo never >/sys/kernel/mm/transparent_hugepage/enabled
am i alone to encounter this bug ?
I am waiting the pushed patches reach the mainline ...
Offline
I have the same problem, but my knowledge of kernel is not so powerful to solve this.
Offline
Pages: 1