You are not logged in.
Pages: 1
Recently, almost every program has been failing intermittently with "Illegal instruction (core dumped)" or some variant thereof. For some periods of use, the system is unusable, not to mention leaves program artifacts all over the place.
coredumpctl list is full of such entries:
Fri 2014-11-14 14:00:02 CET 18640 1001 1001 4 * /usr/bin/dbus-send
Fri 2014-11-14 14:05:16 CET 19623 1000 1000 4 * /usr/bin/geary
Fri 2014-11-14 14:05:33 CET 19729 1000 1000 4 * /usr/bin/evince
Fri 2014-11-14 14:13:26 CET 21237 1000 1000 4 * /usr/bin/evince
Fri 2014-11-14 14:48:56 CET 28441 1000 1000 4 * /usr/bin/make
Fri 2014-11-14 14:52:14 CET 29146 1000 1000 4 * /usr/bin/make
Fri 2014-11-14 14:53:03 CET 29400 1001 1001 4 * /usr/lib/chromium/chromium
Fri 2014-11-14 15:04:36 CET 31770 1001 1001 4 * /usr/lib/chromium/chromium
Fri 2014-11-14 15:09:19 CET 628 0 0 4 * /usr/bin/pacman
Fri 2014-11-14 15:15:43 CET 1943 1000 1000 4 * /usr/lib/git-core/git
Fri 2014-11-14 15:15:46 CET 1935 1000 1000 4 * /usr/lib/gnupg/scdaemonchecking out the coredumps shows the illegal instructions seem to be in libpthread (which would explain why they're intermittent):
Core was generated by `/usr/lib/chromium/chromium --type=gpu-process --channel=5684.0.906543699 --user'.
Program terminated with signal SIGILL, Illegal instruction.
#0 0x00007f2a20de1b63 in __lll_lock_elision () from /usr/lib/libpthread.so.0but in various places:
Core was generated by `gdb /usr/bin/gdb /var/tmp/coredump-bwRUQZ'.
Program terminated with signal SIGILL, Illegal instruction.
#0 0x00007fb4135f5aea in pthread_rwlock_rdlock () from /usr/lib/libpthread.so.0Any suggestions on what to do next? CPU is quad-core Haswell i5-4670T
Offline
Do you have up-to-date system?
Do you have custom packages (e.g. glibc)? Did you modify makepkg.conf and added there your own CFLAGS?
Try to reinstall glibc from official repos.
Last edited by anatolik (2014-11-14 16:14:28)
Read it before posting http://www.catb.org/esr/faqs/smart-questions.html
Ruby gems repository done right https://bbs.archlinux.org/viewtopic.php?id=182729
Fast initramfs generator with security in mind https://wiki.archlinux.org/index.php/Booster
Offline
Check the information about microcode updates on the front page.
Offline
Yes everything is up-to-date. I have the intel microcode update, installed into gummiboot:
title Arch Linux
linux /vmlinuz-linux
initrd /intel-ucode.img
initrd /initramfs-linux.img
options root=/dev/sda2 rwI found a seemingly related issue on the fedora forum
(Edit: path corrected)
Last edited by ohwg (2014-11-14 16:45:02)
Offline
Welcome to Arch Linux.
You have to tell us a bit more about your system. For example, what processor(s) does it use?
Did you install 32 or 64 bit Arch Linux?
What kernel are you using (Post the output of uname -a)
Did your see the news about Intel Processor microcode updates and that the bootloader needs to be massaged by hand? If you did, did the problems start afterwards? If you didn't, is that about when the problems started?
Edit: We were posting at the same time. What journal entries exist that relate to microcode?
Edit 2: And I missed that you had told us what processor it is. Sorry.
Last edited by ewaller (2014-11-14 16:21:28)
Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
The shortest way to ruin a country is to give power to demagogues.— Dionysius of Halicarnassus
---
How to Ask Questions the Smart Way
Offline
Post the output of
dmesg | grep microOffline
Thanks, but I've been on Arch since around 2007, and plenty of Linux before then. Haven't needed the forum until now.
I have these logs in dmesg:
[ 0.000000] CPU0 microcode updated early to revision 0x1c, date = 2014-07-03
[ 0.090370] CPU1 microcode updated early to revision 0x1c, date = 2014-07-03
[ 0.111211] CPU2 microcode updated early to revision 0x1c, date = 2014-07-03
[ 0.132040] CPU3 microcode updated early to revision 0x1c, date = 2014-07-03
[ 0.319374] microcode: CPU0 sig=0x306c3, pf=0x2, revision=0x1c
[ 0.319380] microcode: CPU1 sig=0x306c3, pf=0x2, revision=0x1c
[ 0.319386] microcode: CPU2 sig=0x306c3, pf=0x2, revision=0x1c
[ 0.319389] microcode: CPU3 sig=0x306c3, pf=0x2, revision=0x1c
[ 0.319418] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter OrubaOffline
Hmm, that output shows all cores get updated, so this shouldn't happen. Does stuff crash right after boot, or only after suspend? If after suspend, it's a known bug, during resume cpu0 does not get updated while other cores do. More info here: https://bugs.archlinux.org/task/42689
Offline
@Gusar thanks, that seems to be it exactly. I use suspend a lot and hadn't considered that could be a factor. I guess no more suspend for me until the next kernel version...
Offline
Check if there's an option possible to disable TSX altogether in BIOS.
Alternatively, you can patch the libpthread binary
Run objdump -d /usr/lib/libpthread.so.0 and find elision_init. If it looks like this...
0000000000011ab0 <elision_init>:
11ab0: 48 83 ec 08 sub $0x8,%rsp
11ab4: e8 d7 ff ff ff callq 11a90 <__get_cpu_features>
11ab9: 48 8b 0d c8 54 20 00 mov 0x2054c8(%rip),%rcx # 216f88 <_DYNAMIC+0x258>
11ac0: 8b 40 1c mov 0x1c(%rax),%eax
11ac3: 31 d2 xor %edx,%edx
11ac5: 8b 09 mov (%rcx),%ecx
11ac7: 25 00 08 00 00 and $0x800,%eax
11acc: 0f 95 c2 setne %dl
11acf: 89 15 b3 99 20 00 mov %edx,0x2099b3(%rip) # 21b488 <__elision_available>
11ad5: 85 c9 test %ecx,%ecx
11ad7: b9 00 00 00 00 mov $0x0,%ecx
11adc: 0f 45 d1 cmovne %ecx,%edx
11adf: 85 c0 test %eax,%eax
11ae1: 89 15 9d 99 20 00 mov %edx,0x20999d(%rip) # 21b484 <__pthread_force_elision>
11ae7: 75 0a jne 11af3 <elision_init+0x43>
11ae9: c7 05 c5 57 20 00 00 movl $0x0,0x2057c5(%rip) # 2172b8 <__elision_aconf+0x8>
11af0: 00 00 00
11af3: 48 83 c4 08 add $0x8,%rsp
11af7: c3 retq
11af8: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
11aff: 00 then open it in some hex editor, go to byte 11acc and change 0f 95 c2 to 90 90 90. Should be enough. The modified binary doesn't crash on my AMD box, but make a backup just in case ![]()
Offline
The linux-3.17.3-1 package currently in [testing] contains a fix to properly update the microcode on resume, so no need to hack binaries
.
Offline
Pages: 1