You are not logged in.

#1 2015-12-01 00:32:05

letni69
Member
Registered: 2015-09-21
Posts: 17

Expr Illegal Instruction (gmpn_mul_1, mulx)

Hey smile,

So what I haven't been trying to do is to simply compile the Linux kernel as it didn't look as hard as smashing some command into bash.
The result of the long compilation was an error which says something like this (Sorry my box doesn't have internet yet, lmao, actually it had but that another story lol):

kernel/sys.c:1009:28: error: expected expression before ‘>>’ token
v = ((LINUX_VERSION_CODE >> 8) & 0xff) + 60;

I traced down the issue to the Makefile file which basically generates the LINUX_VERSION_CODE macro but unfortunately the macro wasn't defined at all because it was empty O.o.
Next thing I did was smashing my head several times and then I finally found the cause of this conspiracy. The reason the macro was empty is that "expr" which is used for the evaluated expression (version calculation), resulted into a SIGILL.
So I looked into coredumpctl for any recent occurrence, bingo, there were tons of them. Finally running gdb to find the instruction which failed:

Dump of assembler code for function __gmpn_mul_1_coreibwl:
   0x00007f380b666b40 <+0>:	mov    %rcx,%r10
   0x00007f380b666b43 <+3>:	mov    %rdx,%rcx
   0x00007f380b666b46 <+6>:	mov    %edx,%r8d
   0x00007f380b666b49 <+9>:	shr    $0x3,%rcx
   0x00007f380b666b4d <+13>:	and    $0x7,%r8d
   0x00007f380b666b51 <+17>:	mov    %r10,%rdx
   0x00007f380b666b54 <+20>:	lea    0x25694d(%rip),%r10        # 0x7f380b8bd4a8
   0x00007f380b666b5b <+27>:	movslq (%r10,%r8,4),%r8
   0x00007f380b666b5f <+31>:	lea    (%r8,%r10,1),%r10
   0x00007f380b666b63 <+35>:	jmpq   *%r10
   0x00007f380b666b66 <+38>:	mulx   (%rsi),%r10,%r8
   0x00007f380b666b6b <+43>:	lea    0x38(%rsi),%rsi
   0x00007f380b666b6f <+47>:	lea    -0x8(%rdi),%rdi
   0x00007f380b666b73 <+51>:	jmpq   0x7f380b666c17 <__gmpn_mul_1_coreibwl+215>
   0x00007f380b666b78 <+56>:	mulx   (%rsi),%r9,%rax
   0x00007f380b666b7d <+61>:	lea    0x10(%rsi),%rsi
   0x00007f380b666b81 <+65>:	lea    0x10(%rdi),%rdi
   0x00007f380b666b85 <+69>:	inc    %rcx
   0x00007f380b666b88 <+72>:	jmpq   0x7f380b666c5c <__gmpn_mul_1_coreibwl+284>
   0x00007f380b666b8d <+77>:	mulx   (%rsi),%r10,%r8
   0x00007f380b666b92 <+82>:	lea    0x18(%rsi),%rsi
   0x00007f380b666b96 <+86>:	lea    0x18(%rdi),%rdi
   0x00007f380b666b9a <+90>:	inc    %rcx
   0x00007f380b666b9d <+93>:	jmpq   0x7f380b666c4f <__gmpn_mul_1_coreibwl+271>
   0x00007f380b666ba2 <+98>:	mulx   (%rsi),%r9,%rax
   0x00007f380b666ba7 <+103>:	lea    0x20(%rsi),%rsi
   0x00007f380b666bab <+107>:	lea    0x20(%rdi),%rdi
   0x00007f380b666baf <+111>:	inc    %rcx
   0x00007f380b666bb2 <+114>:	jmpq   0x7f380b666c42 <__gmpn_mul_1_coreibwl+258>
   0x00007f380b666bb7 <+119>:	mulx   (%rsi),%r10,%r8
   0x00007f380b666bbc <+124>:	lea    0x28(%rsi),%rsi
   0x00007f380b666bc0 <+128>:	lea    0x28(%rdi),%rdi
   0x00007f380b666bc4 <+132>:	inc    %rcx
   0x00007f380b666bc7 <+135>:	jmp    0x7f380b666c35 <__gmpn_mul_1_coreibwl+245>
   0x00007f380b666bc9 <+137>:	mulx   (%rsi),%r9,%rax
   0x00007f380b666bce <+142>:	lea    0x30(%rsi),%rsi
   0x00007f380b666bd2 <+146>:	lea    0x30(%rdi),%rdi
   0x00007f380b666bd6 <+150>:	inc    %rcx
   0x00007f380b666bd9 <+153>:	jmp    0x7f380b666c28 <__gmpn_mul_1_coreibwl+232>
=> 0x00007f380b666bdb <+155>:	mulx   (%rsi),%r9,%rax
   0x00007f380b666be0 <+160>:	test   %rcx,%rcx
   0x00007f380b666be3 <+163>:	jne    0x7f380b666c07 <__gmpn_mul_1_coreibwl+199>
   0x00007f380b666be5 <+165>:	mov    %r9,(%rdi)
   0x00007f380b666be8 <+168>:	retq   
   0x00007f380b666be9 <+169>:	mulx   (%rsi),%r10,%r8
   0x00007f380b666bee <+174>:	lea    0x8(%rsi),%rsi
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 94
model name	: Intel(R) Pentium(R) CPU G4400 @ 3.30GHz
stepping	: 3
microcode	: 0x39
cpu MHz		: 759.000
cache size	: 3072 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave rdrand lahf_lm abm 3dnowprefetch arat epb pln pts dtherm hwp hwp_noitfy hwp_act_window hwp_epp intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust erms invpcid rdseed smap clflushopt xsaveopt xsavec xgetbv1
bugs		:
bogomips	: 6623.95
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 94
model name	: Intel(R) Pentium(R) CPU G4400 @ 3.30GHz
stepping	: 3
cpu MHz		: 759.000
cache size	: 3072 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
apicid		: 2
initial apicid	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave rdrand lahf_lm abm 3dnowprefetch arat epb pln pts dtherm hwp hwp_noitfy hwp_act_window hwp_epp intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 bmi2 erms invpcid rdseed smap clflushopt xsaveopt xsavec xgetbv1
bugs		:
bogomips	: 6623.95
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

Any idea what is going on here?

Notes:
I have already installed the intel-ucode package with no success.
The Makefile expression which failed:

$(shell expr $(VERSION) \* 65536 + 0$(PATCHLEVEL) \* 256 + 0$(SUBLEVEL))

Last edited by letni69 (2015-12-01 00:32:36)

Offline

#2 2015-12-01 00:49:52

ewaller
Administrator
From: Pasadena, CA
Registered: 2009-07-13
Posts: 20,257

Re: Expr Illegal Instruction (gmpn_mul_1, mulx)

How are you doing this?  Specifically.


Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing
---
How to Ask Questions the Smart Way

Offline

#3 2015-12-01 01:02:19

letni69
Member
Registered: 2015-09-21
Posts: 17

Re: Expr Illegal Instruction (gmpn_mul_1, mulx)

ewaller wrote:

How are you doing this?  Specifically.

git clone https://aur.archlinux.org/linux-ck.git/
cd ./linux-ck
makepkg -sr

? The build will fail when reaching a file which uses the linux version macro. Then I traced down the issue back to the $(shell expr ...) command. I tried running this command manually and bash gave me that Illegal Instruction message so I figured out looking into the core dump.

Last edited by letni69 (2015-12-01 01:02:34)

Offline

#4 2015-12-01 08:47:58

mich41
Member
Registered: 2012-06-22
Posts: 796

Re: Expr Illegal Instruction (gmpn_mul_1, mulx)

Well, mulx is a relatively new instruction (Haswell) so it seems that gmplib has buggy CPU detection logic and tries to use mulx on some machines which don't understand it. Report this to GMP, they should know what to do.

FWIW, my AMD box seems to be doing fine with the latest gmp-6.1.0-1.

Last edited by mich41 (2015-12-01 08:48:15)

Offline

#5 2015-12-01 17:29:24

letni69
Member
Registered: 2015-09-21
Posts: 17

Re: Expr Illegal Instruction (gmpn_mul_1, mulx)

mich41 wrote:

Well, mulx is a relatively new instruction (Haswell) so it seems that gmplib has buggy CPU detection logic and tries to use mulx on some machines which don't understand it. Report this to GMP, they should know what to do.

FWIW, my AMD box seems to be doing fine with the latest gmp-6.1.0-1.

Well the Pentium G4400 is a Skylake one and it doesn't support that ? O.o
I tried to modify the version macro into something like: $(shell echo $$((<arithmetic here>))) so basically avoiding expr and it worked as expected the kernel began to build but unfortunately again the same issue occurred again!
Yep this time gcc(cc1) failed because of something like:
gcc: internal compiler error
My first guess was that gcc also used gmp lib which had that mulx issue so I tried to trace it down using gdb (with set follow-fork-mode child):

Starting program: /usr/bin/bash ../../test.sh
[New process 11534]
process 11534 is executing new program: /usr/bin/gcc
[New process 11536]
process 11536 is executing new program: /usr/lib/gcc/x86_64-unknown-linux-gnu/5.2.0/cc1

Program received signal SIGILL, Illegal instruction.
[Switching to process 11536]
0x00007ffff7708009 in __gmpn_sqr_basecase_coreibwl () from /usr/lib/libgmp.so.10
Dump of assembler code for function __gmpn_sqr_basecase_coreibwl:
   0x00007ffff7708000 <+0>:	cmp    $0x2,%rdx
   0x00007ffff7708004 <+4>:	jae    0x7ffff7708016 <__gmpn_sqr_basecase_coreibwl+22>
   0x00007ffff7708006 <+6>:	mov    (%rsi),%rdx
=> 0x00007ffff7708009 <+9>:	mulx   %rdx,%rax,%rdx
   0x00007ffff770800e <+14>:	mov    %rax,(%rdi)
   0x00007ffff7708011 <+17>:	mov    %rdx,0x8(%rdi)
   0x00007ffff7708015 <+21>:	retq   
   0x00007ffff7708016 <+22>:	jne    0x7ffff7708055 <__gmpn_sqr_basecase_coreibwl+85>
   0x00007ffff7708018 <+24>:	mov    (%rsi),%rdx
   0x00007ffff770801b <+27>:	mov    0x8(%rsi),%rcx
   0x00007ffff770801f <+31>:	mulx   %rcx,%r9,%r10
   0x00007ffff7708024 <+36>:	mulx   %rdx,%rax,%r8
   0x00007ffff7708029 <+41>:	mov    %rcx,%rdx
   0x00007ffff770802c <+44>:	mulx   %rdx,%r11,%rdx
   0x00007ffff7708031 <+49>:	add    %r9,%r9
   0x00007ffff7708034 <+52>:	adc    %r10,%r10
   0x00007ffff7708037 <+55>:	adc    $0x0,%rdx
   0x00007ffff770803b <+59>:	add    %r9,%r8
   0x00007ffff770803e <+62>:	adc    %r11,%r10
   0x00007ffff7708041 <+65>:	adc    $0x0,%rdx
   0x00007ffff7708045 <+69>:	mov    %rax,(%rdi)
   0x00007ffff7708048 <+72>:	mov    %r8,0x8(%rdi)
   0x00007ffff770804c <+76>:	mov    %r10,0x10(%rdi)
   0x00007ffff7708050 <+80>:	mov    %rdx,0x18(%rdi)
   0x00007ffff7708054 <+84>:	retq   
   0x00007ffff7708055 <+85>:	cmp    $0x4,%rdx
   0x00007ffff7708059 <+89>:	jae    0x7ffff77080fa <__gmpn_sqr_basecase_coreibwl+250>
   0x00007ffff770805f <+95>:	push   %rbx
   0x00007ffff7708060 <+96>:	mov    (%rsi),%rdx
   0x00007ffff7708063 <+99>:	mulx   0x8(%rsi),%r10,%r11
   0x00007ffff7708069 <+105>:	mulx   0x10(%rsi),%r8,%r9
   0x00007ffff770806f <+111>:	add    %r11,%r8
   0x00007ffff7708072 <+114>:	mov    0x8(%rsi),%rdx
   0x00007ffff7708076 <+118>:	mulx   0x10(%rsi),%rax,%r11
   0x00007ffff770807c <+124>:	adc    %rax,%r9
   0x00007ffff770807f <+127>:	adc    $0x0,%r11
   0x00007ffff7708083 <+131>:	test   %ebx,%ebx
   0x00007ffff7708085 <+133>:	mov    (%rsi),%rdx
   0x00007ffff7708088 <+136>:	mulx   %rdx,%rbx,%rcx
   0x00007ffff770808d <+141>:	mov    %rbx,(%rdi)
   0x00007ffff7708090 <+144>:	mov    0x8(%rsi),%rdx
   0x00007ffff7708094 <+148>:	mulx   %rdx,%rax,%rbx
   0x00007ffff7708099 <+153>:	mov    0x10(%rsi),%rdx
   0x00007ffff770809d <+157>:	mulx   %rdx,%rsi,%rdx
   0x00007ffff77080a2 <+162>:	adcx   %r10,%r10
   0x00007ffff77080a8 <+168>:	adcx   %r8,%r8
Quit
gcc -Wp,-MD,drivers/gpu/drm/i2c/.ch7006_mode.o.d -v -nostdinc -isystem /usr/lib/gcc/x86_64-unknown-linux-gnu/5.2.0/include -I./arch/x86/include -Iarch/x86/include/generated/uapi -Iarch/x86/include/generated  -Iinclude -I./arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I./include/uapi -Iinclude/generated/uapi -include ./include/linux/kconfig.h -D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Wno-format-security -std=gnu89 -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -m64 -falign-jumps=1 -falign-loops=1 -mno-80387 -mno-fp-ret-in-387 -mpreferred-stack-boundary=3 -mskip-rax-setup -mtune=generic -mno-red-zone -mcmodel=kernel -funit-at-a-time -maccumulate-outgoing-args -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1 -DCONFIG_AS_CRC32=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -fno-delete-null-pointer-checks -O2 --param=allow-store-data-races=0 -Wframe-larger-than=2048 -fstack-protector-strong -Wno-unused-but-set-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -fno-var-tracking-assignments -pg -mfentry -DCC_USING_FENTRY -Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-overflow -fconserve-stack -Werror=implicit-int -Werror=strict-prototypes -Werror=date-time -DCC_HAVE_ASM_GOTO -Iinclude/drm  -DMODULE  -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ch7006_mode)"  -D"KBUILD_MODNAME=KBUILD_STR(ch7006)" -c -o drivers/gpu/drm/i2c/.tmp_ch7006_mode.o drivers/gpu/drm/i2c/ch7006_mode.c

Can anyone give me a hint where I should report this issue ?

Offline

Board footer

Powered by FluxBB