You are not logged in.

#1 2019-10-17 20:27:00

rbiswas143
Member
Registered: 2018-08-17
Posts: 6

[SOLVED] System crash: Kernel panic invalid opcode 0000 PREMPT SMP PTI

Context (maybe relevant)

  • Have been using arch on my laptop for 2 years without facing major issues

  • Laptop recently failed to start at times, crashed frequently and the charging light would not come on sometimes

  • Technicians said that the Power IC had shorted and replaced/repaired it

  • Problem went away but Arch did not boot and the laptop always entered the BIOS menu

  • I used a bootable usb to reinstall arch. However, I found that my partitions and data were intact and I just used the same partitioning without formatting

  • I reinstalled GRUB and got my system back to exactly how it was. So, probably the bootloader files went corrupt due to extensive hard booting

Problem

  • A new problem has emerged, where the kernel panics and the system hangs

  • I have to hard reboot every time, and the panic messages are not persisted. Hence, the snapshot below

  • The first time I use the laptop everyday, there are no crashes for a long time (5-10 hours)

  • After the first crash, subsequent sessions crash within the first 10 minutes

  • This strongly suggests a hardware problem, more like a heating problem. But I monitored the CPU temperature, which was fine throughout (~53C) and the laptop didn't seem unusually hot either

  • At this stage, I am clueless. Is it actually a hardware problem? If so, how do I narrow down to the faulty component?

j6l4DWt.jpg

Other details

[~]$ uname -a
Linux rhinoMSi 5.2.11-arch1-1-ARCH #1 SMP PREEMPT Thu Aug 29 08:09:36 UTC 2019 x86_64 GNU/Linux
[~]$ sudo lshw -short
H/W path       Device  Class          Description
=================================================
                       system         PE62 7RE (16J9.3)
/0                     bus            MS-16J9
/0/1                   memory         64KiB BIOS
/0/3e                  memory         8GiB System Memory
/0/3e/0                memory         8GiB SODIMM DDR4 Synchronous 2400 MHz (0.4 ns)
/0/3e/1                memory         [empty]
/0/42                  memory         256KiB L1 cache
/0/43                  memory         1MiB L2 cache
/0/44                  memory         6MiB L3 cache
/0/45                  processor      Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
/0/100                 bridge         Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers
/0/100/1               bridge         Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16)
/0/100/1/0             display        GP107M [GeForce GTX 1050 Ti Mobile]
/0/100/2               display        HD Graphics 630
/0/100/14              bus            100 Series/C230 Series Chipset Family USB 3.0 xHCI Controller
/0/100/14/0    usb1    bus            xHCI Host Controller
/0/100/14/0/7          input          MSI EPF USB
/0/100/14/0/a          communication  Bluetooth wireless interface
/0/100/14/0/b          multimedia     BisonCam, NB Pro
/0/100/14/0/c          generic        USB2.0-CRW
/0/100/14/1    usb2    bus            xHCI Host Controller
/0/100/14.2            generic        100 Series/C230 Series Chipset Family Thermal Subsystem
/0/100/16              communication  100 Series/C230 Series Chipset Family MEI Controller #1
/0/100/17              storage        HM170/QM170 Chipset SATA Controller [AHCI Mode]
/0/100/1c              bridge         100 Series/C230 Series Chipset Family PCI Express Root Port #1
/0/100/1c/0    wlp2s0  network        Dual Band Wireless-AC 3168NGW [Stone Peak]
/0/100/1c.3            bridge         100 Series/C230 Series Chipset Family PCI Express Root Port #4
/0/100/1c.3/0  enp3s0  network        QCA8171 Gigabit Ethernet
/0/100/1f              bridge         HM175 Chipset LPC/eSPI Controller
/0/100/1f.2            memory         Memory controller
/0/100/1f.3            multimedia     CM238 HD Audio Controller
/0/100/1f.4            bus            100 Series/C230 Series Chipset Family SMBus
/1                     power          To Be Filled By O.E.M.
[~]$ lscpu 
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   39 bits physical, 48 bits virtual
CPU(s):                          8
On-line CPU(s) list:             0-7
Thread(s) per core:              2
Core(s) per socket:              4
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           158
Model name:                      Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
Stepping:                        9
CPU MHz:                         1000.385
CPU max MHz:                     2800.0000
CPU min MHz:                     800.0000
BogoMIPS:                        5618.00
Virtualization:                  VT-x
L1d cache:                       128 KiB
L1i cache:                       128 KiB
L2 cache:                        1 MiB
L3 cache:                        6 MiB
NUMA node0 CPU(s):               0-7
Vulnerability L1tf:              Mitigation; PTE Inversion
Vulnerability Mds:               Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown:          Mitigation; PTI
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Full generic retpoline, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc
                                  art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx1
                                 6 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd
                                  ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveo
                                 pt xsavec xgetbv1 xsaves dtherm arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d

Let me know if I need to post any other details. I have never dealt with kernel panic or hardware faults before. Any help is greatly appreciated.

Update: Faulty GPU. See comment below.

Last edited by rbiswas143 (2019-10-22 04:40:38)


I'm new to this forum and have read the rules. I'd appreciate your feedback with regard to my adherence to the norms.

Offline

#2 2019-10-18 08:53:21

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,410

Re: [SOLVED] System crash: Kernel panic invalid opcode 0000 PREMPT SMP PTI

Ensure your microcode is set up and generally update your system. You might also want to disable laptop-mode-tools to ensure it isn't triggering some faulty power saving option.

You will likely also want to test your RAM for a day or so

Last edited by V1del (2019-10-18 09:40:05)

Online

#3 2019-10-18 15:36:50

seth
Member
Registered: 2012-09-03
Posts: 49,951

Re: [SOLVED] System crash: Kernel panic invalid opcode 0000 PREMPT SMP PTI

The scenario sounds as if maybe the case warms up, deforms, causes tension and boom: you got a loose connection.
You could try if you can restore the 5-10h uptime capacity by aggressively cooling it down, eg. putting it in the fridge for some time™ (where you keep the butter, NOT where you keep the ice cream. And power it off before doing so, esp. if you've a spinning HDD)

Edit: and make sure the humidity in the fridge isn't too high (ie. if everything has drops of water on it, you should defrost it before putting electronics there)

Last edited by seth (2019-10-18 15:45:11)

Offline

#4 2019-10-22 04:35:39

rbiswas143
Member
Registered: 2018-08-17
Posts: 6

Re: [SOLVED] System crash: Kernel panic invalid opcode 0000 PREMPT SMP PTI

Thank you for your support. I tried everything you guys suggested but without much luck. But, I kind of managed to fix the problem anyway.

  • Laptop soon went back to the initial state where it won't boot for a long time

  • Booted without SSD, HDD, battery, etc (one at a time) but the problem persisted

  • Screen distortions showed up at times and the screen sometimes went completely green

  • Finally uninstalled/blacklisted GPU drivers, and that worked like a charm

My diagnosis is that I have a faulty GPU but it starts misbehaving only when it heats up. The GPU is built into the motherboard, and I'd rather get a new laptop than replace the motherboard. Till then I'll be using a GPU less laptop which, sadly, defeats the purpose of me going with MSI in the first place. I wonder if I should consider using the GPU at times till it heats up without damaging anything else. Probably not. Marking this SOLVED.


I'm new to this forum and have read the rules. I'd appreciate your feedback with regard to my adherence to the norms.

Offline

Board footer

Powered by FluxBB