You are not logged in.

#1 2021-03-26 12:42:27

8472
Member
From: Slovakia
Registered: 2010-05-15
Posts: 83

[SOLVED] kernel panic - fatal exception in interrupt

Hello, I am experiencing a regular "kernel panic - fatal exception in interrupt" at boot of my laptop: Lenovo ThinkPad L560.
Sometimes it does boot properly, even several times in a row, and sometimes I'm getting this annoying mentioned problem, making it unable to get into the system - even several reboots in a row, when it suddenly might start booting properly again, somehow.
Problem started already longer time ago, even with older kernel versions.
Affected kernels are LTS and non-LTS alike.
Have performed memtest, and all other sort of built-in HW tests (CPU, motherboard, S.M.A.R.T., etc.) provided by the BIOS, but no HW problem has been detected there, everything passed ok.


bug: kernel null pointer dereference, adress: ...
bug: unable to handle page fault for address: ...
...
Call Trace:
<IRQ>
run_timer_softirq+0x19/0x30
__do_softirq+0xca/0x288
asm_call_irq_on_stack+0x12/0x20
</IRQ>
do_softirq_own_stack+037/0x40
irq_exit_rcu+0x9c/0xd0
sysvec_apic_timer_interrupt+0x36/0x80
? asm_sysvec_apic_timer_interrupt+0xa/0x20
asm_sysvec_apic_timer_interrupt+0x12/0x20

https://imgur.com/a/VhqXnt5

# uname -a
Linux - 5.10.26-1-lts #1 SMP Thu, 25 Mar 2021 08:56:04 +0000 x86_64 GNU/Linux
# lspci
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers (rev 08)
00:02.0 VGA compatible controller: Intel Corporation Skylake GT2 [HD Graphics 520] (rev 07)
00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model
00:14.0 USB controller: Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller (rev 21)
00:14.2 Signal processing controller: Intel Corporation Sunrise Point-LP Thermal subsystem (rev 21)
00:16.0 Communication controller: Intel Corporation Sunrise Point-LP CSME HECI #1 (rev 21)
00:17.0 SATA controller: Intel Corporation Sunrise Point-LP SATA Controller [AHCI mode] (rev 21)
00:1c.0 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port #4 (rev f1)
00:1c.4 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port #5 (rev f1)
00:1c.5 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port #6 (rev f1)
00:1f.0 ISA bridge: Intel Corporation Sunrise Point-LP LPC Controller (rev 21)
00:1f.2 Memory controller: Intel Corporation Sunrise Point-LP PMC (rev 21)
00:1f.3 Audio device: Intel Corporation Sunrise Point-LP HD Audio (rev 21)
00:1f.4 SMBus: Intel Corporation Sunrise Point-LP SMBus (rev 21)
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection I219-V (rev 21)
02:00.0 Network controller: Intel Corporation Wireless 8260 (rev 3a)
05:00.0 SD Host controller: O2 Micro, Inc. SD/MMC Card Reader Controller (rev 01)

any ideas please?

Last edited by 8472 (2021-04-04 08:42:25)


Logic clearly dictates that the needs of the many outweigh the needs of the few.

Online

#2 2021-03-31 20:37:07

paulkerry
Member
From: Sheffield, UK
Registered: 2014-10-02
Posts: 611

Re: [SOLVED] kernel panic - fatal exception in interrupt

Looking at your screenshot, it looks to be falling fairly early in the boot process.
Have you installed microcode? https://wiki.archlinux.org/index.php/Microcode
Have you tried booting the installer via usb stick, and if so, can you investigate the journal from there?
Have you tried booting a live linux usb stick to check if it's a problem with your installation of the O/S or a hardware issue?

Cheers
Paul.

Offline

#3 2021-04-01 06:43:40

8472
Member
From: Slovakia
Registered: 2010-05-15
Posts: 83

Re: [SOLVED] kernel panic - fatal exception in interrupt

Thank you for your reply.
Yes, microcode is installed and applied. Have even tried downgrading it or disabling it at boot, but makes no difference.
Booting a live linux usb stick (using ventoy), no matter what ISO/distribution I choose, they all are booting fine.
As I said earlier, HW tests (CPU, MEMORY, MOTHERBOARD, PCI EXPRESS) performed via BIOS vendor selftesting utility (updated to the latest BIOS version), no HW problems found. But yes, I still don't rule out anything, neither installation of the OS or HW possible issue.
Booting installer usb stick and checking the journalctl --file system.journal from mounted partition:
After one boot failure, I noticed there the following at the end:

...
kernel: kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
kernel: BUG: unable to handle page fault for address: ffffe12b44283100

However, after another several boot failures, and boot via installer usb stick, I'm observing this kind of ending in the log, and no failure/error as reported previously:

...
systemd-journald[199]: Time spent on flushing to ...

and nothing in the log afterwards.


Logic clearly dictates that the needs of the many outweigh the needs of the few.

Online

#4 2021-04-01 20:07:47

paulkerry
Member
From: Sheffield, UK
Registered: 2014-10-02
Posts: 611

Re: [SOLVED] kernel panic - fatal exception in interrupt

Very strange. Did your system ever work properly at all? I'm just wondering if an update has caused or made your problem worse?
linux firmware perhaps? - https://archlinux.org/packages/core/any/linux-firmware/

Did you happen to notice which kernel versions were used on the live usb stick that worked OK?
Just for testing purposes to see if it's some kind of regression and to see if you get a working system, you could consider trying the old 4.19 lts kernels.
You could try 4.19.101, which was the last main arch built kernel and is on the ALA - https://archive.archlinux.org/packages/l/linux-lts/
or if you have another machine, build the newer (as of writing 4.19.183) https://aur.archlinux.org/packages/linux-lts419/
Remember if you try to use these, then you have to follow the rule on the front page...
"If, for any reason, you are using a kernel version prior to 5.9, make sure to change mkinitcpio.conf COMPRESSION to use one of the compressors supported, like gzip, otherwise you will not be able to boot images generated by mkinitcpio."

Offline

#5 2021-04-01 20:09:14

xerxes_
Member
Registered: 2018-04-29
Posts: 662

Re: [SOLVED] kernel panic - fatal exception in interrupt

Add to your boot command line: 'ignore_loglevel' or 'loglevel=7' and remove 'quiet' if there is. That should produce more detailed boot log.

Offline

#6 2021-04-03 11:06:25

8472
Member
From: Slovakia
Registered: 2010-05-15
Posts: 83

Re: [SOLVED] kernel panic - fatal exception in interrupt

paulkerry wrote:

Very strange. Did your system ever work properly at all? I'm just wondering if an update has caused or made your problem worse?
linux firmware perhaps? - https://archlinux.org/packages/core/any/linux-firmware/

Did you happen to notice which kernel versions were used on the live usb stick that worked OK?
Just for testing purposes to see if it's some kind of regression and to see if you get a working system, you could consider trying the old 4.19 lts kernels.
You could try 4.19.101, which was the last main arch built kernel and is on the ALA - https://archive.archlinux.org/packages/l/linux-lts/
or if you have another machine, build the newer (as of writing 4.19.183) https://aur.archlinux.org/packages/linux-lts419/
Remember if you try to use these, then you have to follow the rule on the front page...
"If, for any reason, you are using a kernel version prior to 5.9, make sure to change mkinitcpio.conf COMPRESSION to use one of the compressors supported, like gzip, otherwise you will not be able to boot images generated by mkinitcpio."

The system worked without this problem since 2016.
Well yes, it's possible, that some update is causing this, I still do not rule out anything.
Even the most recent kernels of the Arch ISO (tested 202102 and 202103 - with some 5.11 kernel version) boot fine from the USB stick.
If I remember correctly, even some earlier v5 kernels worked fine, some time ago, until this issue started few months ago.


xerxes_ wrote:

Add to your boot command line: 'ignore_loglevel' or 'loglevel=7' and remove 'quiet' if there is. That should produce more detailed boot log.

"ignore_loglevel" added to boot line, the following data cover the same one boot failure:
display: https://imgur.com/a/vY8rcdO
journalctl log: https://pastebin.com/raw/v6i3knJq

Last edited by 8472 (2021-04-03 11:08:08)


Logic clearly dictates that the needs of the many outweigh the needs of the few.

Online

#7 2021-04-03 12:48:55

xerxes_
Member
Registered: 2018-04-29
Posts: 662

Re: [SOLVED] kernel panic - fatal exception in interrupt

This log is not from this boot kernel crash or it is not full. Try add to your boot command line 'rcutree.use_softirq=0'.

Offline

#8 2021-04-04 08:42:13

8472
Member
From: Slovakia
Registered: 2010-05-15
Posts: 83

Re: [SOLVED] kernel panic - fatal exception in interrupt

xerxes_ wrote:

This log is not from this boot kernel crash or it is not full. Try add to your boot command line 'rcutree.use_softirq=0'.

That log was from that particular crash.
I don't know why it was not complete, but that was all the command "journalct --file system.journal" gave me.


Wow, that "rcutree.use_softirq=0" seems to have done it.
Have made about 20x reboots of the OS with it, and not a single crash, while before - without it, it would've crashed already.
Have also tested two another reboots without this, and it crashed at the 2nd attempt.
I've tried to read about this parameter, but it's quite out of my understanding, about what exactly it does.

Thank you very much, to all of you!


Logic clearly dictates that the needs of the many outweigh the needs of the few.

Online

#9 2021-04-04 09:05:29

paulkerry
Member
From: Sheffield, UK
Registered: 2014-10-02
Posts: 611

Re: [SOLVED] kernel panic - fatal exception in interrupt

Glad you are sorted!
Here's the commit for "rcutree.use_softirq"...
https://git.kernel.org/pub/scm/linux/ke … 84926212fe
which as is typical for some kernel commits, whose text is only really understood by those that write them :-)
Cheers
Paul

Offline

Board footer

Powered by FluxBB