You are not logged in.
2 days ago, I upgrade to linux 5.15.2 from 5.14.16.
Tonight, my system was semi-hanged and I had to push the button reset.
In the journal, I see many kernel errors during the semi-hang.
Anyone else have seen this issue ?
kernel: general protection fault, probably for non-canonical address 0xf7ff9bdac3bf8f38: 0000 [#1] PREEMPT SM>
kernel: CPU: 8 PID: 5454 Comm: DOM Worker Tainted: P OE 5.15.2-arch1-1 #1 e3bfbeb633edc604ba956>
kernel: Hardware name: Micro-Star International Co., Ltd MS-7B86/B450 GAMING PLUS (MS-7B86), BIOS 1.E0 06/11/>
kernel: RIP: 0010:kmem_cache_alloc+0x10a/0x320
kernel: Code: 5e 48 8b 51 08 48 8b 01 48 83 79 10 00 48 89 04 24 0f 84 b2 01 00 00 48 85 c0 0f 84 a9 01 00 00>
kernel: RSP: 0018:ffffa70002ebfbe8 EFLAGS: 00010282
kernel: RAX: f7ff9bdac3bf8f08 RBX: 0000000000000068 RCX: f7ff9bdac3bf8f38
kernel: RDX: 0000000007791408 RSI: 0000000000000000 RDI: 0000000000030b60
kernel: RBP: ffff9bd940206000 R08: 0000000000000000 R09: 0000000000000000
kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff9bdb50fb9900
kernel: R13: 0000000000408d40 R14: ffffffffa11813fb R15: 0000000000408d40
kernel: FS: 00007f90f41ff640(0000) GS:ffff9bdc4ea00000(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007f90f50fc000 CR3: 000000034d074000 CR4: 00000000003506e0
kernel: Call Trace:
kernel: alloc_buffer_head+0x1b/0x80
kernel: alloc_page_buffers+0x9e/0x150
kernel: create_empty_buffers+0x19/0x110
kernel: ext4_block_write_begin+0x34c/0x430 [ext4 d8e45ae5e5e63c17f79829492c61819e8ba40ecf]
kernel: ? ext4_da_release_space+0x120/0x120 [ext4 d8e45ae5e5e63c17f79829492c61819e8ba40ecf]
kernel: ext4_da_write_begin+0x119/0x2f0 [ext4 d8e45ae5e5e63c17f79829492c61819e8ba40ecf]
kernel: generic_perform_write+0xd0/0x220
kernel: ext4_buffered_write_iter+0xa7/0x190 [ext4 d8e45ae5e5e63c17f79829492c61819e8ba40ecf]
kernel: new_sync_write+0x15c/0x200
kernel: vfs_write+0x203/0x2a0
kernel: ksys_write+0x67/0xf0
kernel: do_syscall_64+0x5c/0x90
kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
kernel: RIP: 0033:0x7f913aa986ff
kernel: Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 69 fd ff ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b>
kernel: RSP: 002b:00007f90f41fa9f0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
kernel: RAX: ffffffffffffffda RBX: 00007f913aa986b0 RCX: 00007f913aa986ff
kernel: RDX: 00000000003310bb RSI: 00007f8fc6400000 RDI: 000000000000009d
kernel: RBP: 00007f90f41faa20 R08: 0000000000000000 R09: 00007f90fb93c000
kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 00007f90ec105f68
kernel: R13: 0000000000000003 R14: 00007f90f41fa960 R15: 0000000000000000
kernel: Modules linked in: rpcsec_gss_krb5 tun nf_conntrack_netlink nfnetlink ipt_REJECT nf_reject_ipv4 rpcrd>
kernel: cryptd usbhid pcspkr mc i2c_piix4 k10temp rng_core rapl soundcore libphy parport_pc parport gpio_amd>
kernel: ---[ end trace dea28a3638fe36a7 ]---
kernel: RIP: 0010:kmem_cache_alloc+0x10a/0x320
kernel: Code: 5e 48 8b 51 08 48 8b 01 48 83 79 10 00 48 89 04 24 0f 84 b2 01 00 00 48 85 c0 0f 84 a9 01 00 00>
kernel: RSP: 0018:ffffa70002ebfbe8 EFLAGS: 00010282
kernel: RAX: f7ff9bdac3bf8f08 RBX: 0000000000000068 RCX: f7ff9bdac3bf8f38
kernel: RDX: 0000000007791408 RSI: 0000000000000000 RDI: 0000000000030b60
kernel: RBP: ffff9bd940206000 R08: 0000000000000000 R09: 0000000000000000
kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff9bdb50fb9900
kernel: R13: 0000000000408d40 R14: ffffffffa11813fb R15: 0000000000408d40
kernel: FS: 00007f90f41ff640(0000) GS:ffff9bdc4ea00000(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007f90f50fc000 CR3: 000000034d074000 CR4: 00000000003506e0
Last edited by hamelg (2021-11-20 16:19:10)
Offline
many kernel errors during the semi-hang
So why did you post only one?
This is a memory allocation error when ext4 tries to sync a mount, typically because it's unmounted for the shutdown.
The problem is *likely* ahead of this.
Please post the entire journal for a boot w/ hanging shutdown (and please don't copy it out of the pager. redirect it into a file or pastebin service)
Offline
Offline
There's no context (no preceeding erros), for 25 minutes you keep getting crashes in (predominantly) kmem_cache_alloc on attempted file access.
Reproducible? Possibly OOM? You don't have swap, do you?
Offline
Yes, no needing swap and no specific context, no errors or OOM conditions preceding the kernel errors.
Just before rebooting, I ran the free command to check memory and I didn't see nothing shocking.
If this issue is happening again, what can I check ?
Offline
I meant that there's no context to the crashes in the logs (like FS access errors etc, devices falling off the bus etc)
Do you use "non-standard" parameters for https://wiki.archlinux.org/title/Zswap ?
You could also run https://wiki.archlinux.org/title/Stress … ing_memory overnight to ensure the RAM is intact.
Offline
You could also run https://wiki.archlinux.org/title/Stress … ing_memory overnight to ensure the RAM is intact.
well spotted !
One of my memory module is faulty.
Thanks much
Offline