You are not logged in.
Pages: 1
I began experiencing random freezes 8 or 10 days ago. The whole system is completely unresponsive (I can't even ssh from another computer) and the only way out is turning off the computer with the power button.
I have no idea if it is a hardware or software problem. I ran memtest86+ and it didn't report any errors, other than that I don't know much what to do.
I ran journalctl -b-1 after the last freeze. The output is way over my head, but there are some obvious error messages at the end:
Is there any useful information there? What else can I do to diagnose the problem?
Offline
jun 10 18:15:40 acme7 kernel: general protection fault, probably for non-canonical address 0x91e1574a2f57ee24: 0000 [#1] PREEMPT SMP PTI
jun 10 18:15:40 acme7 kernel: CPU: 8 PID: 621 Comm: Xorg Tainted: G OE 6.3.6-arch1-1 #1 a07497485287c74e7a472f42ded4b2ddcf7a6fd7
jun 10 18:15:40 acme7 kernel: Hardware name: ASUS All Series/X99-A, BIOS 3701 03/31/2017
jun 10 18:15:40 acme7 kernel: RIP: 0010:__kmem_cache_alloc_node+0x1c3/0x310
jun 10 18:15:40 acme7 kernel: Code: 2b 14 25 28 00 00 00 0f 85 5e 01 00 00 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 41 8b 47 28 4d 8b 07 48 01 f8 <48> 8b 18 48 89 c1 49 33 9f b8 00 00 00 48 89 f8 48 0f c9 48 31 cb
jun 10 18:15:40 acme7 kernel: RSP: 0018:ffffb32dc0ecb6d8 EFLAGS: 00010286
jun 10 18:15:40 acme7 kernel: RAX: 91e1574a2f57ee24 RBX: 0000000000000dc0 RCX: 0000000000000048
jun 10 18:15:40 acme7 kernel: RDX: 000000002e18f408 RSI: 0000000000000dc0 RDI: 91e1574a2f57edf4
jun 10 18:15:40 acme7 kernel: RBP: 0000000000000dc0 R08: 00000000000390c0 R09: 0000000000000018
jun 10 18:15:40 acme7 kernel: R10: 00000000000000c8 R11: 0000000000000000 R12: 0000000000000000
jun 10 18:15:40 acme7 kernel: R13: 00000000ffffffff R14: 0000000000000048 R15: ffff9af980042600
jun 10 18:15:40 acme7 kernel: FS: 00007fdff45e1480(0000) GS:ffff9afcafc00000(0000) knlGS:0000000000000000
jun 10 18:15:40 acme7 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
jun 10 18:15:40 acme7 kernel: CR2: 00007fdfe5656000 CR3: 0000000122642003 CR4: 00000000001706e0
jun 10 18:15:40 acme7 kernel: Call Trace:
jun 10 18:15:40 acme7 kernel: <TASK>
jun 10 18:15:40 acme7 kernel: ? die_addr+0x36/0x90
jun 10 18:15:40 acme7 kernel: ? exc_general_protection+0x1be/0x420
jun 10 18:15:40 acme7 kernel: ? asm_exc_general_protection+0x26/0x30
jun 10 18:15:40 acme7 kernel: ? __kmem_cache_alloc_node+0x1c3/0x310
jun 10 18:15:40 acme7 kernel: ? __kmem_cache_alloc_node+0x27f/0x310
jun 10 18:15:40 acme7 kernel: ? nvkm_mem_new_type+0xd8/0x2d0 [nouveau 4f81216cf0863fd9ab380a064410cbee31eb8eee]
jun 10 18:15:40 acme7 kernel: kmalloc_trace+0x2a/0xa0
jun 10 18:15:40 acme7 kernel: nvkm_mem_new_type+0xd8/0x2d0 [nouveau 4f81216cf0863fd9ab380a064410cbee31eb8eee]
jun 10 18:15:40 acme7 kernel: nvkm_umem_new+0x113/0x200 [nouveau 4f81216cf0863fd9ab380a064410cbee31eb8eee]
jun 10 18:15:40 acme7 kernel: nvkm_ioctl_new+0x150/0x290 [nouveau 4f81216cf0863fd9ab380a064410cbee31eb8eee]
jun 10 18:15:40 acme7 kernel: ? __pfx_nvkm_umem_new+0x10/0x10 [nouveau 4f81216cf0863fd9ab380a064410cbee31eb8eee]
jun 10 18:15:40 acme7 kernel: nvkm_ioctl+0x10e/0x250 [nouveau 4f81216cf0863fd9ab380a064410cbee31eb8eee]
jun 10 18:15:40 acme7 kernel: nvif_object_ctor+0x112/0x190 [nouveau 4f81216cf0863fd9ab380a064410cbee31eb8eee]
jun 10 18:15:40 acme7 kernel: nvif_mem_ctor_type+0xdc/0x1b0 [nouveau 4f81216cf0863fd9ab380a064410cbee31eb8eee]
jun 10 18:15:40 acme7 kernel: nouveau_mem_host+0xfe/0x1a0 [nouveau 4f81216cf0863fd9ab380a064410cbee31eb8eee]
jun 10 18:15:40 acme7 kernel: nouveau_sgdma_bind+0x30/0x90 [nouveau 4f81216cf0863fd9ab380a064410cbee31eb8eee]
jun 10 18:15:40 acme7 kernel: nouveau_bo_move+0x43e/0x820 [nouveau 4f81216cf0863fd9ab380a064410cbee31eb8eee]
jun 10 18:15:40 acme7 kernel: ? __kmalloc_node+0x50/0x150
jun 10 18:15:40 acme7 kernel: ttm_bo_handle_move_mem+0xb8/0x170 [ttm d22ef42d111a276a196fdf41370d456e0356e63d]
jun 10 18:15:40 acme7 kernel: ttm_bo_validate+0xf0/0x160 [ttm d22ef42d111a276a196fdf41370d456e0356e63d]
jun 10 18:15:40 acme7 kernel: ttm_bo_init_reserved+0x14e/0x1c0 [ttm d22ef42d111a276a196fdf41370d456e0356e63d]
jun 10 18:15:40 acme7 kernel: ttm_bo_init_validate+0x5a/0xe0 [ttm d22ef42d111a276a196fdf41370d456e0356e63d]
jun 10 18:15:40 acme7 kernel: ? __pfx_nouveau_bo_del_ttm+0x10/0x10 [nouveau 4f81216cf0863fd9ab380a064410cbee31eb8eee]
jun 10 18:15:40 acme7 kernel: nouveau_bo_init+0x6b/0x80 [nouveau 4f81216cf0863fd9ab380a064410cbee31eb8eee]
jun 10 18:15:40 acme7 kernel: ? __pfx_nouveau_bo_del_ttm+0x10/0x10 [nouveau 4f81216cf0863fd9ab380a064410cbee31eb8eee]
jun 10 18:15:40 acme7 kernel: nouveau_gem_new+0x84/0xe0 [nouveau 4f81216cf0863fd9ab380a064410cbee31eb8eee]
jun 10 18:15:40 acme7 kernel: nouveau_gem_ioctl_new+0x59/0x100 [nouveau 4f81216cf0863fd9ab380a064410cbee31eb8eee]
jun 10 18:15:40 acme7 kernel: ? __pfx_nouveau_gem_ioctl_new+0x10/0x10 [nouveau 4f81216cf0863fd9ab380a064410cbee31eb8eee]
jun 10 18:15:40 acme7 kernel: drm_ioctl_kernel+0xcd/0x170
jun 10 18:15:40 acme7 kernel: drm_ioctl+0x26d/0x4b0
jun 10 18:15:40 acme7 kernel: ? __pfx_nouveau_gem_ioctl_new+0x10/0x10 [nouveau 4f81216cf0863fd9ab380a064410cbee31eb8eee]
jun 10 18:15:40 acme7 kernel: nouveau_drm_ioctl+0x5a/0xb0 [nouveau 4f81216cf0863fd9ab380a064410cbee31eb8eee]
jun 10 18:15:40 acme7 kernel: __x64_sys_ioctl+0x94/0xd0
jun 10 18:15:40 acme7 kernel: do_syscall_64+0x60/0x90
jun 10 18:15:40 acme7 kernel: ? do_syscall_64+0x6c/0x90
jun 10 18:15:40 acme7 kernel: ? __irq_exit_rcu+0x4b/0xf0
jun 10 18:15:40 acme7 kernel: entry_SYSCALL_64_after_hwframe+0x72/0xdc
jun 10 18:15:40 acme7 kernel: RIP: 0033:0x7fdff4fbd76f
jun 10 18:15:40 acme7 kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
jun 10 18:15:40 acme7 kernel: RSP: 002b:00007ffc005ca070 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
jun 10 18:15:40 acme7 kernel: RAX: ffffffffffffffda RBX: 000055f8fc17fdd0 RCX: 00007fdff4fbd76f
jun 10 18:15:40 acme7 kernel: RDX: 00007ffc005ca120 RSI: 00000000c0306480 RDI: 000000000000000f
jun 10 18:15:40 acme7 kernel: RBP: 00007ffc005ca120 R08: 000000055f8fc17f R09: 00007fdff5099bd0
jun 10 18:15:40 acme7 kernel: R10: 00000000000000c0 R11: 0000000000000246 R12: 00000000c0306480
jun 10 18:15:40 acme7 kernel: R13: 000000000000000f R14: 0000000000000000 R15: 0000000000002224
jun 10 18:15:40 acme7 kernel: </TASK>
jun 10 18:15:40 acme7 kernel: Modules linked in: snd_seq_dummy snd_usb_audio snd_usbmidi_lib vfat mc fat intel_rapl_msr intel_rapl_common joydev mousedev x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel snd_intel_dspcfg sha512_ssse3 snd_intel_sdw_acpi aesni_intel snd_hda_codec eeepc_wmi asus_wmi iTCO_wdt crypto_simd snd_hda_core cryptd ledtrig_audio mei_me intel_pmc_bxt rapl sparse_keymap snd_hwdep cfg80211 iTCO_vendor_support platform_profile intel_cstate i2c_i801 e1000e intel_wmi_thunderbolt wmi_bmof pcspkr mei snd_pcm i2c_smbus lpc_ich rfkill intel_uncore mac_hid vboxnetflt(OE) vboxnetadp(OE) vboxdrv(OE) snd_virmidi snd_seq_virmidi snd_seq_midi_event snd_seq snd_rawmidi snd_seq_device snd_timer snd soundcore sg fuse dm_mod loop bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 usbhid nouveau drm_ttm_helper ttm video i2c_algo_bit mxm_wmi sr_mod crc32c_intel drm_display_helper cdrom xhci_pci
jun 10 18:15:40 acme7 kernel: xhci_pci_renesas cec wmi
jun 10 18:15:40 acme7 kernel: ---[ end trace 0000000000000000 ]---
jun 10 18:15:40 acme7 kernel: RIP: 0010:__kmem_cache_alloc_node+0x1c3/0x310
jun 10 18:15:40 acme7 kernel: Code: 2b 14 25 28 00 00 00 0f 85 5e 01 00 00 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 41 8b 47 28 4d 8b 07 48 01 f8 <48> 8b 18 48 89 c1 49 33 9f b8 00 00 00 48 89 f8 48 0f c9 48 31 cb
jun 10 18:15:40 acme7 kernel: RSP: 0018:ffffb32dc0ecb6d8 EFLAGS: 00010286
jun 10 18:15:40 acme7 kernel: RAX: 91e1574a2f57ee24 RBX: 0000000000000dc0 RCX: 0000000000000048
jun 10 18:15:40 acme7 kernel: RDX: 000000002e18f408 RSI: 0000000000000dc0 RDI: 91e1574a2f57edf4
jun 10 18:15:40 acme7 kernel: RBP: 0000000000000dc0 R08: 00000000000390c0 R09: 0000000000000018
jun 10 18:15:40 acme7 kernel: R10: 00000000000000c8 R11: 0000000000000000 R12: 0000000000000000
jun 10 18:15:40 acme7 kernel: R13: 00000000ffffffff R14: 0000000000000048 R15: ffff9af980042600
jun 10 18:15:40 acme7 kernel: FS: 00007fdff45e1480(0000) GS:ffff9afcafc00000(0000) knlGS:0000000000000000
jun 10 18:15:40 acme7 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
jun 10 18:15:40 acme7 kernel: CR2: 00007fdfe5656000 CR3: 0000000122642003 CR4: 00000000001706e0
jun 10 18:17:56 acme7 kernel: general protection fault, probably for non-canonical address 0x91e1574a2f57ee24: 0000 [#2] PREEMPT SMP PTI
jun 10 18:17:56 acme7 kernel: CPU: 8 PID: 2050 Comm: threaded-ml Tainted: G D OE 6.3.6-arch1-1 #1 a07497485287c74e7a472f42ded4b2ddcf7a6fd7
jun 10 18:17:56 acme7 kernel: Hardware name: ASUS All Series/X99-A, BIOS 3701 03/31/2017
jun 10 18:17:56 acme7 kernel: RIP: 0010:__kmem_cache_alloc_node+0x1c3/0x310
jun 10 18:17:56 acme7 kernel: Code: 2b 14 25 28 00 00 00 0f 85 5e 01 00 00 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 41 8b 47 28 4d 8b 07 48 01 f8 <48> 8b 18 48 89 c1 49 33 9f b8 00 00 00 48 89 f8 48 0f c9 48 31 cb
jun 10 18:17:56 acme7 kernel: RSP: 0018:ffffb32dca8d3ca8 EFLAGS: 00010286
jun 10 18:17:56 acme7 kernel: RAX: 91e1574a2f57ee24 RBX: 0000000000000dc0 RCX: 0000000000000058
jun 10 18:17:56 acme7 kernel: RDX: 000000002e18f808 RSI: 0000000000000dc0 RDI: 91e1574a2f57edf4
jun 10 18:17:56 acme7 kernel: RBP: 0000000000000dc0 R08: 00000000000390c0 R09: 0000000000000000
jun 10 18:17:56 acme7 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
jun 10 18:17:56 acme7 kernel: R13: 00000000ffffffff R14: 0000000000000058 R15: ffff9af980042600
jun 10 18:17:56 acme7 kernel: FS: 00007f0ffe3ff6c0(0000) GS:ffff9afcafc00000(0000) knlGS:0000000000000000
jun 10 18:17:56 acme7 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
jun 10 18:17:56 acme7 kernel: CR2: 00007f879c880000 CR3: 00000001d2642005 CR4: 00000000001706e0
jun 10 18:17:56 acme7 kernel: Call Trace:
jun 10 18:17:56 acme7 kernel: <TASK>
jun 10 18:17:56 acme7 kernel: ? die_addr+0x36/0x90
jun 10 18:17:56 acme7 kernel: ? exc_general_protection+0x1be/0x420
jun 10 18:17:56 acme7 kernel: ? asm_exc_general_protection+0x26/0x30
jun 10 18:17:56 acme7 kernel: ? __kmem_cache_alloc_node+0x1c3/0x310
jun 10 18:17:56 acme7 kernel: ? __kmem_cache_alloc_node+0x27f/0x310
jun 10 18:17:56 acme7 kernel: ? refill_pi_state_cache+0x3b/0x90
jun 10 18:17:56 acme7 kernel: kmalloc_trace+0x2a/0xa0
jun 10 18:17:56 acme7 kernel: refill_pi_state_cache+0x3b/0x90
jun 10 18:17:56 acme7 kernel: futex_lock_pi+0x137/0x470
jun 10 18:17:56 acme7 kernel: do_futex+0x52/0x190
jun 10 18:17:56 acme7 kernel: __x64_sys_futex+0x129/0x1e0
jun 10 18:17:56 acme7 kernel: ? syscall_exit_to_user_mode+0x1b/0x40
jun 10 18:17:56 acme7 kernel: ? do_syscall_64+0x6c/0x90
jun 10 18:17:56 acme7 kernel: do_syscall_64+0x60/0x90
jun 10 18:17:56 acme7 kernel: ? do_syscall_64+0x6c/0x90
jun 10 18:17:56 acme7 kernel: ? do_syscall_64+0x6c/0x90
jun 10 18:17:56 acme7 kernel: ? do_syscall_64+0x6c/0x90
jun 10 18:17:56 acme7 kernel: ? do_syscall_64+0x6c/0x90
jun 10 18:17:56 acme7 kernel: entry_SYSCALL_64_after_hwframe+0x72/0xdc
jun 10 18:17:56 acme7 kernel: RIP: 0033:0x7f1025499fb5
jun 10 18:17:56 acme7 kernel: Code: d1 fe ff ff 90 f3 0f 1e fa 89 f0 89 ce 49 89 d2 40 80 f6 86 85 c0 74 09 80 f1 8d 48 85 d2 0f 45 f1 31 d2 b8 ca 00 00 00 0f 05 <83> f8 da 74 26 83 f8 92 74 18 8d 50 23 83 fa 23 77 1f 48 b9 01 20
jun 10 18:17:56 acme7 kernel: RSP: 002b:00007f0ffe3fea28 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
jun 10 18:17:56 acme7 kernel: RAX: ffffffffffffffda RBX: 0000562527b4fe10 RCX: 00007f1025499fb5
jun 10 18:17:56 acme7 kernel: RDX: 0000000000000000 RSI: 0000000000000086 RDI: 0000562527b4fe10
jun 10 18:17:56 acme7 kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000562527b4fe10
jun 10 18:17:56 acme7 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
jun 10 18:17:56 acme7 kernel: R13: 00000000000004da R14: 0000562527b4fe20 R15: 00007f0ffdbff000
jun 10 18:17:56 acme7 kernel: </TASK>
jun 10 18:17:56 acme7 kernel: Modules linked in: snd_seq_dummy snd_usb_audio snd_usbmidi_lib vfat mc fat intel_rapl_msr intel_rapl_common joydev mousedev x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel snd_intel_dspcfg sha512_ssse3 snd_intel_sdw_acpi aesni_intel snd_hda_codec eeepc_wmi asus_wmi iTCO_wdt crypto_simd snd_hda_core cryptd ledtrig_audio mei_me intel_pmc_bxt rapl sparse_keymap snd_hwdep cfg80211 iTCO_vendor_support platform_profile intel_cstate i2c_i801 e1000e intel_wmi_thunderbolt wmi_bmof pcspkr mei snd_pcm i2c_smbus lpc_ich rfkill intel_uncore mac_hid vboxnetflt(OE) vboxnetadp(OE) vboxdrv(OE) snd_virmidi snd_seq_virmidi snd_seq_midi_event snd_seq snd_rawmidi snd_seq_device snd_timer snd soundcore sg fuse dm_mod loop bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 usbhid nouveau drm_ttm_helper ttm video i2c_algo_bit mxm_wmi sr_mod crc32c_intel drm_display_helper cdrom xhci_pci
jun 10 18:17:56 acme7 kernel: xhci_pci_renesas cec wmi
jun 10 18:17:56 acme7 kernel: ---[ end trace 0000000000000000 ]---Unfortunately
jun 10 09:24:01 acme7 kernel: pci 0000:01:00.0: [10de:10c3] type 00 class 0x030000 is a GT218, so the binary nvidia driver is rather not an option to test ![]()
Is the LTS kernel affected?
Also
pacman -Qs nouveauand in doubt remove xf86-video-nouveau
Offline
Thanks for your prompt and helpful reply.
I'm not sure if I should conclude that the problem is the (extremely old) video card or the drivers, or both.
I don't have xf86-video-nouveau installed, perhaps I should try it?
For now, I installed the LTS kernel and I am using it right now. Let's see if it survives the day without freezes.
Offline
Pages: 1