You are not logged in.

#1 2025-06-17 12:43:48

michelesr
Member
Registered: 2016-02-04
Posts: 73

Page fault on DMA fence release afte sway update

I've recently updated kernel and nvidia drivers:

[2025-06-13T21:55:50+0100] [ALPM] upgraded linux (6.14.10.arch1-1 -> 6.15.2.arch1-1)
[2025-06-13T21:55:55+0100] [ALPM] upgraded nvidia (575.57.08-2 -> 575.57.08-5)

Sometimes, after loading the nvidia module (especially after a resume from suspend) and trying vkcube, the screen freezes and I get:

Jun 17 13:09:20 jason kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 510
Jun 17 13:09:21 jason kernel: 
Jun 17 13:09:21 jason kernel: nvidia 0000:01:00.0: enabling device (0000 -> 0003)
Jun 17 13:09:21 jason kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  575.57.08  Sat May 24 07:21:16 UTC 2025
Jun 17 13:09:21 jason kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  575.57.08  Sat May 24 06:52:56 UTC 2025
Jun 17 13:09:21 jason kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
Jun 17 13:09:22 jason kernel: [drm] Initialized nvidia-drm 0.0.0 for 0000:01:00.0 on minor 0
Jun 17 13:09:22 jason kernel: nvidia 0000:01:00.0: [drm] No compatible format found
Jun 17 13:09:22 jason kernel: nvidia 0000:01:00.0: [drm] Cannot find any crtc or sizes
Jun 17 13:09:22 jason kernel: BUG: unable to handle page fault for address: ffffffffc1f89210
Jun 17 13:09:22 jason kernel: #PF: supervisor read access in kernel mode
Jun 17 13:09:22 jason kernel: #PF: error_code(0x0000) - not-present page
Jun 17 13:09:22 jason kernel: PGD 247a29067 P4D 247a29067 PUD 247a2b067 PMD 11a02c067 PTE 0
Jun 17 13:09:22 jason kernel: Oops: Oops: 0000 [#1] SMP PTI
Jun 17 13:09:22 jason kernel: CPU: 8 UID: 1000 PID: 1022 Comm: sway Tainted: P S         OE       6.15.2-arch1-1 #1 PREEMPT(full)  806378c57c3c21a60e39b7d20019ada706b7af8b
Jun 17 13:09:22 jason kernel: Tainted: [P]=PROPRIETARY_MODULE, [ S ]=CPU_OUT_OF_SPEC, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Jun 17 13:09:22 jason kernel: Hardware name: Dell Inc. XPS 15 9570/02MJVY, BIOS 1.27.0 08/12/2022
Jun 17 13:09:22 jason kernel: RIP: 0010:dma_fence_release+0x33/0x160
Jun 17 13:09:22 jason kernel: Code: 41 56 55 48 8d 6f c8 53 48 89 fb 48 83 ec 18 66 90 48 8b 43 d8 4c 8d 4b d8 49 39 c1 74 08 48 8b 43 f8 a8 01 74 62 48 8b 43 d0 <48> 8b 40 30 48 85 c0 0f 84 ec 00 00 00 48 83 c4 18 48 89 ef 5b 5d
Jun 17 13:09:22 jason kernel: RSP: 0018:ffffce47c04d7c50 EFLAGS: 00010202
Jun 17 13:09:22 jason kernel: RAX: ffffffffc1f891e0 RBX: ffff894af44ed998 RCX: ffff894bef175c40
Jun 17 13:09:22 jason kernel: RDX: 0000000000000001 RSI: ffff894bef175c60 RDI: ffff894af44ed998
Jun 17 13:09:22 jason kernel: RBP: ffff894af44ed960 R08: ffff894a4b6f4a20 R09: ffff894af44ed970
Jun 17 13:09:22 jason kernel: R10: ffffce47c04d7c80 R11: ffffce47c04d7c78 R12: ffff894a4bfa0000
Jun 17 13:09:22 jason kernel: R13: ffffffffb882c750 R14: 000000000000001c R15: 00000000000000c0
Jun 17 13:09:22 jason kernel: FS:  00007f26605cee40(0000) GS:ffff894df192f000(0000) knlGS:0000000000000000
Jun 17 13:09:22 jason kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 17 13:09:22 jason kernel: CR2: ffffffffc1f89210 CR3: 000000010c55a001 CR4: 00000000003726f0
Jun 17 13:09:22 jason kernel: Call Trace:
Jun 17 13:09:22 jason kernel:  <TASK>
Jun 17 13:09:22 jason kernel:  dma_fence_chain_release+0xcc/0x100
Jun 17 13:09:22 jason kernel:  drm_syncobj_destroy_ioctl+0x91/0xe0
Jun 17 13:09:22 jason kernel:  drm_ioctl_kernel+0xab/0x100
Jun 17 13:09:22 jason kernel:  drm_ioctl+0x2a0/0x530
Jun 17 13:09:22 jason kernel:  ? __pfx_drm_syncobj_destroy_ioctl+0x10/0x10
Jun 17 13:09:22 jason kernel:  __x64_sys_ioctl+0x94/0xc0
Jun 17 13:09:22 jason kernel:  do_syscall_64+0x7b/0x810
Jun 17 13:09:22 jason kernel:  ? __x64_sys_ioctl+0x56/0xc0
Jun 17 13:09:22 jason kernel:  ? syscall_exit_to_user_mode+0x37/0x1c0
Jun 17 13:09:22 jason kernel:  ? do_syscall_64+0x87/0x810
Jun 17 13:09:22 jason kernel:  ? do_syscall_64+0x87/0x810
Jun 17 13:09:22 jason kernel:  ? handle_mm_fault+0x1d2/0x2d0
Jun 17 13:09:22 jason kernel:  ? syscall_exit_to_user_mode+0x37/0x1c0
Jun 17 13:09:22 jason kernel:  ? do_syscall_64+0x87/0x810
Jun 17 13:09:22 jason kernel:  ? irqentry_exit_to_user_mode+0x2c/0x1b0
Jun 17 13:09:22 jason kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
Jun 17 13:09:22 jason kernel: RIP: 0033:0x7f2661237ecd
Jun 17 13:09:22 jason kernel: Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00
Jun 17 13:09:22 jason kernel: RSP: 002b:00007ffe58388010 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Jun 17 13:09:22 jason kernel: RAX: ffffffffffffffda RBX: 00007ffe583880c0 RCX: 00007f2661237ecd
Jun 17 13:09:22 jason kernel: RDX: 00007ffe583880c0 RSI: 00000000c00864c0 RDI: 000000000000000f
Jun 17 13:09:22 jason kernel: RBP: 00007ffe58388060 R08: 0000000000000010 R09: 0000000000000001
Jun 17 13:09:22 jason kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000f
Jun 17 13:09:22 jason kernel: R13: 0000000000000008 R14: 000056286e676840 R15: 000056286e68cb98
Jun 17 13:09:22 jason kernel:  </TASK>
Jun 17 13:09:22 jason kernel: Modules linked in: nvidia_drm(POE) nvidia_modeset(POE) drm_ttm_helper nvidia(POE) hid_sony ff_memless rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device uhid ccm algif_aead crypto_null des3_ede_x86_64 des_generic libdes algif_skcipher cmac md4 algif_hash af_alg bnep hid_multitouch snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_intel snd_sof_intel_hda_sdw_bpt snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink snd_sof_intel_hda snd_hda_codec_hdmi soundwire_cadence snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_acpi_intel_match snd_soc_acpi_intel_sdca_quirks intel_uncore_frequency soundwire_generic_allocation dell_pc snd_soc_acpi intel_uncore_frequency_common platform_profile soundwire_bus snd_soc_sdca crc8 snd_soc_avs snd_soc_hda_codec snd_ctl_led snd_hda_ext_core snd_hda_codec_realtek x86_pkg_temp_thermal snd_soc_core snd_hda_codec_generic intel_powerclamp coretemp snd_hda_scodec_component snd_compress ac97_bus snd_pcm_dmaengine iwlmvm kvm_intel snd_hda_intel
Jun 17 13:09:22 jason kernel:  snd_intel_dspcfg snd_intel_sdw_acpi kvm mac80211 snd_hda_codec dell_laptop irqbypass snd_hda_core btusb iTCO_wdt dell_wmi libarc4 rapl intel_pmc_bxt btrtl snd_hwdep ee1004 ptp dell_smbios intel_cstate iTCO_vendor_support processor_thermal_device_pci_legacy pps_core mei_wdt mei_hdcp mei_pxp intel_rapl_msr dell_smm_hwmon iwlwifi btintel dcdbas dell_wmi_sysman processor_thermal_device intel_uncore snd_pcm psmouse spi_nor processor_thermal_wt_hint pcspkr btbcm i2c_i801 btmtk snd_timer processor_thermal_rfim cdc_acm wmi_bmof dell_wmi_descriptor firmware_attributes_class intel_wmi_thunderbolt mxm_wmi mtd cfg80211 bluetooth processor_thermal_rapl snd i2c_smbus ucsi_acpi mei_me i2c_mux soundcore intel_rapl_common intel_lpss_pci typec_ucsi intel_lpss mei processor_thermal_wt_req rfkill idma64 typec processor_thermal_power_floor processor_thermal_mbox intel_pch_thermal roles intel_soc_dts_iosf intel_pmc_core i2c_hid_acpi int3403_thermal i2c_hid pmt_telemetry int340x_thermal_zone dell_lis3lv02d pmt_class
Jun 17 13:09:22 jason kernel:  dell_smo8800 intel_vsec int3400_thermal pinctrl_cannonlake intel_hid acpi_thermal_rel acpi_pad sparse_keymap mousedev joydev ip6t_REJECT nf_reject_ipv6 mac_hid xt_hl ip6t_rt ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog nft_limit xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables pkcs8_key_parser i2c_dev sg crypto_user loop nfnetlink ip_tables x_tables dm_crypt encrypted_keys trusted asn1_encoder tee dm_mod hid_logitech_hidpp hid_logitech_dj i915 polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 rtsx_pci_sdmmc mmc_core sha256_ssse3 nvme sha1_ssse3 i2c_algo_bit drm_buddy aesni_intel nvme_core ttm crypto_simd intel_gtt cryptd spi_intel_pci nvme_keyring serio_raw drm_display_helper spi_intel rtsx_pci nvme_auth video cec wmi
Jun 17 13:09:22 jason kernel: Unloaded tainted modules: nvidia(POE):2 nvidia_uvm(POE):2 nvidia_modeset(POE):2 nvidia_drm(POE):2 [last unloaded: nvidia(POE)]
Jun 17 13:09:22 jason kernel: CR2: ffffffffc1f89210
Jun 17 13:09:22 jason kernel: ---[ end trace 0000000000000000 ]---
Jun 17 13:09:22 jason kernel: RIP: 0010:dma_fence_release+0x33/0x160
Jun 17 13:09:22 jason kernel: Code: 41 56 55 48 8d 6f c8 53 48 89 fb 48 83 ec 18 66 90 48 8b 43 d8 4c 8d 4b d8 49 39 c1 74 08 48 8b 43 f8 a8 01 74 62 48 8b 43 d0 <48> 8b 40 30 48 85 c0 0f 84 ec 00 00 00 48 83 c4 18 48 89 ef 5b 5d
Jun 17 13:09:22 jason kernel: RSP: 0018:ffffce47c04d7c50 EFLAGS: 00010202
Jun 17 13:09:22 jason kernel: RAX: ffffffffc1f891e0 RBX: ffff894af44ed998 RCX: ffff894bef175c40
Jun 17 13:09:22 jason kernel: RDX: 0000000000000001 RSI: ffff894bef175c60 RDI: ffff894af44ed998
Jun 17 13:09:22 jason kernel: RBP: ffff894af44ed960 R08: ffff894a4b6f4a20 R09: ffff894af44ed970
Jun 17 13:09:22 jason kernel: R10: ffffce47c04d7c80 R11: ffffce47c04d7c78 R12: ffff894a4bfa0000
Jun 17 13:09:22 jason kernel: R13: ffffffffb882c750 R14: 000000000000001c R15: 00000000000000c0
Jun 17 13:09:22 jason kernel: FS:  00007f26605cee40(0000) GS:ffff894df192f000(0000) knlGS:0000000000000000
Jun 17 13:09:22 jason kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 17 13:09:22 jason kernel: CR2: ffffffffc1f89210 CR3: 000000010c55a001 CR4: 00000000003726f0
Jun 17 13:09:22 jason kernel: note: sway[1022] exited with irqs disabled
Jun 17 13:09:42 jason kernel: sysrq: This sysrq operation is disabled.
Jun 17 13:09:42 jason kernel: sysrq: This sysrq operation is disabled.
Jun 17 13:09:43 jason kernel: sysrq: This sysrq operation is disabled.
Jun 17 13:09:44 jason kernel: sysrq: Emergency Sync
Jun 17 13:09:44 jason kernel: Emergency Sync complete
Jun 17 13:09:45 jason kernel: sysrq: This sysrq operation is disabled.
Jun 17 13:09:45 jason kernel: sysrq: This sysrq operation is disabled.
Jun 17 13:09:48 jason kernel: wlan0: deauthenticating from 48:d3:43:95:b5:ef by local choice (Reason: 3=DEAUTH_LEAVING)
Jun 17 13:09:49 jason kernel: EXT4-fs (nvme0n1p8): unmounting filesystem 1fc797e0-cf58-489d-a2f5-707a93e388c2.
Jun 17 13:09:50 jason systemd-shutdown[1]: Syncing filesystems and block devices.
Jun 17 13:09:50 jason systemd-shutdown[1]: Sending SIGTERM to remaining processes...
Jun 17 13:09:50 jason systemd-journald[400]: Received SIGTERM from PID 1 (systemd-shutdow).

To get rid of it, I need to shutdown the system with power button (that thankfully triggers a safe shutdown).

Happened twice after the update. Another time, I had a kernel panic (with no log unfortunately, caps lock led blinking), running loginctl terminate-session to close Sway (at that time, I just unloaded the nvidia module before running the loginctl command, as I have hybrid graphics and I run Sway on the Intel iGPU)

Hardware:

Dell XPS 9570
Nvidia Ge-Force GTX 1050 Ti
Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
Intel CoffeeLake-H GT2 UHD Graphics 630

Still looking for a reliable way to reproduce, but as said happened twice, always after loading the nvidia module and running vkcube after resuming from suspend (S3).

Full upgrade log:

[2025-06-13T21:54:34+0100] [PACMAN] Running 'pacman --sync -y -u --'
[2025-06-13T21:54:34+0100] [PACMAN] synchronizing package lists
[2025-06-13T21:54:36+0100] [PACMAN] starting full system upgrade
[2025-06-13T21:55:32+0100] [ALPM] running '60-mkinitcpio-remove.hook'...
[2025-06-13T21:55:32+0100] [ALPM] running '71-dkms-remove.hook'...
[2025-06-13T21:55:33+0100] [ALPM] transaction started
[2025-06-13T21:55:33+0100] [ALPM] upgraded linux-api-headers (6.14-1 -> 6.15-1)
[2025-06-13T21:55:33+0100] [ALPM] upgraded protobuf (31.0-2 -> 31.1-1)
[2025-06-13T21:55:33+0100] [ALPM] upgraded sqlite (3.50.0-1 -> 3.50.1-1)
[2025-06-13T21:55:33+0100] [ALPM] upgraded libffi (3.4.8-1 -> 3.5.0-1)
[2025-06-13T21:55:33+0100] [ALPM] upgraded android-tools (35.0.2-16 -> 35.0.2-17)
[2025-06-13T21:55:33+0100] [ALPM] upgraded apparmor (4.1.0-4 -> 4.1.1-1)
[2025-06-13T21:55:33+0100] [ALPM] upgraded mpg123 (1.32.10-1 -> 1.33.0-1)
[2025-06-13T21:55:33+0100] [ALPM] upgraded gstreamer (1.26.2-1 -> 1.26.2-2)
[2025-06-13T21:55:33+0100] [ALPM] upgraded libdrm (2.4.124-1 -> 2.4.125-1)
[2025-06-13T21:55:34+0100] [ALPM] upgraded llvm-libs (19.1.7-2 -> 20.1.6-3)
[2025-06-13T21:55:36+0100] [ALPM] upgraded nvidia-utils (575.57.08-1 -> 575.57.08-3)
[2025-06-13T21:55:37+0100] [ALPM] upgraded mesa (1:25.1.2-1 -> 1:25.1.3-3)
[2025-06-13T21:55:37+0100] [ALPM] upgraded gst-plugins-base-libs (1.26.2-1 -> 1.26.2-2)
[2025-06-13T21:55:37+0100] [ALPM] upgraded xkeyboard-config (2.44-1 -> 2.45-1)
[2025-06-13T21:55:37+0100] [ALPM] upgraded gst-plugins-bad-libs (1.26.2-1 -> 1.26.2-2)
[2025-06-13T21:55:37+0100] [ALPM] upgraded wxwidgets-common (3.2.8.1-1 -> 3.2.8.1-2)
[2025-06-13T21:55:37+0100] [ALPM] upgraded pixman (0.46.0-1 -> 0.46.2-1)
[2025-06-13T21:55:37+0100] [ALPM] upgraded gtk-update-icon-cache (1:4.18.5-2 -> 1:4.18.6-1)
[2025-06-13T21:55:37+0100] [ALPM] upgraded libcloudproviders (0.3.6-1 -> 0.3.6-2)
[2025-06-13T21:55:37+0100] [ALPM] upgraded wxwidgets-gtk3 (3.2.8.1-1 -> 3.2.8.1-2)
[2025-06-13T21:55:37+0100] [ALPM] upgraded audacity (1:3.7.3-2 -> 1:3.7.4-1)
[2025-06-13T21:55:38+0100] [ALPM] upgraded chromium (137.0.7151.68-1 -> 137.0.7151.103-1)
[2025-06-13T21:55:39+0100] [ALPM] upgraded cmake (4.0.2-1 -> 4.0.3-1)
[2025-06-13T21:55:39+0100] [ALPM] upgraded cpupower (6.14-1 -> 6.15-1)
[2025-06-13T21:55:40+0100] [ALPM] upgraded docbook-xml (4.5-10 -> 4.5-11)
[2025-06-13T21:55:40+0100] [ALPM] upgraded x265 (4.0-1 -> 4.1-1)
[2025-06-13T21:55:40+0100] [ALPM] upgraded libbpf (1.5.0-1 -> 1.5.1-1)
[2025-06-13T21:55:40+0100] [ALPM] upgraded ffmpeg (2:7.1.1-3 -> 2:7.1.1-4)
[2025-06-13T21:55:40+0100] [ALPM] upgraded ffmpeg4.4 (4.4.5-5 -> 4.4.6-1)
[2025-06-13T21:55:41+0100] [ALPM] upgraded firefox (139.0.1-1 -> 139.0.4-1)
[2025-06-13T21:55:41+0100] [ALPM] upgraded firejail (0.9.74-1 -> 0.9.74-2)
[2025-06-13T21:55:41+0100] [ALPM] upgraded libheif (1.19.8-1 -> 1.19.8-3)
[2025-06-13T21:55:41+0100] [ALPM] upgraded openexr (3.3.3-1 -> 3.3.4-1)
[2025-06-13T21:55:41+0100] [ALPM] upgraded poppler (25.05.0-2 -> 25.06.0-1)
[2025-06-13T21:55:41+0100] [ALPM] upgraded poppler-glib (25.05.0-2 -> 25.06.0-1)
[2025-06-13T21:55:42+0100] [ALPM] upgraded gimp (3.0.4-2 -> 3.0.4-3)
[2025-06-13T21:55:43+0100] [ALPM] upgraded gst-libav (1.26.2-1 -> 1.26.2-2)
[2025-06-13T21:55:43+0100] [ALPM] upgraded gst-plugins-bad (1.26.2-1 -> 1.26.2-2)
[2025-06-13T21:55:43+0100] [ALPM] upgraded gst-plugins-base (1.26.2-1 -> 1.26.2-2)
[2025-06-13T21:55:43+0100] [ALPM] upgraded gst-plugins-good (1.26.2-1 -> 1.26.2-2)
[2025-06-13T21:55:43+0100] [ALPM] upgraded gst-plugins-ugly (1.26.2-1 -> 1.26.2-2)
[2025-06-13T21:55:43+0100] [ALPM] upgraded gst-python (1.26.2-1 -> 1.26.2-2)
[2025-06-13T21:55:43+0100] [ALPM] upgraded gtk4 (1:4.18.5-2 -> 1:4.18.6-1)
[2025-06-13T21:55:43+0100] [ALPM] upgraded gtk4-demos (1:4.18.5-2 -> 1:4.18.6-1)
[2025-06-13T21:55:43+0100] [ALPM] upgraded lib32-libffi (3.4.8-1 -> 3.5.0-1)
[2025-06-13T21:55:43+0100] [ALPM] upgraded libpcap (1.10.5-2 -> 1.10.5-3)
[2025-06-13T21:55:43+0100] [ALPM] upgraded lib32-libpcap (1.10.5-2 -> 1.10.5-3)
[2025-06-13T21:55:44+0100] [ALPM] upgraded lib32-llvm-libs (1:19.1.7-2 -> 1:20.1.6-1)
[2025-06-13T21:55:44+0100] [ALPM] upgraded lib32-mesa (1:25.1.2-1 -> 1:25.1.3-3)
[2025-06-13T21:55:44+0100] [ALPM] upgraded vulkan-intel (1:25.1.2-1 -> 1:25.1.3-3)
[2025-06-13T21:55:44+0100] [ALPM] upgraded lib32-vulkan-intel (1:25.1.2-1 -> 1:25.1.3-3)
[2025-06-13T21:55:44+0100] [ALPM] upgraded libcamera-ipa (0.5.0-2 -> 0.5.1-2)
[2025-06-13T21:55:44+0100] [ALPM] upgraded libcamera (0.5.0-2 -> 0.5.1-2)
[2025-06-13T21:55:45+0100] [ALPM] upgraded libclc (19.1.7-1 -> 20.1.6-1)
[2025-06-13T21:55:45+0100] [ALPM] upgraded libei (1.4.0-1 -> 1.4.1-1)
[2025-06-13T21:55:45+0100] [ALPM] upgraded libgit2 (1:1.9.0-2 -> 1:1.9.1-1)
[2025-06-13T21:55:45+0100] [ALPM] upgraded libngtcp2 (1.12.0-1 -> 1.13.0-1)
[2025-06-13T21:55:48+0100] [ALPM] upgraded libreoffice-fresh (25.2.4-1 -> 25.2.4-2)
[2025-06-13T21:55:50+0100] [ALPM] upgraded linux (6.14.10.arch1-1 -> 6.15.2.arch1-1)
[2025-06-13T21:55:50+0100] [ALPM] upgraded pahole (1:1.30-1 -> 1:1.30-2)
[2025-06-13T21:55:54+0100] [ALPM] upgraded linux-headers (6.14.10.arch1-1 -> 6.15.2.arch1-1)
[2025-06-13T21:55:54+0100] [ALPM] upgraded llvm (19.1.7-2 -> 20.1.6-3)
[2025-06-13T21:55:54+0100] [ALPM] upgraded meson (1.8.1-1 -> 1.8.2-1)
[2025-06-13T21:55:54+0100] [ALPM] upgraded mosh (1.4.0-23 -> 1.4.0-24)
[2025-06-13T21:55:54+0100] [ALPM] upgraded simdjson (1:3.12.3-1 -> 1:3.13.0-1)
[2025-06-13T21:55:55+0100] [ALPM] upgraded nodejs (24.1.0-1 -> 24.2.0-1)
[2025-06-13T21:55:55+0100] [ALPM] upgraded nvidia (575.57.08-2 -> 575.57.08-5)
[2025-06-13T21:55:55+0100] [ALPM] upgraded openmpi (5.0.7-5 -> 5.0.8-1)
[2025-06-13T21:55:55+0100] [ALPM] upgraded passt (2025_05_12.8ec1341-1 -> 2025_06_11.0293c6f-1)
[2025-06-13T21:55:55+0100] [ALPM] upgraded pnpm (10.11.1-1 -> 10.12.1-1)
[2025-06-13T21:55:55+0100] [ALPM] upgraded protobuf-c (1.5.2-3 -> 1.5.2-4)
[2025-06-13T21:55:55+0100] [ALPM] upgraded python-protobuf (31.0-2 -> 31.1-1)
[2025-06-13T21:55:55+0100] [ALPM] upgraded podman (5.5.0-2 -> 5.5.1-2)
[2025-06-13T21:55:56+0100] [ALPM] upgraded python-cryptography (45.0.3-1 -> 45.0.4-1)
[2025-06-13T21:55:56+0100] [ALPM] upgraded python-isodate (0.7.2-1 -> 0.7.2-2)
[2025-06-13T21:55:56+0100] [ALPM] upgraded python-pbs-installer (2025.05.29-1 -> 2025.06.10-1)
[2025-06-13T21:55:56+0100] [ALPM] upgraded python-pydantic (2.11.5-1 -> 2.11.6-1)
[2025-06-13T21:55:56+0100] [ALPM] upgraded python-pyopenssl (25.0.0-1 -> 25.1.0-1)
[2025-06-13T21:55:56+0100] [ALPM] upgraded python-requests (2.32.3-4 -> 2.32.4-1)
[2025-06-13T21:55:56+0100] [ALPM] upgraded python-typeguard (4.4.2-1 -> 4.4.3-1)
[2025-06-13T21:55:56+0100] [ALPM] upgraded qt5-base (5.15.17+kde+r122-1 -> 5.15.17+kde+r123-1)
[2025-06-13T21:55:56+0100] [ALPM] upgraded qt5-tools (5.15.17+kde+r3-1 -> 5.15.17+kde+r3-2)
[2025-06-13T21:55:56+0100] [ALPM] upgraded qt6-tools (6.9.1-1 -> 6.9.1-2)
[2025-06-13T21:55:56+0100] [ALPM] installed wlroots0.19 (0.19.0-1)
[2025-06-13T21:55:57+0100] [ALPM] upgraded sway (1:1.10.1-3 -> 1:1.11-1)
[2025-06-13T21:55:58+0100] [ALPM] upgraded thunderbird (139.0.1-1 -> 139.0.2-1)
[2025-06-13T21:55:58+0100] [ALPM] upgraded unrar (1:7.1.6-1 -> 1:7.1.7-1)
[2025-06-13T21:55:58+0100] [ALPM] upgraded x86_energy_perf_policy (6.14-1 -> 6.15-1)
[2025-06-13T21:55:58+0100] [ALPM] upgraded yt-dlp (2025.05.22-1 -> 2025.06.09-1)
[2025-06-13T21:55:58+0100] [ALPM] transaction completed
[2025-06-13T21:56:02+0100] [ALPM] running '20-systemd-sysusers.hook'...
[2025-06-13T21:56:02+0100] [ALPM] running '30-systemd-daemon-reload-system.hook'...
[2025-06-13T21:56:03+0100] [ALPM] running '30-systemd-daemon-reload-user.hook'...
[2025-06-13T21:56:03+0100] [ALPM] running '30-systemd-restart-marked.hook'...
[2025-06-13T21:56:03+0100] [ALPM] running '30-systemd-tmpfiles.hook'...
[2025-06-13T21:56:04+0100] [ALPM] running '30-systemd-udev-reload.hook'...
[2025-06-13T21:56:04+0100] [ALPM] running '30-systemd-update.hook'...
[2025-06-13T21:56:04+0100] [ALPM] running '30-update-mime-database.hook'...
[2025-06-13T21:56:05+0100] [ALPM] running '60-depmod.hook'...
[2025-06-13T21:56:08+0100] [ALPM] running '70-dkms-install.hook'...
[2025-06-13T21:56:08+0100] [ALPM] running '90-mkinitcpio-install.hook'...
[2025-06-13T21:56:08+0100] [ALPM-SCRIPTLET] ==> Building image from preset: /etc/mkinitcpio.d/linux.preset: 'default'
[2025-06-13T21:56:08+0100] [ALPM-SCRIPTLET] ==> Using default configuration file: '/etc/mkinitcpio.conf'
[2025-06-13T21:56:08+0100] [ALPM-SCRIPTLET]   -> -k /boot/vmlinuz-linux -g /boot/initramfs-linux.img
[2025-06-13T21:56:08+0100] [ALPM-SCRIPTLET] ==> Starting build: '6.15.2-arch1-1'
[2025-06-13T21:56:09+0100] [ALPM-SCRIPTLET]   -> Running build hook: [base]
[2025-06-13T21:56:09+0100] [ALPM-SCRIPTLET]   -> Running build hook: [udev]
[2025-06-13T21:56:09+0100] [ALPM-SCRIPTLET]   -> Running build hook: [autodetect]
[2025-06-13T21:56:09+0100] [ALPM-SCRIPTLET]   -> Running build hook: [microcode]
[2025-06-13T21:56:09+0100] [ALPM-SCRIPTLET]   -> Running build hook: [modconf]
[2025-06-13T21:56:09+0100] [ALPM-SCRIPTLET]   -> Running build hook: [kms]
[2025-06-13T21:56:11+0100] [ALPM-SCRIPTLET]   -> Running build hook: [keyboard]
[2025-06-13T21:56:11+0100] [ALPM-SCRIPTLET]   -> Running build hook: [keymap]
[2025-06-13T21:56:11+0100] [ALPM-SCRIPTLET]   -> Running build hook: [consolefont]
[2025-06-13T21:56:11+0100] [ALPM-SCRIPTLET] ==> WARNING: consolefont: no font found in configuration
[2025-06-13T21:56:11+0100] [ALPM-SCRIPTLET]   -> Running build hook: [block]
[2025-06-13T21:56:12+0100] [ALPM-SCRIPTLET]   -> Running build hook: [encrypt]
[2025-06-13T21:56:14+0100] [ALPM-SCRIPTLET]   -> Running build hook: [filesystems]
[2025-06-13T21:56:14+0100] [ALPM-SCRIPTLET]   -> Running build hook: [fsck]
[2025-06-13T21:56:14+0100] [ALPM-SCRIPTLET] ==> Generating module dependencies
[2025-06-13T21:56:14+0100] [ALPM-SCRIPTLET] ==> Creating zstd-compressed initcpio image: '/boot/initramfs-linux.img'
[2025-06-13T21:56:15+0100] [ALPM-SCRIPTLET]   -> Early uncompressed CPIO image generation successful
[2025-06-13T21:56:15+0100] [ALPM-SCRIPTLET] ==> Initcpio image generation successful
[2025-06-13T21:56:15+0100] [ALPM-SCRIPTLET] ==> Building image from preset: /etc/mkinitcpio.d/linux.preset: 'fallback'
[2025-06-13T21:56:15+0100] [ALPM-SCRIPTLET] ==> Using default configuration file: '/etc/mkinitcpio.conf'
[2025-06-13T21:56:15+0100] [ALPM-SCRIPTLET]   -> -k /boot/vmlinuz-linux -g /boot/initramfs-linux-fallback.img -S autodetect
[2025-06-13T21:56:15+0100] [ALPM-SCRIPTLET] ==> Starting build: '6.15.2-arch1-1'
[2025-06-13T21:56:15+0100] [ALPM-SCRIPTLET]   -> Running build hook: [base]
[2025-06-13T21:56:15+0100] [ALPM-SCRIPTLET]   -> Running build hook: [udev]
[2025-06-13T21:56:16+0100] [ALPM-SCRIPTLET]   -> Running build hook: [microcode]
[2025-06-13T21:56:16+0100] [ALPM-SCRIPTLET]   -> Running build hook: [modconf]
[2025-06-13T21:56:16+0100] [ALPM-SCRIPTLET]   -> Running build hook: [kms]
[2025-06-13T21:56:16+0100] [ALPM-SCRIPTLET] ==> WARNING: Possibly missing firmware for module: 'ast'
[2025-06-13T21:56:26+0100] [ALPM-SCRIPTLET]   -> Running build hook: [keyboard]
[2025-06-13T21:56:26+0100] [ALPM-SCRIPTLET] ==> WARNING: Possibly missing firmware for module: 'xhci_pci_renesas'
[2025-06-13T21:56:27+0100] [ALPM-SCRIPTLET]   -> Running build hook: [keymap]
[2025-06-13T21:56:27+0100] [ALPM-SCRIPTLET]   -> Running build hook: [consolefont]
[2025-06-13T21:56:27+0100] [ALPM-SCRIPTLET] ==> WARNING: consolefont: no font found in configuration
[2025-06-13T21:56:27+0100] [ALPM-SCRIPTLET]   -> Running build hook: [block]
[2025-06-13T21:56:28+0100] [ALPM-SCRIPTLET] ==> WARNING: Possibly missing firmware for module: 'qla1280'
[2025-06-13T21:56:28+0100] [ALPM-SCRIPTLET] ==> WARNING: Possibly missing firmware for module: 'bfa'
[2025-06-13T21:56:28+0100] [ALPM-SCRIPTLET] ==> WARNING: Possibly missing firmware for module: 'qla2xxx'
[2025-06-13T21:56:28+0100] [ALPM-SCRIPTLET] ==> WARNING: Possibly missing firmware for module: 'qed'
[2025-06-13T21:56:28+0100] [ALPM-SCRIPTLET] ==> WARNING: Possibly missing firmware for module: 'wd719x'
[2025-06-13T21:56:28+0100] [ALPM-SCRIPTLET] ==> WARNING: Possibly missing firmware for module: 'aic94xx'
[2025-06-13T21:56:32+0100] [ALPM-SCRIPTLET]   -> Running build hook: [encrypt]
[2025-06-13T21:56:33+0100] [ALPM-SCRIPTLET]   -> Running build hook: [filesystems]
[2025-06-13T21:56:34+0100] [ALPM-SCRIPTLET]   -> Running build hook: [fsck]
[2025-06-13T21:56:37+0100] [ALPM-SCRIPTLET] ==> Generating module dependencies
[2025-06-13T21:56:37+0100] [ALPM-SCRIPTLET] ==> Creating zstd-compressed initcpio image: '/boot/initramfs-linux-fallback.img'
[2025-06-13T21:56:39+0100] [ALPM-SCRIPTLET]   -> Early uncompressed CPIO image generation successful
[2025-06-13T21:56:39+0100] [ALPM-SCRIPTLET] ==> Initcpio image generation successful
[2025-06-13T21:56:39+0100] [ALPM] running 'dbus-reload.hook'...
[2025-06-13T21:56:40+0100] [ALPM] running 'detect-old-perl-modules.hook'...
[2025-06-13T21:56:40+0100] [ALPM] running 'firejail-permissions.hook'...
[2025-06-13T21:56:40+0100] [ALPM] running 'gdk-pixbuf-query-loaders.hook'...
[2025-06-13T21:56:40+0100] [ALPM] running 'glib-compile-schemas.hook'...
[2025-06-13T21:56:40+0100] [ALPM] running 'gtk-update-icon-cache.hook'...
[2025-06-13T21:56:41+0100] [ALPM] running 'gtk4-querymodules.hook'...
[2025-06-13T21:56:41+0100] [ALPM] running 'texinfo-install.hook'...
[2025-06-13T21:56:41+0100] [ALPM] running 'update-desktop-database.hook'...

Additional info:

- script to enable nvidia card and driver: https://gist.github.com/michelesr/dfa86 … 4912565b7f
- script to disable nvidia driver and card: https://gist.github.com/michelesr/6f67c … 8eeb7cb45a

Been using these scripts for several years without issues: by hiding the nvidia card to the system, the "nvidia" module doesn't get automatically loaded turning power on the card, and fans on (that are really annoying).

EDIT: turns out the problem is sway, see comment below.

Last edited by michelesr (2025-06-17 21:29:38)

Offline

#2 2025-06-17 13:20:36

michelesr
Member
Registered: 2016-02-04
Posts: 73

Re: Page fault on DMA fence release afte sway update

I believe I found reliable steps to reproduce (and had nothing to do with suspend/resume):

- run the enable script https://gist.github.com/michelesr/dfa86 … 4912565b7f
- run vkcube
- run the disable script https://gist.github.com/michelesr/6f67c … 8eeb7cb45a
- run vkcube

Expected result:

- vkcube runs on the iGPU

Actual result:

- page fault
- desktop freeze
- need to press power button to halt the system

My impression is that the nvidia driver is doing something on the kernel that even after unloading the module leaves it in a bad state, and the next time the dma fence release function is  called it fails.

Offline

#3 2025-06-17 21:28:12

michelesr
Member
Registered: 2016-02-04
Posts: 73

Re: Page fault on DMA fence release afte sway update

Looks like the issue is caused by this sway commit -> https://github.com/swaywm/sway/commit/0 … aab63b740e, so not a kernel issue. Something must go wrong after removing the card from the system.

Offline

Board footer

Powered by FluxBB