You are not logged in.

#1 2024-06-10 12:57:21

kelloco2
Member
Registered: 2012-02-13
Posts: 133

Kernel bug when copying files using mc to LUKS encrypted disk?

Hi! I'm using ArchLinux on my personal computers and home servers. Recently, I noticed that my NAS crashes completely when copying a large number of files (about 30GB, an Android backup) from a USB pendrive.
This issue is repeatable, so I disabled all additional features like ZRAM on the machine and began investigating it. At first, I suspected corrupted RAM or HDD, but both passed memtest86+ and SMART status checks.
To be thorough, I replaced the hard drive, but the same crash occurred. I then tried a different ArchLinux installation, but it also crashed.
Suspecting it might be related to my LVM on LUKS setup, I changed it to LUKS on LVM with typical config, but it still crashed. However, when I switched to an unencrypted EXT4 setup, the copying process was successful.

I began to think the issue might be related to the kernel or the Midnight Commander (mc) software when the disk is encrypted. When I used "cp -r" instead of mc to copy the files, it completed successfully.
Next, I decided to try the reverse: I kept the mc and changed the kernel version. I installed the older and possibly more stable linux-lts available in the ArchLinux repository and tried copying the files using mc.
It completed successfully. This indicates that the issue might be specific to the latest kernel version and can by triggered midnight commander software.

In summary, this issue occurs with the latest kernel 6.9 when copying a large number of files using mc to an LVM on LUKS or LUKS on LVM hard drive. What concerns me is that this error, even when mc is used in user space (not as root), can cause a kernel panic and crash the entire system. For now, I have switched to linux-lts to have stable NAS.

Do you know how to fix this error or what steps I should take next? Should I report this bug on something like the ArchLinux or Linux kernel bug tracker? Maybe there is already a reported BUG for this?
The message in dmesg is often different for this error.

441167777-1742819386255807-2730043352087990376-n.jpg
447670446-979067976842344-8104410503754059386-n.jpg
448077998-976695694157376-4543269467364503703-n.jpg
448080651-8576002189083461-5652573505930164655-n.jpg


[wto 28 maj 00:04:35 2024] ------------[ cut here ]------------
[wto 28 maj 00:04:35 2024] WARNING: CPU: 1 PID: 1230 at lib/iov_iter.c:467 copy_page_from_iter_atomic+0x24c/0x6e0
[wto 28 maj 00:04:35 2024] Modules linked in: exfat ccm vfat fat joydev mousedev rmi_smbus rmi_core iwlmvm mac80211 libarc4 ptp pps_core intel_soc_dts_thermal intel_soc_dts_iosf intel_powerclamp ledtrig_netdev snd_hda_codec_hdmi r8169 snd_ctl_led realtek psmouse spi_nor snd_hda_codec_realtek int3401_thermal processor_thermal_device processor_thermal_wt_hint processor_thermal_rfim snd_hda_codec_generic processor_thermal_rapl intel_rapl_msr intel_rapl_common uvcvideo snd_hda_scodec_component videobuf2_vmalloc mdio_devres uvc videobuf2_memops videobuf2_v4l2 coretemp videodev kvm_intel videobuf2_common mtd mc snd_hda_intel kvm snd_intel_dspcfg snd_intel_sdw_acpi libphy snd_hda_codec snd_hda_core snd_hwdep btusb snd_pcm snd_timer processor_thermal_wt_req processor_thermal_power_floor processor_thermal_mbox btmtk at24 intel_cstate mei_pxp mei_hdcp pcspkr spi_intel_platform spi_intel iTCO_wdt i2c_i801 intel_pmc_bxt iTCO_vendor_support i2c_smbus iwlwifi mei_txe cfg80211 ledtrig_audio int3400_thermal mei acpi_thermal_rel int340x_thermal_zone
[wto 28 maj 00:04:35 2024]  lpc_ich hci_uart btqca btrtl btintel think_lmi firmware_attributes_class btbcm wmi_bmof bluetooth thinkpad_acpi platform_profile snd ecdh_generic soundcore rfkill_gpio mac_hid rfkill pwm_lpss_platform pwm_lpss loop nfnetlink zram ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 xts dm_crypt crypto_simd cbc encrypted_keys trusted asn1_encoder tee dm_mod uas usb_storage crct10dif_pclmul crc32_pclmul crc32c_intel polyval_generic serio_raw gf128mul atkbd ghash_clmulni_intel libps2 cryptd vivaldi_fmap sha512_ssse3 sha256_ssse3 sha1_ssse3 xhci_pci xhci_pci_renesas i8042 serio i915 i2c_algo_bit drm_buddy video wmi ttm intel_gtt drm_display_helper cec
[wto 28 maj 00:04:35 2024] CPU: 1 PID: 1230 Comm: mc Tainted: G     U             6.9.2-zen1-1-zen #1 14404c7ad8f24e086d7631b72c6d01926c31bde3
[wto 28 maj 00:04:35 2024] Hardware name: LENOVO 20DAS0U900/Intel powered classmate PC, BIOS N15ET81W (1.41) 04/29/2021
[wto 28 maj 00:04:35 2024] RIP: 0010:copy_page_from_iter_atomic+0x24c/0x6e0
[wto 28 maj 00:04:35 2024] Code: 72 77 49 83 c3 10 48 85 db 74 71 45 31 c9 49 8b 43 08 48 89 c2 4c 29 ca 48 39 da 48 0f 47 d3 48 85 d2 75 8e 49 83 c3 10 eb e1 <0f> 0b 45 31 ed e9 cb fe ff ff 45 31 ed e9 ae fe ff ff e8 8d 91 87
[wto 28 maj 00:04:35 2024] RSP: 0018:ffffb7b0418ff9c8 EFLAGS: 00010246
[wto 28 maj 00:04:35 2024] RAX: 0000000000000000 RBX: 0000000000001000 RCX: ffffb7b0418ffaf7
[wto 28 maj 00:04:35 2024] RDX: 0000000000001000 RSI: 0000000000001000 RDI: fffff8e0c5dfd640
[wto 28 maj 00:04:35 2024] RBP: fffff8e0c5dfd640 R08: fffff8e0c5dfd640 R09: 0000000000000000
[wto 28 maj 00:04:35 2024] R10: 0000000000000000 R11: 0000000000000000 R12: ffffb7b0418ffaf7
[wto 28 maj 00:04:35 2024] R13: ffff98a7d6201098 R14: 0000000000000000 R15: ffffffffc0da2ca0
[wto 28 maj 00:04:35 2024] FS:  00007a4646f6b100(0000) GS:ffff98a7fbc80000(0000) knlGS:0000000000000000
[wto 28 maj 00:04:35 2024] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[wto 28 maj 00:04:35 2024] CR2: 00007d6bad165000 CR3: 00000001375ce000 CR4: 00000000001006f0
[wto 28 maj 00:04:35 2024] Call Trace:
[wto 28 maj 00:04:35 2024]  <TASK>
[wto 28 maj 00:04:35 2024]  ? copy_page_from_iter_atomic+0x24c/0x6e0
[wto 28 maj 00:04:35 2024]  ? __warn.cold+0x8e/0xf3
[wto 28 maj 00:04:35 2024]  ? copy_page_from_iter_atomic+0x24c/0x6e0
[wto 28 maj 00:04:35 2024]  ? report_bug+0xe7/0x200
[wto 28 maj 00:04:35 2024]  ? handle_bug+0x3c/0x80
[wto 28 maj 00:04:35 2024]  ? exc_invalid_op+0x19/0xc0
[wto 28 maj 00:04:35 2024]  ? asm_exc_invalid_op+0x1a/0x20
[wto 28 maj 00:04:35 2024]  ? copy_page_from_iter_atomic+0x24c/0x6e0
[wto 28 maj 00:04:35 2024]  ? ext4_da_write_begin+0x1a2/0x2f0 [ext4 80220ce8580633dce32b37b2270bbe5ecc582b32]
[wto 28 maj 00:04:35 2024]  ? ext4_da_write_end+0xae/0x370 [ext4 80220ce8580633dce32b37b2270bbe5ecc582b32]
[wto 28 maj 00:04:35 2024]  generic_perform_write+0xf1/0x230
[wto 28 maj 00:04:35 2024]  ext4_buffered_write_iter+0xa2/0x180 [ext4 80220ce8580633dce32b37b2270bbe5ecc582b32]
[wto 28 maj 00:04:35 2024]  vfs_write+0x2c6/0x4a0
[wto 28 maj 00:04:35 2024]  __x64_sys_write+0x72/0xf0
[wto 28 maj 00:04:35 2024]  do_syscall_64+0x83/0x190
[wto 28 maj 00:04:35 2024]  ? touch_atime+0xdf/0x320
[wto 28 maj 00:04:35 2024]  ? filemap_read+0x336/0x360
[wto 28 maj 00:04:35 2024]  ? vfs_read+0x2cf/0x460
[wto 28 maj 00:04:35 2024]  ? __rseq_handle_notify_resume+0x23f/0x4e0
[wto 28 maj 00:04:35 2024]  ? switch_fpu_return+0x4e/0xd0
[wto 28 maj 00:04:35 2024]  ? syscall_exit_to_user_mode+0x75/0x1f0
[wto 28 maj 00:04:35 2024]  ? do_syscall_64+0x8f/0x190
[wto 28 maj 00:04:35 2024]  ? syscall_exit_to_user_mode+0x75/0x1f0
[wto 28 maj 00:04:35 2024]  ? do_syscall_64+0x8f/0x190
[wto 28 maj 00:04:35 2024]  ? do_syscall_64+0x8f/0x190
[wto 28 maj 00:04:35 2024]  ? do_syscall_64+0x8f/0x190
[wto 28 maj 00:04:35 2024]  ? syscall_exit_to_user_mode+0x75/0x1f0
[wto 28 maj 00:04:35 2024]  ? do_syscall_64+0x8f/0x190
[wto 28 maj 00:04:35 2024]  ? irq_exit_rcu+0x53/0xc0
[wto 28 maj 00:04:35 2024]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[wto 28 maj 00:04:35 2024] RIP: 0033:0x7a46469cf504
[wto 28 maj 00:04:35 2024] Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d 45 0b 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
[wto 28 maj 00:04:35 2024] RSP: 002b:00007ffe57926e88 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[wto 28 maj 00:04:35 2024] RAX: ffffffffffffffda RBX: 000062a4f28054b0 RCX: 00007a46469cf504
[wto 28 maj 00:04:35 2024] RDX: 0000000000020000 RSI: 000062a4f28054b0 RDI: 0000000000000014
[wto 28 maj 00:04:35 2024] RBP: 0000000000020000 R08: 00000000002801dd R09: 0000000000000000
[wto 28 maj 00:04:35 2024] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000014
[wto 28 maj 00:04:35 2024] R13: 0000000000020000 R14: 000062a4f28054b0 R15: 000062a4f27063a0
[wto 28 maj 00:04:35 2024]  </TASK>
[wto 28 maj 00:04:35 2024] ---[ end trace 0000000000000000 ]---

Best Regards

Offline

#2 2024-06-10 14:20:13

loqs
Member
Registered: 2014-03-06
Posts: 18,872

Re: Kernel bug when copying files using mc to LUKS encrypted disk?

Please post the full system journal or at least all the kernel messages from a boot with the issue.  Would you be willing to bisect the issue to determine the cause?

Offline

#3 2024-06-10 17:32:30

xerxes_
Member
Registered: 2018-04-29
Posts: 1,056

Re: Kernel bug when copying files using mc to LUKS encrypted disk?

Current kernel version is 6.9.3. Update to newest version if you want to report bug. Maybe the bug was fixed?

Offline

#4 2024-06-10 23:35:24

kelloco2
Member
Registered: 2012-02-13
Posts: 133

Re: Kernel bug when copying files using mc to LUKS encrypted disk?

@loqs @xerxes_

Thanks for your replies. I will try to paste more complete logs within a few days. First, I will check can I reproduce it on another machine (to not break the NAS with important data) but if not I will update the NAS to the latest kernel and check. No risk, no fun

Offline

Board footer

Powered by FluxBB