You are not logged in.

#1 Yesterday 13:28:07

Heretic12
Member
Registered: Yesterday
Posts: 7

Constant Input/Output Error

So, a bit of a background

I'm new to Arch and Linux in general, so I may have done something stupid... Anyways, my setup goes like this: I have 2 NVMe SSDs (Identical, Samsung EVO 970 1TB) on my mboard plus 2 SATA SSDs. The first NVMe has efi and boot physical partitions and the rest of the drive is a partition with a physical volume in LVM group. Another NVMe is 1 big partition with a PV. Those 2 PV are in the same Volume Group and which is, in turn, is divided into lv_root and lv_home for / and /home/ directories. For the desktop I use KDE Plasma on Wayland and recently installed Hyprland to tinker with it.

Pretext

So the problem started somewhat like a week ago. Around that time I did a few things:

  • first and foremost I did pacman -Syu;

  • also I've connected another two drives (HDD) to my system;

  • installed KZones for KWin (and also was playing with plasmoids);

  • installed Ratbag/Piper and configured my Logitech mouse.

The problem itself

So it all started with Stellaris. I've started getting random crashes with GLib-GObject-CRITICAL **: time: g_object_unref: assertion 'G_IS_OBJECT (object) failed after which the whole system was collapsing with any attempt to run any command ending with Input/Output error and all apps that were running during the crash becoming unresponsive, sometimes showing just a black screen with a cursor. And then it has started to become worse - first launching Brave browser also started to cause this collapse, and yesterday it has become so bad that even launching KDE now ends up with just a black screen and a cursor. The problem is present also on Hyprland and X11 Plasma sessions. The only way to get out of this state is to reboot the PC with the button - which also somehow corrupts my lv_home and makes me run fsck from a bootable iso everytime.

Things I've already tried:
  • removed all plasmoinds and KWin scripts

  • removed recently added HDD drives

  • reinstalled all the packages with pacman -S $(pacman -Qnq) - which includes glib

  • checked my NVMes with SMART and with Gigabyte mboard tool - no errors

  • checked lv_home with time -p dd if=/dev/\My VG]/lv_home of=/dev/null bs=4M - nothing

  • added a kernel parameter nvme_core.defult_ps_max_latency_us=0 which probably made things even worse altough it was already very bad so might be just my impressions

So the question is - WTF?? What possibly can it be? And are there any fixes?

Last edited by Heretic12 (Yesterday 13:32:35)

Offline

#2 Yesterday 14:12:05

seth
Member
Registered: 2012-09-03
Posts: 58,170

Re: Constant Input/Output Error

Start by posting a system journal that covers at least some of the issues, eg.

sudo journalctl -b -1 | curl -F 'file=@-' 0x0.st

for the previous ("-1") boot.
Avoifd rebooting w/ the power button, setup and use https://wiki.archlinux.org/title/Keyboa … el_(SysRq) instead.

"nvme_core.defult_ps_max_latency_us=0" and "iommu=soft" are strong contenders, but rn. it's not even remotely clear what your problem is, beyond some vague "Input/Output error … crash … unresponsive… black screen" (the latter likely being the compositor.

Offline

#3 Yesterday 14:30:38

Heretic12
Member
Registered: Yesterday
Posts: 7

Re: Constant Input/Output Error

Hi, thanks for the reply, and for SysRq advice!

So the journal for the previous boot is here - https://0x0.st/XDZo.txt

UPD
Now I think I'll launch Stellaris right now to trigger that error once again and reboot to capture it in it's full glory

Last edited by Heretic12 (Yesterday 14:37:00)

Offline

#4 Yesterday 14:38:56

seth
Member
Registered: 2012-09-03
Posts: 58,170

Re: Constant Input/Output Error

That journal wasn't sync'd to disk after the root switch (initramfs phase)

Because of the SSDs, https://wiki.archlinux.org/title/Power_ … Management has been more often an issue because several drives were over-optimistically "upgraded" to med_power_with_dipm - check your value and set it to max_performance (or eventually medium_power, but check max_performance first)

Offline

#5 Yesterday 14:54:13

Heretic12
Member
Registered: Yesterday
Posts: 7

Re: Constant Input/Output Error

So the problem was with SATA drives? And should I change that value for all hosts?

Offline

#6 Yesterday 14:56:09

seth
Member
Registered: 2012-09-03
Posts: 58,170

Re: Constant Input/Output Error

I do not know what the problem was/is - it's a guess based on "things that were problems for other people"
The journal doesn't record any issues because that boot ended w/ a hard reset.

Offline

#7 Yesterday 16:50:12

Heretic12
Member
Registered: Yesterday
Posts: 7

Re: Constant Input/Output Error

OK, I had to battle with my system for a bit, so here's the log:

System started fine - and i caught Input/Output Error with Stellaris. Btw, Steam now is refusing to even launch so I used .sh file. Then after rebooting with SysRq I ran into system being unable to mount drives so I had to hard reboot 3 times and fixed the issue with fsck /dev/Array/lv_home from a bootable usb. So here are journals for the 5 last boots. The 4th is most likely the one that contains that boot with I/O Error - at least it reads so, although I am not that experienced yet to decipher it's content(

UPD
It seems like despite I've done fsck, there are still a lot of corrupted inodes. And - which is even more interesting, there are corrupted inodes in /boot/efi. That's at least was my understanding of those logs...

Last edited by Heretic12 (Yesterday 17:22:34)

Offline

#8 Yesterday 20:33:44

seth
Member
Registered: 2012-09-03
Posts: 58,170

Re: Constant Input/Output Error

4 of them end early

Nov 06 17:16:31 archlinux systemd-journald[469]: Time spent on flushing to /var/log/journal/689dd85299b14c96aac4b7be644764be is 4.566ms for 1091 entries.

https://0x0.st/XDNY.txt contains some actuall runtime journal.


Nov 06 17:18:01 archlinux mount[787]: WARNING: blksize option is ignored because ntfs-3g must calculate it.
Nov 06 17:18:01 archlinux mount[791]: WARNING: blksize option is ignored because ntfs-3g must calculate it.
Nov 06 17:18:01 archlinux kernel: EXT4-fs (nvme0n1p2): recovery complete
Nov 06 17:18:01 archlinux kernel: EXT4-fs (nvme0n1p2): mounted filesystem a8d527f2-71bc-4760-a7ea-b01c99d9ee5c r/w with ordered data mode. Quota mode: none.
Nov 06 17:18:01 archlinux systemd[1]: Mounted /boot.
Nov 06 17:18:01 archlinux systemd[1]: Mounting /boot/efi...
Nov 06 17:18:01 archlinux kernel: EXT4-fs error (device dm-1): ext4_orphan_get:1421: comm mount: bad orphan inode 88891646
Nov 06 17:18:01 archlinux kernel: ext4_test_bit(bit=253, block=355467283) = 0

There's an fsck because of the unclean previous shutdown - and an ntfs partition.

==> IS THERE A PARALLEL WINDOWS INSTALLATION?
=> 3rd link below. Mandatory.
Disable it (it's NOT the BIOS setting!) and reboot windows and linux twice for voodo reasons.

Later on

Nov 06 17:18:05 archlinux kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:1217: group 10945, block bitmap and bg descriptor inconsistent: 313 vs 307 free clusters
Nov 06 17:18:05 archlinux kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:1217: group 10929, block bitmap and bg descriptor inconsistent: 8828 vs 8773 free clusters

there're still some ext4 inconsistencies.

Nov 06 17:18:20 192.168.1.17 kernel: EXT4-fs error (device dm-1): ext4_lookup:1815: inode #89269276: comm kwin_wayland: deleted inode referenced: 89275064
Nov 06 17:18:21 192.168.1.17 kernel: usb 1-6: reset high-speed USB device number 5 using xhci_hcd
Nov 06 17:18:21 192.168.1.17 kcminit_startup[1179]: Initializing  "/usr/lib/qt6/plugins/plasma/kcms/systemsettings/kcm_mouse.so"
Nov 06 17:18:21 192.168.1.17 kcminit_startup[1179]: Initializing  "/usr/lib/qt6/plugins/plasma/kcms/systemsettings/kcm_style.so"
Nov 06 17:18:21 192.168.1.17 kernel: EXT4-fs error (device dm-1): ext4_lookup:1815: inode #89269276: comm ksplashqml: deleted inode referenced: 89275064
Nov 06 17:18:21 192.168.1.17 kernel: EXT4-fs error (device dm-1): ext4_lookup:1815: inode #89269276: comm Xwayland: deleted inode referenced: 89275064

and more

And finally =======================================================

Nov 06 17:19:20 192.168.1.17 kernel: nvme nvme1: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0x10
Nov 06 17:19:20 192.168.1.17 kernel: nvme nvme1: Does your device have a faulty power saving mode enabled?
Nov 06 17:19:20 192.168.1.17 kernel: nvme nvme1: Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off" and report a bug
Nov 06 17:19:20 192.168.1.17 udisksd[1184]: Error probing device: NVMe Identify Controller command error: Interrupted system call (g-bd-nvme-error-quark, 1)
Nov 06 17:19:20 192.168.1.17 kernel: nvme 0000:03:00.0: enabling device (0000 -> 0002)
Nov 06 17:19:20 192.168.1.17 kernel: nvme nvme1: Disabling device after reset failure: -19
Nov 06 17:19:20 192.168.1.17 udisksd[1184]: Error probing device: NVMe Identify Namespace command error: Input/output error (g-bd-nvme-error-quark, 1)
Nov 06 17:19:21 192.168.1.17 kernel: nvme nvme1: Identify namespace failed (-5)

You already have "nvme_core.default_ps_max_latency_us=0" so add the others and also "iommu=soft" and run a complete fsck.

Offline

#9 Yesterday 21:24:45

Heretic12
Member
Registered: Yesterday
Posts: 7

Re: Constant Input/Output Error

==> IS THERE A PARALLEL WINDOWS INSTALLATION?
No, I have 2 SATA SSDs that contain files from my previous setup which was Windows, and I've decided to leave them on NTFS and not to do any additional manipulations. Is it OK to have NTFS partitions in my system? Because at least one of them needs to be available to Windows and Mac as it is a data storage and could be used in different scenarios

So, I've created a file /etc/udev/rules.d/hd_power_save.rules with

ACTION=="add", SUBSYSTEM=="scsi_host", KERNEL=="host*", ATTR{link_power_management_policy}="max_performance"

And also added "iommu=soft" to both GRUB_CMDLINE_LINUX_DEFAULT and GRUB_CMDLINE_LINUX

Now going to reboot and do a complete fsck from a bootable usb. Hope it will fix the issue and I didn't miss anything

Last edited by Heretic12 (Yesterday 21:27:59)

Offline

#10 Yesterday 21:34:29

seth
Member
Registered: 2012-09-03
Posts: 58,170

Re: Constant Input/Output Error

And also added "iommu=soft" to both GRUB_CMDLINE_LINUX_DEFAULT and GRUB_CMDLINE_LINUX

1. editing /etc/default/grub doesn't do anything, you also have to run grub-mkconfig
2. don't forget "pcie_aspm=off pcie_port_pm=off"

Offline

#11 Yesterday 21:43:45

Heretic12
Member
Registered: Yesterday
Posts: 7

Re: Constant Input/Output Error

Good thing I waited for your reply before doing reboot! I know about generating config and I did it. But totally forgot "pcie_aspm=off pcie_port_pm=off") So if it's gonna solve the issue - it would mean that power management was to blame all along? And btw, woud this affect graphics card since it adjusts all pcie?
So the total list of added kernel parameters would be:

GRUB_CMDLINE_LINUX_DEFAULT="loglevel=3 quiet nvidia-drm.modeset=1 nvme_core.default_ps_max_latency_us=0 iommu=soft pcie_aspm=off pcie_port_pm=off"
GRUB_CMDLINE_LINUX="nvidia-drm.modeset=1 nvme_core.default_ps_max_latency_us=0 iommu=soft pcie_aspm=off pcie_port_pm=off"

Last edited by Heretic12 (Yesterday 21:45:39)

Offline

#12 Yesterday 21:57:38

seth
Member
Registered: 2012-09-03
Posts: 58,170

Re: Constant Input/Output Error

And btw, woud this affect graphics card since it adjusts all pcie?

"pcie_aspm=off" prevents ASPM on the entire bus, but since you've only one GPU, it will likely not have any effect on that.

Also nb. that the general strategy is to first see whether you can stabilize the system at all.
If yes, you'd try to narrow down on the critical parameters to limit the impact as much as possible.

Offline

#13 Yesterday 23:10:50

Heretic12
Member
Registered: Yesterday
Posts: 7

Re: Constant Input/Output Error

Unfortunately it didn't help. Well... at least Stellaris managed to work a few minutes before the crash. Again, journal entries:

  • https://0x0.st/XDAs.txt - that boot when I've managed to play a few minutes of Stellaris

  • https://0x0.st/XDT7.txt - and here I couldn't even enter the Plasma session after the reboot from previous crash - it just hand with greeting animation completely stopping and no ability to switch to TTY

Last edited by Heretic12 (Today 01:57:35)

Offline

Board footer

Powered by FluxBB