You are not logged in.
Original Title before edit to [SOLVED]: Linux 6.4.4 introduces high IO-Wait CPU operations on one core.
Hello, i just updated to Linux 6.4.4 and i noticed something very weird: One core of my CPU is being used with IO-wait operations at 100%.
I tested this on my PC using Xorg and KDE Plasma
And my Thinkpad using Wayland and KDE Plasma
Both of them use BTRFS file systems with ZSTD compression enabled and with nodiscard (To prevent constant I/O to the SSD) mount option on it.
Htop:
Top:
%Cpu(s): 10.6 us, 1.1 sy, 0.0 ni, 65.0 id, 22.8 wa, 0.3 hi, 0.2 si, 0.0 st
22.8 wa is the IO-Wait usage.
To prevent this stick with Linux 6.4.3 or LTS
Last edited by Fijxu (2023-08-04 06:10:41)
Offline
I have this behaviour on ext4
12:07:36 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
12:07:36 all 4.02 0.00 1.04 12.01 0.25 0.18 0.00 0.00 0.00 82.51
12:07:36 0 4.17 0.00 1.07 0.21 0.20 0.14 0.00 0.00 0.00 94.21
12:07:36 1 3.79 0.00 1.02 0.20 0.22 0.17 0.00 0.00 0.00 94.61
12:07:36 2 4.56 0.00 1.05 0.19 0.20 0.15 0.00 0.00 0.00 93.85
12:07:36 3 4.12 0.00 1.08 0.24 0.19 0.10 0.00 0.00 0.00 94.27
12:07:36 4 4.07 0.00 1.00 94.61 0.18 0.09 0.00 0.00 0.00 0.05
12:07:36 5 4.23 0.00 1.04 0.20 0.34 0.56 0.00 0.00 0.00 93.63
12:07:36 6 3.19 0.00 1.01 0.21 0.44 0.09 0.00 0.00 0.00 95.07
12:07:36 7 4.02 0.00 1.02 0.18 0.21 0.13 0.00 0.00 0.00 94.43
Last edited by svp (2023-07-20 09:50:00)
Offline
Please consider bisecting the regression and reporting the causal commit upstream. Let me know if you need any help with the bisection.
Offline
After some hours of compiling Linux-Mainline and modifying files with no luck i just started killing process to see what is the one causing this high CPU usage.
And then I found the process that was using a lot of IO-Wait to the CPU. It was MariaDB (11.0.2).
I use Kalendar as my Calendar program since i use KDE Plasma and this program uses Akonadi which uses MariaDB to store data on it.
I tried Linux Mainline 6.5-rc1 and 6.5-rc2, both of them also has this high IO-Wait CPU usage and as I said, <6.4.3 don't have this problem.
So this problem is related with MariaDB or with the >=6.4.4 Linux kernels?
EDIT: Reinstalling Kalendar from scratch deleting all the database folders and akonadi related folders generates the same problem.
Last edited by Fijxu (2023-07-20 19:15:12)
Offline
As a kernel update triggered the issue I would be inclined to that being the cause of the problem. Does the kernel linked below which is the midpoint between 6.4.3 and 6.4.4 have the issue?
$ git bisect start
status: waiting for both good and bad commits
$ git bisect bad v6.4.4
status: waiting for good commit(s), bad commit known
$ git bisect good v6.4.3
Bisecting: 400 revisions left to test after this (roughly 9 steps)
[ca47d0dc00968358c136a1847cfed550cedfd1b5] drm/msm/dp: Free resources after unregistering them
https://drive.google.com/file/d/1Oscu7E … sp=sharing linux-6.4.3.r400.gca47d0dc0096-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1YoOcAU … sp=sharing linux-headers-6.4.3.r400.gca47d0dc0096-1-x86_64.pkg.tar.zst
Offline
Does the kernel linked below which is the midpoint between 6.4.3 and 6.4.4 have the issue?
No, there is no IO-Wait operations on the CPU, just normal traffic as usual.
Offline
$ git bisect good
Bisecting: 200 revisions left to test after this (roughly 8 steps)
[7a34dc6dc1d04510fe0be2746548981d59492cdd] power: supply: rt9467: Make charger-enable control as logic level
https://drive.google.com/file/d/1RGvyX0 … sp=sharing linux-6.4.3.r600.g7a34dc6dc1d0-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1nnaRo2 … sp=sharing linux-headers-6.4.3.r600.g7a34dc6dc1d0-1-x86_64.pkg.tar.zst
Offline
linux-6.4.3.r600.g7a34dc6dc1d0-1-x86_64.pkg.tar.zst
linux-headers-6.4.3.r600.g7a34dc6dc1d0-1-x86_64.pkg.tar.zst
Those ones don't have the problem too, CPU usage as usual and MariaDB running without problems...
Offline
$ git bisect good
Bisecting: 100 revisions left to test after this (roughly 7 steps)
[af20ce74201d5b61f8dbb248a9901856550b7fc3] net: dsa: sja1105: always enable the send_meta options
https://drive.google.com/file/d/1EKY4-d … sp=sharing linux-6.4.3.r700.gaf20ce74201d-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1IR_UFw … sp=sharing linux-headers-6.4.3.r700.gaf20ce74201d-1-x86_64.pkg.tar.zst
Offline
Yup, no issues in r700 as well. Just to be sure i will send my current uname -a
Linux Navi 6.4.3-1-00700-gaf20ce74201d #1 SMP PREEMPT_DYNAMIC Thu, 20 Jul 2023 20:35:40 +0000 x86_64 GNU/Linux
Offline
This is 6.4.4 without Arch's locally applied patches
https://drive.google.com/file/d/1iVijbg … sp=sharing linux-6.4.4-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1U1UpMP … sp=sharing linux-headers-6.4.4-1-x86_64.pkg.tar.zst
Edit:
$ git bisect good
Bisecting: 50 revisions left to test after this (roughly 6 steps)
[1ea89213cdfec04758a57e2d7d0bfab79936aef2] btrfs: fix dirty_metadata_bytes for redirtied buffers
https://drive.google.com/file/d/1YGEDKs … sp=sharing linux-6.4.3.r750.g1ea89213cdfe-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1WvexDd … sp=sharing linux-headers-6.4.3.r750.g1ea89213cdfe-1-x86_64.pkg.tar.zst
Last edited by loqs (2023-07-20 21:23:52)
Offline
Installed, booted and the issue persists, now one core of my CPU is at 100% doing IO-Wait operations while MariaDB is running... If i kill mariadb with
killall mysqld
the CPU usage is back to normal.
Linux Navi 6.4.4-1 #1 SMP PREEMPT_DYNAMIC Thu, 20 Jul 2023 20:53:01 +0000 x86_64 GNU/Linux
What a silly bug...
Offline
Please see my edit to post #11 which has the next bisection kernel.
Offline
No issues in r750. CPU usage normal and MariaDB running.
Linux Navi 6.4.3-1-00750-g1ea89213cdfe #1 SMP PREEMPT_DYNAMIC Thu, 20 Jul 2023 21:06:28 +0000 x86_64 GNU/Linux
At this point we will gonna reach the last commit before the 6.4.4 was released
Offline
From skimming the changelog, I'm suspecting bd4f737b145d85c7183ec879ce46b57ce64362e1 resp. https://lore.kernel.org/all/20230707162 … arazel.de/
Online
At this point we will gonna reach the last commit before the 6.4.4 was released
One possibility is only seven commits before 6.4.4 https://git.kernel.org/pub/scm/linux/ke … 7ce64362e1
$ git bisect good
Bisecting: 25 revisions left to test after this (roughly 5 steps)
[103d3437b3c69096802dc657f3befffd2cbdf124] kbuild: Add CLANG_FLAGS to as-instr
https://drive.google.com/file/d/1swAAWq … sp=sharing linux-6.4.3.r775.g103d3437b3c6-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1WmeaEa … sp=sharing linux-headers-6.4.3.r775.g103d3437b3c6-1-x86_64.pkg.tar.zst
Offline
CPU usage is normal and MariaDB is running.
Linux Navi 6.4.3-1-00775-g103d3437b3c6 #1 SMP PREEMPT_DYNAMIC Thu, 20 Jul 2023 21:36:39 +0000 x86_64 GNU/Linux
Offline
6.4.4 with bd4f737b145d85c7183ec879ce46b57ce64362e1 reverted
https://drive.google.com/file/d/10VBoS0 … sp=sharing linux-6.4.4-1.1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1cXSX5A … sp=sharing linux-headers-6.4.4-1.1-x86_64.pkg.tar.zst
Offline
Woop, sorry for the late response, that didn't do the trick. The CPU usage is normal as usual.
Linux Navi 6.4.4-1.1 #1 SMP PREEMPT_DYNAMIC Thu, 20 Jul 2023 22:24:53 +0000 x86_64 GNU/Linux
Offline
All the other commits that are left are extremely unlikely to be the cause, so this must be the offender.
You'll want to report this to the kernel bugzilla as a regression
Online
I was also experiencing this issue after upgrading to 6.4.4, and after building my own kernel with bd4f737b145d85c7183ec879ce46b57ce64362e1 reverted I can confirm that the issue has disappeared.
I'd file an upstream bug myself but I'm not sure which product and component to put it in. I'd guess the IO/Storage product and the Other component?
Offline
I would suggest Component: Block Layer Product: IO/Storage which comes under Jens Axboe who applied the causal commit.
$ perl scripts/get_maintainer.pl io_uring/io_uring.c
Jens Axboe <axboe@kernel.dk> (maintainer:IO_URING)
Pavel Begunkov <asml.silence@gmail.com> (reviewer:IO_URING)
io-uring@vger.kernel.org (open list:IO_URING)
linux-kernel@vger.kernel.org (open list)
Last edited by loqs (2023-07-21 10:57:04)
Offline
Just here to report my experience to gather more data for the report.
I have a Ryzen 5 1400 with 8 cores, and two of them were pinged with iowait to 98-100% as soon as I logged in to Plasma. Root drive is an NVMe SSD with EXT4 formatting.
Tried using htop, iotop, atop, iostat, ps -auxf and other tools sites suggest to debug high I/O wait, but no process seemed to be the culprit, even when running those tools with root in case it was an elevated process.
The kalendar stuff seems to be the issue. I use KMail, and despite I disabled it and others apps from autostart, the problem persisted. Killing kalendarac solved the iowait from one core, but the other remained.
Funny enough, stopping mariadb from systemd did nothing, but killing mysqld did stop the other rogue core.
I made a bow to not perpetrate the meme, so I don't RTMF people, and I don't even say the name of the distro unless I am asked direclty.
Offline
Hello, I also seem to be affected; posting this if it helps. My environment matches by:
- very new install
- running KDE Plasma with Wayland
- running on an NVMe drive with EXT4 formatting (also LUKS on LVM, if it's related)
- running on a Thinkpad (but not a super recent one, it's a T490, Whiskey Lake era)
- no process shows up by diagnostic tools (htop io tab, iotop)
- also running kernel 6.4.4 (full uname: Linux redjoker 6.4.4-arch1-1 #1 SMP PREEMPT_DYNAMIC Wed, 19 Jul 2023 19:19:38 +0000 x86_64 GNU/Linux)
Killing mysqld did solve the iowait issue (stopping mariadb did not).
(This thread came up by googling "arch single core iowait", I didn't expect the issue to be this recent)
Last edited by execthts (2023-07-22 15:22:50)
Offline
Please report the bug upstream on the kernel bugzilla or mailing lists rather than this thread so the kernel developers become aware of the issue and can fix it.
Offline