You are not logged in.
Pages: 1
Hi,
I frequently (every few seconds) see high CPU usages of two systemd processes (one by root, PID 1) and another by the user.
It seems that this leads to a lag of "ls -l" ("ls" itself works fine) and also of "ssh"ing into the computer.
"strace" of the user process during a laggy "ls -l" leads to
strace: Process 963 attached
newfstatat(AT_FDCWD, "/dev/disk/by-diskseq/32985", {st_mode=S_IFBLK|0660, st_rdev=makedev(0x8, 0x20), ...}, 0) = 0
newfstatat(AT_FDCWD, "/dev/disk/by-id/usb-Generic-_SD_MMC_MS_PRO_20120926571200000-0:0", {st_mode=S_IFBLK|0660, st_rdev=makedev(0x8, 0x20), ...}, 0) = 0
newfstatat(AT_FDCWD, "/dev/disk/by-path/pci-0000:00:14.0-usb-0:6.3:1.0-scsi-0:0:0:0", {st_mode=S_IFBLK|0660, st_rdev=makedev(0x8, 0x20), ...}, 0) = 0
newfstatat(AT_FDCWD, "/home/user/.config/systemd/user", 0x7ffef23b6af0, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/etc/xdg/systemd/user", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
newfstatat(AT_FDCWD, "/etc/systemd/user", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
newfstatat(AT_FDCWD, "/run/user/1000/systemd/user", 0x7ffef23b6af0, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/run/systemd/user", 0x7ffef23b6af0, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/home/user/.local/share/systemd/user", 0x7ffef23b6af0, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/usr/local/share/systemd/user", 0x7ffef23b6af0, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/usr/share/systemd/user", 0x7ffef23b6af0, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/usr/local/lib/systemd/user", 0x7ffef23b6af0, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/usr/lib/systemd/user", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
getrandom("\x38\x33\xb3\x9c\x60\x7a\x80\x4a\xb1\x94\xf8\xd5\x55\xfb\xe0\x53", 16, GRND_INSECURE) = 16
gettid() = 963
epoll_wait(4, [], 67, 0) = 0
close(22) = 0
close(23) = 0
close(24) = 0
close(25) = 0
gettid() = 963
epoll_wait(4,
And then continuing after completion of "ls -l" with
[{events=EPOLLIN, data={u32=3509636864, u64=94119127994112}}], 67, -1) = 1
recvmsg(15, {msg_name={sa_family=AF_NETLINK, nl_pid=567, nl_groups=0x000002}, msg_namelen=128 => 12, msg_iov=[{iov_base=[{prefix="libudev", magic=htonl(0xfeedcafe), header_size=40, properties_off=40, properties_len=959, filter_subsystem_hash=htonl(0xf0031db7), filter_devtype_hash=htonl(0x7bcbc5ee), filter_tag_bloom_hi=htonl(0x2000400), filter_tag_bloom_lo=htonl(0x10800000)}, "UDEV_DATABASE_VERSION=1\0ACTION=c"...], iov_len=8192}], msg_iovlen=1, msg_control=[{cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS, cmsg_data={pid=2125880, uid=0, gid=0}}], msg_controllen=32, msg_flags=0}, 0) = 999
getrandom("\xf2\xe0\xec\x3a\x03\xeb\x34\xd8\x1e\x38\xad\xd1\xde\x78\xef\x20", 16, GRND_INSECURE) = 16
getrandom("\x59\xb6\xe9\x55\x76\x5f\x60\x3b\x43\x24\xf2\xe1\x0e\x85\x2c\xaf", 16, GRND_INSECURE) = 16
"strace" on the root systemd process during a "ls -l" leads to a lot of output like
sendmsg(58, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\4\1\1a\0\0\0\32\201\303az\0\0\0\1\1o\0\31\0\0\0/org/fre"..., iov_len=144}, {iov_base="\"\213n '\0\0\0/org/freedesktop/systemd"..., iov_len=97}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 241
sendmsg(58, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\4\1\1a\0\0\0\33\201\303az\0\0\0\1\1o\0\31\0\0\0/org/fre"..., iov_len=144}, {iov_base="!\213n '\0\0\0/org/freedesktop/systemd"..., iov_len=97}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 241
sendmsg(58, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\4\1\1a\0\0\0\34\201\303az\0\0\0\1\1o\0\31\0\0\0/org/fre"..., iov_len=144}, {iov_base=" \213n '\0\0\0/org/freedesktop/systemd"..., iov_len=97}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 241
sendmsg(58, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\4\1\1a\0\0\0\35\201\303az\0\0\0\1\1o\0\31\0\0\0/org/fre"..., iov_len=144}, {iov_base="\37\213n '\0\0\0/org/freedesktop/systemd"..., iov_len=97}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 241
sendmsg(58, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\4\1\1a\0\0\0\36\201\303az\0\0\0\1\1o\0\31\0\0\0/org/fre"..., iov_len=144}, {iov_base="\36\213n '\0\0\0/org/freedesktop/systemd"..., iov_len=97}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 241
"journalctl" does not show any suspicious outlook during the "ls -l" process.
My only clue so far is that this is related to some disk problems. The "ls -l" occurs in a directory from a SSD. Further there are two HDDs combined as RAID 1 via a hardware RAID controller. The latter is mounted with the options "noauto,x-systemd.automount,x-systemd.idle-timeout=1min,rw,relatime,stripe=16".
How can I debug this further?
Offline
Step 1 - verify you have a complete backup of all your important data. If not, create one immediately. [Remember, there are two kinds of people: those who plan to take backups and those that take backups.]
Once you have a verified, complete backup, first check the disk(s) for hardware faults.
Offline
"journalctl" does not show any suspicious outlook during the "ls -l" process.
"sudo journalctl" or "dmesg", are there any IO errors? "sudo journalctl -b | grep DRDY"?
The "ls -l" occurs in a directory from a SSD.
Does it matter which directory?
Does the SSD matter?
a lot of output like
looks like a dbus issue, but the bus sockets should™ be on some tmpfs (/run/somewhere)
echo $DBUS_SESSION_BUS_ADDRESSMaybe the bus is jammed, but idk how the "ls -l" (and most likely the file stat'ing) would play into that.
You can monitor the dbus activity w/
"dbus-monitor --session" and "dbus-monitor --system"
Lastly, let's just blame lennart (because it's the most fun thing in the world to do):
If you don't systemd-automount and more importantly (iirc there was another thread complaining that this didn't work anymore at all) don't idle-timeout *anything*, does the behavior remain?
(Cause that would explain how disk access could cause activity on the session bus and also it's always good to blame lennart
)
Offline
Thanks, twelveeighty and seth!
I think I found the issue, which is an empty USB HDD Docking station I had been using previously. After turning that off and on again(TM) I do not experience the issue anymore.
Once this comes back, I will check what you suggested. I still would like to understand why this kept 2 systemd processes at 100% CPU usage.
Offline
Pages: 1