[SOLVED] Waiting for process fsck.ext4 on reboot

javex · 2023-07-31 11:59:16

I noticed something odd recently when I rebooted my server. The machine runs headless so the issue may have existed for a while without me noticing, but I don't think so, because I think I would have noticed the delay on reboot. Essentially, when running "reboot", the following is the last line that's displayed:

[1399.YYYYYYY] systemd-shutdown[1]: Waiting for process: XXX (fsck.ext4)

where "XXX" is a three digit PID, usually in the 500s to 600s (i.e. relatively early). (The YYYYYY are some numbers, presumably time related)

When checking the running system before reboot I can find this PID:

% ps -ef | grep fsck
root         633       1  0 21:08 ?        00:00:00 fsck.ext4 -a -C0 /dev/mapper/system-root

As you can tell, this my root partition - I don't know why fsck would still be running on it.

My /etc/fstab:

#
# /etc/fstab: static file system information
#
# <file system> <dir>   <type>  <options>       <dump>  <pass>
# /dev/mapper/system-root
UUID=444258dc-fc7e-4762-90dc-4707715d5b70       /               ext4            rw,relatime,data=ordered        0 1

# /dev/sda1
UUID=49FA-B368                                  /boot           vfat            rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,utf8,errors=remount-ro   0 2

# /dev/mapper/system-home
UUID=2fd16afc-ade9-4b6f-8e9f-8d2807edba11       /home           ext4            rw,relatime,data=ordered        0 2

# /dev/mapper/system-var
UUID=59ba5ef0-5cdd-4df5-9d53-0729d8c04d1f       /var            ext4            rw,relatime,data=ordered        0 2

# /dev/mapper/system-swap
UUID=e073331f-1faf-4b9c-beae-6b1c4506ce4b       none            swap            defaults        0 0

# ramdisk
none    /mnt/ramdisk    ramfs   defaults        0       0

Running "dmesg | grep -i fsck":

[   65.162777] systemd[1]: Created slice Slice /system/systemd-fsck.

Running "journalctl -b-1 | grep -i fsck"

Jul 31 21:03:01 media-server.home.xevaj.eu systemd[1]: Created slice Slice /system/systemd-fsck.
Jul 31 21:03:01 media-server.home.xevaj.eu systemd-fsck[769]: /dev/mapper/system-var: clean, 288136/1966080 files, 6109410/7864320 blocks
Jul 31 21:03:01 media-server.home.xevaj.eu systemd-fsck[783]: /dev/mapper/system-home: clean, 126557/1310720 files, 2551433/5242880 blocks

I specifically took "previous boot" so that I'd have the shutdown logs as well, the current boot "b0" looks the same. And to make sure here's my /etc/mkinitcpio.conf (which I have regenerated with mkinitcpio -P to make sure). I removed comments to make it shorter.

# vim:set ft=sh
MODULES=""

BINARIES=""

FILES=""

HOOKS="base udev autodetect keyboard keymap consolefont modconf block mdadm_udev netconf dropbear encryptssh lvm2 filesystems fsck"

I have also booted a live system and manually run fsck on all partitions (/dev/sda1 as /boot and the three LVM partitions), everything comes back clean.

So at this point I'm a bit lost... fsck is clean and should be run during init. Because the parent PID is 1 and I can't find anything that tells me what started it, I don't know how to even track it down. Any tips?

Last edited by javex (2023-08-01 13:10:23)

Lone_Wolf · 2023-08-01 10:13:32

some other process could be stuck causing the wait.
https://wiki.archlinux.org/title/System … ribly_long should be useful

javex · 2023-08-01 13:09:56

Thank you for the pointers Lone_Wolf! I looked through the steps and other articles linked from there, but there was simply nothing in the logs. I was considering logging to the boot process and decided to do a bit more investigating.

I realised that the first PID that was logged was higher than the fsck.ext4 PID so I thought maybe that means the process is started during init. Upon reading the code for "/usr/lib/initcpio/install/fsck" I saw that it had a section like this:

                if [[ -e /etc/e2fsck.conf ]]; then
                    add_file /etc/e2fsck.conf
                fi

This is actually I change I made recently, I added that file and added some logging options to it:

[options]
         log_dir = /var/log/e2fsck
         log_filename = e2fsck-%N.%h.INFO.%D-%T
         log_dir_wait = true

I ran "lsinitcpio -x initramfs-linux-lts.img" in a temporary directory and sure enough my config file is in there. Now "man 5 e2fsck.conf" actually says the following:

       log_dir_wait
              If  this  boolean relation is true, them if the directories specified by log_dir or log_dir_fallback are not available or are not yet writable, e2fsck will save the output in a memory buffer, and a child process will periodically test to see if the log direc‐
              tory has become available after the boot sequence has mounted the requested file system for reading/writing.  This implements the functionality provided by logsave(8) for e2fsck log files.

So that should have been fine, but I decided to rename the file and re-run "mkinitcpio -P" and reboot.

And what do you know, the process has disappeared! I don't know if it's related to one of the options in there, but I don't need the configuration anyway so I've removed it. This also explains why it only suddenly appeared, since the config file was now.

I hope this is useful for someone in the future that comes across this issue. I don't know if this is a bug or I have a bad config or a missing piece of config, so I don't know if there's a "fix" for this.

Arch Linux

#1 2023-07-31 11:59:16

[SOLVED] Waiting for process fsck.ext4 on reboot

#2 2023-08-01 10:13:32

Re: [SOLVED] Waiting for process fsck.ext4 on reboot

#3 2023-08-01 13:09:56

Re: [SOLVED] Waiting for process fsck.ext4 on reboot

Board footer