You are not logged in.

#1 2015-12-19 05:32:53

MikeDacre
Member
From: San Francisco Bay Area
Registered: 2013-01-18
Posts: 51
Website

[SOLVED] Create Volatile Files and Directories causes boot hang

I run a cluster of 20 machines, and I just did an upgrade on them. I can only do an upgrade about once every three months because of the need to maintain continuous uptime, so my last update before this one was in September.

I upgraded all software on the head node just fine using pacman, however, on all 20 of the slave nodes, the nodes will now not boot at all. They hang permanently with the error:

   

A start job is running for Create Volatile Files and Directories.

This error never goes away, here is a screen shot from 2 hours in:

http://i.imgur.com/NnKd4o7.jpg

I have read and tried all of the solutions in the following three posts:
- https://bbs.archlinux.org/viewtopic.php?id=196018 — I do not have btrfs
- https://bbs.archlinux.org/viewtopic.php?id=196341 — The root drives are ext4 and they have ACLs enabled
- https://bbs.archlinux.org/viewtopic.php?id=199940 — I removed /usr/lib/tmpfiles.d/journal-nocow.conf

Also, what I found from looking at other forums:
- http://www.overclock.net/t/1584449/fail … ctories/10 — I don't have a radeon card

What I have tried so far:
Mounting the root drive with arch-chroot on a different machine and:
- Confirming that the drive does have acls enabled
- Confirming the /etc/fstab is correct
- Removing /var/log/journal
- Running

pacman -Syu

to make sure everything is installed
- Deleting all .conf files in /etc/systed and then reinstalling systemd

I have tried to boot the machine with the fallback kernel, also with no effect.

journalctl -D /mnt/node/var/log/journal.bak -p 3 -x

produces no messages related to the boot. The last message is prior to reboot.

At this point I have absolutely no idea what to do and it looks like I am going to have to reinstall all of the nodes from scratch, which is at least a whole day of work right when I am supposed to be going on holiday.

Any help will be extremely appreciated.

Last edited by MikeDacre (2015-12-30 19:58:07)

Offline

#2 2015-12-19 23:22:07

MikeDacre
Member
From: San Francisco Bay Area
Registered: 2013-01-18
Posts: 51
Website

Re: [SOLVED] Create Volatile Files and Directories causes boot hang

Just a quick follow up. I tried to switch over to a diskless configuration today. I got it working and booting fine, but then I ran pacman to install all of the software I need and the same error reported above is back. The pacman command was:

pacman -Syy bash-completion blas boost boost-libs bowtie bowtie2 cblas clustalx cronie ctags dhclient dhcp dos2unix eigen emacs emacs-python-mode fcgene hwloc ifplugd ipython ipython2 java-runtime-common java-environment-common jdk6 jdk7-openjdk lapack less-mouse lrzip mercurial mlocate mprime mrbayes muscle nicstat ntp openmpi openssh p7zip parallel pbget pbzip2 picard-tools pigz plink r samtools-git sqlite sshfs strace stunnel sudo tcsh tmux tophat-bin ucsc_genome_tools unrar unzip vcftools vim-python3 vim-bufexplorer vim-systemd wget xz zip zsh

A full package list on the old machines is here: https://www.dropbox.com/s/fnjugr8eht03g … s-old?dl=0

All of that software, with no exceptions, is also installed on the login node, which has different hardware. It boots fine. None of the nodes boot once I have my software on them though. I am not sure which package is causing the problem yet.

Offline

#3 2015-12-20 02:38:08

MikeDacre
Member
From: San Francisco Bay Area
Registered: 2013-01-18
Posts: 51
Website

Re: [SOLVED] Create Volatile Files and Directories causes boot hang

Figured it out by using a diskless boot and going through the install process one step at a time, rebooting every time until I figured out what did it.

The error is caused by having x-systemd.automount in the fstab on my nodes. Here is a sample of one of the nfs mount lines from fstab:

192.168.0.2:/home                           /home                   nfs         defaults,vers=3,bg,hard,intr,noauto,x-systemd.automount  0 0

Removing the x-systemd.automount entry from the options fixed the problem for me. I can confirm that this entry alone is enough to cause this error. The vers=3 entry actually caused the mount to fail (and be backgrounded) in this case, but removing that entry allowed the nfs drive to mount easily, but the x-systemd.automount entry still caused the error.

I tried to put the same entry into the fstabs of my other arch linux machines and was unable to recreate the error on those machines. I do not have the time to continue investigating exactly how to reproduce this error.

My systemd version is  228-3

Last edited by MikeDacre (2015-12-20 02:39:08)

Offline

#4 2015-12-30 19:59:14

MikeDacre
Member
From: San Francisco Bay Area
Registered: 2013-01-18
Posts: 51
Website

Re: [SOLVED] Create Volatile Files and Directories causes boot hang

I marked this as solved even though I believe this is a bug in the new version of systemd as I don't have time to submit and maintain a bug report, and this workaround does solve the problem. I hope this helps someone else.

Offline

#5 2016-08-15 19:19:52

kozaki
Member
From: London >. < Paris
Registered: 2005-06-13
Posts: 670
Website

Re: [SOLVED] Create Volatile Files and Directories causes boot hang

They hang permanently with the error:
   

A start job is running for Create Volatile Files and Directories.

Thanks for pointing out to x-systemd.automount.
Had a related:

Failed to start Create Volatile Diles and Directories

on a Mageia VM with systemd-230 after kernel upgrade (4.7rc → 4.7) and whatever 4.7.0 kernel I choose. Dropped in Mageia rescue shell (nice one).

I could finish booting manualy with:

mount -o remount,acl / && systemctl default

Cause in this case: The root fs was commented out in fstab. But both `rootflags`and 'rw' boot options were missing (kernel boot line).

Correcting both with `rw rootfstype=ext4 rootflags=<options>,acl` allowed `systemd-remount-fs.service` to handle the root fs properly (wiki). *Or* uncommenting the root fs line in fstab.


Seeded last month: Arch 50 gig, derivatives 1 gig
Desktop @3.3GHz 8 gig RAM, linux-ck
laptop #1 Atom 2 gig RAM, Arch linux stock i686 (6H w/ 6yrs old battery smile) #2: ARM Tegra K1, 4 gig RAM, ChrOS
Atom Z520 2 gig RAM, OMV (Debian 7) kernel 3.16 bpo on SDHC | PGP Key: 0xFF0157D9

Offline

Board footer

Powered by FluxBB