You are not logged in.

#1 2018-07-21 02:42:07

kode54
Member
Registered: 2013-10-21
Posts: 20

Boot stuck at 'Starting version 239'

Both after the 4.17.6 and 4.17.8 kernel updates, my system tends to lock up during boot, right after 'Starting version 239'.

System info:

System:
  Host: umaro Kernel: 4.17.8-1-ARCH x86_64 bits: 64 Console: tty 2 
  Distro: Arch Linux 
Machine:
  Type: Desktop Mobo: Gigabyte model: EP45-UD3R serial: <root required> 
  BIOS: Award v: F12 date: 01/25/2010 
CPU:
  Topology: Quad Core model: Intel Core2 Quad Q9650 bits: 64 type: MCP 
  L2 cache: 6144 KiB 
  Speed: 2666 MHz min/max: 2000/3000 MHz Core speeds (MHz): 1: 2730 2: 2535 
  3: 2764 4: 2571 
Graphics:
  Card-1: AMD Curacao XT / Trinidad XT [Radeon R7 370 / R9 270X/370X] 
  driver: radeon v: kernel 
  Display: server: No display server data found. Headless machine? 
  tty: 80x24 
  Message: Unable to show advanced data. Required tool glxinfo missing. 
Audio:
  Card-1: Intel 82801JI HD Audio driver: snd_hda_intel 
  Card-2: AMD Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series] 
  driver: snd_hda_intel 
  Card-3: Creative Labs EMU20k1 [Sound Blaster X-Fi Series] 
  driver: snd_ctxfi 
  Sound Server: ALSA v: k4.17.8-1-ARCH 
Network:
  Card-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet 
  driver: r8169 
  IF: enp4s0 state: up speed: 1000 Mbps duplex: full mac: 00:1f:d0:d4:3e:ae 
  IF-ID-1: br-416fd823133a state: up speed: N/A duplex: N/A 
  mac: 02:42:97:a5:71:8c 
  IF-ID-2: br-426d78a02f57 state: up speed: N/A duplex: N/A 
  mac: 02:42:fd:98:c5:8b 
  IF-ID-3: docker0 state: down mac: 02:42:aa:91:eb:a5 
  IF-ID-4: veth0f8bd52 state: up speed: 10000 Mbps duplex: full 
  mac: 7e:fc:50:c0:f2:eb 
  IF-ID-5: veth4218766 state: up speed: 10000 Mbps duplex: full 
  mac: 5a:8f:99:e5:fa:6d 
  IF-ID-6: veth513e6a2 state: up speed: 10000 Mbps duplex: full 
  mac: 1e:c1:0c:4c:d4:e3 
  IF-ID-7: veth52b8206 state: up speed: 10000 Mbps duplex: full 
  mac: 7a:76:60:cc:e6:61 
  IF-ID-8: vethb9efcc5 state: up speed: 10000 Mbps duplex: full 
  mac: 3e:1f:c5:94:25:7e 
  IF-ID-9: vethe00393c state: up speed: 10000 Mbps duplex: full 
  mac: 7e:62:29:68:5f:15 
  IF-ID-10: vethe309c34 state: up speed: 10000 Mbps duplex: full 
  mac: c6:7c:ca:6d:03:50 
Drives:
  Local Storage: total: 18.43 TiB used: 5.22 TiB (28.3%) 
  ID-1: /dev/sda vendor: Samsung model: SSD 850 PRO 256GB size: 238.47 GiB 
  ID-2: /dev/sdb vendor: Toshiba model: HDWE160 size: 5.46 TiB 
  ID-3: /dev/sdc vendor: Toshiba model: HDWE160 size: 5.46 TiB 
  ID-4: /dev/sdd vendor: Western Digital model: WD40EFRX-68WT0N0 
  size: 3.64 TiB 
  ID-5: /dev/sde vendor: Western Digital model: WD40EFRX-68WT0N0 
  size: 3.64 TiB 
RAID:
  Device-1: storage type: zfs status: ONLINE size: 9.06 TiB free: 3.85 TiB 
  array-1: mirror status: ONLINE size: 3.62 TiB free: 1.44 TiB Components: 
  online: N/A 
  array-2: mirror status: ONLINE size: 5.44 TiB free: 2.41 TiB Components: 
  online: N/A 
Partition:
  ID-1: / size: 62.75 GiB used: 11.67 GiB (18.6%) fs: ext4 dev: /dev/sda1 
  ID-2: /home size: 162.86 GiB used: 1.81 GiB (1.1%) fs: ext4 dev: /dev/sda3 
  ID-3: swap-1 size: 8.00 GiB used: 0 KiB (0.0%) fs: swap dev: /dev/sda2 
Sensors:
  System Temperatures: cpu: 45.0 C mobo: 35.0 C gpu: radeon temp: 40 C 
  Fan Speeds (RPM): cpu: 1503 fan-2: 0 fan-3: 0 
Info:
  Processes: 253 Uptime: 26m Memory: 7.79 GiB used: 6.27 GiB (80.5%) 
  Init: systemd Shell: bash inxi: 3.0.18

I managed to make it boot this time by deleting the intel-ucode image from the initrd line. It's currently running ucode version 0xa07, and the image appears to provide version 0xa0b, while Intel's own information charts show the latest to be 0xa0e. Intel's own update guidance lies and says the CPU is ID 0x10677 with microcode 0x70d, but I know in fact that it is 0x1067a, which their chart says has microcode up to 0xa0e.

Microcode check with iucode_tool shows the following:

$ bsdtar -Oxf /boot/intel-ucode.img | iucode_tool -tb -lS -
iucode_tool: system has processor(s) with signature 0x0001067a
microcode bundle 1: (stdin)
selected microcodes:
  001/112: sig 0x00010676, pf_mask 0x80, 2010-09-29, rev 0x060f, size 4096
  001/113: sig 0x00010676, pf_mask 0x40, 2010-09-29, rev 0x060f, size 4096
  001/114: sig 0x00010676, pf_mask 0x10, 2010-09-29, rev 0x060f, size 4096
  001/115: sig 0x00010676, pf_mask 0x04, 2010-09-29, rev 0x060f, size 4096
  001/116: sig 0x00010676, pf_mask 0x01, 2010-09-29, rev 0x060f, size 4096
  001/117: sig 0x00010677, pf_mask 0x10, 2010-09-29, rev 0x070a, size 4096
  001/118: sig 0x0001067a, pf_mask 0xa0, 2010-09-28, rev 0x0a0b, size 8192
  001/119: sig 0x0001067a, pf_mask 0x44, 2010-09-28, rev 0x0a0b, size 8192
  001/120: sig 0x0001067a, pf_mask 0x11, 2010-09-28, rev 0x0a0b, size 8192

I have no idea which of those pf_masks applies to my CPU.

For now, I am uninstalling the intel-ucode package, until such time as I can determine the safety of using it on a machine this old.

Last edited by kode54 (2018-08-22 02:52:12)

Offline

#2 2018-08-22 02:51:52

kode54
Member
Registered: 2013-10-21
Posts: 20

Re: Boot stuck at 'Starting version 239'

Looks like 4.18.3 is doing this whether or not the intel-ucode package is installed. And now, after about a minute or two, some crap about rcu_preempt detecting a stall in some CPU tasks. I can't dump a log since the machine is locked up, so I would have to photograph the monitor and transcribe it by hand.

Offline

#3 2018-08-22 03:30:13

the-bird-is-the-word
Member
Registered: 2014-08-26
Posts: 4

Re: Boot stuck at 'Starting version 239'

Forum topic here covers this issue:
https://bbs.archlinux.org/viewtopic.php?id=239672
Has more information, but no long term solution.

Offline

#4 2019-02-03 15:06:32

pthoem
Member
Registered: 2019-02-03
Posts: 2

Re: Boot stuck at 'Starting version 239'

I've had a similar problem, and I posted my solution here...

https://github.com/systemd/systemd/issues/9529

Offline

#5 2019-02-03 15:20:22

pthoem
Member
Registered: 2019-02-03
Posts: 2

Re: Boot stuck at 'Starting version 239'

Quote from my github post...

I've had the same problem for months now with ArchLinux 4.18.16 (was driving me mad), but finally I could solve it after many many tests and debugging. The problem comes somehow from the `systemd-fsck@dev-sdxy.service` (xy = a1, a2, ..., b1, b2, ...). So I did following:

1.) I disabled checking my root file system at boot time by adding 'fsck.mode=skip' to the bootloader's kernel command line...

   # vi /boot/grub/grub.cfg
   ...
   menuentry "Linux, ..." {
     linux   /boot/vmlinuz-linux root=... fsck.mode=skip

2.) I disabled all boot-time fsck by setting the pass parameter (6th column) in /etc/fstab to 0...

   # vi /etc/fstab
   ...
   /dev/sda1  /              ext4   rw,relatime,data=ordered  0 1
   /dev/sda2  none           swap   defaults                  0 0
   /dev/sda3  /data0         ext4   defaults                  0 0
   /dev/sdb1  /data1         ext4   defaults                  0 0
   /dev/sdc1  /data2         ext4   defaults                  0 0

3.) I avoid starting X (startx) automatically, but I boot into the text console only

4.) I updated ~/.bashrc such, that the main data disks are check after I logged in...

   # vi /etc/sudoers (SHIFT + G, ESC + I)
   ...
   Cmnd_Alias CMDS1 = /usr/bin/mount,/usr/bin/umount,/usr/bin/fsck
   myusername ALL = (root) NOPASSWD: CMDS1

   # vi /home/myusername/.bashrc   (SHIFT + G, ESC + I)
   ...
   if [ "$DISPLAY" = "" ]
   then
     echo " "
     sudo umount /dev/sdb1
     sudo fsck -y /dev/sdb1
     sudo umount /dev/sdc1
     sudo fsck -y /dev/sdc1
     sudo mount -a
     echo " "
   fi

That's it!

So now, after I logged in to the text console, I first watch all messages to be OK, then I type startx to get into my desktop manager (XFCE4). From there I log out (not shut down), when finished with my work. Then I type shutdown now on the text console to shutdown the system.

Offline

#6 2019-02-03 19:11:43

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,427

Re: Boot stuck at 'Starting version 239'

This really doesn't sound like a generally good idea or suggestion, nor really related to the original issue. Boot time fscks can be quite important, you should find the root cause. Did you mess with your mkinitcpio config?

That said, going from the last post, this concerned a well known timing issue with early 4.18 kernels, that does not have a relation to the issue you are currently seeing. If you'd like to follow this further, please open up your own thread.

Closing.

Offline

Board footer

Powered by FluxBB