You are not logged in.

#1 2014-04-30 02:51:53

mac57
Member
From: St. Somewhere
Registered: 2006-01-06
Posts: 302
Website

[SOLVED] Arch Linux Duke (2007) Fails to Boot

Folks, I have a unique and challenging problem that has exhausted my Arch Linux skills, and so I am now turning to you.

I have a vintage Pentium Pro 200 system (that’s 200 MHz folks! – 200 MHz 686 architecture – the original 686!), two CPUs, running a dual boot between Windows NT 4.0 and Arch Linux Duke (2007). It has 512 MB of RAM and a 120 GB hard drive, partitioned up between Windows NT and Linux. I built this system new in 2007, hence the dated version of Arch.  It has run like a charm all these years, granted not getting that much use. After about a year of no use at all, I fired the system up last week to help with a little research for a blog post I was writing on networking Windows NT 4.0 and Mac OS 8.6. Windows NT 4.0 fired right up with no issue, and after I was done testing what needed to be tested I tried to boot over to Arch.

After a year of disuse, Arch unexpectedly and stubbornly refused to boot. The boot process started up just fine, but towards the end, it declared that it could not mount the root file system on the root device and took a kernel panic and stopped. My Arch skills have gotten a bit rusty in the last few years, but I dusted them off and went to work. My guess was a file system or superblock error. Arch wouldn’t boot, but I dragged out my trusty RIPLinux 2.9 Rescue Live CD and fired it up. It came right up and ran, and I was able to mount the Arch partition and view all the files… everything seemed to be there; it just wouldn’t boot. Windows NT 4.0 AND RIPLinux both boot and run on the machine, so the hardware is fine as well.

A little information on the disk layout. Windows NT 4.0 is in the first partition on the hard drive. The extended partition has a second Windows NT 4.0 partition (sort of a /home partition for Windows NT 4.0), followed by the main Arch partition (the one I am trying to boot), followed by a swap partition and then the largest partition, which I use to share data between Arch and Windows NT 4.0 (I have loaded an ext2/3 driver into Windows NT 4.0 and it happily accesses the Linux partitions on the box).

RIPLinux’s e2fsck did find some issues with the Arch partition and I had it repair them all. I checked again afterwards that all the files were still there, and they were. With the partition now known to be clean, and the superblock repaired from one of the backups, all should have been well. However, Arch still wouldn’t (and still won’t) boot.

RIPLinux has a kind of a chain loader function, so I had it attempt to start up Arch for me. However, this was flummoxed by the fact that Arch addresses all my hard drive partitions as /dev/sdax and RIPLinux addresses them as /dev/hdax. Hence, without a common language, it was hard to get the one to start the other. Still, using this function, I have been able to get a crippled version of Arch running on the machine again. No modules had been loaded, and so it couldn’t do almost anything, but there it was (and is), Arch Linux Duke, at the CLI level. From there, I can see all the files, I can move freely in and out of my user account and the root account, but I can’t make the thing actually boot properly.

If you have read this far, you are a trooper.  Summarizing what I know, the hardware is good, the file system is clean, the superblock is good, I can mount it cleanly from a live CD and I can chain load a crippled version of Arch. Here is the boot process blow-by-blow. When I try to do a normal boot, the Windows NT 4.0 loader passes control to the Lilo boot sector I have placed on hda1 (sda1 in Duke’s parlance). Lilo takes over, present a menu and when I select Duke, takes off. Arch Linux Duke starts to boot. It gets a good long way along, all the way along to:

:: Loading udev events                [Pass]
:: Mount root Read-only
:: Checking file systems

This is where it stops.

The next thing I see is:

/dev/sda6
The superblock could not be read or does not describe a correct ext2 filesystem. If the device is valid and it really contains an ext2 filesystem (and not swap or ufs or something else) then the superblock is corrupt and you might try running e2fsck with an alternate superblock:
    E2fsck –b 8193 <device>

I then get a sort of character based splash screen that says

**********FILE SYSTEM CHECK FAILED ****************************
*
*   Please repair manually and reboot. Note that the root file system
*   is currently mounted read-only. To remount it read-write, type:
*   mount –n –o remount,rw /.  When you exit the maintenance
*   shell, the system will reboot automatically
*
*****************************************************************************
Give root password for maintenance

At this point, I give the root password and enter the maintenance shell as root. I typed in “mount” and the first entry I got back is

/dev/sda6 on / type ext3 (rw)

This is exactly the root partition that the start up complains about. It is clearly there.  I can see it, I can walk around it… it is clearly there. Why won’t it boot? Despite the message, the superblock is fine – it passes every test e2fsck can throw at it.

At this point, I did a “e2fsck /dev/hda6 (which is how RIPLinux would have passed it into Arch” and it says it is “clean”. I suspect that the Superblock message is because Arch sees root as sda6, while RIP passed it in as hda6...

Deciding to see what Arch would be seeing as it tried to set things up in the boot sequence, I tried the following next:

# mknod “/dev/root2” b 3 6   

(“3” because RIPLinux refers to my hard drive as IDE, while Arch refers to it by major number “8”, which is SCSI. By the way, it IS an IDE drive – not sure why Arch insists on using the sdx nomenclature instead of hdx)

Then I entered “mount /dev/root2 /mnt/hda6” and “ls /mnt/hda6”

All was well. I can make the node, I can mount it, and I can see the contents. All is clearly well, but something is clearly wrong enough that Arch can’t boot.

I am totally out of ideas. I have tried every trick I know and am out of tricks. I would welcome any insights as to what I could try to get this venerable Arch installation back on its legs.

By the way, the key section of the /etc/lilo.conf file (lest anyone want to know) is:

#
image = /boot/vmlinuz26
   root = /dev/sda6
   label = ArchLinux-Duke
   initrd = /boot/kernel26.img
   read-only
#

I am stumped. Thanks in advance for any and all pointers you may be able to offer.

Last edited by mac57 (2014-06-02 17:42:21)


Cast off the Microsoft shackles Jan 2005

Offline

#2 2014-04-30 08:59:54

Rexilion
Member
Registered: 2013-12-23
Posts: 784

Re: [SOLVED] Arch Linux Duke (2007) Fails to Boot

You missed out a lot lol. Linux uses the SCSI subsystem as a generic way to talk to IDE drives. Hence, they appear as /dev/sd* nodes while in fact they are not. This was done quite a while ago... .

My best guess is this: You have a rather old Arch installation. And maybe you once (or twice) booted a new livecd which mounted your root partition. It is known that once the ext3 filesystems are mounted by a new driver, they could set new flags. Which breaks backward compatibility with older drivers.

I suggest you chroot into the Arch installation and update the ext utility's (e2fsprogs). It could be that fsck is stumbling over these new flags.

A less instrusive alternative would be to just disable the fsck alltogether. I cannot see you tried that.


fs/super.c : "Self-destruct in 5 seconds.  Have a nice day...\n",

Offline

#3 2014-04-30 16:21:50

karol
Archivist
Registered: 2009-05-06
Posts: 25,440

Re: [SOLVED] Arch Linux Duke (2007) Fails to Boot

If I'm reading it right, you haven't been updating your Arch installation at all, right?

Offline

#4 2014-04-30 17:48:18

mac57
Member
From: St. Somewhere
Registered: 2006-01-06
Posts: 302
Website

Re: [SOLVED] Arch Linux Duke (2007) Fails to Boot

Rexilion wrote:

A less instrusive alternative would be to just disable the fsck alltogether. I cannot see you tried that.

Thanks, this sounds like the most efficient test at this point. I am quite comfortable tromping about startup/config files, but I am not sure where the file system check is initiated. Would you be able to advise the name and location of the startup file that does this?


Cast off the Microsoft shackles Jan 2005

Offline

#5 2014-04-30 17:56:27

mac57
Member
From: St. Somewhere
Registered: 2006-01-06
Posts: 302
Website

Re: [SOLVED] Arch Linux Duke (2007) Fails to Boot

karol wrote:

If I'm reading it right, you haven't been updating your Arch installation at all, right?

That is correct. After I got Arch installed, configured and stable, I didn't continue with ongoing updates. Given that the machine is a 200 MHz Pentium Pro, which is a pretty low spec environment to be running a modern OS in, I didn't want to take the ongoing risk of reducing performance and/or breaking it outright as a result of a recent update. Essentially, I froze the system once I had it stable, to keep it stable.

The main reason for putting Arch on the system in the first place was to gain USB and Firewire access to the system (I built in a USB/Firewire combo card for this purpose). Windows NT 4.0 does not support either USB or Firewire due to its early release date, and hence another OS was needed. Arch appeared to be able to run in this very "primitive" hardware environment and so I loaded it, as my enabler for USB and Firewire access. It met this need perfectly, and with astonishingly good performance, and keeping it unconditionally stable then became the paramount concern. Hence after a few weeks of updates, I ceased updating.


Cast off the Microsoft shackles Jan 2005

Offline

#6 2014-04-30 18:09:03

alphaniner
Member
From: Ancapistan
Registered: 2010-07-12
Posts: 2,810

Re: [SOLVED] Arch Linux Duke (2007) Fails to Boot

The superblock could not be read or does not describe a correct ext2 filesystem.

Is that a typo, or is it normal for older fsck to treat ext3 filesystems as ext2?


But whether the Constitution really be one thing, or another, this much is certain - that it has either authorized such a government as we have had, or has been powerless to prevent it. In either case, it is unfit to exist.
-Lysander Spooner

Offline

#7 2014-04-30 19:31:16

brain0
Developer
From: Aachen - Germany
Registered: 2005-01-03
Posts: 1,382

Re: [SOLVED] Arch Linux Duke (2007) Fails to Boot

What you are doing here is pointless. Nobody will be able to solve issues with a 7 year old system. Besides, a current Arch Linux will install and work just fine on 512MB RAM (just don't try to run KDE or GNOME on it).

Offline

#8 2014-04-30 20:27:47

Rexilion
Member
Registered: 2013-12-23
Posts: 784

Re: [SOLVED] Arch Linux Duke (2007) Fails to Boot

brain0 wrote:

What you are doing here is pointless. Nobody will be able to solve issues with a 7 year old system. Besides, a current Arch Linux will install and work just fine on 512MB RAM (just don't try to run KDE or GNOME on it).

Thread starter indicated that part of the freeze was to keep things stable. Yes, I would also update 7 year old systems (I'm writing this from an ~10,5 updated old system).

But if others decide not to, that does not break any warranty whatsoever. (Because there is not any). And the forum rules are not forbidding anyone to ask for help using an aged system.

mac57 wrote:
Rexilion wrote:

A less instrusive alternative would be to just disable the fsck alltogether. I cannot see you tried that.

Thanks, this sounds like the most efficient test at this point. I am quite comfortable tromping about startup/config files, but I am not sure where the file system check is initiated. Would you be able to advise the name and location of the startup file that does this?

There is a binary in PATH called fsck which calls all the others for each seperate (and known fs). Mine is in /usr/bin/fsck, but I think yours should be in /sbin/fsck. I suggest you relocate that fsck and create a new fsck with a symlink to the true binary:

ln -s $(which true) /sbin/fsck

Something like that should work, given that I remembered the old location of fsck.


fs/super.c : "Self-destruct in 5 seconds.  Have a nice day...\n",

Offline

#9 2014-04-30 20:35:31

mac57
Member
From: St. Somewhere
Registered: 2006-01-06
Posts: 302
Website

Re: [SOLVED] Arch Linux Duke (2007) Fails to Boot

mac57 wrote:
Rexilion wrote:

A less instrusive alternative would be to just disable the fsck alltogether. I cannot see you tried that.

Thanks, this sounds like the most efficient test at this point. I am quite comfortable tromping about startup/config files, but I am not sure where the file system check is initiated. Would you be able to advise the name and location of the startup file that does this?

There is a binary in PATH called fsck which calls all the others for each seperate (and known fs). Mine is in /usr/bin/fsck, but I think yours should be in /sbin/fsck. I suggest you relocate that fsck and create a new fsck with a symlink to the true binary:

ln -s $(which true) /sbin/fsck

Something like that should work, given that I remembered the old location of fsck.

Thanks Rexilion. Actually, I think the idea of disabling the fsck entirely is the easiest place to start. I am quite certain that I could not update just the e2tools package without updating half the rest of the system, due to dependencies.

Does anyone know which startup script launches the fsck? I could simply comment that line out and see what happens. Thanks.


Cast off the Microsoft shackles Jan 2005

Offline

#10 2014-05-01 07:04:17

Rexilion
Member
Registered: 2013-12-23
Posts: 784

Re: [SOLVED] Arch Linux Duke (2007) Fails to Boot

I'm using systemd right now. So I have no idea where the old sysinit scripts are located.

I think that replacing fsck with true is the easiest to thing to do in order to disable fsck.

Another option would be (if this is honoured):

man fstab

       The sixth field (fs_passno).
              This field is used by the fsck(8) program to determine the order
              in which filesystem checks are done at reboot  time.   The  root
              filesystem  should be specified with a fs_passno of 1, and other
              filesystems should have a fs_passno of 2.  Filesystems within  a
              drive will be checked sequentially, but filesystems on different
              drives will be checked at the same time to  utilize  parallelism
              available in the hardware.  [b]If the sixth field is not present or
              zero, a value of zero is returned and fsck will assume that  the
              filesystem does not need to be checked.[/b]

Last edited by Rexilion (2014-05-01 07:04:40)


fs/super.c : "Self-destruct in 5 seconds.  Have a nice day...\n",

Offline

#11 2014-05-01 17:35:08

mac57
Member
From: St. Somewhere
Registered: 2006-01-06
Posts: 302
Website

Re: [SOLVED] Arch Linux Duke (2007) Fails to Boot

Rexilion wrote:

I'm using systemd right now. So I have no idea where the old sysinit scripts are located.

I think that replacing fsck with true is the easiest to thing to do in order to disable fsck.

Another option would be (if this is honoured):

man fstab

       The sixth field (fs_passno).
              This field is used by the fsck(8) program to determine the order
              in which filesystem checks are done at reboot  time.   The  root
              filesystem  should be specified with a fs_passno of 1, and other
              filesystems should have a fs_passno of 2.  Filesystems within  a
              drive will be checked sequentially, but filesystems on different
              drives will be checked at the same time to  utilize  parallelism
              available in the hardware.  [b]If the sixth field is not present or
              zero, a value of zero is returned and fsck will assume that  the
              filesystem does not need to be checked.[/b]

Very interesting Rexilion - thanks. I will give this a whirl.


Cast off the Microsoft shackles Jan 2005

Offline

#12 2014-06-01 15:37:09

mac57
Member
From: St. Somewhere
Registered: 2006-01-06
Posts: 302
Website

Re: [SOLVED] Arch Linux Duke (2007) Fails to Boot

Folks, thanks for all your helpful comments, and I wanted to report back to you that I finally overcame the issue, and ArchLinux-Duke (2007) is once again executing flawlessly on my old Pentium Pro 200 system. I won't bother reporting here all the blind allies I went down as I tried to figure out what was wrong, but in the end, literally moments before I was about to give up and overwrite my Arch installation with a new Linux variant (antiX seemed well suited for such old and low power hardware), my attention was drawn to a note I had made in my files back in 2007 about a problem with similar symptoms. In that case, I had just deleted ZenWalk Linux from the hard drive (both Arch and Zen had been on the drive), and merged several partitions to make use of the newly free space. This had changed Arch's view of the drive lettering, and what had been its /dev/sddx root device was now /dev/sdcx. Arch failed to boot, throwing off the same errors I was seeing now. I wish I had recalled that note a month or so ago! It would have saved me a lot of work and a lot of frustration.

At any rate, as a last step, and testing the idea that maybe the drive lettering had changed for some reason, I repeatedly manually booted Arch, specifying root=/dev/sda6, then /dev/sdb6, then /dev/sdd6, and finally, /dev/sdc6. Eureka! Arch now considered itself to be on /dev/sdc6 whereas previously it had been on /dev/sda6. This got me part way there, but the boot failed at the filesystem check stage and threw me into root. I disabled the file system check in /etc/rc.sysinit and got farther. Then I cleaned up /etc/fstab to agree with the new sdc naming, and I was back on the air fully.

So, what had happened was that Arch had changed its view of the drive it was on from sda6 to sdc6. While I could not understand why this "sudden" change had occurred, at least I had a solution, and had Arch back up and running.

Trolling through the rest of my notes, I found the answer. In 2012, the Tekram SCSI card in the machine failed, and I ultimately replaced it with an Adaptec card. The Tekram card did not have a BIOS segment on it. The Adaptec card did. My guess is that this caused the two internal SCSI devices I have built into the system (Iomega ZIP and Jaz respectively) to be enumerated first, claiming the "sda" and "sdb". device names. That left "sdc" for the root device, and that is where Arch went next.  This is my guess anyway.

I should have caught this issue back in 2012, at the time, but from my notes, I can see that I tested the new card thoroughly using the  Windows NT 4.0 side of the machine, but never thought to bring up Arch as well. Hence, this problem lay dormant for two years, before I attempted to fire up Arch last month and blundered right into it.

It has not all been bad. I have learned more about the ext2 and ext3 file systems and superblocks in the intervening time than I will ever need to use. I have learned how to manually boot Linux on a machine whose BIOS is so old that it cannot address the disk cylinder that the kernel is on and I have completely refreshed the many general Linux skills that used to just flow from my finger tips. It has been a frustrating experience, but ultimately a successful and useful one.

Just wanted to let everyone know that this is now [SOLVED]. I would mark the post as such, but I don't see any obvious way to do that. Thanks again everyone.


Cast off the Microsoft shackles Jan 2005

Offline

#13 2014-06-01 15:50:15

Head_on_a_Stick
Member
From: London
Registered: 2014-02-20
Posts: 7,769
Website

Re: [SOLVED] Arch Linux Duke (2007) Fails to Boot

mac57 wrote:

Just wanted to let everyone know that this is now [SOLVED]. I would mark the post as such, but I don't see any obvious way to do that. Thanks again everyone.

Edit the title of your first post and put "[SOLVED]" at the beginning wink
Fascinating story BTW --- I might try this method on my dad's old computer...

Offline

#14 2014-06-02 17:43:33

mac57
Member
From: St. Somewhere
Registered: 2006-01-06
Posts: 302
Website

Re: [SOLVED] Arch Linux Duke (2007) Fails to Boot

Thanks for the pointer Head_on_a_Stick. I have updated the title to reflect the [SOLVED] status.


Cast off the Microsoft shackles Jan 2005

Offline

Board footer

Powered by FluxBB