You are not logged in.

#1 2019-04-06 07:13:46

krork
Member
Registered: 2013-06-08
Posts: 29

Intermittent initial ramdisk loading failure

Hi!

Around every two weeks my Dell Latitude E6320 laptop hangs on boot. Usually it works again after a hard reboot via Power Button, but today I needed two hard reboots. After grub2 starts loading my vanilla arch kernel, it prints "Loading Linux linux.." and it hangs on "Loading initial ramdisk.."

The system resides on an SSD, smartctl output is at the end of this post. (Perhaps it's an ssd health issue?)

The entry in my fstab for my boot partition is:

UUID=D72E-DAC1      	/boot     	vfat      rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,utf8,errors=remount-ro	0 2 

The only entry that is not empty in my mkinitcpio.conf is:

HOOKS=(base udev autodetect keyboard modconf block encrypt lvm2 filesystems keyboard fsck)

I have encrypted root and home partitions in a logical volume and decrypt them on boot, a while after the stage where it hangs, though.

Perhaps unusual entries in /etc/default/grub: (since the error occurs at loading the initrd I think they shouldn't matter, though)

GRUB_GFXMODE=1280x1024x24,auto

# Uncomment to allow the kernel use the same resolution used by grub
GRUB_GFXPAYLOAD_LINUX=keep

The error occurs too early for there to be any error logs, at least I haven't caught any yet.

Any ideas on what could be causing this, or how to debug/fix it?

The output of smartctl -a for the disk is:

smartctl 7.0 2018-12-30 r4883 [x86_64-linux-5.0.2-arch1-1-ARCH] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, [url=http://www.smartmontools.org]www.smartmontools.org[/url]

=== START OF INFORMATION SECTION ===
Device Model:     LITEONIT LCM-256M3S 2.5" 7mm 256GB
Serial Number:    TW0WKJR2550852CI0098
Add. Product Id:  WRDA
Firmware Version: WRDA
User Capacity:    256,060,514,304 bytes [256 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ATA8-ACS, ATA/ATAPI-7 T13/1532D revision 4a
SATA Version is:  SATA 3.0, 6.0 Gb/s
Local Time is:    Sat Apr  6 08:47:25 2019 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)    Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:         (   10) seconds.
Offline data collection
capabilities:              (0x15) SMART execute Offline immediate.
                    No Auto Offline data collection support.
                    Abort Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    No Selective Self-test supported.
SMART capabilities:            (0x0002)    Does not save SMART data before
                    entering power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x00)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:      (   1) minutes.
Extended self-test routine
recommended polling time:      (  10) minutes.
SCT capabilities:            (0x003d)    SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0003   100   100   000    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0003   100   100   000    Pre-fail  Always       -       2049
175 Program_Fail_Count_Chip 0x0003   100   100   000    Pre-fail  Always       -       0
176 Erase_Fail_Count_Chip   0x0003   100   100   000    Pre-fail  Always       -       0
177 Wear_Leveling_Count     0x0003   100   100   000    Pre-fail  Always       -       92280
178 Used_Rsvd_Blk_Cnt_Chip  0x0003   100   100   000    Pre-fail  Always       -       0
179 Used_Rsvd_Blk_Cnt_Tot   0x0003   100   100   000    Pre-fail  Always       -       0
180 Unused_Rsvd_Blk_Cnt_Tot 0x0003   100   100   005    Pre-fail  Always       -       1888
181 Program_Fail_Cnt_Total  0x0003   100   100   000    Pre-fail  Always       -       0
182 Erase_Fail_Count_Total  0x0003   100   100   000    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0003   100   100   000    Pre-fail  Always       -       0
195 Hardware_ECC_Recovered  0x0003   100   100   000    Pre-fail  Always       -       0
241 Total_LBAs_Written      0x0003   100   100   000    Pre-fail  Always       -       226375
242 Total_LBAs_Read         0x0003   100   100   000    Pre-fail  Always       -       323248

SMART Error Log Version: 0
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     22787         -
# 2  Short offline       Completed without error       00%       256         -
# 3  Short offline       Completed without error       00%         0         -
# 4  Short offline       Completed without error       00%         0         -
# 5  Short offline       Completed without error       00%         0         -

Selective Self-tests/Logging not supported

Last edited by krork (2019-04-06 07:37:24)

Offline

#2 2019-04-06 07:16:42

jasonwryan
Anarchist
From: .nz
Registered: 2009-05-09
Posts: 30,424
Website

Re: Intermittent initial ramdisk loading failure

Please use code tags when pasting to the boards: https://wiki.archlinux.org/index.php/Co … s_and_code


Arch + dwm   •   Mercurial repos  •   Surfraw

Registered Linux User #482438

Offline

Board footer

Powered by FluxBB