You are not logged in.

#1 2015-02-14 06:12:22

bicyclingrevolution
Member
Registered: 2010-10-18
Posts: 71

Drives disappear between UEFI and Linux

This problem has me stumped for a month, so I'm hoping someone can shed some light on this.
Originally, I had an MSI motherboard with BIOS only, GRUB2 booting Arch and Windows 7, no problems whatsoever. I upgraded to an Asus motherboard with UEFI, and at first Arch booted fine (Windows just blue screened). Subsequent boots would randomly fail, sometimes telling me Arch couldn't find the root drive, sometimes the home drive, and other times it would boot successfully. There was no telling how many times I would need to reboot before Arch would find all the drives. I returned the Asus board and replaced it with a Gigabyte Z97-D3H, but the problem continued.
Finally this Tuesday I reinstalled Arch in UEFI mode, and instead of a bootloader, UEFI boots Linux directly, as shown in the "Using UEFI directly" section of EFISTUB. Still, Arch randomly fails to find my drives.
Here is my drive layout when Arch boots properly (when it doesn't, lsblk also can't see the missing drive):

┌── kiba ⟶ Paradise   ~
└───── lsblk -o name,fstype,size,label,mountpoint
NAME   FSTYPE   SIZE LABEL  MOUNTPOINT
sda            59.6G        
├─sda1 vfat     512M SHAMAN /boot
├─sda2 ext4    51.1G Hige   /
└─sda3 swap       8G swap   [SWAP]
sdb           119.2G        
└─sdb1 ext4   119.2G Kiba   /home
sdc           465.8G        
├─sdc1 ntfs     128G Tsume  
└─sdc2 ext4   337.8G Toboe  /home/kiba/Toboe

And /etc/fstab:

# <file system> <dir>   <type>  <options>       <dump>  <pass>
# /dev/sda2 UUID=67de5f82-861e-4934-94ba-9fdde2225bb1
LABEL=Hige              /               ext4            rw,noatime,discard,data=ordered 0 1

# /dev/sda1 UUID=6A35-D4DE
LABEL=SHAMAN            /boot           vfat            rw,noatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro 0 2

# /dev/sdb1 UUID=c89186ab-c005-4598-a5b9-2be4d0d6202c
LABEL=Kiba              /home           ext4            rw,noatime,discard,data=ordered 0 2

# /dev/sda3 UUID=0ee5665b-13e3-4592-bdf1-5701636697f3
LABEL=swap              none            swap            defaults        0 0

LABEL=Toboe             /home/kiba/Toboe ext4           defaults        0 0

Offline

#2 2015-02-14 12:08:06

Head_on_a_Stick
Member
From: The Wirral
Registered: 2014-02-20
Posts: 9,003
Website

Re: Drives disappear between UEFI and Linux

Maybe a hardware problem with the troublesome drive?

You could try a SMARTCTL test on it.


Jin, Jîyan, Azadî

Offline

#3 2015-02-14 20:07:36

bicyclingrevolution
Member
Registered: 2010-10-18
Posts: 71

Re: Drives disappear between UEFI and Linux

Oops, I forgot to mention that. As far as I can tell, the SMART status is fine on all the drives, but then I could be missing something.
Here's the output for the root drive:

┌── kiba ⟶ Paradise   ~                                      11:49:50
└───── s smartctl -t long /dev/sda && sleep 61 && s smartctl -a /dev/sda
smartctl 6.3 2014-07-26 r3976 [x86_64-linux-3.18.6-1-ARCH] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 1 minutes for test to complete.
Test will complete after Sat Feb 14 11:52:27 2015

Use smartctl -X to abort test.
smartctl 6.3 2014-07-26 r3976 [x86_64-linux-3.18.6-1-ARCH] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     JMicron based SSDs
Device Model:     KINGSTON SV100S264G
Serial Number:    08AAA0003175
Firmware Version: D100811a
User Capacity:    64,023,257,088 bytes [64.0 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Sat Feb 14 11:52:28 2015 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.                                              
                                        Auto Offline Data Collection: Disabled                          .                                                                                                       
Self-test execution status:      (   0) The previous self-test routine complet                          ed                                                                                                      
                                        without error or no self-test has ever                                                                                                                                  
                                        been run.                                                       
Total time to complete Offline                                                
data collection:                (   30) seconds.                              
Offline data collection                                                       
capabilities:                    (0x1b) SMART execute Offline immediate.      
                                        Auto Offline data collection on/off support.                                                                        
                                        Suspend Offline collection upon new   
                                        command.                              
                                        Offline surface scan supported.       
                                        Self-test supported.                  
                                        No Conveyance Self-test supported.    
                                        No Selective Self-test supported.     
SMART capabilities:            (0x0003) Saves SMART data before entering      
                                        power-saving mode.                    
                                        Supports SMART auto save timer.       
Error logging capability:        (0x01) Error logging supported.              
                                        General Purpose Logging supported.    
Short self-test routine                                                       
recommended polling time:        (   1) minutes.                              
Extended self-test routine                                                    
recommended polling time:        (   1) minutes.                              
                                                                              
SMART Attributes Data Structure revision number: 16                           
Vendor Specific SMART Attributes with Thresholds:                             
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE                                                            
  1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       -       0                                                                    
  2 Throughput_Performance  0x0005   100   100   050    Pre-fail  Offline      -       0                                                                    
  3 Unknown_Attribute       0x0007   100   100   050    Pre-fail  Always       -       0                                                                    
  5 Reallocated_Sector_Ct   0x0013   100   100   050    Pre-fail  Always       -       0                                                                    
  7 Unknown_Attribute       0x000b   100   100   050    Pre-fail  Always       -       0                                                                    
  8 Unknown_Attribute       0x0005   100   100   050    Pre-fail  Offline      -       0                                                                    
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       13005                                                                
 10 Unknown_Attribute       0x0013   100   100   050    Pre-fail  Always       -       0                                                                    
 12 Power_Cycle_Count       0x0012   100   100   000    Old_age   Always       -       3565                                                                 
168 SATA_Phy_Error_Count    0x0012   100   100   000    Old_age   Always       -       15                                                                   
175 Bad_Cluster_Table_Count 0x0003   100   100   010    Pre-fail  Always       -       0
192 Unexpect_Power_Loss_Ct  0x0012   100   100   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0022   031   100   020    Old_age   Always       -       31 (Min/Max 23/40)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
240 Unknown_Attribute       0x0013   100   100   050    Pre-fail  Always       -       0
170 Bad_Block_Count         0x0003   100   100   010    Pre-fail  Always       -       0 89 0
173 Erase_Count             0x0012   100   100   000    Old_age   Always       -       5 9441 7415

SMART Error Log Version: 1
ATA Error Count: 15 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 15 occurred at disk power-on lifetime: 12999 hours (541 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 00 00 00 00 a0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ec 00 00 00 00 00 a0 08      00:11:20.100  IDENTIFY DEVICE
  b1 c1 00 00 00 00 00 ff      00:11:19.800  DEVICE CONFIGURATION FREEZE LOCK [OBS-ACS-3]
  f5 00 00 00 00 00 00 ff      00:11:19.800  SECURITY FREEZE LOCK
  ec 00 00 00 00 00 00 00      00:11:17.600  IDENTIFY DEVICE
  ec 00 00 00 00 00 00 00      00:11:17.600  IDENTIFY DEVICE

Error 14 occurred at disk power-on lifetime: 12999 hours (541 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 00 00 00 00 a0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ec 00 00 00 00 00 a0 08      00:15:36.300  IDENTIFY DEVICE
  ec 00 00 00 00 00 00 ff      00:15:35.900  IDENTIFY DEVICE
  ec 00 00 00 00 00 00 00      00:15:33.700  IDENTIFY DEVICE
  ec 00 00 00 00 00 00 00      00:15:33.700  IDENTIFY DEVICE
  ec 00 00 00 00 00 00 00      00:15:33.700  IDENTIFY DEVICE

Error 13 occurred at disk power-on lifetime: 12986 hours (541 days + 2 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 00 00 00 00 00

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ec 00 00 00 00 00 00 00      00:03:08.200  IDENTIFY DEVICE
  ec 00 00 00 00 00 00 ff      00:03:02.700  IDENTIFY DEVICE
  ec 00 00 00 00 00 00 ff      00:02:46.000  IDENTIFY DEVICE
  ec 00 00 00 00 00 00 ff      00:02:41.400  IDENTIFY DEVICE
  ec 00 00 00 00 00 00 ff      00:02:24.700  IDENTIFY DEVICE

Error 12 occurred at disk power-on lifetime: 12763 hours (531 days + 19 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 00 00 00 00 00

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ec 00 00 00 00 00 00 00      00:00:35.900  IDENTIFY DEVICE
  00 00 00 00 00 00 00 ff      00:00:29.400  NOP [Abort queued commands]
  00 00 00 00 00 00 00 ff      00:00:12.400  NOP [Abort queued commands]
  00 00 00 00 00 00 00 ff      00:00:00.000  NOP [Abort queued commands]

Error 11 occurred at disk power-on lifetime: 12706 hours (529 days + 10 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 00 00 00 00 00

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ec 00 00 00 00 00 00 00      00:02:30.800  IDENTIFY DEVICE
  e7 00 00 00 00 00 a0 ff      00:02:11.900  FLUSH CACHE
  e7 00 00 00 00 00 a0 08      00:01:44.500  FLUSH CACHE
  e7 00 00 00 00 00 a0 08      00:01:43.300  FLUSH CACHE
  e7 00 00 00 00 00 a0 08      00:01:40.300  FLUSH CACHE

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     13005         -
# 2  Extended offline    Completed without error       00%     13005         -

Selective Self-tests/Logging not supported

And for the home drive:

┌── kiba ⟶ Paradise   ~                                      11:52:28
└───── s smartctl -t long /dev/sdb && sleep 61 && s smartctl -a /dev/sdb 
smartctl 6.3 2014-07-26 r3976 [x86_64-linux-3.18.6-1-ARCH] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 1 minutes for test to complete.
Test will complete after Sat Feb 14 11:54:25 2015

Use smartctl -X to abort test.
smartctl 6.3 2014-07-26 r3976 [x86_64-linux-3.18.6-1-ARCH] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     JMicron based SSDs
Device Model:     KINGSTON SV100S2128G
Serial Number:    08BB20039237
Firmware Version: D110225a
User Capacity:    128,035,676,160 bytes [128 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Sat Feb 14 11:54:26 2015 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (   30) seconds.
Offline data collection
capabilities:                    (0x1b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        No Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        (   1) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0005   100   100   050    Pre-fail  Offline      -       0
  3 Unknown_Attribute       0x0007   100   100   050    Pre-fail  Always       -       0
  5 Reallocated_Sector_Ct   0x0013   100   100   050    Pre-fail  Always       -       0
  7 Unknown_Attribute       0x000b   100   100   050    Pre-fail  Always       -       0
  8 Unknown_Attribute       0x0005   100   100   050    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       9899
 10 Unknown_Attribute       0x0013   100   100   050    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0012   100   100   000    Old_age   Always       -       3541
168 SATA_Phy_Error_Count    0x0012   100   100   000    Old_age   Always       -       13
175 Bad_Cluster_Table_Count 0x0003   100   100   010    Pre-fail  Always       -       0
192 Unexpect_Power_Loss_Ct  0x0012   100   100   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0022   034   100   020    Old_age   Always       -       34 (Min/Max 23/40)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
240 Unknown_Attribute       0x0013   100   100   050    Pre-fail  Always       -       0
170 Bad_Block_Count         0x0003   100   100   010    Pre-fail  Always       -       0 135 0
173 Erase_Count             0x0012   100   100   000    Old_age   Always       -       2 16503 12094

SMART Error Log Version: 1
ATA Error Count: 37 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 37 occurred at disk power-on lifetime: 9893 hours (412 days + 5 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 00 00 00 00 a0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ec 00 00 00 00 00 a0 08      00:00:09.200  IDENTIFY DEVICE
  ec 00 00 00 00 00 00 ff      00:00:08.800  IDENTIFY DEVICE
  ec 00 00 00 00 00 00 ff      00:00:08.800  IDENTIFY DEVICE
  ec 00 00 00 00 00 00 00      00:00:06.600  IDENTIFY DEVICE
  ec 00 00 00 00 00 00 00      00:00:06.600  IDENTIFY DEVICE

Error 36 occurred at disk power-on lifetime: 9880 hours (411 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 00 00 00 00 00

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ec 00 00 00 00 00 00 00      00:09:55.000  IDENTIFY DEVICE
  ec 00 00 00 00 00 a0 ff      00:09:48.800  IDENTIFY DEVICE
  e7 00 00 00 00 00 a0 08      00:09:47.900  FLUSH CACHE
  ec 00 01 00 00 00 00 08      00:04:00.000  IDENTIFY DEVICE
  ec 00 01 00 00 00 00 08      00:03:56.100  IDENTIFY DEVICE

Error 35 occurred at disk power-on lifetime: 9855 hours (410 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 00 00 00 00 00

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ec 00 00 00 00 00 00 00      00:03:35.400  IDENTIFY DEVICE
  ec 00 00 00 00 00 00 ff      00:03:29.200  IDENTIFY DEVICE
  ec 00 00 00 00 00 00 ff      00:03:24.600  IDENTIFY DEVICE
  ec 00 00 00 00 00 00 00      00:01:00.000  IDENTIFY DEVICE
  ec 00 00 00 00 00 00 00      00:01:00.000  IDENTIFY DEVICE

Error 34 occurred at disk power-on lifetime: 9168 hours (382 days + 0 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 00 00 00 00 00

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ec 00 00 00 00 00 00 00      00:00:21.800  IDENTIFY DEVICE
  00 00 00 00 00 00 00 ff      00:00:21.800  NOP [Abort queued commands]
  00 00 00 00 00 00 00 ff      00:00:00.000  NOP [Abort queued commands]

Error 33 occurred at disk power-on lifetime: 9165 hours (381 days + 21 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 00 00 00 00 a0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ec 00 00 00 00 00 a0 00      00:18:36.900  IDENTIFY DEVICE
  a1 00 00 00 00 00 a0 00      00:18:36.900  IDENTIFY PACKET DEVICE
  ec 00 00 00 00 00 a0 ff      00:18:36.900  IDENTIFY DEVICE
  ec 00 00 00 00 00 a0 ff      00:18:00.100  IDENTIFY DEVICE
  ec 00 00 00 00 00 a0 ff      00:17:49.100  IDENTIFY DEVICE

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      9899         -
# 2  Extended offline    Completed without error       00%      9894         -

Selective Self-tests/Logging not supported

They are both SSD's, while the extra drive is a HDD. I don't think I've ever had an issue with the HDD showing up, so maybe this is an SSD-specific problem?

Offline

#4 2015-02-14 20:12:18

Head_on_a_Stick
Member
From: The Wirral
Registered: 2014-02-20
Posts: 9,003
Website

Re: Drives disappear between UEFI and Linux

Have you checked the connectors and cables?


Jin, Jîyan, Azadî

Offline

#5 2015-02-15 02:20:16

bicyclingrevolution
Member
Registered: 2010-10-18
Posts: 71

Re: Drives disappear between UEFI and Linux

I just did, everything is connected firmly and the cables aren't damaged in any way. I booted up and Arch failed to find the home drive, so I actually unplugged it and plugged it back in (possibly a bad idea....). Lsblk then showed it as mounted to /home! I simply ran "systemctl default" and got to the desktop. Here's journalctl's output of that:

Feb 14 17:19:55 Paradise systemd[1]: Job dev-disk-by\x2dlabel-Kiba.device/start timed out.
Feb 14 17:19:55 Paradise systemd[1]: Timed out waiting for device dev-disk-by\x2dlabel-Kiba.device.
Feb 14 17:19:55 Paradise systemd[1]: Dependency failed for File System Check on /dev/disk/by-label/Kiba.
Feb 14 17:19:55 Paradise systemd[1]: Dependency failed for /home.
Feb 14 17:19:55 Paradise systemd[1]: Dependency failed for /home/kiba/Toboe.
Feb 14 17:19:55 Paradise systemd[1]: Dependency failed for Local File Systems.
Feb 14 17:19:55 Paradise systemd[1]: Job local-fs.target/start failed with result 'dependency'.
Feb 14 17:19:55 Paradise systemd[1]: Triggering OnFailure= dependencies of local-fs.target.
Feb 14 17:19:55 Paradise systemd[1]: Job home-kiba-Toboe.mount/start failed with result 'dependency'.
Feb 14 17:19:55 Paradise systemd[1]: Job home.mount/start failed with result 'dependency'.
Feb 14 17:19:55 Paradise systemd[1]: Job systemd-fsck@dev-disk-by\x2dlabel-Kiba.service/start failed with result 'dependency'.
Feb 14 17:19:55 Paradise systemd[1]: Job dev-disk-by\x2dlabel-Kiba.device/start failed with result 'timeout'.
Feb 14 17:19:55 Paradise systemd[1]: Stopped Run reflector weekly.
Feb 14 17:19:55 Paradise systemd[1]: Stopped Bluetooth service.
Feb 14 17:19:55 Paradise systemd[1]: Starting Bluetooth.
Feb 14 17:19:55 Paradise systemd[1]: Reached target Bluetooth.
Feb 14 17:19:55 Paradise systemd[1]: Stopped target Graphical Interface.
Feb 14 17:19:55 Paradise systemd[1]: Stopped target Multi-User System.
Feb 14 17:19:55 Paradise systemd[1]: Stopped dhcpcd on eno1.
Feb 14 17:19:55 Paradise systemd[1]: Starting Network.
Feb 14 17:19:55 Paradise systemd[1]: Reached target Network.
Feb 14 17:19:55 Paradise systemd[1]: Stopped Daily rotation of log files.
Feb 14 17:19:55 Paradise systemd[1]: Stopped Login Service.
Feb 14 17:19:55 Paradise systemd[1]: Stopped 32-bit chroot.
Feb 14 17:19:55 Paradise systemd[1]: Stopped Daily man-db cache update.
Feb 14 17:19:55 Paradise systemd[1]: Stopped D-Bus System Message Bus.
Feb 14 17:19:55 Paradise systemd[1]: Closed D-Bus System Message Bus Socket.
Feb 14 17:19:55 Paradise systemd[1]: Stopped Simple Desktop Display Manager.
Feb 14 17:19:55 Paradise systemd[1]: Stopped Permit User Sessions.
Feb 14 17:19:55 Paradise systemd[1]: Stopped target Basic System.
Feb 14 17:19:55 Paradise systemd[1]: Starting Sockets.
Feb 14 17:19:55 Paradise systemd[1]: Reached target Sockets.
Feb 14 17:19:55 Paradise systemd[1]: Stopped Daily verification of password and group files.
Feb 14 17:19:55 Paradise systemd[1]: Stopped Daily Cleanup of Temporary Directories.
Feb 14 17:19:55 Paradise systemd[1]: Stopped target System Initialization.
Feb 14 17:19:55 Paradise systemd[1]: Started Manage Sound Card State (restore and store).
Feb 14 17:19:55 Paradise systemd[1]: Starting Restore Sound Card State...
Feb 14 17:19:55 Paradise systemd[1]: Started Rebuild Journal Catalog.
Feb 14 17:19:55 Paradise systemd[1]: Started Update is Completed.
Feb 14 17:19:55 Paradise systemd[1]: Started Commit a transient machine-id on disk.
Feb 14 17:19:55 Paradise systemd[1]: Starting Create Volatile Files and Directories...
Feb 14 17:19:55 Paradise systemd[1]: Starting Timers.
Feb 14 17:19:55 Paradise systemd[1]: Reached target Timers.
Feb 14 17:19:55 Paradise systemd[1]: Starting Emergency Shell...
Feb 14 17:19:55 Paradise systemd[1]: Started Emergency Shell.
Feb 14 17:19:55 Paradise systemd[1]: Starting Emergency Mode.
Feb 14 17:19:55 Paradise systemd[1]: Reached target Emergency Mode.
Feb 14 17:19:55 Paradise systemd[1]: Started Restore Sound Card State.
Feb 14 17:19:55 Paradise systemd[1]: Started Create Volatile Files and Directories.
Feb 14 17:19:55 Paradise systemd[1]: Starting Update UTMP about System Boot/Shutdown...
Feb 14 17:19:55 Paradise systemd[1]: Started Update UTMP about System Boot/Shutdown.
Feb 14 17:19:55 Paradise systemd[1]: Startup finished in 3.227s (kernel) + 1min 30.486s (userspace) = 1min 33.714s.
Feb 14 17:19:55 Paradise systemd[401]: Failed at step EXEC spawning /bin/plymouth: No such file or directory
######  This is when I unplugged the home drive and plugged it back in, the following three lines appeared in the console.  ######
Feb 14 17:20:15 Paradise kernel: ata3: exception Emask 0x10 SAct 0x0 SErr 0x4040000 action 0xe frozen
Feb 14 17:20:15 Paradise kernel: ata3: irq_stat 0x00000040, connection status changed
Feb 14 17:20:15 Paradise kernel: ata3: SError: { CommWake DevExch }
Feb 14 17:20:15 Paradise kernel: ata3: hard resetting link
Feb 14 17:20:15 Paradise kernel: ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Feb 14 17:20:15 Paradise kernel: ata3.00: ATA-8: KINGSTON SV100S2128G, D110225a, max UDMA/100
Feb 14 17:20:15 Paradise kernel: ata3.00: 250069680 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
Feb 14 17:20:15 Paradise kernel: ata3.00: configured for UDMA/100
Feb 14 17:20:15 Paradise kernel: ata3: EH complete
Feb 14 17:20:15 Paradise kernel: scsi 2:0:0:0: Direct-Access     ATA      KINGSTON SV100S2 225a PQ: 0 ANSI: 5
Feb 14 17:20:15 Paradise kernel: sd 2:0:0:0: [sdg] 250069680 512-byte logical blocks: (128 GB/119 GiB)
Feb 14 17:20:15 Paradise kernel: sd 2:0:0:0: [sdg] Write Protect is off
Feb 14 17:20:15 Paradise kernel: sd 2:0:0:0: [sdg] Mode Sense: 00 3a 00 00
Feb 14 17:20:15 Paradise kernel: sd 2:0:0:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Feb 14 17:20:15 Paradise kernel:  sdg: sdg1
Feb 14 17:20:15 Paradise kernel: sd 2:0:0:0: [sdg] Attached SCSI disk
Feb 14 17:20:15 Paradise systemd[1]: Found device KINGSTON_SV100S2128G Kiba.
Feb 14 17:20:15 Paradise systemd[1]: Found device KINGSTON_SV100S2128G Kiba.
Feb 14 17:20:15 Paradise systemd[1]: Found device KINGSTON_SV100S2128G Kiba.
Feb 14 17:20:15 Paradise systemd[1]: Starting File System Check on /dev/disk/by-label/Kiba...
Feb 14 17:20:15 Paradise systemd-fsck[421]: Kiba: clean, 216140/7815168 files, 15219188/31258368 blocks
Feb 14 17:20:15 Paradise systemd[1]: Started File System Check on /dev/disk/by-label/Kiba.
Feb 14 17:20:15 Paradise systemd[1]: Mounting /home...
Feb 14 17:20:15 Paradise systemd[1]: Mounted /home.
Feb 14 17:20:15 Paradise kernel: EXT4-fs (sdg1): mounted filesystem with ordered data mode. Opts: discard,data=ordered

Then I rebooted, and this time the root drive wasn't found. Unplugging and replugging it didn't work; I had to power cycle the computer. Then it booted up without seeing the home drive, and even after replugged it three times, Arch didn't see it. Power cycled again, and it booted properly.

Offline

#6 2015-02-15 13:44:11

firekage
Member
From: Eastern Europe, Poland
Registered: 2013-06-30
Posts: 624

Re: Drives disappear between UEFI and Linux

I have the same Z97 HD3 mainboard. HAve you checked sata cables/cords? I had the same problem with detecting my root SSD drive. After reset, sometimes after hard freez (poweroff) my SSD disappeared and i could not boot up my machine. Yesterday i changed sata cord for a new one - no problem, did a many resets and it works ok.


You said that you checked cables and that they are fine - in my opinion sata cables/cords are much fragile (cord is rock solid but the connectors are not, i thing that old pata cables/cords were more fragile but the connectors were rock solid). I would try to check with a new one connected to root drive.

In my opinion either you have problem with drives or cables because your output shows that dependency failed for disk drives. BTW - there are normal hard drives, not ssd? You have a good quality power supply? Hard drives are fragile in terms of current voltage applied to them (12V).

Last edited by firekage (2015-02-15 13:48:55)

Offline

#7 2015-02-16 06:15:04

bicyclingrevolution
Member
Registered: 2010-10-18
Posts: 71

Re: Drives disappear between UEFI and Linux

I unplugged all the drives except root, and then tried booting with each of the five SATA cables I have, starting with the cable that my HDD was using, because my HDD always shows up. No matter which cable I used, bootup would fail about half of the time.
At this point all I can think of doing is install Windows with the same drive setup (does Windows even allow putting users' directories on another drive?). That should show whether the SSD's are failing.

Offline

#8 2015-03-01 10:15:33

bicyclingrevolution
Member
Registered: 2010-10-18
Posts: 71

Re: Drives disappear between UEFI and Linux

I have installed Windows on both SSD's, and it looks like my 64 GB one is dying, as Windows wanted to run a filesystem check on it at every boot. The 128 GB one didn't have any problems, so I wiped it and installed Arch, but more often than not it would fail to boot with this output:

:: running early hook [udev]
starting version 218
:: running hook [udev]
:: Triggering uevents...
error: /dev/sda: No medium found
error: /dev/sdc: No medium found
error: /dev/sdb: No medium found
error: /dev/sdd: No medium found
error: /dev/sda: No medium found
error: /dev/sdc: No medium found
error: /dev/sdb: No medium found
error: /dev/sdd: No medium found
:: running hook [resume]
ERROR: resume: no device specified for hibernation
Waiting 10 seconds for device /dev/sda2 ...
ERROR: device '/dev/sda2' not found. Skipping fsck.
ERROR: Unable to find root device '/dev/sda2'.
You are being dropped to a recovery shell
    Type 'exit' to try and continue booting
sh: can't access tty: job control turned off
[rootfs /]# _

I have now installed Arch on my hard disk drive, and there are no boot issues. I reinstalled Windows on the 128 GB SSD, and it is continuing to work fine. However, Arch is completely unaware that drive exists, so os-prober can't add a Windows entry to grub (I have to use the UEFI bootmenu to boot Windows).

Offline

#9 2015-03-01 13:44:03

clfarron4
Member
From: London, UK
Registered: 2013-06-28
Posts: 2,175
Website

Re: Drives disappear between UEFI and Linux

Have you tried using the partition UUID/GUIDs instead of labels and block device paths?


Claire is fine.
Problems? I have dysgraphia, so clear and concise please.
My public GPG key for package signing
My x86_64 package repository

Offline

#10 2015-03-02 08:50:07

bicyclingrevolution
Member
Registered: 2010-10-18
Posts: 71

Re: Drives disappear between UEFI and Linux

I had not actually tried that, so I just reinstalled Arch on the 128 GB SSD (replacing Windows entirely). I used genfstab -U and installed grub following its wiki page for UEFI systems, and it does generate grub.cfg with UUIDs in the kernel arguments. Then I exited the live USB, booted the new system, and was greeted with:

starting version 218
error: /dev/sdc: No medium found
error: /dev/sdd: No medium found
error: /dev/sde: No medium found
error: /dev/sdb: No medium found
error: /dev/sdc: No medium found
error: /dev/sdd: No medium found
error: /dev/sde: No medium found
error: /dev/sdb: No medium found
ERROR: device 'UUID=a9c1500b-9c94-4996-89f6-2f863c84027d' not found. Skipping fsck.
ERROR: Unable to find root device 'UUID=a9c1500b-9c94-4996-89f6-2f863c84027d'.
You are being dropped to a recovery shell
   Type 'exit' to try and continue booting
sh: can't access tty: job control turned off
[rootfs /]# _

Second boot was the same, but the third actually worked with no errors I could see. The fourth failed, and at this point, I'm going to put this in a table for readability:

Boot  Result
1     Failed
2     Failed
3     Worked
4     Failed
5     Failed
6     Worked
7     Worked
8     Failed
9     Failed

I also noticed that often the UEFI screen would seem to take a while, then go blank, then appear again, as if it had rebooted itself. However, it did this before a successful boot and a failed boot, so I can't see any correlation. After the 9th reboot, I powered off the computer for a few seconds, then started it again, and it is now telling me "Reboot and Select proper Boot device." I don't see any grub entries in the boot menu, and this is when I realise that I overwrote the grub entry for my other Arch install because I didn't think to use a different bootloader-id. I have no idea why the one for the new install disappeared, although it happened a few times before when using EFISTUB. The boot menu just has entries for my two drives that are named "UEFI: (drive brand name). I feel like I was using those entries to boot Arch on my hard disk drive, but now it just gives me "Reboot and Select proper Boot device." A quick chroot into that system and reinstalling grub fixed it, but these problems are just baffling me.

Offline

Board footer

Powered by FluxBB