You are not logged in.

#1 2022-01-18 08:45:13

dictionary
Member
Registered: 2021-02-19
Posts: 19

[Solved] Corrupted system files after hard shutdown

Yesterday I updated my system with "pacman -Syu", after some seconds I open Firefox and the system totally freezes, with the Caps Lock light blinking. I still don't know the reason of that, but I post it in this section because a blinking light might be a hardware error, even if my HP laptop is just 1 year old.

After rebooting the machine, the system stops at photo 1.

I then plugged int the arch usb ISO, arch-chrooted to mnt and tried again a system update, but it shows the error in photo 2, last line.

I searched the forum and can't understand if it's related to this or this (my laptop has AMD, not Nvidia).

EDIT: Ah I forgot, pacman was updating some packages included the linux kernel and kernel headers, but I use linux-hardened. I don't know if it's a useful info.

Last edited by dictionary (2022-01-19 20:43:26)

Offline

#2 2022-01-18 08:53:00

Dennis
Member
Registered: 2014-11-04
Posts: 56
Website

Re: [Solved] Corrupted system files after hard shutdown

That message about the "file too short" appears to suggest it was incompletely written during the first system update, rendering it un-loadable. Maybe the harddrive/ssd is failing? Have you tried checking the filesystem for errors?

Offline

#3 2022-01-18 10:10:23

seth
Member
Registered: 2012-09-03
Posts: 49,981

Re: [Solved] Corrupted system files after hard shutdown

Don't chroot, use "pacman --sysroot /mnt" from the iso.
Then "pacman --sysroot /mnt -Qkk | grep -v ', 0 altered files'" to see how bad it is.

This could be because of inappropriate trimming or the drive falling apart.
https://wiki.archlinux.org/title/Solid_state_drive#TRIM
https://wiki.archlinux.org/title/SMART

Offline

#4 2022-01-18 18:25:27

dictionary
Member
Registered: 2021-02-19
Posts: 19

Re: [Solved] Corrupted system files after hard shutdown

This is part of what I get with that pacman command.
Smartctl does't show any errors.
If try "pacman --sysroot /mnt -Syu" it says after answering Yes:
Could not open file /etc/mtab: No such file
Could not determine filesystem mount points
(nevermind, ignore this)

Last edited by dictionary (2022-01-18 18:43:29)

Offline

#5 2022-01-18 20:08:47

dictionary
Member
Registered: 2021-02-19
Posts: 19

Re: [Solved] Corrupted system files after hard shutdown

After some more research and reading pacman wiki, I found that many files are sized 0, including linux-hardened. I then tried to reinstall it but errors follow. You can see what I did here. The top part is the end of kernel install. How could pacman break all of this??
Should I reinstall the system? Any other solution?

Edit: and if you tell me to reinstall the system, do I need to follow the installation wiki again like before or is there something to just repair the files other than /home?

Last edited by dictionary (2022-01-18 20:11:39)

Offline

#6 2022-01-18 21:43:39

seth
Member
Registered: 2012-09-03
Posts: 49,981

Re: [Solved] Corrupted system files after hard shutdown

https://wiki.archlinux.org/title/Pacman … dependency?
But since the pacman database looks damaged:
https://wiki.archlinux.org/title/Pacman … l_database

In either case, don't forget to mount the /boot partition.

This is rather not something that pacman has done - it's typically either a problem w/ the disk (you might want to share the smart data) or the filesystem (btrfs?) or unqualified trimming for an unsuited SSD
It can in theory happen for an unclean shutdown (power loss, hard reboot) during or right after an update (where not all data had been synced to disk)

Offline

#7 2022-01-19 20:42:01

dictionary
Member
Registered: 2021-02-19
Posts: 19

Re: [Solved] Corrupted system files after hard shutdown

After a LOT of pain, adapting those solutions for my situation and with things not working, the system is now fixed! Your first link was my last step, but adding "--overwrite". I can finally rest, thanks a lot for your precious support seth and Dennis!


(you might want to share the smart data)

Now that I can post results more easyly, I retried and looks like my ssd is not totally compatible with smart? It doesn't show anything about availability, like the wiki says:

$ sudo smartctl -i /dev/nvme0n1
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.15-hardened1-1-hardened] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       INTEL SSDPEKNW512G8H
Serial Number:                      BTNH04350X8U512A
Firmware Version:                   HPS1
PCI Vendor/Subsystem ID:            0x8086
IEEE OUI Identifier:                0x5cd2e4
Controller ID:                      1
NVMe Version:                       1.3
Number of Namespaces:               1
Namespace 1 Size/Capacity:          512.110.190.592 [512 GB]
Namespace 1 Formatted LBA Size:     512
Local Time is:                      Wed Jan 19 21:21:37 2022 CET

$ sudo smartctl -c /dev/nvme0n1
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.15-hardened1-1-hardened] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Firmware Updates (0x14):            2 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x0f):         S/H_per_NS Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg
Maximum Data Transfer Size:         32 Pages
Warning  Comp. Temp. Threshold:     77 Celsius
Critical Comp. Temp. Threshold:     80 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     3.50W       -        -    0  0  0  0        0       0
 1 +     2.70W       -        -    1  1  1  1        0       0
 2 +     2.00W       -        -    2  2  2  2        0       0
 3 -   0.0250W       -        -    3  3  3  3     5000    5000
 4 -   0.0040W       -        -    4  4  4  4     5000    9000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

$ sudo smartctl -t conveyance /dev/nvme0n1
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.15-hardened1-1-hardened] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

NVMe device successfully opened

Use 'smartctl -a' (or '-x') to print SMART (and more) information

$ sudo smartctl -H /dev/nvme0n1
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.15-hardened1-1-hardened] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

or the filesystem (btrfs?)

It's ext4.

or unqualified trimming for an unsuited SSD

It looked like it's supported, so I enabled fstrim.timer back then when I first installed the system.

$ lsblk --discard
NAME        DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
nvme0n1            0      512B       2T         0
├─nvme0n1p1        0      512B       2T         0
├─nvme0n1p2        0      512B       2T         0
├─nvme0n1p3        0      512B       2T         0
└─nvme0n1p4        0      512B       2T         0
  └─arch           0      512B       2T         0

Should I disable trimming to be more safe?

It can in theory happen for an unclean shutdown (power loss, hard reboot)

For sure it was my hard shutdown because of the system freeze. I will need to diagnose the blinking light next time it happens. I edit the thread title to better explain my problem.

Last edited by dictionary (2022-01-19 20:45:21)

Offline

#8 2022-01-19 21:18:15

seth
Member
Registered: 2012-09-03
Posts: 49,981

Re: [Solved] Corrupted system files after hard shutdown

smartctl wrote:

Use 'smartctl -a' (or '-x') to print SMART (and more) information

Google doesn't yell trimming issues w/ the device at me and since the premise of the thread is a hard shutdown, that's the most likely cause.
You want to look at https://wiki.archlinux.org/title/Keyboa … el_(SysRq) in order to deal w/ kernel panics in a more robust way.

Offline

#9 2022-01-19 22:06:25

dictionary
Member
Registered: 2021-02-19
Posts: 19

Re: [Solved] Corrupted system files after hard shutdown

$ sudo smartctl -a /dev/nvme0n1
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.15-hardened1-1-hardened] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       INTEL SSDPEKNW512G8H
Serial Number:                      BTNH04350X8U512A
Firmware Version:                   HPS1
PCI Vendor/Subsystem ID:            0x8086
IEEE OUI Identifier:                0x5cd2e4
Controller ID:                      1
NVMe Version:                       1.3
Number of Namespaces:               1
Namespace 1 Size/Capacity:          512.110.190.592 [512 GB]
Namespace 1 Formatted LBA Size:     512
Local Time is:                      Wed Jan 19 22:58:38 2022 CET
Firmware Updates (0x14):            2 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x0f):         S/H_per_NS Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg
Maximum Data Transfer Size:         32 Pages
Warning  Comp. Temp. Threshold:     77 Celsius
Critical Comp. Temp. Threshold:     80 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     3.50W       -        -    0  0  0  0        0       0
 1 +     2.70W       -        -    1  1  1  1        0       0
 2 +     2.00W       -        -    2  2  2  2        0       0
 3 -   0.0250W       -        -    3  3  3  3     5000    5000
 4 -   0.0040W       -        -    4  4  4  4     5000    9000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        25 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    2.489.113 [1,27 TB]
Data Units Written:                 2.314.079 [1,18 TB]
Host Read Commands:                 23.332.536
Host Write Commands:                30.250.349
Controller Busy Time:               640
Power Cycles:                       447
Power On Hours:                     997
Unsafe Shutdowns:                   18
Media and Data Integrity Errors:    0
Error Information Log Entries:      0
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0

Error Information (NVMe Log 0x01, 16 of 256 entries)
No Errors Logged

Oh yes, the Sysrq was already enabled and I tried that when system freezed, but did nothing so the kernel was gone too.

Offline

#10 2022-01-20 07:40:11

seth
Member
Registered: 2012-09-03
Posts: 49,981

Re: [Solved] Corrupted system files after hard shutdown

The disk itself is fine, but if the system blocks beyond the sysrq, there's not much you can do for an orderly shutdown.
If this happens more often or you can trigger this, you may want to look at https://wiki.archlinux.org/title/Kdump to debug the kernel crash.

Offline

#11 2022-01-20 08:36:18

dictionary
Member
Registered: 2021-02-19
Posts: 19

Re: [Solved] Corrupted system files after hard shutdown

Got it, thanks again!

Offline

Board footer

Powered by FluxBB