You are not logged in.
Yesterday I updated my system with "pacman -Syu", after some seconds I open Firefox and the system totally freezes, with the Caps Lock light blinking. I still don't know the reason of that, but I post it in this section because a blinking light might be a hardware error, even if my HP laptop is just 1 year old.
After rebooting the machine, the system stops at photo 1.
I then plugged int the arch usb ISO, arch-chrooted to mnt and tried again a system update, but it shows the error in photo 2, last line.
I searched the forum and can't understand if it's related to this or this (my laptop has AMD, not Nvidia).
EDIT: Ah I forgot, pacman was updating some packages included the linux kernel and kernel headers, but I use linux-hardened. I don't know if it's a useful info.
Last edited by dictionary (2022-01-19 20:43:26)
Offline
That message about the "file too short" appears to suggest it was incompletely written during the first system update, rendering it un-loadable. Maybe the harddrive/ssd is failing? Have you tried checking the filesystem for errors?
Offline
Don't chroot, use "pacman --sysroot /mnt" from the iso.
Then "pacman --sysroot /mnt -Qkk | grep -v ', 0 altered files'" to see how bad it is.
This could be because of inappropriate trimming or the drive falling apart.
https://wiki.archlinux.org/title/Solid_state_drive#TRIM
https://wiki.archlinux.org/title/SMART
Offline
This is part of what I get with that pacman command.
Smartctl does't show any errors.If try "pacman --sysroot /mnt -Syu" it says after answering Yes: (nevermind, ignore this)
Could not open file /etc/mtab: No such file
Could not determine filesystem mount points
Last edited by dictionary (2022-01-18 18:43:29)
Offline
After some more research and reading pacman wiki, I found that many files are sized 0, including linux-hardened. I then tried to reinstall it but errors follow. You can see what I did here. The top part is the end of kernel install. How could pacman break all of this??
Should I reinstall the system? Any other solution?
Edit: and if you tell me to reinstall the system, do I need to follow the installation wiki again like before or is there something to just repair the files other than /home?
Last edited by dictionary (2022-01-18 20:11:39)
Offline
https://wiki.archlinux.org/title/Pacman … dependency?
But since the pacman database looks damaged:
https://wiki.archlinux.org/title/Pacman … l_database
In either case, don't forget to mount the /boot partition.
This is rather not something that pacman has done - it's typically either a problem w/ the disk (you might want to share the smart data) or the filesystem (btrfs?) or unqualified trimming for an unsuited SSD
It can in theory happen for an unclean shutdown (power loss, hard reboot) during or right after an update (where not all data had been synced to disk)
Offline
After a LOT of pain, adapting those solutions for my situation and with things not working, the system is now fixed! Your first link was my last step, but adding "--overwrite". I can finally rest, thanks a lot for your precious support seth and Dennis!
(you might want to share the smart data)
Now that I can post results more easyly, I retried and looks like my ssd is not totally compatible with smart? It doesn't show anything about availability, like the wiki says:
$ sudo smartctl -i /dev/nvme0n1
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.15-hardened1-1-hardened] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Number: INTEL SSDPEKNW512G8H
Serial Number: BTNH04350X8U512A
Firmware Version: HPS1
PCI Vendor/Subsystem ID: 0x8086
IEEE OUI Identifier: 0x5cd2e4
Controller ID: 1
NVMe Version: 1.3
Number of Namespaces: 1
Namespace 1 Size/Capacity: 512.110.190.592 [512 GB]
Namespace 1 Formatted LBA Size: 512
Local Time is: Wed Jan 19 21:21:37 2022 CET
$ sudo smartctl -c /dev/nvme0n1
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.15-hardened1-1-hardened] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Firmware Updates (0x14): 2 Slots, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005f): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x0f): S/H_per_NS Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg
Maximum Data Transfer Size: 32 Pages
Warning Comp. Temp. Threshold: 77 Celsius
Critical Comp. Temp. Threshold: 80 Celsius
Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 3.50W - - 0 0 0 0 0 0
1 + 2.70W - - 1 1 1 1 0 0
2 + 2.00W - - 2 2 2 2 0 0
3 - 0.0250W - - 3 3 3 3 5000 5000
4 - 0.0040W - - 4 4 4 4 5000 9000
Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0
$ sudo smartctl -t conveyance /dev/nvme0n1
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.15-hardened1-1-hardened] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
NVMe device successfully opened
Use 'smartctl -a' (or '-x') to print SMART (and more) information
$ sudo smartctl -H /dev/nvme0n1
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.15-hardened1-1-hardened] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
or the filesystem (btrfs?)
It's ext4.
or unqualified trimming for an unsuited SSD
It looked like it's supported, so I enabled fstrim.timer back then when I first installed the system.
$ lsblk --discard
NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
nvme0n1 0 512B 2T 0
├─nvme0n1p1 0 512B 2T 0
├─nvme0n1p2 0 512B 2T 0
├─nvme0n1p3 0 512B 2T 0
└─nvme0n1p4 0 512B 2T 0
└─arch 0 512B 2T 0
Should I disable trimming to be more safe?
It can in theory happen for an unclean shutdown (power loss, hard reboot)
For sure it was my hard shutdown because of the system freeze. I will need to diagnose the blinking light next time it happens. I edit the thread title to better explain my problem.
Last edited by dictionary (2022-01-19 20:45:21)
Offline
Use 'smartctl -a' (or '-x') to print SMART (and more) information
Google doesn't yell trimming issues w/ the device at me and since the premise of the thread is a hard shutdown, that's the most likely cause.
You want to look at https://wiki.archlinux.org/title/Keyboa … el_(SysRq) in order to deal w/ kernel panics in a more robust way.
Offline
$ sudo smartctl -a /dev/nvme0n1
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.15-hardened1-1-hardened] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Number: INTEL SSDPEKNW512G8H
Serial Number: BTNH04350X8U512A
Firmware Version: HPS1
PCI Vendor/Subsystem ID: 0x8086
IEEE OUI Identifier: 0x5cd2e4
Controller ID: 1
NVMe Version: 1.3
Number of Namespaces: 1
Namespace 1 Size/Capacity: 512.110.190.592 [512 GB]
Namespace 1 Formatted LBA Size: 512
Local Time is: Wed Jan 19 22:58:38 2022 CET
Firmware Updates (0x14): 2 Slots, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005f): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x0f): S/H_per_NS Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg
Maximum Data Transfer Size: 32 Pages
Warning Comp. Temp. Threshold: 77 Celsius
Critical Comp. Temp. Threshold: 80 Celsius
Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 3.50W - - 0 0 0 0 0 0
1 + 2.70W - - 1 1 1 1 0 0
2 + 2.00W - - 2 2 2 2 0 0
3 - 0.0250W - - 3 3 3 3 5000 5000
4 - 0.0040W - - 4 4 4 4 5000 9000
Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0
=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 25 Celsius
Available Spare: 100%
Available Spare Threshold: 10%
Percentage Used: 0%
Data Units Read: 2.489.113 [1,27 TB]
Data Units Written: 2.314.079 [1,18 TB]
Host Read Commands: 23.332.536
Host Write Commands: 30.250.349
Controller Busy Time: 640
Power Cycles: 447
Power On Hours: 997
Unsafe Shutdowns: 18
Media and Data Integrity Errors: 0
Error Information Log Entries: 0
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Error Information (NVMe Log 0x01, 16 of 256 entries)
No Errors Logged
Oh yes, the Sysrq was already enabled and I tried that when system freezed, but did nothing so the kernel was gone too.
Offline
The disk itself is fine, but if the system blocks beyond the sysrq, there's not much you can do for an orderly shutdown.
If this happens more often or you can trigger this, you may want to look at https://wiki.archlinux.org/title/Kdump to debug the kernel crash.
Offline
Got it, thanks again!
Offline