You are not logged in.
I was setting up a dual gpu passthrough vm with 2 AMD GPU, everything worked correctly but I had some problem with virt manager.
I decided to reboot and now I can't boot and my pc auto reboot after few second.
I reversed my grub, mkconfig and mkinitcpio with arch-chroot. I did a pacman -Syu linux-zen but it did nothing.
what is printed when I try to boot:
```
dev/sda1: recovering journal
dev/sda1 primary superblock features different from backup, check forced.
dev/sda1: Feature orphan_present is set but orphan file is clean.
CLEARED.
dev/sda1: 1537957/12591104 files (0.3% non contiguous), 22278788/50358784 blocks
FAILED: failed to start Simple Desktop Display Manager
```
I can't use sudo anymore:
```
sudo: error in /etc/sudo.conf, line 0 while loading plugin "sudoers_policy"
sudo: unable to load usr/lib/sudo/sudoers.so: /usr/lib/libldap.so.2: file too short
sudo: fatal error, unable to load plugins
```
I tried to used timeshift:
```
timeshift: error while loading shared libraries: /usr/lib/lib.jpeg.so.8: file too short
```
I don't feel that the problem come from the vfio at this point.
Last edited by Hyderman (2023-08-08 23:46:09)
Offline
The "file too short" errors look like serious problems, it could be you have a bunch of corrupted 'so' files. Step 1 would be to take a full offline backup of your data. I would not even try to boot from that disk, use the LiveCD or rescue OS and backup the data from the drive/partition.
To restore the so files, you should be able to overwrite the files again using pacman from your cache, but that could be *every* package you installed.
Offline
Afterwards:
https://wiki.archlinux.org/title/SMART
Post "smartctl -a /dev/sda" in doubt, you want to ensure the disk isn't falling apart before trying to fix this which is going to require you to re-install all packages offline: https://wiki.archlinux.org/title/Pacman … an_upgrade
Offline
Thanks for your answers.
Afterwards:
https://wiki.archlinux.org/title/SMARTPost "smartctl -a /dev/sda" in doubt, you want to ensure the disk isn't falling apart before trying to fix this which is going to require you to re-install all packages offline: https://wiki.archlinux.org/title/Pacman … an_upgrade
My smart output:
smartctl 8.3 2022-02-28 r5338 [x86_64-linux-6.2.13-arch1-1] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: SATA3 1TB SSD
Serial Number: 2020092300018
Firmware Version: T0707B0
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
TRIM Command: Available, deterministic
Device is: Not in smartctl database 7.3/5319
ATA Version is: ACS-2 T13/2015-D revision 3
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Tue Aug 8 07:38:43 2023 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 120) seconds.
Offline data collection
capabilities: (0x11) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0002) Does not save SMART data before
entering power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 10) minutes.
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x0032 100 100 050 Old_age Always - 0
5 Reallocated_Sector_Ct 0x0032 100 100 050 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 050 Old_age Always - 9037
12 Power_Cycle_Count 0x0032 100 100 050 Old_age Always - 2440
160 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 0
161 Unknown_Attribute 0x0033 100 100 050 Pre-fail Always - 100
163 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 68
164 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 43143
165 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 32
166 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 5
167 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 22
168 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 5050
169 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 100
175 Program_Fail_Count_Chip 0x0032 100 100 050 Old_age Always - 0
176 Erase_Fail_Count_Chip 0x0032 100 100 050 Old_age Always - 0
177 Wear_Leveling_Count 0x0032 100 100 050 Old_age Always - 0
178 Used_Rsvd_Blk_Cnt_Chip 0x0032 100 100 050 Old_age Always - 0
181 Program_Fail_Cnt_Total 0x0032 100 100 050 Old_age Always - 0
182 Erase_Fail_Count_Total 0x0032 100 100 050 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 050 Old_age Always - 220
194 Temperature_Celsius 0x0022 100 100 050 Old_age Always - 40
195 Hardware_ECC_Recovered 0x0032 100 100 050 Old_age Always - 0
196 Reallocated_Event_Count 0x0032 100 100 050 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 100 050 Old_age Always - 0
198 Offline_Uncorrectable 0x0032 100 100 050 Old_age Always - 0
199 UDMA_CRC_Error_Count 0x0032 100 100 050 Old_age Always - 0
232 Available_Reservd_Space 0x0032 100 100 050 Old_age Always - 100
241 Total_LBAs_Written 0x0030 100 100 050 Old_age Offline - 213820
242 Total_LBAs_Read 0x0030 100 100 050 Old_age Offline - 2382702
245 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 92484
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
Selective Self-tests/Logging not supportedIm going to try the pacman reinstall and I will tell you.
Offline
When I do:
find /mnt/usr/lib -size 0I have dozens of empty lib.
I tried to pick up on package manually but some lib with no subfolder (directly in /usr/lib/the_lib) like /usr/lib/libjpeg.so.8 prevent to install the package.
I tried to install libjpeg-turbo package but I have conflict because files exists.
I dont know if I have to use overwrite because the command
pacman --sysroot /mnt -Qo /usr/lib/libjpeg.so.8says no package owns the lib.
Offline
The pacman databases probably got nuked as well.
Have you backed up your actually relevant data?
You can then re-install all packages "--dbonly" first to (hopefully) sanitize the databases before re-installing the actual pacakges.
Any idea what happened that has left you in this state? (power outage, forceful reboot, stuff like that)
Offline
I was doing things on virt manager and I decided to manually reboot. I don't remember anything else.
pacman --sysroot /mnt -Syu --dbonly haven't change anything.
I have my home directory saves on another drive and my timeshift snap in case I could reuse it.
Should I reinstall arch?
Offline
pacman --sysroot /mnt -Syu --dbonly haven't change anything.
Not a system update, most pacakges will be up-to date. You've to re-install all packages (along that parameter)
Offline
Always the same libs which are empty.
ldconfig: file /usr/lib/x is empty, not checked.
Same error for all packages I tried
Offline
What produces that error?
You can probably not use --sysroot but will have to use --root and --cachdir because the chroot will fail because the system is in shambles.
Offline
I feel like the packages are ok but it's those libs the problem.
Last edited by Hyderman (2023-08-08 13:28:50)
Offline
You're just updating sudo, that's not gonna cut it.
pacman --root /mnt -Qnq | wc -lPlease replace the oversized image w/ a link.
Offline
This command prints 1022.
Offline
So why do you only end up updating sudo?
Read this again
Offline
Should I try something like pacman --root /mnt - S $(pacman --root /mnt Qnq) ?
Offline
--cachedir <dir>
Specify an alternative package cache location (the default is /var/cache/pacman/pkg).
Multiple cache directories can be specified, and they are tried in the order they are
passed to pacman. NOTE: This is an absolute path, and the root path is not
automatically prepended.
re-install all packages "--dbonly" first
But otherwise yes, that's how you'd re-install all packages.
Offline
I have reinstalled manually each package of the empty libs and it worked I can boot but I always have an auto reboot after few seconds
Offline
If you have 'quiet' on your kernel parameters line, remove it and watch what comes across the screen, perhaps a fatal error is shown there. Does the reboot happen if you do not attempt to log in, or does it stay at the login screen?
Offline
Can you boot/start the
- rescue.target
- multi-user.target
(2nd link below) w/o a reboot?
Do you have a journal for a boot that resulted in an immediate reboot?
Offline
For rescue.target I dont see anything but it's pretty big so maybe i miss it, for multi-user.target I don't really have the time to see because I have the reboot.
I don't see anything after removing quiet from my grub
Offline
What is pretty big and how this there a time issue?
Add "systemd.unit=rescue.target" to the kernel parameters in the bootloader.
Offline
When I am in rescue.target I have like 2000 lines of journal, I tried to search with "reboot" word but nothing.
For multi-user.target I also have the the reboot problem so I can stay like 3 sec before reboot.
Offline
Chroot root in from live media and post the full system journal for a boot with the issue from there to a pastebin?
Offline
After booting in the live arch, I have the error:
amdgpu: Fail to disable thermal alert!
[drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -22
Im going to send the journal in few minutes.
Offline
Here is the journal: http://0x0.st/H_1j.txt
Offline