You are not logged in.
Hello everyone,
new to the forum and the Arch community in general.
I'm facing a problem which I can't seem to pinpoint, and I thought some of the gurus here might be able to help.
So, the behavior I see is this:
Suddenly, after 5 minutes of work or 50 minutes or work, no consistency there, the system will break.
I can still switch workspaces (I use i3), and I can still open new terminals.
However:
- Some Terminal commands don't work/ are unresponsive
ls works
ps doesn't work
exit doesn't work
- I cannot open any other program (browser etc)
- Browser freezes (brave or surf)
- If I try to switch to tty2, the whole system freezes up and I cannot do anything
something new I tried today: I wrote kill -9 -1 in the terminal and pressed enter. The prompt didn't even come up when I opened it, but the command run somehow and i got logged off.
However, when I tried to log back in, I nothing happened. Correct username, correct password, but I just got prompted to log in again, as I never entered my credentials.
Tried also to log in as root at that point. Got "Login invalid" for some reason with correct credentials
I'm suspecting this to be a hardware issue. I bought this used Thinkpad X200 tablet, and I substituted the drive with an SSD, which is thinner.
Therefore it might be that it disconnects and then connects again? And while that happens I'm only working off the RAM?
Just spitballing, no clue why this is happening.
I tried to add some paper as padding for the drive (don't judge me) and that seemed to help but now it happens every 3 minutes after I turn on the computer and it's becoming really annoying since I'm trying to work on this laptop.
I will try to replace the ram stick with another one I have laying around to see if the ram is corrupt or something, I have no clue.
Any help would be appreciated. I understand that this might be 100% a hardware problem and have nothing to do with arch.
I just thought someone might have faced something similar and can help me out.
Thanks!
Last edited by filkaris (2019-04-07 12:43:31)
Offline
What your intuition tells you about your drive is a certain possibility. I did a bad error once - issued a bad dd command to a drive that was part of my working directory  and my computer started to exhibit similar symptoms as yours. I got an X220 and like you I bought it 2nd hand and like you I also changed the HDD to an SDD. I remember attaching a device around it so it completely snugs the drive compartment but I can't remember how I got hold of this device. I think it came with one of the drives but it's worth checking the internet for one. It was meant to do just that. To put the drive firmly in place. This is a laptop and gets knocked around a bit.
If you can log in, then you can check the usual route - dmesg, journalctl. 
If you still have the old drive, you can put it in and see if it exhibits the same symptoms. Be careful when trying another stick of RAM that it is compatible with your X200.
Also I would check if the keyboard works as it should. I kept having problems with certain commands once only to discover that there was a bad piece of dirt under one of the keys preventing the key from functioning properly.
Last edited by d_fajardo (2019-04-04 18:49:04)
Offline
Oh my god you are right... I searched ebay and it turns out they only gave me PART of the drive caddy...
I will go back to the store and install this correctly and hope it fixes it. Will check in in a couple of days. Thanks man.
Offline
I had your same issues with i3 while using vmware, it was caused by mouse cursor automatic focus, not even REISUB could have rebooted my pc, have you tried other desktop environment?
Offline
Got the tray, installed it, but I'm still having the same problem...
1) DIdn't know about REISUB, that's cool, will try it next time
2) I will try switching from i3 but I don't know how it could be causing this. I also have mouse cursor auto focus disabled in the config
3) journalctl -b -1 does not show anything. It stops around the time the problem happens. I don't really know how systemd works, could it be possible the error occurs but it's not logged? 
For now, I'll just keep trying to debug with journalctl the next time it happens.
I'm getting this all the time in journalctl, but I don't think it's related
Apr 06 16:40:55 archpad dbus-daemon[299]: [system] Activating via systemd: service name='org.freedesktop.resolve1' unit='dbus-org.freedesktop.resolve1.service' requested by ':1.1' (uid=0 pid=300 comm="/usr/bin/NetworkManager --no-daemon ")
Apr 06 16:40:55 archpad dbus-daemon[299]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.resolve1.service': Unit dbus-org.freedesktop.resolve1.service not found.
Last edited by filkaris (2019-04-06 14:03:56)
Offline
Ok so we now have a new in between state.
Terminal is responsive, commands work, I can create new terminals.
BUT browser no longer works, and I cannot even open up dmenu to open an application.
Every time I try to open surf, I get a new I/O error like the ones on the bottom
This is the output of dmesg
https://i.imgur.com/mR4gzr7.jpg
While I'm typing this on my mobile phone, the system finally got completely unresponsive to everything and now requires hard reset
Mod edit: converted oversized image to link -- V1del
Last edited by V1del (2019-04-06 19:26:31)
Offline

Please only post links, or even better actual text to output: https://wiki.archlinux.org/index.php/Co … s_and_code
That said this really doesn't look good for your drive. There's a small chance that the error is simply due to some power saving mode, or more likely due to actual movement, try to set that to max_performance, however before doing so you might want to boot a live disk, run a long smart test and come back with the results of a smartctl -a after the elapsed time, within [ code ] tags.
Offline
You're right, I'm sorry about the embedded image. I couldn't get the code output because the computer was not responding, that's why I took a picture.
Thanks for taking a look, here's the smartctl -a output
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-4.20.6-arch1-1-ARCH] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model:     Patriot Burst
Serial Number:    E473079219EC07025054
Firmware Version: SBFM61.3
User Capacity:    120,034,123,776 bytes [120 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-4 (minor revision not indicated)
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sat Apr  6 21:16:39 2019 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (65535) seconds.
Offline data collection
capabilities:                    (0x79) SMART execute Offline immediate.
                                        No Auto Offline data collection support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  30) minutes.
Conveyance self-test routine
recommended polling time:        (   6) minutes.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       21
 12 Power_Cycle_Count       0x0012   100   100   000    Old_age   Always       -       65
168 Unknown_Attribute       0x0012   100   100   000    Old_age   Always       -       0
170 Unknown_Attribute       0x0003   062   062   000    Pre-fail  Always       -       319
173 Unknown_Attribute       0x0012   100   100   000    Old_age   Always       -       1
192 Power-Off_Retract_Count 0x0012   100   100   000    Old_age   Always       -       33
194 Temperature_Celsius     0x0023   067   067   000    Pre-fail  Always       -       33 (Min/Max 33/33)
218 Unknown_Attribute       0x000b   100   100   050    Pre-fail  Always       -       0
231 Temperature_Celsius     0x0013   100   100   000    Pre-fail  Always       -       100
241 Total_LBAs_Written      0x0012   100   100   000    Old_age   Always       -       19
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%        21         -
SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.Last edited by filkaris (2019-04-06 21:24:53)
Offline
Ok, so the "max_performance" setting is going pretty well so far. Tomorrow I'm going to try and leave the PC alone on purpose for quite some time to see if this happens again.
It makes sense though... Every time I only worked on the terminal for some time and didn't call any programs from the disk. Then, when the time came to call something, it broke because the power saving setting had done something weird with the drive and then it could not get access once it was lost.
V1del you're my savior man.
Will test this more to make sure it's ok, and then will post again to mark as solved
Offline

Smart test looks, as far as it is detected alright, though this seems like a real niche and offbrand drive to have that many unknowns in the SMART output. However it's likely that it doesn't properly support the dipm mode, so that might indeed be the fix
Offline
Oh so it's the hardrive's fault?? Who would've guessed that neither arch nor the 10 year notebook are the cause, but the brand new ssd 
In any case great catch man, helps a lot!
For anyone else having trouble, follow the instructions in V1del's link https://wiki.archlinux.org/index.php/Po … Management
and set link_power_management_policy to max_performance
Offline