You are not logged in.

#1 2023-01-25 21:08:35

hully
Member
Registered: 2022-11-14
Posts: 164

System gets stuck

Sometimes my system gets stuck and I have to REISUB it.

Here is the journal of the last boot: [REMOVED]

Last edited by hully (2023-01-26 10:23:34)

Offline

#2 2023-01-25 22:39:11

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 30,473
Website

Re: System gets stuck

Do you know when in that log the system was "stuck"?  I don't see any signs of a notable gap in the timestamps and it appears to show a successful boot and clean shutdown with nothing particularly notable inbetween except for kconnectd seg faulting - which itself is definitely a problem, but didn't appear to hinder any other processes from continuing.

I don't know what a REISUB use would look like in the journal, but I suspect it would look quite different from a normal clean shutdown, and unless I'm missing it (quite possible in a 3K line log) I'm not seeing signs of it in there.

Last edited by Trilby (2023-01-25 22:40:50)


"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman

Offline

#3 2023-01-25 23:38:48

hully
Member
Registered: 2022-11-14
Posts: 164

Re: System gets stuck

I don't know, maybe I did got it wrong.

I don't think it contains all it should.

I also get this when I try to show the journal:

Journal file /var/log/journal/69e8b7f93d494b2dbd77c5fb13bb45a4/system@0005f2b5792a4bf7-3e4ee0067ca4c654.journal~ is truncated, ignoring file.

Offline

#4 2023-01-26 01:15:14

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 30,473
Website

Re: System gets stuck

Well that is relevant as that would be the journal from the boot that was "stuck".  That's what we need to see.


"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman

Offline

#5 2023-01-26 06:13:10

hully
Member
Registered: 2022-11-14
Posts: 164

Re: System gets stuck

It looks like it's a binary file.

How can I avoid journal files to get truncated?

Offline

#6 2023-01-26 07:44:59

d.ALT
Member
Registered: 2019-05-10
Posts: 959

Re: System gets stuck

hully wrote:

How can I avoid journal files to get truncated?

sudo journalctl -b | curl -F 'file=@-' 0x0.st

(thanks @seth ! wink )


<49,17,III,I>    Fama di loro il mondo esser non lassa;
<50,17,III,I>    misericordia e giustizia li sdegna:
<51,17,III,I>    non ragioniam di lor, ma guarda e passa.

Offline

#7 2023-01-26 07:49:54

Head_on_a_Stick
Member
From: The Wirral
Registered: 2014-02-20
Posts: 9,003
Website

Re: System gets stuck

hully wrote:

It looks like it's a binary file.

That can be read with

strings /var/log/journal/69e8b7f93d494b2dbd77c5fb13bb45a4/system@0005f2b5792a4bf7-3e4ee0067ca4c654.journal~ | grep -i message

Jin, Jîyan, Azadî

Offline

#8 2023-01-26 09:21:57

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,393

Re: System gets stuck

@d.Alt, that posts the journal of the running boot, but we need the journal of the crashing boot.
The OP might have posted that - it looks cut short.
At Jan 25 21:57:14 the system woke from S3 and 2/4 seconds later we've

Jan 25 21:57:16 raffarch kernel: sysrq: Keyboard mode set to system default
Jan 25 21:57:18 raffarch kernel: sysrq: Terminate All Tasks

what somewhat fits their description.

@hully, don't fly over the REISUB shortcut, give each instance 3 seconds or so before you move on to the next.
ESPECIALLY wait for S & U.

Is this somewhat related to https://bbs.archlinux.org/viewtopic.php?id=282965 ?
Since briefly before you reisub

Jan 25 21:57:14 raffarch NetworkManager[1790]: <info>  [1674680234.3559] manager: NetworkManager state is now CONNECTED_GLOBAL

Are you sure that you *need* to REISUB?
You cannot switch to a different VT or ssh into the system?

Offline

#9 2023-01-26 10:22:23

hully
Member
Registered: 2022-11-14
Posts: 164

Re: System gets stuck

So I'll answer each of you and then post the new logs.

@d.ALT: That is the command I had run, except I used `-b -1` to get the previous boot. It will say that the journal file is truncated. So it doesn't avoid it to be truncated.

@Head_on_a_Stick: that works. thanks!

@seth

No it isn't

Are you sure that you *need* to REISUB?
You cannot switch to a different VT or ssh into the system?

VT means "Virtual Terminal" which means CTRL+ALT+F2/F3/F4/...?

Sometimes it allow me to go to a virtual terminal, but still I cannot reboot as it will say another restart is already being undertaken.

I don't know what ssh into the system means. I have a laptop with no running ssh server?

----

I was unable to reboot. After issuing the "reboot" command, it appeared a black screen with a blinking cursor (with no prompt).

I REISUB-bed (didn't try to get into a VT).

I issued:

❯ sudo journalctl -b -1 > journal.txt
Journal file /var/log/journal/69e8b7f93d494b2dbd77c5fb13bb45a4/system@0005f2b5792a4bf7-3e4ee0067ca4c654.journal~ is truncated, ignoring file.

Here is journal.txt: https://raw.githubusercontent.com/875d/ … ournal.txt

Then I issued:

❯ strings /var/log/journal/69e8b7f93d494b2dbd77c5fb13bb45a4/system@0005f2b5792a4bf7-3e4ee0067ca4c654.journal~ | grep -i message > truncated.txt

Here is truncated.txt: https://raw.githubusercontent.com/875d/ … ncated.txt

It says "Timeout waiting for RPC from GSP"

Offline

#10 2023-01-26 10:33:19

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,393

Re: System gets stuck

Jan 26 10:32:32 raffarch kernel: NVRM: Xid (PCI:0000:01:00): 119, pid=3704, name=Hyprland:sh7, Timeout waiting for RPC from GSP! Expected function 10 (FREE) (0xfade0002 0x0).
Jan 26 10:32:32 raffarch kernel: NVRM _issueRpcAndWait: rpcRecvPoll timedout for fn 10!
Jan 26 10:32:32 raffarch kernel: NVRM nvAssertFailedNoLog: Assertion failed: 0 @ rpc.c:205
Jan 26 10:32:32 raffarch kernel: NVRM rpcRmApiFree_GSP: GspRmFree failed: hClient=0xc1d00002; hObject=0xfade0002; paramsStatus=0x00000000; status=0x00000065
Jan 26 10:32:32 raffarch kernel: NVRM nvAssertFailedNoLog: Assertion failed: NV_OK == status @ vaspace_api.c:553
Jan 26 10:32:38 raffarch kernel: NVRM: Xid (PCI:0000:01:00): 119, pid=3704, name=Hyprland:sh7, Timeout waiting for RPC from GSP! Expected function 10 (FREE) (0xfade0001 0x0).
Jan 26 10:32:38 raffarch kernel: NVRM _issueRpcAndWait: rpcRecvPoll timedout for fn 10!
Jan 26 10:32:38 raffarch kernel: NVRM nvAssertFailedNoLog: Assertion failed: 0 @ rpc.c:205
Jan 26 10:32:38 raffarch kernel: NVRM rpcRmApiFree_GSP: GspRmFree failed: hClient=0xc1d00002; hObject=0xfade0001; paramsStatus=0x00000000; status=0x00000065

Wayland compositor and nvidia GPU, is this nvidia-open?

Edit

Loading NVIDIA UNIX Open Kernel Mode Setting Driver

"Yes" - try the regular driver.

Last edited by seth (2023-01-26 10:37:30)

Offline

#11 2023-01-26 14:38:57

hully
Member
Registered: 2022-11-14
Posts: 164

Re: System gets stuck

I cannot use the proprietary ones.

When I install `nvidia-dkms`, the system takes forever to boot, during boot it shows red messages like "Kernel module failed to load", and once it finally boots Hyprland doesn't start up.

The Hyprland wiki suggests to use `nvidia-open-dkms`, although I still don't understand why `nvidia-open` and `nvidia` suffice since I'm not compile my own kernel.

It seems a driver bug though.

Offline

#12 2023-01-26 14:41:44

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,393

Re: System gets stuck

I cannot use the proprietary ones.

Bullshit.

When I install `nvidia-dkms`, the system takes forever to boot, during boot it shows red messages like "Kernel module failed to load", and once it finally boots Hyprland doesn't start up.

You're running the default kernel, no need for dkms.
Also https://wiki.archlinux.org/title/NVIDIA#Installation - blue note, "ibt=off"

Offline

#13 2023-01-26 15:12:33

d.ALT
Member
Registered: 2019-05-10
Posts: 959

Re: System gets stuck

As seth correctly pointed out, your GPU  ( which is 10de:25a2 = GA107M [GeForce RTX 3050 Mobile] ) is perfectly capable to be driven by nVIDIA's proprietary driver.


<49,17,III,I>    Fama di loro il mondo esser non lassa;
<50,17,III,I>    misericordia e giustizia li sdegna:
<51,17,III,I>    non ragioniam di lor, ma guarda e passa.

Offline

#14 2023-01-26 17:00:39

hully
Member
Registered: 2022-11-14
Posts: 164

Re: System gets stuck

seth wrote:

I cannot use the proprietary ones.

Bullshit.

When I install `nvidia-dkms`, the system takes forever to boot, during boot it shows red messages like "Kernel module failed to load", and once it finally boots Hyprland doesn't start up.

You're running the default kernel, no need for dkms.
Also https://wiki.archlinux.org/title/NVIDIA#Installation - blue note, "ibt=off"

It booted after I put "ibt=off".

How I was supposed to know I had to put this parameter in the kernel?

It used to work. Then I installed the open source drivers as the Hyprland wiki suggested to do so and forgot about it.

Now after this bug I tried to install the proprietary ones and it didn't boot.

What I did wrong? How I was supposed to know the kernel parameter to have it boot?

Offline

#15 2023-01-26 17:01:41

hully
Member
Registered: 2022-11-14
Posts: 164

Re: System gets stuck

Also, when is the DKMS needed?

When I use other kernels than "linux"? For instance "linux-zen"? "linux-lts"?

Then I compile my kernel from source?

Offline

#16 2023-01-26 17:05:10

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,393

Re: System gets stuck

How I was supposed to know I had to put this parameter in the kernel?

It's in the wiki and the module leaves a googleable error in the journal.
Also you can ask instead of assume. The IBT situation is somewhat recent (4/5 months or so?)

You need dkms when no precompiled module for your kernel is available.
There're nvidia and nvidia-lts, so for those you won't need dkms.

Back on topic: does the system still "get stuck"?

Offline

#17 2023-01-26 17:18:56

d.ALT
Member
Registered: 2019-05-10
Posts: 959

Re: System gets stuck

hully wrote:

Also, when is the DKMS needed?

If user install / use multiple kernel(s), he could opt for using nvidia-dkms only (plus the relative kernel's headers!).
DKMS will take care of (automagically) rebuilding needed modules (the nVIDIA ones, in this case) everytime there's a kernel update, thanks to its Pacman hook.

Also, another Pacman hook is responsible to automate Rebuilding/Updating the initramfs when there's an update to nVIDIA drivers.

That's all about ArchLinux's official Kernel Packages: linux, or linux-lts, or linux-zen (and their relative headers!)


<49,17,III,I>    Fama di loro il mondo esser non lassa;
<50,17,III,I>    misericordia e giustizia li sdegna:
<51,17,III,I>    non ragioniam di lor, ma guarda e passa.

Offline

#18 2023-01-26 17:20:39

hully
Member
Registered: 2022-11-14
Posts: 164

Re: System gets stuck

It's in the wiki and the module leaves a googleable error in the journal.

So once it booted, after several minutes, I should have checked the logs and googled the error?

You need dkms when no precompiled module for your kernel is available.
There're nvidia and nvidia-lts, so for those you won't need dkms.

So "nvidia" is the precompiled kernel module for "linux", "nvidia-lts" is the precompiled kernel module for "linux-lts" and so on?

And "nvidia-dkms" what is it? A module that gets compiled from source?

How do you reconcile the fact that "nvidia-dkms" gets compiled from source with the fact that Arch is a binary distro and you find sources in AUR only? (emphasis in original quoted text)

Back on topic: does the system still "get stuck"?

For now no

Last edited by hully (2023-01-26 17:21:42)

Offline

#19 2023-01-26 17:23:57

d.ALT
Member
Registered: 2019-05-10
Posts: 959

Re: System gets stuck

hully wrote:

"nvidia-dkms" gets compiled from source

Who told?


<49,17,III,I>    Fama di loro il mondo esser non lassa;
<50,17,III,I>    misericordia e giustizia li sdegna:
<51,17,III,I>    non ragioniam di lor, ma guarda e passa.

Offline

#20 2023-01-26 17:25:53

hully
Member
Registered: 2022-11-14
Posts: 164

Re: System gets stuck

d.ALT wrote:
hully wrote:

"nvidia-dkms" gets compiled from source

Who told?

seth wrote:

You need dkms when no precompiled module for your kernel is available.

nvidia-dkms description wrote:

NVIDIA drivers - module sources

Last edited by hully (2023-01-26 17:26:49)

Offline

#21 2023-01-26 17:32:31

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,393

Re: System gets stuck

So once it booted, after several minutes, I should have checked the logs and googled the error?

No, you would have inspected the journal of the failing boot after it failed to see why it failed and fix the failure.

DKMS is explained in https://wiki.archlinux.org/title/Dynami … le_Support which d.Alt linked before.

Arch is a binary distro, but that doesn't preclude you from eg. compiling your own kernel, at which point you'll need DKMS.
Arguebly there could be more out-of-tree modules for -zen, -rt and -hardened, but so far no TU seems to care.

Ftr. nvidia-dkms installs sources, it does not somehow get compiled when you install it. DKMS willl however then use those sources to compile modules for your kernel.
Read the wiki.

Offline

#22 2023-01-26 18:25:00

hully
Member
Registered: 2022-11-14
Posts: 164

Re: System gets stuck

It got stuck again, but this time I was able to quit the compositor and return to a working VT which allowed me to restart the compositor again. Which is MUCH better.

Here are the logs: https://raw.githubusercontent.com/875d/ … ournal.txt

It looks like it is due to kdeconnect crashing?

Head_on_a_Stick wrote:
hully wrote:

It looks like it's a binary file.

That can be read with

strings /var/log/journal/69e8b7f93d494b2dbd77c5fb13bb45a4/system@0005f2b5792a4bf7-3e4ee0067ca4c654.journal~ | grep -i message

How can I have a timestamp for those messages? I still get the old error messages (timeout waiting for GSP), but I don't know whether they refer to the old situation

Offline

#23 2023-01-26 20:06:22

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,393

Re: System gets stuck

The journal doesn't seem to cover the hyprland restart?

Jan 26 17:17:26 raffarch dbus-daemon[4693]: [session uid=1000 pid=4693] Activating via systemd: service name='org.freedesktop.impl.portal.desktop.hyprland' unit='xdg-desktop-portal-hyprland.service' requested by ':1.15' (uid=1000 pid=4943 comm="/usr/lib/xdg-desktop-portal")
Jan 26 17:17:26 raffarch systemd[4676]: Starting Portal service (Hyprland implementation)...
Jan 26 17:17:26 raffarch dbus-daemon[4693]: [session uid=1000 pid=4693] Successfully activated service 'org.freedesktop.impl.portal.desktop.hyprland'
Jan 26 17:17:26 raffarch systemd[4676]: Started Portal service (Hyprland implementation).
                                          'HYPRLAND_CMD': 'Hyprland',
                                          'HYPRLAND_INSTANCE_SIGNATURE': 'fc89e70a1fb74429ad0f772d399325f69e65b357_1674749841',
                                          'XDG_CURRENT_DESKTOP': 'Hyprland',

nor any GPU related issues nor a console login.

Offline

#24 2023-01-31 08:58:59

hully
Member
Registered: 2022-11-14
Posts: 164

Re: System gets stuck

I still get stuck for time to time.

For example, just before following a system upgrade.

But now I'm able to get to a TTY re start the compositor, which is MUCH better than before.

Here is the system log: https://github.com/875d/journals/blob/main/journal.txt
Here is the truncated file: https://github.com/875d/journals/blob/m … ncated.txt

Question again: How can I get the timestamps from the messages in the truncated file?

Offline

#25 2023-01-31 09:39:31

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,393

Re: System gets stuck

Does the incident fall into this timeframe?

Jan 31 09:42:02 raffarch sudo[618494]: raffaele : TTY=pts/0 ; PWD=/home/raffaele ; USER=root ; COMMAND=/usr/bin/pacman -U --noconfirm --config /etc/pacman.conf -- /home/raffaele/.cache/yay/waybar-hyprland-git/waybar-hyprland-git-0.9.17.r46.gc93811b1-1-x86_64.pkg.tar.zst
Jan 31 09:45:37 raffarch systemd[1]: Started Getty on tty2.
Jan 31 09:45:47 raffarch systemd-logind[1799]: New session 3 of user raffaele.
Jan 31 09:45:52 raffarch fuzzel[620568]: wayland: failed to read events from the Wayland socket: Broken pipe

---------
Ftr:

Jan 31 00:15:35 raffarch kernel: audit: type=1701 audit(1675120535.170:1914): auid=1000 uid=1000 gid=1000 ses=1 pid=3852 comm="Hyprland" exe="/usr/bin/Hyprland" sig=6 res=1
Jan 31 00:15:35 raffarch audit[3852]: ANOM_ABEND auid=1000 uid=1000 gid=1000 ses=1 pid=3852 comm="Hyprland" exe="/usr/bin/Hyprland" sig=6 res=1
Jan 31 00:15:35 raffarch audit: BPF prog-id=69 op=LOAD
Jan 31 00:15:35 raffarch audit: BPF prog-id=70 op=LOAD
Jan 31 00:15:35 raffarch audit: BPF prog-id=71 op=LOAD
Jan 31 00:15:35 raffarch kernel: audit: type=1334 audit(1675120535.177:1915): prog-id=69 op=LOAD
Jan 31 00:15:35 raffarch kernel: audit: type=1334 audit(1675120535.177:1916): prog-id=70 op=LOAD
Jan 31 00:15:35 raffarch kernel: audit: type=1334 audit(1675120535.177:1917): prog-id=71 op=LOAD
Jan 31 00:15:35 raffarch systemd[1]: Started Process Core Dump (PID 449252/UID 0).
Jan 31 00:15:35 raffarch audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@1-449252-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jan 31 00:15:35 raffarch kernel: audit: type=1130 audit(1675120535.233:1918): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@1-449252-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jan 31 00:15:35 raffarch systemd-coredump[449253]: Process 3852 (Hyprland) of user 1000 dumped core.
                                                   
                                                   Stack trace of thread 3852:
                                                   #0  0x00007f3f1e3b964c n/a (libc.so.6 + 0x8864c)
                                                   #1  0x00007f3f1e369938 raise (libc.so.6 + 0x38938)
                                                   #2  0x00007f3f1e35353d abort (libc.so.6 + 0x2253d)
                                                   #3  0x00007f3f1e3ad7ee n/a (libc.so.6 + 0x7c7ee)
                                                   #4  0x00007f3f1e3c33dc n/a (libc.so.6 + 0x923dc)
                                                   #5  0x00007f3f1e3c523c n/a (libc.so.6 + 0x9423c)
                                                   #6  0x00007f3f1e3c7ba3 __libc_free (libc.so.6 + 0x96ba3)
                                                   #7  0x0000560aae36a09c _ZN14CConfigManagerD2Ev (Hyprland + 0x5609c)
                                                   #8  0x0000560aae36ad81 _ZNSt10unique_ptrI14CConfigManagerSt14default_deleteIS0_EED1Ev (Hyprland + 0x56d81)
                                                   #9  0x00007f3f1e36bf85 n/a (libc.so.6 + 0x3af85)
                                                   #10 0x00007f3f1e36c100 exit (libc.so.6 + 0x3b100)
                                                   #11 0x00007f3f1e354297 n/a (libc.so.6 + 0x23297)
                                                   #12 0x00007f3f1e35434a __libc_start_main (libc.so.6 + 0x2334a)
                                                   #13 0x0000560aae352945 _start (Hyprland + 0x3e945)
                                                   
                                                   Stack trace of thread 4029:
                                                   #0  0x00007f3f1e43ab6f accept (libc.so.6 + 0x109b6f)
                                                   #1  0x0000560aae3fce33 _ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZN13CEventManager11startThreadEvEUlvE_EEEEE6_M_runEv (Hyprland + 0xe8e33)
                                                   #2  0x00007f3f1e6d7283 execute_native_thread_routine (libstdc++.so.6 + 0xd7283)
                                                   #3  0x00007f3f1e3b78fd n/a (libc.so.6 + 0x868fd)
                                                   #4  0x00007f3f1e439d20 n/a (libc.so.6 + 0x108d20)
                                                   ELF object binary architecture: AMD x86-64

Offline

Board footer

Powered by FluxBB