You are not logged in.
Hi,
I have this weird problem that I can't analyze myself to find out the real cause.
When I use systemd-networkd for example, it doesn't happen. But when I use NetworkManager, my ssh connection is suddenly terminated by itself. I connect at home to the local server on the router where Arch is with sshd.
At first I thought it was a client or station problem, but it happens from other devices and clients as well. It happens at different time intervals ... sometimes it's fine for a few tens of minutes, sometimes less.
There is nothing in the log / journal.
Have you ever faced this problem?
Thanks
[root@home ~]# uname -a
Linux home 6.1.1-arch1-1 #1 SMP PREEMPT_DYNAMIC Wed, 21 Dec 2022 22:27:55 +0000 x86_64 GNU/Linux
dbus.service loaded active running D-Bus System Message Bus
frr.service loaded active running FRRouting
getty@tty1.service loaded active running Getty on tty1
kea-dhcp4.service loaded active running ISC Kea IPv4 DHCP daemon
kea-dhcp6.service loaded active running ISC Kea IPv6 DHCP daemon
kmod-static-nodes.service loaded active exited Create List of Static Device Nodes
ldconfig.service loaded active exited Rebuild Dynamic Linker Cache
NetworkManager-wait-online.service loaded active exited Network Manager Wait Online
NetworkManager.service loaded active running Network Manager
nmb.service loaded active running Samba NMB Daemon
serial-getty@ttyS0.service loaded active running Serial Getty on ttyS0
smb.service loaded active running Samba SMB Daemon
snmpd.service loaded active running Simple Network Management Protocol (SNMP) Daemon
sshd.service loaded active running OpenSSH Daemon
systemd-fsck@dev-disk-by\x2duuid-46c0c8ab\x2dd1ea\x2d48bf\x2db2d7\x2d08d4b8a6fc19.service loaded active exited File System Check on /dev/disk/by-uuid/46c0c8ab-d1ea-48bf-b2d7-08d4b8a6fc19
systemd-journal-catalog-update.service loaded active exited Rebuild Journal Catalog
systemd-journal-flush.service loaded active exited Flush Journal to Persistent Storage
systemd-journald.service loaded active running Journal Service
systemd-logind.service loaded active running User Login Management
systemd-random-seed.service loaded active exited Load/Save Random Seed
systemd-remount-fs.service loaded active exited Remount Root and Kernel File Systems
systemd-sysctl.service loaded active exited Apply Kernel Variables
systemd-sysusers.service loaded active exited Create System Users
systemd-timesyncd.service loaded active running Network Time Synchronization
systemd-tmpfiles-setup-dev.service loaded active exited Create Static Device Nodes in /dev
systemd-tmpfiles-setup.service loaded active exited Create Volatile Files and Directories
systemd-udev-trigger.service loaded active exited Coldplug All udev Devices
systemd-udevd.service loaded active running Rule-based Manager for Device Events and Files
systemd-update-done.service loaded active exited Update is Completed
systemd-update-utmp.service loaded active exited Record System Boot/Shutdown in UTMP
systemd-user-sessions.service loaded active exited Permit User Sessions
user-runtime-dir@0.service loaded active exited User Runtime Directory /run/user/0
user@0.service loaded active running User Manager for UID 0Last edited by vecino (2023-01-01 12:38:11)
Offline
Check/post the
journalctl -bwhen the issue happens. Even if you say "there's nothing" how confident are you in idenifying the potentially relevant lines?
Last edited by V1del (2022-12-30 12:34:18)
Offline
@V1del Thanks for your response.
I'm used to working with journal and I really didn't find anything wrong there.
I turned on DEBUG logging in NetworkManager and found out that the problem is probably caused by "Checking connectivity" ... I don't have static routes set up via NetworkManager because it's a router that runs frrouting (OSPF) and I use dynamic routing. That's probably why the connectivity check was failing and NetworkManager was dropping the interface (down / up). Now I'm trying to see if that's really it.
NetworkManager - 4.4 Checking connectivity
/etc/NetworkManager/conf.d/20-connectivity.conf
[connectivity]
enabled=falseThis is not related, but sometimes there is this line in the journal that I don't understand - is it ok or should I edit something? Thank you
pam_warn(systemd-user:setcred): function=[pam_sm_setcred] flags=0x8002 service=[systemd-user] terminal=[] user=[root] ruser=[<unknown>] rhost=[<unknown>] Offline
The connectivity check does nothing, other than show whether the wider internet is reachable and allowing networkmanager desktop clients to show an indicator that you're technically connected but can't reach the internet. It does not lead to any "corrective" action on NMs part.
That line is most likely not related, please post the entire journal, otherwise this is just random guesses.
Offline
So I rejoiced prematurely. It doesn't happen that often, but it still does ![]()
Here is my journal: https://pastebin.com/aYGBhkRQ
If you find any explanation or cause, I will be glad. Thank you
Last edited by vecino (2022-12-31 12:37:05)
Offline
And since then it's quiet and the ssh connection doesn't end by itself.
Is the journal supposed to cover such connection/ssh session loss?
Because there's noo global connection reached (so the connectivity check is probably disabled) and there ~4 ssh logins (as root…) - 3 of them close after some minutes w/o any obvious interference.
Offline
During this time, the ssh connection was terminated at least once by itself.
Dec 31 13:01:32 home systemd[1]: Started User Manager for UID 0.
Dec 31 13:01:32 home systemd[1]: Started Session 8 of User root.
Dec 31 13:01:33 home sshd[3338]: Accepted publickey for root from 2a01:***:****:0:c86a:9a13:ba9c:e746 port 49793 ssh2: RSA SHA256:HxSGjQ4gimNur0kcoUDPpEKTR7M+njXaMY1qZlWLKPc
Dec 31 13:01:33 home sshd[3338]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)
Dec 31 13:01:33 home systemd-logind[301]: New session 10 of user root.
Dec 31 13:01:33 home systemd[1]: Started Session 10 of User root.
Dec 31 13:24:50 home sshd[3461]: Accepted publickey for root from 2a01:***:****:0:c86a:9a13:ba9c:e746 port 49988 ssh2: RSA SHA256:HxSGjQ4gimNur0kcoUDPpEKTR7M+njXaMY1qZlWLKPc
Dec 31 13:24:50 home sshd[3461]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)
Dec 31 13:24:50 home systemd-logind[301]: New session 11 of user root.
Dec 31 13:24:51 home systemd[1]: Started Session 11 of User root.
Dec 31 13:24:51 home sshd[3466]: Accepted publickey for root from 2a01:***:****:0:c86a:9a13:ba9c:e746 port 49989 ssh2: RSA SHA256:HxSGjQ4gimNur0kcoUDPpEKTR7M+njXaMY1qZlWLKPc
Dec 31 13:24:51 home sshd[3466]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)
Dec 31 13:24:51 home systemd-logind[301]: New session 12 of user root.
Dec 31 13:24:51 home systemd[1]: Started Session 12 of User root.But the journal doesn't show any reason why this is happening.
Offline
Have you checked the clients journal?
Also that's not in the journal you posted.
Offline
I am using Windows with a client Xshell 7 (last messge after disconect is "Socket error Event: 32 Error: 10053"). I found that the problem is probably related to ipv6 ... if I connect via ipv4 it doesn't happen. I'm checking the ipv6 settings now, but I've been using it the same way for years.
Offline
Can you test a different client? Live distro?
Offline
The problem is probably somewhere in my network. I tried restarting my vlan switch. It looks good now. My deepest apologies to NetworkManager for suspecting that he is causing the problem. For a while I also suspected stations with Windows 11 because from time to time when pinging an ipv6 address "Ping Transmit Failed General Failure" was displayed ... but not on stations with Linux. I still don't know the real cause.
I solved the pam_warn messages like this:
[root@home ~]# cat /etc/pam.d/other
#%PAM-1.0
auth required pam_deny.so
#auth required pam_warn.so
account required pam_deny.so
#account required pam_warn.so
password required pam_deny.so
#password required pam_warn.so
session required pam_deny.so
#session required pam_warn.soThank you very much for your advice here. Great community. @V1del @seth
Offline