Edit: This also has not worked, I am still getting dropped connections and kernel iwlwifi errors. Not sure what there is left to do other than switch to wpa_supplicant for now...
Edit 2: So iwd is now in version 1.25 and I got a new router + access points for my house and the issue is now gone. Maybe it was my network setup after all but it is still weird to me that the issue wasn't present with wpa_supplicant and the old setup but who knows. Thanks to everyone who tried to help, I'm marking this as solved.
]]>[IPv6AcceptRA]
RouteMetric=512
this still led to occasional disconnection events. However checking the logs for potentially more relevant lines I did note that disconnections were preceded by log lines like:
Jan 27 15:09:49 lenovo2 kernel: wlan0: AP xx:xx:xx:xx:xx:xx changed bandwidth, new config is 2437.000 MHz, width 2 (2427.000/0 MHz)
So I was trying understand if the bandwidth switching initiated by the router might lead to loss of stable wireless connection, and I did find references/postings from the past to Access Points switching between 20MHz and 40MHz on 2.4GHz causing problems with stability. So I changed the settings on the router, so that the bandwidth for the 2.4GHz wireless connections is now fixed at 20MHz only. In addition I set the 5GHz bandwidth to 80MHz rather than allowing all bandwidth options. Also the 2.4MHz ssid setting is for N only and the 5GHz ssid has AC only one one router, (but a second router allows different modes on 5GHz) instead of allowing multiple modes as it was set before. The laptop I am using only has 2.4GHz capability, and after making that change I have had no loss of connection events at all. I will run a soak test on a different laptop, that has both 2.4 and 5GHz, over the next few days to see if these changes do finally resolve this issue. Also after the iwd update to 1.22 there was also no improvement (and 1.23 is in [testing] at the moment too). I will report further after a long period connected to see if the bandwidth changed setting on the router has resolved the issue for me, or not.
Edit: I have now also been testing the 5GHz connection and it too is now very stable.
]]>I wonder if there are any parameters for iwd itself that might help - there is a good list of options in the output of 'man iwd.config' including some concerning roaming thresholds. such a 'RoamThreshold'
You can play with RoamThreshold[5G] to make roaming initiated by IWD less sensitive. Problem is this won't effect roaming if the AP requests it. Currently IWD has no switches to turn this off or alter the behavior. IWD assumes if the AP wants us to roam its for a good reason (high load, low signal, AP going down, etc.).
]]>Thank you for the logs, but I don't see any deauths for IWD at least in those logs... There was a bunch of roaming which all was fine (some failed but this is due to no better BSS candidates, totally normal). Nothing really seems wrong from what I can tell.
One thing to note is wpa_supplicant never roamed in those logs you posted so that could be the differentiating factor as far as how both supplicants are interacting with the card. The fact you were seeing those strange "Unhandled Alg" prints tells me the firmware is not behaving right. No idea if roaming could have something to do with it or not. I would at least let linux-wireless know about this behavior with iwlwifi and maybe someone on the driver/FW team could give some insight.
If you happen to get iwmon logs when IWD deauths I'm happy to take another look.
]]>Jan 10 10:02:16 lenovo2 kernel: wlan0: AP xx:xx:xx:xx:xx:xx changed bandwidth, new config is 2437.000 MHz, width 2 (2427.000/0 MHz)
Jan 10 10:03:06 lenovo2 systemd-networkd[701]: wlan0: Lost carrier
Jan 10 10:03:06 lenovo2 systemd-networkd[701]: wlan0: Failed to send DHCP RELEASE, ignoring: Operation not permitted
Jan 10 10:03:06 lenovo2 systemd-networkd[701]: wlan0: DHCP lease lost
Jan 10 10:03:07 lenovo2 iwd[700]: Received Deauthentication event, reason: 4, from_ap: false
Jan 10 10:03:07 lenovo2 dbus-daemon[664]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.3' (uid=193 pid=701 comm="/usr/lib/systemd/systemd-networkd " label="kernel")
Jan 10 10:03:07 lenovo2 audit: BPF prog-id=51 op=LOAD
Jan 10 10:03:07 lenovo2 audit: BPF prog-id=52 op=LOAD
Jan 10 10:03:07 lenovo2 audit: BPF prog-id=53 op=LOAD
Jan 10 10:03:07 lenovo2 kernel: audit: type=1334 audit(1641808987.022:234): prog-id=51 op=LOAD
Jan 10 10:03:07 lenovo2 kernel: audit: type=1334 audit(1641808987.022:235): prog-id=52 op=LOAD
Jan 10 10:03:07 lenovo2 kernel: audit: type=1334 audit(1641808987.022:236): prog-id=53 op=LOAD
Jan 10 10:03:07 lenovo2 systemd[1]: Starting Hostname Service...
Jan 10 10:03:07 lenovo2 systemd-networkd[701]: wlan0: DHCPv6 lease lost
Jan 10 10:03:07 lenovo2 systemd-timesyncd[509]: No network connectivity, watching for changes.
Jan 10 10:03:07 lenovo2 dbus-daemon[664]: [system] Successfully activated service 'org.freedesktop.hostname1'
Jan 10 10:03:07 lenovo2 systemd[1]: Started Hostname Service.
Jan 10 10:03:07 lenovo2 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jan 10 10:03:07 lenovo2 kernel: audit: type=1130 audit(1641808987.064:237): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jan 10 10:03:07 lenovo2 systemd-hostnamed[3032]: Hostname set to <lenovo2> (static)
Jan 10 10:03:07 lenovo2 kernel: wlan0: authenticate with xx:xx:xx:xx:xx:xx
Jan 10 10:03:07 lenovo2 kernel: wlan0: send auth to xx:xx:xx:xx:xx:xx (try 1/3)
Jan 10 10:03:07 lenovo2 kernel: wlan0: authenticated
Jan 10 10:03:07 lenovo2 kernel: wlan0: associate with xx:xx:xx:xx:xx:xx (try 1/3)
Jan 10 10:03:07 lenovo2 kernel: wlan0: RX AssocResp from 1xx:xx:xx:xx:xx:xx (capab=0x1431 status=0 aid=7)
Jan 10 10:03:07 lenovo2 systemd-networkd[701]: wlan0: Connected WiFi access point: mike-guest (xx:xx:xx:xx:xx:xx)
Jan 10 10:03:07 lenovo2 kernel: wlan0: associated
Jan 10 10:03:07 lenovo2 systemd-networkd[701]: wlan0: Gained carrier
https://www.cisco.com/assets/sol/sb/WAP … odes2.html
So deauthenticated due to inactivity by local choice? would a ping every few minutes fix that?
Possibly yes - and needs testing further.
]]>So deauthenticated due to inactivity by local choice? would a ping every few minutes fix that?
]]>$ sudo journalctl -b | grep -e 'deauth\|Deauth'
Jan 08 15:37:03 ryzen1 iwd[644]: Received Deauthentication event, reason: 4, from_ap: false
So I have re-instated the [scan] line in iwd's main.conf and see if these events continue or not. Also this kind of Deauth event is reason 4 which is different from what I had seen previously. So it is not clear whether these events are due to the several changes to the packages/config or not. I will continue to test.
]]>@jprestwo here is the output you asked for. The files are large so I put them on github.
I will update this comment with the iwd output as soon as I have enough data.
These were the errors I got with iwd 1.20 today.
Jan 08 14:27:09 host kernel: iwlwifi 0000:02:00.0: Unhandled alg: 0xc040071b
Jan 08 15:22:28 host kernel: iwlwifi 0000:02:00.0: Unhandled alg: 0xc0400707
Jan 08 15:30:57 host kernel: iwlwifi 0000:02:00.0: Not associated and the time event is over already...
(note this was done with iwd 1.20, I only just got the update in pacman now and will start testing.)
Update: I'm getting disconnects and same kernel errors with 1.21 too.
@Slithery
$ find /etc/systemd -type l -exec test -f {} \; -print | awk -F'/' '{ printf ("%-40s | %s\n", $(NF-0), $(NF-1)) }' | sort -f
accounts-daemon.service | graphical.target.wants
bluetooth.service | bluetooth.target.wants
cronie.service | multi-user.target.wants
cups.socket | sockets.target.wants
dbus-org.bluez.service | system
dbus-org.fedoraproject.FirewallD1.service | system
dbus-org.freedesktop.network1.service | system
dbus-org.freedesktop.timesync1.service | system
dirmngr.socket | sockets.target.wants
display-manager.service | system
firewalld.service | multi-user.target.wants
fstrim.timer | timers.target.wants
gcr-ssh-agent.socket | sockets.target.wants
getty@tty1.service | getty.target.wants
gpg-agent-browser.socket | sockets.target.wants
gpg-agent-extra.socket | sockets.target.wants
gpg-agent.socket | sockets.target.wants
gpg-agent-ssh.socket | sockets.target.wants
intel-undervolt.service | hibernate.target.wants
intel-undervolt.service | hybrid-sleep.target.wants
intel-undervolt.service | multi-user.target.wants
intel-undervolt.service | suspend.target.wants
iwd.service | multi-user.target.wants
lm_sensors.service | multi-user.target.wants
logrotate.timer | timers.target.wants
p11-kit-server.socket | sockets.target.wants
pipewire-media-session.service | pipewire.service.wants
pipewire-pulse.socket | sockets.target.wants
pipewire-session-manager.service | user
pipewire.socket | sockets.target.wants
plocate-updatedb.timer | timers.target.wants
power-profiles-daemon.service | graphical.target.wants
reflector.timer | timers.target.wants
remote-fs.target | multi-user.target.wants
systemd-networkd.service | multi-user.target.wants
systemd-networkd.socket | sockets.target.wants
systemd-networkd-wait-online.service | network-online.target.wants
systemd-timesyncd.service | sysinit.target.wants
unbound.service | multi-user.target.wants
xdg-user-dirs-update.service | default.target.wants
$ find /etc/systemd -type l -exec test -f {} \; -print | awk -F'/' '{ printf ("%-40s | %s\n", $(NF-0), $(NF-1)) }' | sort -f
cups.socket | sockets.target.wants
dbus-org.freedesktop.network1.service | system
dbus-org.freedesktop.timesync1.service | system
dirmngr.socket | sockets.target.wants
display-manager.service | system
dovecot.service | multi-user.target.wants
fstrim.timer | timers.target.wants
gcr-ssh-agent.socket | sockets.target.wants
getty@tty1.service | getty.target.wants
gpg-agent-browser.socket | sockets.target.wants
gpg-agent-extra.socket | sockets.target.wants
gpg-agent-ssh.socket | sockets.target.wants
gpg-agent.socket | sockets.target.wants
iwd.service | multi-user.target.wants
lm_sensors.service | multi-user.target.wants
nftables.service | multi-user.target.wants
p11-kit-server.socket | sockets.target.wants
pipewire-media-session.service | pipewire.service.wants
pipewire-pulse.socket | sockets.target.wants
pipewire-session-manager.service | user
pipewire.socket | sockets.target.wants
remote-fs.target | multi-user.target.wants
sshd.service | multi-user.target.wants
systemd-networkd.service | multi-user.target.wants
systemd-networkd.socket | sockets.target.wants
systemd-timesyncd.service | sysinit.target.wants
unbound.service | multi-user.target.wants
xdg-user-dirs-update.service | default.target.wants
find /etc/systemd -type l -exec test -f {} \; -print | awk -F'/' '{ printf ("%-40s | %s\n", $(NF-0), $(NF-1)) }' | sort -f
Roaming by clients in general is very common and expected in both enterprise and home APs. In this case the *access point* requested the client to roam which is an 802.11 feature, BSS Transition Management. This is what is far less common and is what made me want to know more about what wpa_supplicant is doing in this case. I still doubt this has anything to do with the disconnect (since the roam completed successfully) but you never know.
At least AVM (probably still market leader for consumer IADs and routers in germany) seems to use it their newer? mesh products: https://en.avm.de/service/knowledge-bas … s-it-work/
]]>