You are not logged in.

#1 2023-05-08 09:21:57

Moviuro
Member
Registered: 2012-06-03
Posts: 74

WiFi interface in bond receiving packets when primary is connected

Hi,

In addition to my previous issue with bonds, my two archlinux machines are experiencing the same unexpected behavior: packet loss when pinging from my router to the archlinux boxes.

https://wiki.linuxfoundation.org/networking/bonding wrote:

mode -> active-backup or 1

Active-backup policy: Only one slave in the bond is active. A different slave becomes active if, and only if, the active slave fails. The bond's MAC address is externally visible on only one port (network adapter) to avoid confusing the switch. [...]

root@toxoplasmosis / # ip a
4: enp4s0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default qlen 1000
    link/ether 8a:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff permaddr 70:...
5: wlp5s0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP group default qlen 1000
    link/ether 8a:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff permaddr 34:...
7: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 8a:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.111/24 metric 1024 brd 192.168.1.255 scope global dynamic bond0
       valid_lft 25777sec preferred_lft 25777sec
    [ipv6 setup... NYOB]

I unplug then plug my ethernet cable back in, wait a few minutes.

Ethernet Channel Bonding Driver: v6.2.11-1-clear

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: enp4s0 (primary_reselect always)
Currently Active Slave: enp4s0
MII Status: up  
MII Polling Interval (ms): 100 
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

[...]
root@toxoplasmosis / # tcpdump -nei wlp5s0 icmp # should be empty, right?
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on wlp5s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
11:10:02.926061 98:xx:xx:xx:xx:xx > 8a:yy:yy:yy:yy:yy, ethertype IPv4 (0x0800), length 66: 185.XX.XX.XX.443 > 192.168.1.111.39328: Flags [.], ack 9215, win 503, options [nop,nop,TS val 3124802758 ecr 4247132757], length 0 # HTTPS traffic I guess?
11:10:02.926480 98:xx:xx:xx:xx:xx > 8a:yy:yy:yy:yy:yy, ethertype IPv4 (0x0800), length 570: AA.AA.AA.AA.51820 > 192.168.1.111.60927: UDP, length 528 # Wireguard VPN
11:10:04.635794 98:xx:xx:xx:xx:xx > 8a:yy:yy:yy:yy:yy, ethertype IPv4 (0x0800), length 330: AA.AA.AA.AA.51820 > 192.168.1.111.60927: UDP, length 288
11:10:10.946487 d0:xx:xx:xx:xx:xx > 8a:yy:yy:yy:yy:yy, ethertype IPv4 (0x0800), length 70: 192.168.1.21.42123 > 192.168.1.111.3478: UDP, length 28 # Ubiquiti devices at home connecting to the aur/unifi software
[...]
11:14:05.489611 98:xx:xx:xx:xx > 8a:yy:yy:yy:yy, ethertype IPv4 (0x0800), length 98: 192.168.1.1 > 192.168.1.111: ICMP echo request, id 15601, seq 923, length 64 # lost (?) ping from router to my machine

My home network works as follows:

192.168.1/24
router .1 --- USW-24-PoE .30 --- U6 Lite .21 (WAP)
                   \ 1Gbps eth        / WiFi6 chan40
                    ` my machine .111 

What would cause this issue and how can I fix it?


bspwm, BTRFS over LUKS
Archlinux a lot, FreeBSD more and more...
Murphy's rule: The day you need a backup, you tell yourself you should have created some.

Offline

#2 2023-07-28 15:27:14

Moviuro
Member
Registered: 2012-06-03
Posts: 74

Re: WiFi interface in bond receiving packets when primary is connected

I'm bumping this subject as it has received no answers either here or on the netdev mailing list sad The issue keeps happening on LTS, zen and regular linux kernel.

See also https://lore.kernel.org/netdev/ZGzGngNh … 49d74c1e90


bspwm, BTRFS over LUKS
Archlinux a lot, FreeBSD more and more...
Murphy's rule: The day you need a backup, you tell yourself you should have created some.

Offline

#3 2023-08-01 07:06:35

-thc
Member
Registered: 2017-03-15
Posts: 578

Re: WiFi interface in bond receiving packets when primary is connected

I have read your posts but I fail to understand the reason behind that bond. WiFi as a backup for a failed Ethernet port?

Offline

#4 2023-08-01 08:28:07

Moviuro
Member
Registered: 2012-06-03
Posts: 74

Re: WiFi interface in bond receiving packets when primary is connected

Yes, WiFi as a backup, and common setup across all my machines (desktops, and laptops that would be (un)plugged randomly).


bspwm, BTRFS over LUKS
Archlinux a lot, FreeBSD more and more...
Murphy's rule: The day you need a backup, you tell yourself you should have created some.

Offline

#5 2023-08-01 11:02:56

-thc
Member
Registered: 2017-03-15
Posts: 578

Re: WiFi interface in bond receiving packets when primary is connected

So this is my version of what I think happens:

The bond with it's MAC is registered via Ethernet on the pSwitch. You disconnect the Ethernet cable.

The pSwitch immediately realizes the port is no longer in use and stops advertising it's address via ARP.

The bond switches to WiFi and registers it's MAC with the AP. All ARP traffic regarding your IP (.111) should now arrive at the AP and answered from there.

You plug the Ethernet cable back in. The bond switches back to the Ethernet port and the pSwitch should now serve ARP requests for .111 again.

The AP has no way of "knowing" that your WiFi should be disconnected and will be substituted by other means. You may be simply out of reach for a short while. The AP still answers to ARP requests for .111.

This ARP collision is AFAIK intrinsic to this setup and I can't think of a simple solution. In theory the firmware in both upstream devices has to be "mixed-bond"-aware and should somehow communicate the hand-over.

Last edited by -thc (2023-08-01 11:03:34)

Offline

#6 2024-06-27 20:37:46

westportjack
Member
Registered: 2024-06-27
Posts: 1

Re: WiFi interface in bond receiving packets when primary is connected

Hi, I've experienced the exact same issue on my Arch Linux machine. In my setup I'm using NetworkManager, but that is not relevant here.

The problem is not the Linux network bonding configuration. The problem is in the network. In particular on the Unifi access point. If you look in detail what is going on, you will notice LLC frames going out the wired interface of you access point. The frames have the source MAC of the wireless client (my Linux machine).

accesspoint# tcpdump -i eth0 -Q out -nne ether host xx:xx:xx:xx:xx:64 (MAC of my Linux machine)
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes

10:06:44.361870 xx:xx:xx:xx:xx:64 > ff:ff:ff:ff:ff:ff, 802.3, length 6: LLC, dsap Null (0x00) Individual, ssap Null (0x00) Response, ctrl 0xaf: Unnumbered, xid, Flags [Response], length 46: 01 02 

Wait! Frames from the MAC of the Linux machine are not supposed to arrive from the access point while the wireless interface is in bonding backup state. But they do, and exactly that causes the problem. On the switch there happens a MAC flap. This is also why you see incoming traffic on your wireless interface for a short period.

Switch log:

Jun 27 2024 10:06:44 switch %%01L2IFPPI/4/MAC_MOVE_WARN(l)[0]:MAC move detected. (Original-Port=<wired-port>, Flapping-Port=<ap-port>)
Jun 27 2024 10:06:44 switch %%01L2IFPPI/4/MAC_MOVE_WARN(l)[1]:MAC move detected. (Original-Port=<ap-port>, Flapping-Port=<wired-port>)

Now, I try a short investigation into this behavior. Looking on my Linux machine on the wireless interface, I can't find any LLC frames going out the interface. So the Linux machine should be OK.

linux-machine ~ # tcpdump -i wlan0 -nne -Q out 
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on wlan0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
  
<no packets>

On the access point it looks like the frames arrive on the wireless interface:

accesspoint# tcpdump -i rai0 -Q in -nne ether host xx:xx:xx:xx:xx:64
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on rai0, link-type EN10MB (Ethernet), capture size 262144 bytes (rai0 = wireless interface of the access point)

10:08:52 xx:xx:xx:xx:xx:64 > ff:ff:ff:ff:ff:ff, 802.3, length 6: LLC, dsap Null (0x00) Individual, ssap Null (0x00) Response, ctrl 0xaf: Unnumbered, xid, Flags [Response], length 46: 01 02

In my opinion the frames are not sent from the wireless client, it seems that the access point somehow generate this type of LLC frames by itself, maybe in the network driver on some wireless events? I couldn't find any useful information or documentation about this exact LLC frames, nor any settings on the access point related to LLC.

Wireshark expands to:

+ Frame 1: 60 bytes on wire (480 bits), 60 bytes captured (480 bits) on interface unknown, id 0
+ IEEE 802.3 Ethernet 
- Logical-Link Control
    DSAP: NULL LSAP (0x00)
    SSAP: NULL LSAP (0x01)
    Control field: U, func=XID (0xAF)
- Logical-Link Control Basic Format XID
    XID Format: LLC basic format (0x81)
    LLC Types/Classes: Type 1 LLC (Class I LLC) (0x01)
    Receive Window Size: 1 

So, a quick note to your original question:
You get incoming traffic on your wireless interface. Incoming traffic is decided by the "network", or in this case by the switch, where traffic is send to. Something needs to trigger a MAC flap event. In my case it is the access point, generating or relaying LLC frames. So this issue could be well related to your problem to.

Can you investigate if you see such LLC frames on your access point to?

Yet, I found no other solution than using another access point.

If anyone can explain this particular use of LLC, it is very well appreciated.

Last edited by westportjack (2024-06-27 20:43:38)

Offline

Board footer

Powered by FluxBB