You are not logged in.
I have PC with multiple VLANs over 2 ethernet cards (1GB/s each) bonded with LACP and connected to 2 ports of the smart switch DGS-1210-28 (router-on-a-stick configuration). PC have ntop-ng running so I can monitor traffic. It's working. Yet I have problem that ALL traffic comes through single port. Disconnecting this port breaks connection. Which shouldn't happen with LACP trunking. Unfortunately switch is old firmware and not showing link-aggregation status. And I can't find how to check LACP status (port states, connected device info, etc) in networkd. How to diagnose such problems in arch?
networkd configs:
enp-any.network
[Match]
Name=enp*
[Network]
Bond=Trunk0
Trunk0.netdev
[NetDev]
Name=Trunk0
Kind=bond
[Bond]
Mode=802.3ad
TransmitHashPolicy=layer3+4
MIIMonitorSec=1s
LACPTransmitRate=slow
Trunk0.network
[Match]
Name=Trunk0
[Network]
VLAN=***
VLAN=***
VLAN=***
LinkLocalAddressing=no
BindCarrier=enp2s0 enp7s0
networkctl shows status "configuring" for carriers which is weird
networkctl
IDX LINK TYPE OPERATIONAL SETUP
1 lo loopback carrier unmanaged
2 enp2s0 ether carrier configuring
3 enp7s0 ether carrier configuring
4 Trunk0 bond carrier configured
5 *** vlan routable configured
6 *** vlan routable configured
7 *** vlan routable configured
Offline
It appears that bond state could be checked by
cat /proc/net/bonding/*
which shows problem:
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer3+4 (1)
MII Status: up
MII Polling Interval (ms): 1000
Up Delay (ms): 0
Down Delay (ms): 0
802.3ad info
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: **:**:**:**:**:ad
Active Aggregator Info:
Aggregator ID: 2
Number of ports: 1
Actor Key: 9
Partner Key: 3
Partner Mac Address: **:**:**:**:**:6e
Slave Interface: enp7s0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: **:**:**:**:**:67
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: churned
Partner Churn State: churned
Actor Churned Count: 1
Partner Churned Count: 1
details actor lacp pdu:
system priority: 65535
system mac address: **:**:**:**:**:ad
port key: 0
port priority: 255
port number: 1
port state: 69
details partner lacp pdu:
system priority: 65535
system mac address: 00:00:00:00:00:00
oper key: 1
port priority: 255
port number: 1
port state: 1
Slave Interface: enp2s0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: **:**:**:**:**:ef
Slave queue ID: 0
Aggregator ID: 2
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
system priority: 65535
system mac address: **:**:**:**:**:ad
port key: 9
port priority: 255
port number: 2
port state: 61
details partner lacp pdu:
system priority: 32768
system mac address: **:**:**:**:**:6e
oper key: 3
port priority: 128
port number: 15
port state: 61
one of the driven NICs is in "churned" state. Which seems to be the result of that NIC to have "Aggregator ID" different from bond which breaks aggregation for this NIC. And I don't understand why this happening.
Last edited by avi9526 (2019-01-27 02:28:33)
Offline
I noticed that this problem happens if I restart machine with kexec (LTS kernel), if machine gets full restart with complete power off - everything is ok, all interfaces have aggregator ID=1. Setting churned link to down state before kexec does not help
Last edited by avi9526 (2019-03-02 22:00:55)
Offline