You are not logged in.

#1 2022-01-25 23:32:58

vieri
Member
Registered: 2018-12-03
Posts: 14

Linux router: reply packets sent to wrong network interface

Hi,

I'm witnessing a very odd network behavior. I don't know if it's a misconfiguration on my behalf or a bug.

I use my Linux server as a router.
I have several NICs and vlans on this router.

For instance, I have an LACP-bound interface named "lan" with 2 NICs.
I have hosts in vlan 1 (lan.1 interface) that can ping hosts in vlan 18 (lan.18 interface).

Here's what a tcpdump would show when successfully pinging a host in vlan 18 from a host in vlan 1:

9c:7b:ef:b7:7a:a1 > b8:59:9f:cc:bb:5c, ethertype 802.1Q (0x8100), length 78: vlan 1, p 0, ethertype IPv4 (0x0800), 10.215.111.210 > 10.215.144.129: ICMP echo request, id 1, seq 4050, length 40
b8:59:9f:cc:bb:5c > 94:40:c9:26:e2:d2, ethertype 802.1Q (0x8100), length 78: vlan 18, p 0, ethertype IPv4 (0x0800), 10.215.111.210 > 10.215.144.129: ICMP echo request, id 1, seq 4050, length 40
94:40:c9:26:e2:d2 > b8:59:9f:cc:bb:5c, ethertype 802.1Q (0x8100), length 78: vlan 18, p 0, ethertype IPv4 (0x0800), 10.215.144.129 > 10.215.111.210: ICMP echo reply, id 1, seq 4050, length 40
b8:59:9f:cc:bb:5c > 9c:7b:ef:b7:7a:a1, ethertype 802.1Q (0x8100), length 78: vlan 1, p 0, ethertype IPv4 (0x0800), 10.215.144.129 > 10.215.111.210: ICMP echo reply, id 1, seq 4050, length 40

All's fine.

The MAC addr. b8:59:9f:cc:bb:5c is the "lan" interface on the Linux router (where lan.1 and lan.18 are defined).

However, for some strange reason the same host in vlan 1 CANNOT successfully ping another host in vlan 18 (when it should as there are no blocking iptables rules for these hosts).

Here's what a tcpdump shows me:

9c:7b:ef:b7:7a:a1 > b8:59:9f:cc:bb:5c, ethertype 802.1Q (0x8100), length 78: vlan 1, p 0, ethertype IPv4 (0x0800), 10.215.111.210 > 10.215.144.251: ICMP echo request, id 1, seq 3845, length 40
b8:59:9f:cc:bb:5c > 94:40:c9:26:dc:80, ethertype 802.1Q (0x8100), length 78: vlan 18, p 0, ethertype IPv4 (0x0800), 10.215.111.210 > 10.215.144.251: ICMP echo request, id 1, seq 3845, length 40
94:40:c9:26:dc:80 > ac:1f:6b:f5:b7:1b, ethertype 802.1Q (0x8100), length 78: vlan 18, p 0, ethertype IPv4 (0x0800), 10.215.144.251 > 10.215.111.210: ICMP echo reply, id 1, seq 3845, length 40

As you can see the difference here is that the reply is sent to MAC addr. ac:1f:6b:f5:b7:1b which is also on the Linux router (a different NIC), but that interface is DOWN:

# ip a s lanixa
5: lanixa: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether ac:1f:6b:f5:b7:1b brd ff:ff:ff:ff:ff:ff
    altname enp1s0f1

The ping obviously does not complete successfully in this case.

Why in the world would a packet be sent to an interface which is down?
Is it an ARP glitch?
What can I try to further debug this?

Regards,

Vieri

Last edited by vieri (2022-01-27 00:50:58)

Offline

#2 2022-01-26 07:00:01

-thc
Member
Registered: 2017-03-15
Posts: 775

Re: Linux router: reply packets sent to wrong network interface

The purpose of tagged VLANs (IEEE 802.1Q) is network separation on a logical (packet) level.

No host in VLAN A should be able to communicate with hosts in VLAN B.

What is the purpose of your VLAN setup? What are you trying to achieve?

Offline

#3 2022-01-26 09:52:03

vieri
Member
Registered: 2018-12-03
Posts: 14

Re: Linux router: reply packets sent to wrong network interface

-thc wrote:

The purpose of tagged VLANs (IEEE 802.1Q) is network separation on a logical (packet) level.

No host in VLAN A should be able to communicate with hosts in VLAN B.

What is the purpose of your VLAN setup? What are you trying to achieve?

Of course, the networks are separated. However, as I said in the first post traffic between the specified hosts is allowed (just between those hosts). I believe the problem in "scenario 2" goes beyond vlans and iptables rules. As I noted, the replies are going to the WRONG MAC address. They are going to a MAC addr. of an interface that is down. This sounds more like an ARP issue (or bug).

This is the "offending line" in the example I'm giving in the first post:

94:40:c9:26:dc:80 > ac:1f:6b:f5:b7:1b, ethertype 802.1Q (0x8100), length 78: vlan 18, p 0, ethertype IPv4 (0x0800), 10.215.144.251 > 10.215.111.210: ICMP echo reply, id 1, seq 3845, length 40

where ac:1f:6b:f5:b7:1b is the MAC addr. of a NIC on the Linux router that is DOWN.

Why should the Linux router decide to use that MAC addr.?

Instead of that, I was expecting this instead:

94:40:c9:26:dc:80 > b8:59:9f:cc:bb:5c, ethertype 802.1Q (0x8100), length 78: vlan 18, p 0, ethertype IPv4 (0x0800), 10.215.144.251 > 10.215.111.210: ICMP echo reply, id 1, seq 3845, length 40

which makes sense because b8:59:9f:cc:bb:5c is the MAC addr. of the right NIC.

However, I am not seeing this. I don't understand why the OS or the ARP protocol is "sending" the packets to the MAC addr. of a NIC that is down.
Even if it were up there would be no reason to send it there either -- because neither vlan 1 nor vlan 18 are there.

Last edited by vieri (2022-01-27 00:51:53)

Offline

#4 2022-01-26 12:49:24

-thc
Member
Registered: 2017-03-15
Posts: 775

Re: Linux router: reply packets sent to wrong network interface

Again: Tagged VLANs are used to separate hosts into different, completely separate networks.
Hosts on different VLANs should be unable to communicate by design.

From your first posting:

I have hosts in vlan 1 (lan.1 interface) that can ping hosts in vlan 18 (lan.18 interface)

This should not be possible.

Offline

#5 2022-01-26 12:59:39

vieri
Member
Registered: 2018-12-03
Posts: 14

Re: Linux router: reply packets sent to wrong network interface

-thc wrote:

Hosts on different VLANs should be unable to communicate by design.
This should not be possible.

Why?

Seriously, even in a basic scenario it makes perfectly sense to allow specific traffic between vlans. Take for example, vlan "Human Resources" cannot communicate with vlan "IT". None of those can communicate with vlan "servers", but you want to allow vlan "Human Resources" to access one or more hosts in vlan "servers" only on port 443. You might also want  to grant more access for vlan "IT". What's "wrong" with that?

In any case, my question isn't about design either. It's about  networking behavior and probably ARP. My question is simply stated:
why is the OS sending a packet to the MAC addr. of an interface which is DOWN, has no IP settings, etc?

I don't have in-depth knowledge of ARP (if what I'm seeing is actually due to that protocol).

Offline

#6 2022-01-26 14:09:46

seth
Member
Registered: 2012-09-03
Posts: 60,922

Re: Linux router: reply packets sent to wrong network interface

Please edid your posts and make use of the code tag for readability, thanks.
https://bbs.archlinux.org/help.php#bbcode

On topic:
If you want to bridge different segments you'll need, guess what, a bridge.
For VLANs that would be a forward rule in iptables - do you forward the traffic there? How exactly?
Afaiu your OP you cannot actually ping across VLANs but you can ping the (common) gateway.

The packages are sent according to your routing table - in particular that of the host w/ the MAC 94:40:c9:26:dc:80
Also the ARP cache of 94:40:c9:26:dc:80 might have some stale IPs - or you might simply re-use IPs in the VLANs.
Check "ip neigh" on 94:40:c9:26:dc:80 - it'll likely assign 10.215.111.21 to ac:1f:6b:f5:b7:1b

sudo ip neigh flush all

will flush the ARP cache on that host.

"Why" any of  this happens can't be told from the mere fact of "that", we'll need the exact and complete setup of your VLAN, the lease strategies and arp cache on each host as well as the involved iptables.

And yes, this is about "design": You're trying to do something™  but you're clearly not doing the something™ that you want to do and without knowing what something™ is, we can't tell you what somewhatelsebetter™ would be.
If you unconditionally want to forward traffic amongs VLANs that would raise the question why you're using them itfp.

Offline

#7 2022-01-27 00:48:49

vieri
Member
Registered: 2018-12-03
Posts: 14

Re: Linux router: reply packets sent to wrong network interface

If you want to bridge different segments you'll need, guess what, a bridge.

No, I'm routing.


For VLANs that would be a forward rule in iptables - do you forward the traffic there? How exactly?

iptables rules are not the issue here (see below).


Afaiu your OP you cannot actually ping across VLANs but you can ping the (common) gateway.

No, I can ping just fine. In my OP I show 2 ping examples. The first one is a successful round-trip ICMP request/reply. Works fine. The second one is where the ICMP reply packets are sent to a wrong MAC addr so the host who made the request is not actually receiving the reply (but as you can see from the tcpdump, the Linux router is seeing it). The routing rules and accesses are fine because they are the same as with the first "successful" example.

Please note that the "common gateway" here is my Linux router with NICs with MAC addr. b8:59:9f:cc:bb:5c and ac:1f:6b:f5:b7:1b.

On the Linux router I see this:

# ip neigh | grep 10.215.144.251
10.215.144.251 dev lan.18 lladdr 94:40:c9:26:dc:80 REACHABLE

# ip neigh | grep 94:40:c9:26:dc:80
10.215.144.251 dev lan.18 lladdr 94:40:c9:26:dc:80 REACHABLE

# ip neigh | grep ac:1f:6b:f5:b7:1b

# ip neigh | grep 10.215.111.210
10.215.111.210 dev lan.1 lladdr 9c:7b:ef:b7:7a:a1 REACHABLE

If you unconditionally want to forward traffic amongs VLANs that would raise the question why you're using them itfp.

I did not state that I wanted to unconditionally forward traffic among VLANs. The hosts in my OP - only those hosts - are "allowed" to ping.

Offline

#8 2022-01-27 05:56:00

-thc
Member
Registered: 2017-03-15
Posts: 775

Re: Linux router: reply packets sent to wrong network interface

vieri wrote:

Seriously, even in a basic scenario it makes perfectly sense to allow specific traffic between vlans. Take for example, vlan "Human Resources" cannot communicate with vlan "IT". None of those can communicate with vlan "servers", but you want to allow vlan "Human Resources" to access one or more hosts in vlan "servers" only on port 443. You might also want  to grant more access for vlan "IT". What's "wrong" with that?

You have a fundamental misunderstanding what VLANs are designed to do.

Offline

#9 2022-01-27 08:18:25

seth
Member
Registered: 2012-09-03
Posts: 60,922

Re: Linux router: reply packets sent to wrong network interface

On the Linux router I see this

You're interested in the ARP cache on the host that sends the reply the wrong direction - and I suggest to not grep the output.

Edit, you might find this helpful: https://security.stackexchange.com/ques … p-spoofing

Last edited by seth (2022-01-27 21:02:20)

Offline

#10 2022-01-27 22:40:45

Tarqi
Member
From: Ixtlan
Registered: 2012-11-27
Posts: 179
Website

Re: Linux router: reply packets sent to wrong network interface

-thc wrote:

You have a fundamental misunderstanding what VLANs are designed to do.

You might rethink that, maybe with the help of some reading.

Last edited by Tarqi (2022-01-27 22:58:51)


Knowing others is wisdom, knowing yourself is enlightenment. ~Lao Tse

Offline

#11 2022-01-28 07:51:59

-thc
Member
Registered: 2017-03-15
Posts: 775

Re: Linux router: reply packets sent to wrong network interface

After practically being called a dimwit I looked for ways of connecting VLANs - something alien and self-contradictory to me. In the past I only had to realize single virtual devices that needed to be accessible in multiple VLANs.

So vieri's Arch box is a router between VLANs with separate physical interfaces for each VLAN.

My mistake.

Offline

#12 2022-01-28 22:13:01

Koatao
Member
Registered: 2018-08-30
Posts: 98

Re: Linux router: reply packets sent to wrong network interface

-thc wrote:

So vieri's Arch box is a router between VLANs with separate physical interfaces for each VLAN.

No, it isn't. But it is unclear what is the exact network configuration.

vieri wrote:

For instance, I have an LACP-bound interface named "lan" with 2 NICs.
I have hosts in vlan 1 (lan.1 interface) that can ping hosts in vlan 18 (lan.18 interface).

What does for instance supposed to mean? Best would be simply post the output of

ip a

. And give a bit of description of the network topology (what are those bonded physical interfaces connected to?).
The ARP cache poisoning of the host could easily come from a misconfiguration (capture ARP traffic might help too).

Offline

#13 2022-02-01 11:19:49

vieri
Member
Registered: 2018-12-03
Posts: 14

Re: Linux router: reply packets sent to wrong network interface

I do not have full access to the hosts in vlan 18, so I can only do just as much. There is one new information I have though. Here below are the tcpdump traces while pinging two different hosts (say, hosts A and B).

Host A:

[new info] It is a vmware ESXi management interface with IP addr. 10.215.144.129/23, default gateway 10.215.144.91.
The ping trace is OK when pinging from a host in vlan 1 (10.215.111.210), and I can see that the MAC addr. of the Linux router is as expected (b8:59:9f:cc:bb:5c is on lan LACP aggregate):

# tcpdump -vv -n -e -i lan.1 host 10.215.144.129 and icmp
9c:7b:ef:b7:7a:a1 > b8:59:9f:cc:bb:5c, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 128, id 48597, offset 0, flags [none], proto ICMP (1), length 60)
    10.215.111.210 > 10.215.144.129: ICMP echo request, id 2, seq 3794, length 40
b8:59:9f:cc:bb:5c > 9c:7b:ef:b7:7a:a1, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 63, id 39108, offset 0, flags [none], proto ICMP (1), length 60)
    10.215.144.129 > 10.215.111.210: ICMP echo reply, id 2, seq 3794, length 40

# tcpdump -vv -n -e -i lan.18 host 10.215.144.129 and icmp
b8:59:9f:cc:bb:5c > 94:40:c9:26:e2:d2, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 127, id 48713, offset 0, flags [none], proto ICMP (1), length 60)
    10.215.111.210 > 10.215.144.129: ICMP echo request, id 2, seq 3944, length 40
94:40:c9:26:e2:d2 > b8:59:9f:cc:bb:5c, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 57790, offset 0, flags [none], proto ICMP (1), length 60)
    10.215.144.129 > 10.215.111.210: ICMP echo reply, id 2, seq 3944, length 40

All's well for host A.

Host B:

[new info] It is a vmware ESXi iLO interface with IP addr. 10.215.144.251/23, default gateway 10.215.144.91.
The requests are going out fine:

# tcpdump -vv -n -e -i lan.1 host 10.215.144.251 and icmp
9c:7b:ef:b7:7a:a1 > b8:59:9f:cc:bb:5c, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 128, id 38533, offset 0, flags [none], proto ICMP (1), length 60)
    10.215.111.210 > 10.215.144.251: ICMP echo request, id 2, seq 3906, length 40
9c:7b:ef:b7:7a:a1 > b8:59:9f:cc:bb:5c, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 128, id 38534, offset 0, flags [none], proto ICMP (1), length 60)
    10.215.111.210 > 10.215.144.251: ICMP echo request, id 2, seq 3907, length 40

However, I can see that the replies are headed for MAC addr. ac:1f:6b:f5:b7:1b which is not OK.

# tcpdump -vv -n -e -i lan.18 host 10.215.144.251 and icmp
b8:59:9f:cc:bb:5c > 94:40:c9:26:dc:80, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 127, id 38538, offset 0, flags [none], proto ICMP (1), length 60)
    10.215.111.210 > 10.215.144.251: ICMP echo request, id 2, seq 3911, length 40
94:40:c9:26:dc:80 > ac:1f:6b:f5:b7:1b, ethertype IPv4 (0x0800), length 74: (tos 0xe0, ttl 255, id 16211, offset 0, flags [none], proto ICMP (1), length 60)
    10.215.144.251 > 10.215.111.210: ICMP echo reply, id 2, seq 3911, length 40
b8:59:9f:cc:bb:5c > 94:40:c9:26:dc:80, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 127, id 38539, offset 0, flags [none], proto ICMP (1), length 60)
    10.215.111.210 > 10.215.144.251: ICMP echo request, id 2, seq 3912, length 40
94:40:c9:26:dc:80 > ac:1f:6b:f5:b7:1b, ethertype IPv4 (0x0800), length 74: (tos 0xe0, ttl 255, id 16213, offset 0, flags [none], proto ICMP (1), length 60)
    10.215.144.251 > 10.215.111.210: ICMP echo reply, id 2, seq 3912, length 40

So, ping to host B fails.
   
I managed to have access to one of the three iLO interfaces I have in vlan 18 (3 different servers). It's not easy because the VMs have to be moved to another host in the cluster, reboot into BIOS/UEFI, check/edit network settings, reboot...
I decided to change the netmask, so I left host C with the settings listed below.

Host C:

[new info]
10.215.144.253/22
default gw 10.215.144.91

So, netmask CIDR went from 23 to 22.

Now, ping to host C works fine according to the following trace.

# tcpdump -vv -n -e -i lan.1 host 10.215.144.253 and icmp
9c:7b:ef:b7:7a:a1 > b8:59:9f:cc:bb:5c, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 128, id 55592, offset 0, flags [none], proto ICMP (1), length 60)
    10.215.111.210 > 10.215.144.253: ICMP echo request, id 2, seq 5066, length 40
b8:59:9f:cc:bb:5c > 9c:7b:ef:b7:7a:a1, ethertype IPv4 (0x0800), length 74: (tos 0xe0, ttl 254, id 45286, offset 0, flags [none], proto ICMP (1), length 60)
    10.215.144.253 > 10.215.111.210: ICMP echo reply, id 2, seq 5066, length 40

# tcpdump -vv -n -e -i lan.18 host 10.215.144.253 and icmp
b8:59:9f:cc:bb:5c > 94:40:c9:26:fa:aa, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 127, id 55599, offset 0, flags [none], proto ICMP (1), length 60)
    10.215.111.210 > 10.215.144.253: ICMP echo request, id 2, seq 5073, length 40
94:40:c9:26:fa:aa > b8:59:9f:cc:bb:5c, ethertype IPv4 (0x0800), length 74: (tos 0xe0, ttl 255, id 45294, offset 0, flags [none], proto ICMP (1), length 60)
    10.215.144.253 > 10.215.111.210: ICMP echo reply, id 2, seq 5073, length 40

So I'm a bit puzzled as to why the netmask change made the difference.

Below you will find the network settings of the interfaces involved in the Linux router.

	
# ip a s lanixa
5: lanixa: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether ac:1f:6b:f5:b7:1b brd ff:ff:ff:ff:ff:ff
    altname enp1s0f1
	
# ip a s lan
15: lan: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether b8:59:9f:cc:bb:5c brd ff:ff:ff:ff:ff:ff
    inet 192.168.215.65/28 brd 192.168.215.79 scope global lan
       valid_lft forever preferred_lft forever

# ip a s lan.18
24: lan.18@lan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether b8:59:9f:cc:bb:5c brd ff:ff:ff:ff:ff:ff
    inet 192.168.240.1/24 brd 192.168.240.255 scope global lan.18
       valid_lft forever preferred_lft forever

# ip a s lan.1
16: lan.1@lan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc prio state UP group default qlen 1000
    link/ether b8:59:9f:cc:bb:5c brd ff:ff:ff:ff:ff:ff
    inet 10.215.144.91/22 brd 10.215.147.255 scope global lan.1
       valid_lft forever preferred_lft forever
    inet 10.215.246.91/23 brd 10.215.247.255 scope global lan.1
       valid_lft forever preferred_lft forever
    inet 10.215.248.91/24 brd 10.215.248.255 scope global lan.1
       valid_lft forever preferred_lft forever
    inet 10.215.111.254/22 brd 10.215.111.255 scope global lan.1
       valid_lft forever preferred_lft forever
    inet 192.168.144.91/24 brd 192.168.144.255 scope global lan.1
       valid_lft forever preferred_lft forever
    inet 10.215.145.241/23 brd 10.215.145.255 scope global lan.1
       valid_lft forever preferred_lft forever
    inet 192.168.246.1/23 brd 192.168.247.255 scope global lan.1
       valid_lft forever preferred_lft forever
    inet 192.168.136.91/22 brd 192.168.139.255 scope global lan.1
       valid_lft forever preferred_lft forever
    inet 10.215.144.6/22 brd 10.215.147.255 scope global secondary lan.1
       valid_lft forever preferred_lft forever
    inet 10.215.145.242/23 brd 10.215.145.255 scope global secondary lan.1
       valid_lft forever preferred_lft forever
    inet 10.215.145.81/23 brd 10.215.145.255 scope global secondary lan.1
       valid_lft forever preferred_lft forever

Please note that the netmask for 10.215.144.91/22 will soon be changed to 10.215.144.91/23 as soon as I migrate some hosts to the right subnets. It's there for historical reasons.
Also note that the vlans are not cleanly segmented into their own subnets. That is also for historical reasons, and it cannot be fixed immediately. So, while the vlans are being rearranged the Linux router is in charge of setting up routing rules or entries in the route tables (and acting as proxy for ARP) such as in the following listing (as far as the hosts involved in this example are concerned).

# ip route show table all | grep 10.215.144.129
10.215.144.129 dev lan.18 table HMAN scope link

# ip route show table all | grep 10.215.144.251
10.215.144.251 dev lan.18 table HMAN scope link

# ip route show table all | grep 10.215.144.253
10.215.144.253 dev lan.18 table HMAN scope link

# ip rule show
0:      from all lookup local
1:      from all fwmark 0x200/0x200 lookup Tproxy
220:    from all lookup 220
990:    from all lookup HMAN
999:    from all lookup main

As far as the ARP messages are concerned I cannot clear the ARP cache in host B as I do not have access to it. I can clear it in the host making the ICMP request.
I can see this:

b8:59:9f:cc:bb:5c > 94:40:c9:26:dc:80, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 10.215.144.251 tell 192.168.240.1, length 28
94:40:c9:26:dc:80 > b8:59:9f:cc:bb:5c, ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4), Reply 10.215.144.251 is-at 94:40:c9:26:dc:80, length 46

which is OK.

Now, I realize the router isn't in the best of conditions as far as vlans are concerned, but the problem described is actually quite interesting. I don't see why pings to host B are sent to the MAC addr. of a NIC that is down on the router *just* because the netmask is different.
I'm also supposing that host B would start working if I were to leave it its /23 netmask but change the default gateway to 10.215.145.241.
It would also probably start working too as soon as I changed 10.215.144.91/22 to 10.215.144.91/23 on the router.
But why exactly?

Could host B have an ARP cache entry for 10.215.111.210 or for 10.215.144.91 pointing to the wrong MAC addr.?

Last edited by vieri (2022-02-01 11:20:33)

Offline

#14 2022-02-01 13:12:57

-thc
Member
Registered: 2017-03-15
Posts: 775

Re: Linux router: reply packets sent to wrong network interface

Koatao wrote:

No, it isn't. But it is unclear what is the exact network configuration.

So Koatao was right. This is a "one-armed" router with two VLANs on one interface.

vieri wrote:

(b8:59:9f:cc:bb:5c is on lan LACP aggregate)

Can you please explain which physical router interfaces are aggregated via LACP? And how?

Offline

#15 2022-02-01 14:13:11

seth
Member
Registered: 2012-09-03
Posts: 60,922

Re: Linux router: reply packets sent to wrong network interface

Offline

#16 2022-02-02 06:46:11

vieri
Member
Registered: 2018-12-03
Posts: 14

Re: Linux router: reply packets sent to wrong network interface

I tried this:

# arping -c 1 -s 10.215.144.91 10.215.144.252
ARPING 10.215.144.252 from 10.215.144.91 lan.18
Unicast reply from 10.215.144.252 [94:40:C9:26:E2:CE]  0.691ms
Sent 1 probes (1 broadcast(s))
Received 1 response(s)

and at the same time I checked this:

# tcpdump -n -vv -e -i lan.18 arp
b8:59:9f:cc:bb:5c > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 10.215.144.252 (ff:ff:ff:ff:ff:ff) tell 10.215.144.91, length 28
94:40:c9:26:e2:ce > b8:59:9f:cc:bb:5c, ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4), Reply 10.215.144.252 is-at 94:40:c9:26:e2:ce, length 46

However, ICMP replies are still sent to the wrong MAC address (ac:1f:6b:f5:b7:1b) even when pinging from the Linux router through interface lan.18.

b8:59:9f:cc:bb:5c > 94:40:c9:26:e2:ce, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 9186, offset 0, flags [DF], proto ICMP (1), length 84)
    192.168.240.1 > 10.215.144.252: ICMP echo request, id 50791, seq 247, length 64
94:40:c9:26:e2:ce > ac:1f:6b:f5:b7:1b, ethertype IPv4 (0x0800), length 98: (tos 0xe0, ttl 255, id 44376, offset 0, flags [none], proto ICMP (1), length 84)
    10.215.144.252 > 192.168.240.1: ICMP echo reply, id 50791, seq 247, length 64

Beats me.

Offline

#17 2022-02-02 08:55:01

seth
Member
Registered: 2012-09-03
Posts: 60,922

Re: Linux router: reply packets sent to wrong network interface

How out of control is that host?
Can you solve the access problem w/ a $5 wrench?
https://xkcd.com/538/

Offline

#18 2022-02-02 09:14:41

vieri
Member
Registered: 2018-12-03
Posts: 14

Re: Linux router: reply packets sent to wrong network interface

I now have ssh access to host C in my previous post. The problem now is that even if I did manage to access via ssh the failing iLO hosts I wouldn't know how to clear the ARP cache or do some more testing there as the shell is limited to this:

# ssh hp@10.215.144.253
hp@10.215.144.253's password:
User:hp logged-in to ILOCZ200201FM.(10.215.144.253 / FE80::9640:C9FF:FE26:FAAA)
iLO Advanced 2.14 at  Feb 11 2020
Server Name: vmwareilo03.domain.org
Server Power: On

</>hpiLO-> help

status=0
status_tag=COMMAND COMPLETED
Wed Feb  2 09:06:09 2022



DMTF SMASH CLP Commands:

help    : Used to get context sensitive help.
show    : Used to display values of a property or contents of a collection target.
show  -a  : Recursively show all targets within the current target.
show  -l <level>  : Recursively show targets within the current target based on 'level' specified.
         Valid values for 'level' is from 1 to 9.
create  : Used to create new instances in the name space of the MAP.
 Example: create /map1/accounts1 username=<lname1> password=<pwd12345> name=
 <dname1> group=<admin,config,oemHPE_vm,oemHPE_rc,oemHPE_power>

delete  : Used to destroy instances in the name space of the MAP.
 Example: delete /map1/accounts1/<lname1>

load    : Used to move a binary image from an URL to the MAP.
 Example: load /map1/firmware1 -source http://192.168.1.1/images/fw/iLO5_110.bin

reset   : Causes a target to cycle from enabled to disabled and back to enabled.

set     : Used to set a property or set of properties to a specific value.
set [<options>] [<target>] <propertyname>=<value>
start   : Used to cause a target to change state to a higher run level.
stop    : Used to cause a target to change state to a lower run level.
cd      : Used to set the current default target.
   Example: cd targetname
date    : Used to get the current date.
time    : Used to get the current time.
exit    : Used to terminate the CLP session.
version : Used to query the version of the CLP implementation or other CLP elements.

oemHPE_ping    : Used to determine if an IP address is reachable.
Example : oemHPE_ping 192.168.1.1

oemHPE_loadSSHKey    : Used to authorize a SSH Key File from an URL.
Example : oemHPE_loadSSHKey -source http://user:pwd@192.168.1.1/images/SSHkey1.pub
oemHPE_deleteSSHKey    : Used to remove a SSH Key associated with a user
Example : oemHPE_deleteSSHKey

HPE CLI Commands:

POWER    : Control server power.
UID      : Control Unit-ID light.
ONETIMEBOOT: Access One-Time Boot setting.
NMI      : Generate an NMI.
VM       : Virtual media commands.
LANGUAGE : Command to set or get default language
VSP      : Invoke virtual serial port.
TEXTCONS : Invoke Remote Text Console.
TESTTRAP : Sends a test SNMP trap to the configured alert destinations.



</>hpiLO->

I fiddled with a few commands and searched the web, but I can't seem to find anything useful for the problem I'm seeing except the ping command.

I'm "glad" to see that the problem is with this type of host -- not with the Linux router (I think).
It's still very puzzling though.

Offline

#19 2022-02-02 12:55:30

seth
Member
Registered: 2012-09-03
Posts: 60,922

Re: Linux router: reply packets sent to wrong network interface

Can you reboot it this way?

Looking at #16, you're resetting the ARP from and for 10.215.144.91 but the monitored traffic is for 192.168.240.1 ?

Offline

#20 2022-02-02 13:12:15

vieri
Member
Registered: 2018-12-03
Posts: 14

Re: Linux router: reply packets sent to wrong network interface

The IP addr. 192.168.240.1 is on lan.18, and the trace I showed in my previous post is when pinging 10.215.144.252 in vlan 18 from the Linux router (not from vlan 1 host 10.215.111.210).
Anyway, doing this does not alter behavior:

# arping -c 1 -s 192.168.240.1 10.215.144.252
ARPING 10.215.144.252 from 192.168.240.1 lan.18
Unicast reply from 10.215.144.252 [94:40:C9:26:E2:CE]  0.709ms
Sent 1 probes (1 broadcast(s))
Received 1 response(s)

Note that if I ping host C from the Linux router (the host that properly replies to ICMP because I changed its netmask) I get a similar trace but to the right MAC addr.:

b8:59:9f:cc:bb:5c > 94:40:c9:26:fa:aa, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 62056, offset 0, flags [DF], proto ICMP (1), length 84)
    192.168.240.1 > 10.215.144.253: ICMP echo request, id 21114, seq 57, length 64
94:40:c9:26:fa:aa > b8:59:9f:cc:bb:5c, ethertype IPv4 (0x0800), length 98: (tos 0xe0, ttl 255, id 54524, offset 0, flags [none], proto ICMP (1), length 84)
    10.215.144.253 > 192.168.240.1: ICMP echo reply, id 21114, seq 57, length 64

I guess I can try to reboot it if all the VMs have been moved to another node in the cluster.
That should sanitize the ARP cache.

Offline

#21 2022-02-02 17:30:07

vieri
Member
Registered: 2018-12-03
Posts: 14

Re: Linux router: reply packets sent to wrong network interface

It seems that I can reboot the ESXi node via ssh on the iLO interface with the command:

reset map1

I did that on the previously "failing" hosts, and they are all working as expected now.

So, HPE iLO firmware is to blame, I guess.

Thanks to all

Offline

Board footer

Powered by FluxBB