You are not logged in.

#1 2021-11-21 23:48:48

Brocellous
Member
Registered: 2017-11-27
Posts: 61

No route to host, but there is

I've been getting some unusual errors lately where all clients fail to connect to any remote host out of the blue, claiming the host is unreachable.

e.g. when 10.1.0.254 is my gateway

$ ping -n 10.1.0.254
PING 10.1.0.254 (10.1.0.254) 56(84) bytes of data.
From 10.1.0.172 icmp_seq=1 Destination Host Unreachable
From 10.1.0.172 icmp_seq=2 Destination Host Unreachable
[..]

for any client or host. The odd part is that my wireless connection appears fine, ip addr has no change, and even ip route reports routes correctly!

$ ip route get 10.1.0.254
10.1.0.254 dev wlan0 src 10.1.0.172 uid 1000 
    cache

which has the correct iface and saddr. I'm also not running a firewall of any kind.

It happens once or twice a day with no apparent trigger, if I restart iwd it usually corrects itself, and if I leave it alone for about 10 minutes normal connectivity will be restored. It starts and ends without any additional logging in my journal, so I'm wondering if there are some debugging flags I could set to try and understand why I'm getting EHOSTUNREACH, and why it doesn't affect ip route. Or has anyone seen a similar issue? I'm using iwlwifi+iwd+networkd.

It started sometime in the past month I think. In case it's relevant here's some hardware and package info,

$ lspci -kd::280
02:00.0 Network controller: Intel Corporation Wireless 8265 / 8275 (rev 78)
	Subsystem: Intel Corporation Dual Band Wireless-AC 8265
	Kernel driver in use: iwlwifi
	Kernel modules: iwlwifi
$ paclog --package linux --package linux-firmware --package iwd --package systemd | tail
[2021-10-31T18:00:26-0700] [ALPM] upgraded linux (5.14.14.arch1-1 -> 5.14.15.arch1-1)
[2021-11-03T22:17:34-0700] [ALPM] upgraded linux (5.14.15.arch1-1 -> 5.14.16.arch1-1)
[2021-11-04T02:26:54-0700] [ALPM] upgraded iwd (1.18-1 -> 1.19-1)
[2021-11-05T12:17:29-0700] [ALPM] upgraded linux-firmware (20210919.d526e04-1 -> 20211027.1d00989-1)
[2021-11-11T03:11:21-0700] [ALPM] upgraded iwd (1.19-1 -> 1.19-2)
[2021-11-12T11:52:57-0700] [ALPM] upgraded systemd (249.5-3 -> 249.6-3)
[2021-11-12T17:16:12-0700] [ALPM] upgraded linux (5.14.16.arch1-1 -> 5.15.2.arch1-1)
[2021-11-20T10:49:25-0700] [ALPM] upgraded iwd (1.19-2 -> 1.20-1)
[2021-11-21T01:54:03-0700] [ALPM] upgraded linux (5.15.2.arch1-1 -> 5.15.3.arch1-1)
[2021-11-21T14:45:11-0700] [ALPM] upgraded systemd (249.6-3 -> 249.7-1)

Looking at the package list, linux-firmware seems like a possible culprit, but I have no reliable repro, so it's hard to test.

EDIT: Maybe relevant:

$ lspci -kd::280
02:00.0 Network controller: Intel Corporation Wireless 8265 / 8275 (rev 78)
	Subsystem: Intel Corporation Dual Band Wireless-AC 8265
	Kernel driver in use: iwlwifi
	Kernel modules: iwlwifi

Last edited by Brocellous (2021-12-04 09:51:37)

Offline

#2 2021-11-23 00:47:48

rsmarples
Member
Registered: 2009-05-12
Posts: 286

Re: No route to host, but there is

When it fails, check the output of `ip neigh`.
I'm willing to bet that the ARP entry has been lost, and what is really being reported is that there is no L2 route to the host because it's expired due to a lack of ARP replies to requests.

Offline

#3 2021-11-23 06:39:36

Brocellous
Member
Registered: 2017-11-27
Posts: 61

Re: No route to host, but there is

Good idea, I'll be sure to try it.

Offline

#4 2021-11-26 05:37:29

Brocellous
Member
Registered: 2017-11-27
Posts: 61

Re: No route to host, but there is

@rsmarples You were right. ip neigh output shows that all the neighbor entries are lost. In fact, here is the output:

$ ip neigh
68.xxx.xxx.xxx dev wlan0 lladdr 40:b0:76:af:15:78 STALE
10.1.0.254 dev wlan0  INCOMPLETE
10.1.0.253 dev wlan0  INCOMPLETE
fe80::42b0:76ff:feaf:1578 dev wlan0  router INCOMPLETE

254 and 253 are my router and dns so those are expected, fe80:: is the ipv6 link local addr of my router so that's fine too.

The 68.xxx.xxx.xxx entry is unexpected. It is the public ipv4 addr of my router, same as is returned by

$ curl --ipv4 ipapi.co/json | jq -r .ip
68.xxx.xxx.xxx

when my connection is working. Not sure why I have a neighbor entry for that.

I tried to see if my laptop was requesting arp info while my connection as bad, and it appears it was

$ tshark arp
    1 0.000000000 IntelCor_cd:xx.xx → Broadcast    ARP 42 Who has 10.1.0.253? Tell 10.1.0.172
    2 0.026894775 IntelCor_cd:xx.xx → Broadcast    ARP 42 Who has 10.1.0.254? Tell 10.1.0.172
    3 0.723413432 Grandstr_5f:xx:xx → Broadcast    ARP 64 Who has 10.1.0.21? Tell 0.0.0.0
    4 1.013461504 IntelCor_cd:xx.xx → Broadcast    ARP 42 Who has 10.1.0.253? Tell 10.1.0.172
    5 1.040195147 IntelCor_cd:xx.xx → Broadcast    ARP 42 Who has 10.1.0.254? Tell 10.1.0.172
    6 2.030595138 IntelCor_cd:xx.xx → Broadcast    ARP 42 Who has 10.1.0.253? Tell 10.1.0.172
    7 2.053669152 IntelCor_cd:xx.xx → Broadcast    ARP 42 Who has 10.1.0.254? Tell 10.1.0.172
    8 3.040061163 IntelCor_cd:xx.xx → Broadcast    ARP 42 Who has 10.1.0.253? Tell 10.1.0.172
[...]

I saw a request for 254 and 253 each at a rate of about 1/second, but no response I guess. Not sure what to make of that.

Offline

#5 2021-12-04 00:17:56

Brocellous
Member
Registered: 2017-11-27
Posts: 61

Re: No route to host, but there is

Today I tested from another machine on the local network while my laptop was having this issue. From there I determined that my router did respond to arping, both broadcast and unicast arp pings.

I also tried to arping my laptop while it was down. Interestingly, it does reliably respond to the broadcast arping requests, but not the unicast ones.

EDIT:

The repetitive arp requests from my laptop are also heard by the other machine in tshark arp. I'm guessing that means my laptop is failing to receive the arp replies somehow.

Last edited by Brocellous (2021-12-04 09:41:18)

Offline

Board footer

Powered by FluxBB