You are not logged in.
My home network looks something like this:
ISP ----- Cable Modem ----- Arch Server ----- Home Network
The Internet facing port, eno1, has a static IP address assigned by the ISP, and the Arch Server provides firewall and NAT services to the home network. Generally everything works great, but very intermittently (say once every couple of weeks) I inexplicably lose network connectivity through the eno1 port. Similarly, every so often when I reboot, say after a kernel upgrade, the machine comes up with no Internet access. In both cases I can fix the problem by taking the interface down and then up again:
# ip link set eno1 down
# ip link set eno1 up
# ip route add default via 76.198.113.254
Prior to doing this, everything looks completely normal:
[root@toad ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 74:d4:35:18:11:34 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.1/24 brd 192.168.1.255 scope global enp2s0
valid_lft forever preferred_lft forever
3: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 74:d4:35:18:11:36 brd ff:ff:ff:ff:ff:ff
inet 76.198.113.142/24 brd 67.198.113.255 scope global eno1
valid_lft forever preferred_lft forever
4: wlp3s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 0c:8b:fd:75:13:d6 brd ff:ff:ff:ff:ff:ff
[root@toad ~]# ip route
default via 67.198.113.254 dev eno1
76.198.113.0/24 dev eno1 proto kernel scope link src 76.198.113.142
192.168.1.0/24 dev enp2s0 proto kernel scope link src 192.168.1.1
[root@toad ~]#
I just can't get anything out or in through that port. When I take the interface down and up, the output of ip addr and ip route look exactly the same as the above, the only difference is the interface works again.
I'm clueless as to how to go about debugging this problem or in fact even understanding what it could be. I suppose intermittent hardware failure is possible, but seems very unlikely, given the frequency with which this happens (i.e. not that frequently).
Any ideas, thiings I can test, etc.?
Offline
It may be infrequent, but is it regular? That is, is there a pattern to the failures? My last ISP assigned me a static IP, and every fortnight (around 10-11am on a Sunday) the same thing would happen, they would run some sort of reset and my network would go down and I would have to bring the interface back up.
Offline
It may be infrequent, but is it regular? That is, is there a pattern to the failures?
I thought of this, but sadly it doesn't seem to be regular (unless a new regular has just begun). My network went down yesterday in the morning and again this morning, which is quite unusual. Previously it would be once every couple of weeks (or less). I can't imagine what the ISP could do to make your network go down, though -- this shouldn't be possible; i.e. they can't take the interface down, or anything like this, so what could it be? Surely the kernel shouldn't be secretly disabling the device driver, or something like this?
More worrisome, this morning the trick of bringing the interface down and back up did not work. I tried this several times in addition to resetting the firewall rules to simple. Nothing worked other than rebooting the server. Fortunately it boots in about 2 seconds, but it still bothers me to have to reboot.
Offline
Impending hardware failure?
Offline
Impending hardware failure?
That's always a possibility. This incident and your comment have piqued my curiosity, though. I'm going to try and find out if there are any known external conditions that can cause a network port to stop responding even though the interface is being reported as being up. This shouldn't be possible, of course; unless there's some reason like potential for physical damage to the hardware driving it.
Having to reboot to fix the problem this morning does make me more suspicious that it's an intermittent hardware problem. This build used a Gigabyte ITX board. If it is hardware, I'm switching back to the top of the line ASUS boards, as it's not worth it to have to deal with issues like this.
Offline
The fact that it is irregular would point me to hardware. Good luck with further troubleshooting!
Offline