You are not logged in.

#1 2018-04-28 22:25:56

mascip
Member
Registered: 2015-02-10
Posts: 29

Ethernet not working 26% of the time after startup

Hi, I'm running this version of Arch Linux:

$cat /proc/version
Linux version 4.16.3-1-ARCH (builduser@heftig-2067) (gcc version 7.3.1 20180406 (GCC)) #1 SMP PREEMPT Thu Apr 19 09:17:56 UTC 2018

Most of the times my Ethernet network (with the dhcpcd service) works after startup, but sometimes it doesn't, and it seems to be happening randomly. Here is what the track records look like, for the last 50 startups. It's 74% of successes and 26% of failures (about one failure every 4 startup as an average... is that a coincidence?), apparently in a random order.
0 stands for Failure and 1 stands for Success: 01111111100111111011110100111101
"Failure" means that my Ethernet is not started by the end of startup.

One particularity of my motherboard is that it has two ethernet network cards (called eth1 and eno1; you will see them below, in the output of $ip a. I wonder if that's related. These ports sometimes change name to eth0 or eno0, when I move the Ethernet cable from one Ethernet socket to the other and restart the computer. I didn't choose these names.

When the Ethernet is not working after startup, this is how things look like for the dhcpcd service:

$journalctl -b | grep "dhcp"
Apr 27 19:53:22 mascip-desk systemd[1]: Created slice system-dhcpcd.slice.
Apr 27 19:53:23 mascip-desk systemd[1]: Starting dhcpcd on eno1...
Apr 27 19:53:24 mascip-desk dhcpcd[350]: eno1: waiting for carrier
Apr 27 19:53:24 mascip-desk dhcpcd[350]: eno1: carrier acquired
Apr 27 19:53:24 mascip-desk dhcpcd[350]: DUID 00:01:00:01:22:71:2b:79:4c:72:b9:21:16:8f
Apr 27 19:53:24 mascip-desk dhcpcd[350]: eno1: IAID b9:21:16:90
Apr 27 19:53:24 mascip-desk dhcpcd[350]: eno1: adding address fe80::4e72:b9ff:fe21:1690
Apr 27 19:53:24 mascip-desk dhcpcd[350]: eno1: carrier lost
Apr 27 19:53:24 mascip-desk dhcpcd[350]: eno1: deleting address fe80::4e72:b9ff:fe21:1690
Apr 27 19:53:30 mascip-desk dhcpcd[569]: wlp1s0: soliciting a DHCP lease
Apr 27 19:53:31 mascip-desk dhcpcd[569]: wlp1s0: offered 192.168.1.7 from 192.168.1.1
Apr 27 19:53:32 mascip-desk dhcpcd[569]: wlp1s0: probing address 192.168.1.7/24
Apr 27 19:53:37 mascip-desk dhcpcd[569]: wlp1s0: leased 192.168.1.7 for 3600 seconds
Apr 27 19:53:37 mascip-desk dhcpcd[569]: wlp1s0: adding route to 192.168.1.0/24
Apr 27 19:53:37 mascip-desk dhcpcd[569]: wlp1s0: adding default route via 192.168.1.1
Apr 27 19:53:37 mascip-desk dhcpcd[569]: forked to background, child pid 587
Apr 27 19:53:54 mascip-desk dhcpcd[350]: timed out
Apr 27 19:53:54 mascip-desk dhcpcd[350]: timed out
Apr 27 19:53:54 mascip-desk systemd[1]: dhcpcd@eno1.service: Control process exited, code=exited status=1
Apr 27 19:53:54 mascip-desk dhcpcd[350]: dhcpcd exited
Apr 27 19:53:54 mascip-desk systemd[1]: dhcpcd@eno1.service: Failed with result 'exit-code'.
Apr 27 19:53:54 mascip-desk systemd[1]: Failed to start dhcpcd on eno1.

And when the Ethernet doesn't work, eno1 says "NO-CARRIER":

$ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host·
       valid_lft forever preferred_lft forever
2: wlp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:15:00:5f:aa:98 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.6/24 brd 192.168.1.255 scope global noprefixroute wlp1s0
       valid_lft forever preferred_lft forever
    inet6 fe80::215:ff:fe5f:aa98/64 scope link·
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 4c:72:b9:21:16:8f brd ff:ff:ff:ff:ff:ff
4: eno1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
    link/ether 4c:72:b9:21:16:90 brd ff:ff:ff:ff:ff:ff
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default·
    link/ether 02:42:db:c5:bb:59 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever

And if I try to start it manually, it doesn’t work. (it only works when it gets started at startup) :

$sudo systemctl start dhcpcd@eno1
Job for dhcpcd@eno1.service failed because the control process exited with error code.
See "systemctl status dhcpcd@eno1.service" and "journalctl -xe" for details.

In this case journalctl gives:

[...a lot of irrelevant things…]
Apr 28 22:39:56 mascip-desk polkitd[632]: Operator of unix-process:627:5652 FAILED to authenticate to gain authorization for action org.freedesktop.systemd1.manage-units for system-bus-name::1.12 [systemctl start dhcpcd@eno1] (owned by unix-user:user)
Apr 28 22:39:56 mascip-desk polkitd[632]: Unregistered Authentication Agent for unix-process:627:5652 (system bus name :1.10, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale C) (disconnected from bus)
Apr 28 22:40:04 mascip-desk sudo[650]:     user : TTY=tty1 ; PWD=/home/user ; USER=root ; COMMAND=/usr/bin/systemctl start dhcpcd@eno1
Apr 28 22:40:04 mascip-desk sudo[650]: pam_unix(sudo:session): session opened for user root by user(uid=0)
Apr 28 22:40:04 mascip-desk systemd[1]: Starting dhcpcd on eno1...
-- Subject: Unit dhcpcd@eno1.service has begun start-up
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit dhcpcd@eno1.service has begun starting up.
Apr 28 22:40:04 mascip-desk dhcpcd[653]: eno1: waiting for carrier
Apr 28 22:40:34 mascip-desk dhcpcd[653]: timed out
Apr 28 22:40:34 mascip-desk dhcpcd[653]: timed out
Apr 28 22:40:34 mascip-desk systemd[1]: dhcpcd@eno1.service: Control process exited, code=exited status=1
Apr 28 22:40:34 mascip-desk dhcpcd[653]: dhcpcd exited
Apr 28 22:40:34 mascip-desk systemd[1]: dhcpcd@eno1.service: Failed with result 'exit-code'.
Apr 28 22:40:34 mascip-desk systemd[1]: Failed to start dhcpcd on eno1.
-- Subject: Unit dhcpcd@eno1.service has failed
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit dhcpcd@eno1.service has failed.
[...a lot of irrelevant things…]

-------------------------------------------

On the other hand, 74% of the times, when the Ethernet is working after startup, this is how things look like for the dhcpcd service:

$journalctl -b | grep "dhcp"
grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
Apr 28 10:43:35 mascip-desk systemd[1]: Starting dhcpcd on eno1...
Apr 28 10:43:35 mascip-desk dhcpcd[361]: eno1: waiting for carrier
Apr 28 10:43:37 mascip-desk dhcpcd[361]: eno1: carrier acquired
Apr 28 10:43:37 mascip-desk dhcpcd[361]: DUID 00:01:00:01:22:71:2b:79:4c:72:b9:21:16:8f
Apr 28 10:43:37 mascip-desk dhcpcd[361]: eno1: IAID b9:21:16:8f
Apr 28 10:43:37 mascip-desk dhcpcd[361]: eno1: adding address fe80::4e72:b9ff:fe21:168f
Apr 28 10:43:37 mascip-desk dhcpcd[361]: eno1: soliciting a DHCP lease
Apr 28 10:43:37 mascip-desk dhcpcd[361]: eno1: soliciting an IPv6 router
Apr 28 10:43:38 mascip-desk dhcpcd[361]: eno1: offered 192.168.1.5 from 192.168.1.1
Apr 28 10:43:38 mascip-desk dhcpcd[361]: eno1: probing address 192.168.1.5/24
Apr 28 10:43:43 mascip-desk dhcpcd[361]: eno1: leased 192.168.1.5 for 3600 seconds
Apr 28 10:43:43 mascip-desk dhcpcd[361]: eno1: adding route to 192.168.1.0/24
Apr 28 10:43:43 mascip-desk dhcpcd[361]: eno1: adding default route via 192.168.1.1
Apr 28 10:43:43 mascip-desk dhcpcd[361]: forked to background, child pid 616
Apr 28 10:43:43 mascip-desk systemd[1]: Started dhcpcd on eno1.
Apr 28 10:43:49 mascip-desk dhcpcd[809]: wlp1s0: soliciting a DHCP lease
Apr 28 10:43:50 mascip-desk dhcpcd[616]: eno1: no IPv6 Routers available
Apr 28 10:43:50 mascip-desk dhcpcd[809]: docker0: new hardware address: 02:42:48:c9:45:d9
Apr 28 10:43:50 mascip-desk dhcpcd[616]: docker0: new hardware address: 02:42:48:c9:45:d9
Apr 28 10:43:52 mascip-desk dhcpcd[809]: wlp1s0: offered 192.168.1.6 from 192.168.1.1
Apr 28 10:43:52 mascip-desk dhcpcd[809]: wlp1s0: probing address 192.168.1.6/24
Apr 28 10:43:56 mascip-desk dhcpcd[809]: wlp1s0: leased 192.168.1.6 for 3600 seconds
Apr 28 10:43:56 mascip-desk dhcpcd[809]: wlp1s0: adding route to 192.168.1.0/24
Apr 28 10:43:56 mascip-desk dhcpcd[809]: wlp1s0: adding default route via 192.168.1.1
Apr 28 10:43:56 mascip-desk dhcpcd[809]: forked to background, child pid 1029

And when the Ethernet works, eno1 says "BROADCAST":

$ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 4c:72:b9:21:16:8f brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.5/24 brd 192.168.1.255 scope global noprefixroute eno1
       valid_lft forever preferred_lft forever
    inet6 fe80::4e72:b9ff:fe21:168f/64 scope link 
       valid_lft forever preferred_lft forever
3: wlp1s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 00:15:00:5f:aa:98 brd ff:ff:ff:ff:ff:ff
4: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 4c:72:b9:21:16:90 brd ff:ff:ff:ff:ff:ff
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:49:6b:fe:0a brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever

Any ideas of what might happen, or what I could check next? I'm lost.





-------------------------------
I tried to move the Ethernet cable from one Ethernet socket to the other, and tried this:

$sudo systemctl disable dhcpcd@eno1
Removed /etc/systemd/system/multi-user.target.wants/dhcpcd@eno1.service.
$sudo systemctl enable dhcpcd@eth1
Created symlink /etc/systemd/system/multi-user.target.wants/dhcpcd@eth1.service -> /usr/lib/systemd/system/dhcpcd@.service.

But it doesn’t work. It says:

$journalctl -u dhcpcd@eth1 --since "60 min ago"
-- Logs begin at Fri 2015-04-24 21:30:33 BST, end at Sat 2018-04-28 22:06:11 BST. --
Apr 28 21:52:04 mascip-desk systemd[1]: Dependency failed for dhcpcd on eth1.
Apr 28 21:52:04 mascip-desk systemd[1]: dhcpcd@eth1.service: Job dhcpcd@eth1.service/start failed with result 'dependency'.
Apr 28 22:02:17 mascip-desk systemd[1]: Dependency failed for dhcpcd on eth1.
Apr 28 22:02:17 mascip-desk systemd[1]: dhcpcd@eth1.service: Job dhcpcd@eth1.service/start failed with result 'dependency'.


--------------------------------
I read somewhere that systemd-networkd could be used instead of dhcpcd: https://wiki.archlinux.org/index.php/Systemd-networkd
Maybe that’s my way of out this problem?

Thank you for any suggestions.

Offline

#2 2018-04-29 10:09:53

seth
Member
Registered: 2012-09-03
Posts: 51,172

Re: Ethernet not working 26% of the time after startup

Apr 27 19:53:24 mascip-desk dhcpcd[350]: eno1: waiting for carrier
Apr 27 19:53:24 mascip-desk dhcpcd[350]: eno1: carrier acquired
...
Apr 27 19:53:24 mascip-desk dhcpcd[350]: eno1: carrier lost

Sloppy plug or broken cable.

Offline

#3 2018-04-29 11:32:08

mascip
Member
Registered: 2015-02-10
Posts: 29

Re: Ethernet not working 26% of the time after startup

Thank you seth :-)

I'm pretty sure that it's not the cable: I tried it with 3 different cables, one of them brand new, and it still does the same (I have re-tried it just now, to make sure).

And a sloppy plug is possible, but it does seem unlikely: I have restarted the computer 8 times in the last 10 minutes, with 5 successes and 3 fails in a random order: 1101001. During this time, the cable and plug have remained untouched. If it was a very sloppy plug, the cable would come right out and I would get a series of fails. The fact that fails and successes are alternating so randomly could indicate that the plug is "just a bit" sloppy, and that sometimes the computer manages to make that connection, while sometimes not. Possible...

I guess, to check properly I could try and use the other plug.
I don't know why it didn't work though:

dhcpcd@eth1.service/start failed with result 'dependency'.

What could be the cause of that? I searched online but didn't come up with anything interesting. Mostly, people using the wrong name (for example "eth0" instead of "eth1) and I checked that, or people having several dhcpcd services enabled at the same time, and I checked that too.

Offline

#4 2018-04-29 11:57:50

seth
Member
Registered: 2012-09-03
Posts: 51,172

Re: Ethernet not working 26% of the time after startup

The missing "dependency" is probably the carrier.

NO-CARRIER means the interface cannot talk to the other side, the usual cause is physical failure, but it could also be a problem w/ the switch/router (whatever the other side of the cable is plugged to) so you should check that as well, but the randomness suggests some loose connection.
Also ensure the plug is properly seated and the latch locked.
Worst case scenario would be a cold solder joint for the jack :-\

Offline

#5 2018-04-29 11:59:05

Lone_Wolf
Member
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 11,919

Re: Ethernet not working 26% of the time after startup

ethtool can be very useful for diagnostics.

Temporarily disable dpchcd , boot into multi-user.target, login as root.

run ethtool and post the output.
If everything checks out try starting the appropriate dhcpcd service.


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

Board footer

Powered by FluxBB