You are not logged in.

#1 2013-05-06 22:28:13

erikvv
Member
Registered: 2013-05-06
Posts: 7

Detected Hardware Unit Hang : Reset adapter unexpectedly

Hello all!


### summary ###

I have upgraded my kernel from 3.5 to 3.8 and since then I am experiencing timeouts on one of the network interfaces. I get the following error in journalctl when this happens.

May 06 23:23:41 PRIME kernel: e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
  TDH                  <3d>
  TDT                  <92>
  next_to_use          <92>
  next_to_clean        <3d>
buffer_info[next_to_clean]:
  time_stamp           <100039539>
  next_to_watch        <3e>
  jiffies              <100039856>
  next_to_watch.status <0>
MAC Status             <40080083>
PHY Status             <796d>
PHY 1000BASE-T Status  <3c00>
PHY Extended Status    <3000>
PCI Status             <10>
May 06 23:23:43 PRIME kernel: e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
  TDH                  <3d>
  TDT                  <92>
  next_to_use          <92>
  next_to_clean        <3d>
buffer_info[next_to_clean]:
  time_stamp           <100039539>
  next_to_watch        <3e>
  jiffies              <100039aae>
  next_to_watch.status <0>
MAC Status             <40080083>
PHY Status             <796d>
PHY 1000BASE-T Status  <3c00>
PHY Extended Status    <3000>
PCI Status             <10>
May 06 23:23:44 PRIME kernel: e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
May 06 23:23:44 PRIME kernel: br0: port 1(eno1) entered disabled state
May 06 23:23:48 PRIME kernel: e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
May 06 23:23:48 PRIME kernel: br0: port 1(eno1) entered forwarding state
May 06 23:23:48 PRIME kernel: br0: port 1(eno1) entered forwarding state

### Distro ###

Originally I was using Ubuntu. I upgraded from 12.10 to 13.04 and then issue started. I borked my kernel in an attempt to fix things. After that I have switched to Arch, but the issue remains.


### diagnostics ###

Motherboard is a P9X79 deluxe (lspci has it wrong). It has 3 network interfaces: one wireless, one Realtek 8111E (enp10s0), one Intel 82579V (eno1). Only the Intel is affected by this problem.

lspci -vvv

....
00:19.0 Ethernet controller: Intel Corporation 82579V Gigabit Network Connection (rev 05)
        Subsystem: ASUSTeK Computer Inc. P8P67 Deluxe Motherboard
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 92
        Region 0: Memory at fbf00000 (32-bit, non-prefetchable) [size=128K]
        Region 1: Memory at fbf28000 (32-bit, non-prefetchable) [size=4K]
        Region 2: I/O ports at f040 [size=32]
        Capabilities: [c8] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 00000000fee00000  Data: 4055
        Capabilities: [e0] PCI Advanced Features
                AFCap: TP+ FLR+
                AFCtrl: FLR-
                AFStatus: TP-
        Kernel driver in use: e1000e
        Kernel modules: e1000e
...

ethtool -i eno1

driver: e1000e
version: 2.3.2-NAPI
firmware-version: 0.13-4
bus-info: 0000:00:19.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no

ethtool -k eno1

Features for eno1:
rx-checksumming: on
tx-checksumming: on
        tx-checksum-ipv4: off [fixed]
        tx-checksum-ip-generic: on
        tx-checksum-ipv6: off [fixed]
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: off [fixed]
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp6-segmentation: on
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: on
loopback: off [fixed]
rx-fcs: off
rx-all: off

### use case ###

I am using this linux box as a router like below. It also runs Samba and Dhcpd for the LAN.

                           #----------#   #-----#   #--------#
                           |          |   |     |---|  eno1  |---(LAN)
             #---------#   |          |   |     |   #--------#   
(internet)---| enp10s0 |---| COMPUTER |---| br0 |         
             #---------#   |          |   |     |   #--------#   
                           |          |   |     |---| wlp8s0 |---(LAN)
                           #----------#   #-----#   #--------#

I use netctl to set up enp10s0 (dhcp) and br0 (static), and hostapd to set up wlp8s0.

iptables-save (I'm running a minimal setup to keep it simple for now)

*nat
:PREROUTING ACCEPT [7840:680033]
:INPUT ACCEPT [2619:163187]
:OUTPUT ACCEPT [283:17319]
:POSTROUTING ACCEPT [929:78831]
-A POSTROUTING -s 10.0.0.0/24 -o enp10s0 -j MASQUERADE
COMMIT

*filter
:INPUT ACCEPT [98426:606304255]
:FORWARD ACCEPT [7047:594273]
:OUTPUT ACCEPT [106432:61533766]
COMMIT

### reproducability ###

The issue seems random and occurs under different circumstances, but I've managed to find a 100% reproducable use case: on a PC in the LAN i use a firefox download manager to download a large file at 10 MB/s which I save directly on the router through SMB or SCP. This error occurs within a few seconds.

Oddly, when I download a file to a local drive and transfer it to the router afterwards (at much higher speed), the issue does not occur.


### what i've tried ###

- fresh installation
- updated bios
- installed latest drivers from intel
- removed the network bridge and wireless interface, and run the system with only the 2 wired interfaces
- replaced physical cables
- turned off auto-negotiation
- disabled rx flow control
- enabled arp filtering

nothing changed


### Final notes ####

I must say I'm very content with Arch so far. The wiki is great and many actions are simpler than on Ubuntu.

This is the only linux pc I ever use, so take it a bit slow on me.

One more thing I could try is swap the interfaces: use the Realtek one for LAN and the Intel for internet. But that also means that when the issues occurs I might not notice it while others using the server remotely (of which there are many more) will be affected.

I really hope someone can help.

Last edited by erikvv (2013-05-22 20:45:32)

Offline

#2 2013-05-06 23:18:53

erikvv
Member
Registered: 2013-05-06
Posts: 7

Re: Detected Hardware Unit Hang : Reset adapter unexpectedly

I've swapped the network interfaces and the issue is gone. Still curious after this though. Intel is supposed to be better than Realtek.

Last edited by erikvv (2013-05-06 23:19:21)

Offline

#3 2013-05-10 00:01:11

erikvv
Member
Registered: 2013-05-06
Posts: 7

Re: Detected Hardware Unit Hang : Reset adapter unexpectedly

Alas, the problem appeared even in the new configuration. So not solved.

Offline

#4 2013-05-14 13:23:53

erikvv
Member
Registered: 2013-05-06
Posts: 7

Re: Detected Hardware Unit Hang : Reset adapter unexpectedly

I've contacted Asus support, and they want me to test on Windows.

Nothing against Windows, but I'd have to buy a license and take out my entire server again (I don't have another system to replace it with. I lease Minecraft servers on it which requires powerful hardware).

Last edited by erikvv (2013-05-14 13:25:14)

Offline

#5 2013-05-20 19:25:43

erikvv
Member
Registered: 2013-05-06
Posts: 7

Re: Detected Hardware Unit Hang : Reset adapter unexpectedly

nvm

Last edited by erikvv (2013-05-22 20:45:41)

Offline

#6 2013-06-26 16:06:34

demize
Package Maintainer (PM)
From: Stockholm, Sweden
Registered: 2012-10-23
Posts: 20
Website

Re: Detected Hardware Unit Hang : Reset adapter unexpectedly

From #archlinux:

18:03:45   onox | demize: could you post at https://bbs.archlinux.org/viewtopic.php?id=162841 the following: I tried to disable tcp-segmentation-offload with: ethtool -K eno1 tso off (seems to work for my 82579LM)

Last edited by demize (2013-06-26 16:07:27)

Offline

#7 2014-02-08 05:00:17

adamcstephens
Member
Registered: 2013-11-10
Posts: 1

Re: Detected Hardware Unit Hang : Reset adapter unexpectedly

This worked for me on my 82566MM in my laptop which I use as a NAT. Mine was on the internal/private interface.

ethtool -K enp0s25 tso off

And try to set it permanent in /etc/netctl/<profile>

ExecUpPost='/usr/bin/ethtool -K enp0s25 tso off'

Last edited by adamcstephens (2014-02-08 05:07:19)

Offline

Board footer

Powered by FluxBB