You are not logged in.

#1 2020-12-18 16:39:03

matsjj
Member
Registered: 2020-12-13
Posts: 3

Connectivity issues due to NETDEV WATCHDOG: transmit queue 3 timed out

Dear community :)

first and foremost, this community is amazing. I came here as a noob, and even though the initial installation looks like a steep learning curve, I already learnt a ton. Coming from Debian, Arch feels *so much* snappier!

TLDR of my problem
My server that I have at home for various self-hosted solutions loses all connectivity within some random timeframe after I disconnect my MacBookPro from the power/ethernet hub.

None of the other devices in my network can reach it, nor can it reach the gateway (or any other device). But then, the second I plug in the connector to my laptop, ping yields an answer.
The network setup is my ISP Router (Fritz!Box 7590) connected to an unmanaged switch. Laptops and phones generally connect to the route via WiFi, but at my desk I have a dedicated hub for my Laptop which goes to the switch via Ethernet.

It usually happens within 5 minutes - 2 hours after I unplug my laptop. I brought the server up to my desk by now, to be able to use the terminal. As far as I can tell, there is zero change in any of the outputs posted below.

Initially I had Debian installed, but this issue led me to reinstall Debian and ultimately install Arch (glad it did!).
It happened with the on-board LAN port. I thought it was some hardware issue, so I bought a PCIe card with the same chipset, still same problem. (Realtek - happens with both r8168 and r8169). I bought yet another PCIe card, this time Intel (2 port), but still same issue.

As far as I can tell, this happens with a pretty minimal install. I use netctl with a static address as described here
https://ostechnix.com/configure-static- … rch-linux/
But I have been able to replicate this with most other network managers, i.e. dhcpcd.



Some output, let me know about anything else you need

Thank you so much in advance already, I've been fighting with this issue the past 4 weeks.

Demonstration of the actual issue
ping 192.168.178.1 (gateway IP)

PING 192.168.178.1 (192.168.178.1) 56(84) bytes of data.
64 Bytes von 192.168.178.1: icmp_seq=1 ttl=64 Zeit=0.754 ms
64 Bytes von 192.168.178.1: icmp_seq=2 ttl=64 Zeit=0.759 ms
64 Bytes von 192.168.178.1: icmp_seq=3 ttl=64 Zeit=0.651 ms
64 Bytes von 192.168.178.1: icmp_seq=4 ttl=64 Zeit=0.917 ms
64 Bytes von 192.168.178.1: icmp_seq=5 ttl=64 Zeit=0.829 ms
64 Bytes von 192.168.178.1: icmp_seq=6 ttl=64 Zeit=0.530 ms
64 Bytes von 192.168.178.1: icmp_seq=7 ttl=64 Zeit=0.778 ms
64 Bytes von 192.168.178.1: icmp_seq=8 ttl=64 Zeit=0.962 ms
64 Bytes von 192.168.178.1: icmp_seq=9 ttl=64 Zeit=0.779 ms
64 Bytes von 192.168.178.1: icmp_seq=10 ttl=64 Zeit=0.883 ms
64 Bytes von 192.168.178.1: icmp_seq=11 ttl=64 Zeit=0.860 ms
64 Bytes von 192.168.178.1: icmp_seq=12 ttl=64 Zeit=1.30 ms
64 Bytes von 192.168.178.1: icmp_seq=13 ttl=64 Zeit=0.614 ms
64 Bytes von 192.168.178.1: icmp_seq=14 ttl=64 Zeit=1.26 ms
64 Bytes von 192.168.178.1: icmp_seq=15 ttl=64 Zeit=0.344 ms
64 Bytes von 192.168.178.1: icmp_seq=16 ttl=64 Zeit=0.514 ms
64 Bytes von 192.168.178.1: icmp_seq=17 ttl=64 Zeit=0.596 ms
64 Bytes von 192.168.178.1: icmp_seq=18 ttl=64 Zeit=0.493 ms
64 Bytes von 192.168.178.1: icmp_seq=19 ttl=64 Zeit=0.773 ms
64 Bytes von 192.168.178.1: icmp_seq=20 ttl=64 Zeit=0.502 ms
[...]
64 Bytes von 192.168.178.1: icmp_seq=7032 ttl=64 Zeit=0.905 ms
64 Bytes von 192.168.178.1: icmp_seq=7033 ttl=64 Zeit=0.602 ms
64 Bytes von 192.168.178.1: icmp_seq=7034 ttl=64 Zeit=0.778 ms
64 Bytes von 192.168.178.1: icmp_seq=7035 ttl=64 Zeit=0.844 ms
64 Bytes von 192.168.178.1: icmp_seq=7036 ttl=64 Zeit=0.993 ms
64 Bytes von 192.168.178.1: icmp_seq=7037 ttl=64 Zeit=0.899 ms
64 Bytes von 192.168.178.1: icmp_seq=7038 ttl=64 Zeit=0.893 ms
64 Bytes von 192.168.178.1: icmp_seq=7039 ttl=64 Zeit=0.841 ms
64 Bytes von 192.168.178.1: icmp_seq=7040 ttl=64 Zeit=22.4 ms
Von 192.168.178.191 icmp_seq=7049 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=7050 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=7051 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=7052 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=7053 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=7054 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=7055 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=7056 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=7057 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=7068 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=7069 Zielhost nicht erreichbar [host not reachable]
[...]
Now at some point I plug in my laptop...
[...]
Von 192.168.178.191 icmp_seq=22471 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=22474 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=22477 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=22480 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=22481 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=22500 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=22501 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=22502 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=22503 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=22507 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=22512 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=22513 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=22515 Zielhost nicht erreichbar
Von 192.168.178.191 icmp_seq=22518 Zielhost nicht erreichbar
64 Bytes von 192.168.178.1: icmp_seq=22253 ttl=64 Zeit=276246 ms
64 Bytes von 192.168.178.1: icmp_seq=22254 ttl=64 Zeit=275233 ms
64 Bytes von 192.168.178.1: icmp_seq=22255 ttl=64 Zeit=274219 ms
64 Bytes von 192.168.178.1: icmp_seq=22256 ttl=64 Zeit=273206 ms
64 Bytes von 192.168.178.1: icmp_seq=22257 ttl=64 Zeit=272194 ms
64 Bytes von 192.168.178.1: icmp_seq=22258 ttl=64 Zeit=271181 ms
64 Bytes von 192.168.178.1: icmp_seq=22259 ttl=64 Zeit=270168 ms
64 Bytes von 192.168.178.1: icmp_seq=22260 ttl=64 Zeit=269154 ms
64 Bytes von 192.168.178.1: icmp_seq=22261 ttl=64 Zeit=268142 ms
64 Bytes von 192.168.178.1: icmp_seq=22317 ttl=64 Zeit=211413 ms
64 Bytes von 192.168.178.1: icmp_seq=22318 ttl=64 Zeit=210401 ms
64 Bytes von 192.168.178.1: icmp_seq=22319 ttl=64 Zeit=209388 ms
64 Bytes von 192.168.178.1: icmp_seq=22320 ttl=64 Zeit=208375 ms
64 Bytes von 192.168.178.1: icmp_seq=22321 ttl=64 Zeit=207362 ms
64 Bytes von 192.168.178.1: icmp_seq=22322 ttl=64 Zeit=206349 ms
64 Bytes von 192.168.178.1: icmp_seq=22323 ttl=64 Zeit=205337 ms
64 Bytes von 192.168.178.1: icmp_seq=22324 ttl=64 Zeit=204324 ms
64 Bytes von 192.168.178.1: icmp_seq=22325 ttl=64 Zeit=203311 ms
64 Bytes von 192.168.178.1: icmp_seq=22371 ttl=64 Zeit=156713 ms
64 Bytes von 192.168.178.1: icmp_seq=22372 ttl=64 Zeit=155700 ms
64 Bytes von 192.168.178.1: icmp_seq=22373 ttl=64 Zeit=154688 ms
64 Bytes von 192.168.178.1: icmp_seq=22374 ttl=64 Zeit=153675 ms
64 Bytes von 192.168.178.1: icmp_seq=22375 ttl=64 Zeit=152663 ms
64 Bytes von 192.168.178.1: icmp_seq=22376 ttl=64 Zeit=151650 ms
64 Bytes von 192.168.178.1: icmp_seq=22377 ttl=64 Zeit=150637 ms
64 Bytes von 192.168.178.1: icmp_seq=22378 ttl=64 Zeit=149625 ms
64 Bytes von 192.168.178.1: icmp_seq=22379 ttl=64 Zeit=148613 ms
64 Bytes von 192.168.178.1: icmp_seq=22380 ttl=64 Zeit=147600 ms
64 Bytes von 192.168.178.1: icmp_seq=22441 ttl=64 Zeit=85809 ms
64 Bytes von 192.168.178.1: icmp_seq=22442 ttl=64 Zeit=84797 ms
64 Bytes von 192.168.178.1: icmp_seq=22443 ttl=64 Zeit=83784 ms
64 Bytes von 192.168.178.1: icmp_seq=22444 ttl=64 Zeit=82771 ms
64 Bytes von 192.168.178.1: icmp_seq=22445 ttl=64 Zeit=81759 ms
64 Bytes von 192.168.178.1: icmp_seq=22446 ttl=64 Zeit=80746 ms
64 Bytes von 192.168.178.1: icmp_seq=22447 ttl=64 Zeit=79733 ms
64 Bytes von 192.168.178.1: icmp_seq=22448 ttl=64 Zeit=78721 ms
64 Bytes von 192.168.178.1: icmp_seq=22449 ttl=64 Zeit=77708 ms
64 Bytes von 192.168.178.1: icmp_seq=22489 ttl=64 Zeit=37244 ms
64 Bytes von 192.168.178.1: icmp_seq=22490 ttl=64 Zeit=36233 ms
64 Bytes von 192.168.178.1: icmp_seq=22491 ttl=64 Zeit=35220 ms
64 Bytes von 192.168.178.1: icmp_seq=22492 ttl=64 Zeit=34207 ms
64 Bytes von 192.168.178.1: icmp_seq=22493 ttl=64 Zeit=33195 ms
64 Bytes von 192.168.178.1: icmp_seq=22494 ttl=64 Zeit=32182 ms
64 Bytes von 192.168.178.1: icmp_seq=22495 ttl=64 Zeit=31170 ms
64 Bytes von 192.168.178.1: icmp_seq=22496 ttl=64 Zeit=30158 ms
64 Bytes von 192.168.178.1: icmp_seq=22497 ttl=64 Zeit=29145 ms
64 Bytes von 192.168.178.1: icmp_seq=22498 ttl=64 Zeit=28133 ms
64 Bytes von 192.168.178.1: icmp_seq=22499 ttl=64 Zeit=27120 ms
64 Bytes von 192.168.178.1: icmp_seq=22525 ttl=64 Zeit=803 ms
64 Bytes von 192.168.178.1: icmp_seq=22526 ttl=64 Zeit=0.912 ms
64 Bytes von 192.168.178.1: icmp_seq=22527 ttl=64 Zeit=0.541 ms
64 Bytes von 192.168.178.1: icmp_seq=22528 ttl=64 Zeit=0.673 ms
64 Bytes von 192.168.178.1: icmp_seq=22529 ttl=64 Zeit=0.501 ms
64 Bytes von 192.168.178.1: icmp_seq=22530 ttl=64 Zeit=0.709 ms
64 Bytes von 192.168.178.1: icmp_seq=22531 ttl=64 Zeit=0.782 ms
64 Bytes von 192.168.178.1: icmp_seq=22532 ttl=64 Zeit=0.742 ms
64 Bytes von 192.168.178.1: icmp_seq=22533 ttl=64 Zeit=0.551 ms

I found in dmesg some of those errors that would explain this to some degree, but no idea why that would be, and I had similar issues with all the other network cards

[ 8820.760184] ------------[ cut here ]------------
[ 8820.760206] NETDEV WATCHDOG: enp16s0f1 (igb): transmit queue 3 timed out
[ 8820.760228] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:442 dev_watchdog+0x26d/0x280
[ 8820.760231] Modules linked in: hid_magicmouse apple_mfi_fastcharge hid_logitech_hidpp joydev mousedev hid_logitech_dj input_leds hid_generic usbhid hid amdgpu gpu_sched ttm snd_hda_codec_hdmi edac_mce_amd snd_hda_intel kvm snd_intel_dspcfg drm_kms_helper snd_hda_codec cec irqbypass snd_hda_core crct10dif_pclmul r8169 snd_hwdep crc32_pclmul snd_pcm rc_core ghash_clmulni_intel snd_timer aesni_intel snd realtek mdio_devres of_mdio syscopyarea sysfillrect crypto_simd sysimgblt fixed_phy fb_sys_fops wmi_bmof soundcore cryptd igb libphy pcspkr glue_helper ccp rapl i2c_algo_bit rng_core dca sp5100_tco i2c_piix4 k10temp evdev mac_hid wmi pinctrl_amd gpio_amdpt drm agpgart fuse ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 uas usb_storage crc32c_intel xhci_pci xhci_pci_renesas xhci_hcd
[ 8820.760285] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.13-arch1-1 #1
[ 8820.760288] Hardware name: Micro-Star International Co., Ltd. MS-7C52/B450M-A PRO MAX (MS-7C52), BIOS 3.00 08/13/2019
[ 8820.760294] RIP: 0010:dev_watchdog+0x26d/0x280
[ 8820.760299] Code: 68 10 79 ff eb 85 4c 89 f7 c6 05 dc b4 0c 01 01 e8 18 a9 fa ff 44 89 e9 4c 89 f6 48 c7 c7 28 62 a0 9b 48 89 c2 e8 d5 fb 15 00 <0f> 0b e9 63 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44
[ 8820.760303] RSP: 0018:ffffa7a080003e90 EFLAGS: 00010286
[ 8820.760307] RAX: 0000000000000000 RBX: ffff901595ab18c0 RCX: 0000000000000000
[ 8820.760311] RDX: 0000000000000103 RSI: ffffffff9b959b0f RDI: 00000000ffffffff
[ 8820.760314] RBP: ffff9015953f43dc R08: 00000000000004d6 R09: 0000000000000001
[ 8820.760317] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9015953f4480
[ 8820.760319] R13: 0000000000000003 R14: ffff9015953f4000 R15: ffff901595ab1940
[ 8820.760324] FS:  0000000000000000(0000) GS:ffff901598800000(0000) knlGS:0000000000000000
[ 8820.760327] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8820.760330] CR2: 00007fb0edff5ff8 CR3: 000000021518c000 CR4: 00000000003506f0
[ 8820.760333] Call Trace:
[ 8820.760338]  <IRQ>
[ 8820.760346]  ? pfifo_fast_init+0x110/0x110
[ 8820.760350]  ? pfifo_fast_init+0x110/0x110
[ 8820.760356]  call_timer_fn+0x2d/0x160
[ 8820.760361]  ? pfifo_fast_init+0x110/0x110
[ 8820.760365]  __run_timers+0x1ec/0x280
[ 8820.760371]  run_timer_softirq+0x2b/0x50
[ 8820.760377]  __do_softirq+0xff/0x344
[ 8820.760384]  asm_call_irq_on_stack+0x12/0x20
[ 8820.760387]  </IRQ>
[ 8820.760393]  do_softirq_own_stack+0x5d/0x80
[ 8820.760399]  irq_exit_rcu+0xd8/0x120
[ 8820.760406]  sysvec_apic_timer_interrupt+0x47/0xe0
[ 8820.760411]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 8820.760416] RIP: 0010:native_safe_halt+0xe/0x10
[ 8820.760421] Code: f0 80 48 02 20 48 8b 00 a8 08 75 c3 e9 7a ff ff ff cc cc cc cc cc cc cc cc cc cc cc e9 07 00 00 00 0f 00 2d 16 85 5d 00 fb f4 <c3> 90 e9 07 00 00 00 0f 00 2d 06 85 5d 00 f4 c3 cc cc 0f 1f 44 00
[ 8820.760424] RSP: 0018:ffffffff9be03e10 EFLAGS: 00000246
[ 8820.760428] RAX: 0000000000004000 RBX: ffff901593e6a000 RCX: 000000000000001f
[ 8820.760431] RDX: ffff901598800000 RSI: ffff901596c17c00 RDI: ffff901596c17c64
[ 8820.760434] RBP: ffff901596c17c64 R08: 00000805be460398 R09: 00000805b43b307f
[ 8820.760437] R10: 0000000000013818 R11: 0000000000000030 R12: 0000000000000001
[ 8820.760440] R13: ffffffff9bed4140 R14: 0000000000000001 R15: 000000005b5c1000
[ 8820.760448]  acpi_idle_do_entry+0x46/0x50
[ 8820.760455]  acpi_idle_enter+0xa3/0xf0
[ 8820.760462]  cpuidle_enter_state+0x8c/0x3c0
[ 8820.760467]  cpuidle_enter+0x29/0x40
[ 8820.760473]  do_idle+0x1ed/0x280
[ 8820.760478]  cpu_startup_entry+0x19/0x20
[ 8820.760485]  start_kernel+0x867/0x88c
[ 8820.760493]  secondary_startup_64+0xb6/0xc0
[ 8820.760499] ---[ end trace 606410dad22440aa ]---
[ 8820.760799] igb 0000:10:00.1 enp16s0f1: Reset adapter
[ 8824.264584] igb 0000:10:00.1 enp16s0f1: igb: enp16s0f1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[ 8904.600626] igb 0000:10:00.1 enp16s0f1: Reset adapter
[ 8908.167476] igb 0000:10:00.1 enp16s0f1: igb: enp16s0f1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX

cat /etc/netctl/enp16s0f1

Description='A basic static ethernet connection'
Interface=enp16s0f1
Connection=ethernet
IP=static
Address=('192.168.178.191/24')
Gateway='192.168.178.1'
DNS=('192.168.178.1')

ifconfig -a

enp16s0f0: flags=4098<BROADCAST,MULTICAST>  mtu 1500
        ether 80:61:5f:08:2c:4e  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device memory 0xf7420000-f743ffff

enp16s0f1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.178.191  netmask 255.255.255.0  broadcast 192.168.178.255
        ether 80:61:5f:08:2c:4f  txqueuelen 1000  (Ethernet)
        RX packets 1509  bytes 109289 (106.7 KiB)
        RX errors 0  dropped 777  overruns 0  frame 0
        TX packets 284  bytes 34393 (33.5 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device memory 0xf7400000-f741ffff

enp37s0: flags=4098<BROADCAST,MULTICAST>  mtu 1500
        ether 00:d8:61:c2:95:b8  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 4  bytes 200 (200.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 4  bytes 200 (200.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lspci -knn

00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Root Complex [1022:15d0]
	Subsystem: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Root Complex [1022:15d0]
00:00.2 IOMMU [0806]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 IOMMU [1022:15d1]
	Subsystem: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 IOMMU [1022:15d1]
00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP Bridge [6:0] [1022:15d3]
	Kernel driver in use: pcieport
00:01.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP Bridge [6:0] [1022:15d3]
	Kernel driver in use: pcieport
00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Internal PCIe GPP Bridge 0 to Bus A [1022:15db]
	Kernel driver in use: pcieport
00:08.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Internal PCIe GPP Bridge 0 to Bus B [1022:15dc]
	Kernel driver in use: pcieport
00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 61)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7c52]
	Kernel driver in use: piix4_smbus
	Kernel modules: i2c_piix4, sp5100_tco
00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7c52]
00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 0 [1022:15e8]
00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 1 [1022:15e9]
00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 2 [1022:15ea]
00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 3 [1022:15eb]
	Kernel driver in use: k10temp
	Kernel modules: k10temp
00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 4 [1022:15ec]
00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 5 [1022:15ed]
00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 6 [1022:15ee]
00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 7 [1022:15ef]
10:00.0 Ethernet controller [0200]: Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01)
	Subsystem: Device [1dcf:0319]
	Kernel driver in use: igb
	Kernel modules: igb
10:00.1 Ethernet controller [0200]: Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01)
	Subsystem: Device [1dcf:0319]
	Kernel driver in use: igb
	Kernel modules: igb
12:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller [1022:43d5] (rev 01)
	Subsystem: ASMedia Technology Inc. Device [1b21:1142]
	Kernel driver in use: xhci_hcd
	Kernel modules: xhci_pci
12:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller [1022:43c8] (rev 01)
	Subsystem: ASMedia Technology Inc. Device [1b21:1062]
	Kernel driver in use: ahci
12:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge [1022:43c6] (rev 01)
	Kernel driver in use: pcieport
20:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
	DeviceName: Broadcom 5762
	Kernel driver in use: pcieport
20:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
	Kernel driver in use: pcieport
20:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
	Kernel driver in use: pcieport
20:05.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
	Kernel driver in use: pcieport
20:06.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
	Kernel driver in use: pcieport
20:07.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
	Kernel driver in use: pcieport
25:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 15)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7c52]
	Kernel driver in use: r8169
	Kernel modules: r8169
29:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Picasso [1002:15d8] (rev c9)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7c52]
	Kernel driver in use: amdgpu
	Kernel modules: amdgpu
29:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Raven/Raven2/Fenghuang HDMI/DP Audio Controller [1002:15de]
	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Raven/Raven2/Fenghuang HDMI/DP Audio Controller [1002:15de]
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel
29:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor [1022:15df]
	Subsystem: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor [1022:15df]
	Kernel driver in use: ccp
	Kernel modules: ccp
29:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Raven USB 3.1 [1022:15e0]
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7c52]
	Kernel driver in use: xhci_hcd
	Kernel modules: xhci_pci
29:00.4 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Raven USB 3.1 [1022:15e1]
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7c52]
	Kernel driver in use: xhci_hcd
	Kernel modules: xhci_pci
2a:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 61)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7c52]
	Kernel driver in use: ahci

ip link

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp37s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:d8:61:c2:95:b8 brd ff:ff:ff:ff:ff:ff
3: enp16s0f0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 80:61:5f:08:2c:4e brd ff:ff:ff:ff:ff:ff
4: enp16s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 80:61:5f:08:2c:4f brd ff:ff:ff:ff:ff:ff

ip route

default via 192.168.178.1 dev enp16s0f1
192.168.178.0/24 dev enp16s0f1 proto kernel scope link src 192.168.178.191

hostnamectl

   Static hostname: smarthome
         Icon name: computer-desktop
           Chassis: desktop
        Machine ID: 4711987ec8ab4bcaac179d12ef05b10c
           Boot ID: 5527df14cb234000a048d5c4c0202027
  Operating System: Arch Linux
            Kernel: Linux 5.9.13-arch1-1
      Architecture: x86-64

systemctl list-unit-files --state=enabled

UNIT FILE        STATE   VENDOR PRESET
cronie.service   enabled disabled
getty@.service   enabled enabled
sshd.service     enabled disabled
remote-fs.target enabled enabled

4 unit files listed.

Last edited by matsjj (2020-12-22 07:12:13)

Offline

#2 2020-12-22 07:03:11

matsjj
Member
Registered: 2020-12-13
Posts: 3

Re: Connectivity issues due to NETDEV WATCHDOG: transmit queue 3 timed out

I fiddled some more with the BIOS, and indeed, for a while it worked and I was getting confident.

But just now it happened again, this was in dmesg this time

[46854.421530] ------------[ cut here ]------------
[46854.421559] NETDEV WATCHDOG: enp16s0f1 (igb): transmit queue 3 timed out
[46854.421581] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:442 dev_watchdog+0x26d/0x280
[46854.421585] Modules linked in: tun veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c br_netfilter bridge stp llc overlay amdgpu snd_hda_codec_hdmi snd_hda_intel gpu_sched snd_intel_dspcfg snd_hda_codec ttm drm_kms_helper edac_mce_amd snd_hda_core cec rc_core kvm syscopyarea irqbypass sysfillrect sysimgblt r8169 fb_sys_fops snd_hwdep crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd snd_pcm realtek cryptd mdio_devres of_mdio wmi_bmof snd_timer glue_helper snd fixed_phy soundcore rapl libphy pcspkr igb ccp sp5100_tco i2c_algo_bit rng_core dca k10temp i2c_piix4 evdev mac_hid wmi pinctrl_amd gpio_amdpt drm fuse agpgart ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 uas usb_storage crc32c_intel xhci_pci xhci_pci_renesas xhci_hcd
[46854.421649] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.13-arch1-1 #1
[46854.421656] Hardware name: Micro-Star International Co., Ltd. MS-7C52/B450M-A PRO MAX (MS-7C52), BIOS 3.00 08/13/2019
[46854.421661] RIP: 0010:dev_watchdog+0x26d/0x280
[46854.421667] Code: 68 10 79 ff eb 85 4c 89 f7 c6 05 dc b4 0c 01 01 e8 18 a9 fa ff 44 89 e9 4c 89 f6 48 c7 c7 28 62 00 97 48 89 c2 e8 d5 fb 15 00 <0f> 0b e9 63 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44
[46854.421670] RSP: 0018:ffff9d5880003e90 EFLAGS: 00010286
[46854.421675] RAX: 0000000000000000 RBX: ffff8f6a062598c0 RCX: 0000000000000000
[46854.421678] RDX: 0000000000000103 RSI: ffffffff96f59b0f RDI: 00000000ffffffff
[46854.421681] RBP: ffff8f6a078003dc R08: 00000000000004b1 R09: 0000000000000001
[46854.421684] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8f6a07800480
[46854.421687] R13: 0000000000000003 R14: ffff8f6a07800000 R15: ffff8f6a06259940
[46854.421691] FS:  0000000000000000(0000) GS:ffff8f6a18600000(0000) knlGS:0000000000000000
[46854.421695] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[46854.421698] CR2: 00007fd58a1a7300 CR3: 0000000191984000 CR4: 00000000003506f0
[46854.421701] Call Trace:
[46854.421706]  <IRQ>
[46854.421714]  ? pfifo_fast_init+0x110/0x110
[46854.421718]  ? pfifo_fast_init+0x110/0x110
[46854.421724]  call_timer_fn+0x2d/0x160
[46854.421729]  ? pfifo_fast_init+0x110/0x110
[46854.421733]  __run_timers+0x1ec/0x280
[46854.421739]  run_timer_softirq+0x2b/0x50
[46854.421746]  __do_softirq+0xff/0x344
[46854.421752]  asm_call_irq_on_stack+0x12/0x20
[46854.421756]  </IRQ>
[46854.421762]  do_softirq_own_stack+0x5d/0x80
[46854.421768]  irq_exit_rcu+0xd8/0x120
[46854.421775]  sysvec_apic_timer_interrupt+0x47/0xe0
[46854.421780]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[46854.421786] RIP: 0010:native_safe_halt+0xe/0x10
[46854.421791] Code: f0 80 48 02 20 48 8b 00 a8 08 75 c3 e9 7a ff ff ff cc cc cc cc cc cc cc cc cc cc cc e9 07 00 00 00 0f 00 2d 16 85 5d 00 fb f4 <c3> 90 e9 07 00 00 00 0f 00 2d 06 85 5d 00 f4 c3 cc cc 0f 1f 44 00
[46854.421794] RSP: 0018:ffffffff97403e10 EFLAGS: 00000246
[46854.421798] RAX: 0000000000004000 RBX: ffff8f6a13ab1800 RCX: 000000000000001f
[46854.421801] RDX: ffff8f6a18600000 RSI: ffff8f6a16f5ec00 RDI: ffff8f6a16f5ec64
[46854.421804] RBP: ffff8f6a16f5ec64 R08: 00002a9d24cdd77c R09: 0000000000000018
[46854.421807] R10: 0000000000002575 R11: 0000000000001224 R12: 0000000000000001
[46854.421810] R13: ffffffff974d4140 R14: 0000000000000001 R15: 00000000db8b3000
[46854.421819]  acpi_idle_do_entry+0x46/0x50
[46854.421826]  acpi_idle_enter+0xa3/0xf0
[46854.421833]  cpuidle_enter_state+0x8c/0x3c0
[46854.421838]  cpuidle_enter+0x29/0x40
[46854.421844]  do_idle+0x1ed/0x280
[46854.421849]  cpu_startup_entry+0x19/0x20
[46854.421856]  start_kernel+0x867/0x88c
[46854.421864]  secondary_startup_64+0xb6/0xc0
[46854.421870] ---[ end trace 4215a9b39f50a168 ]---
[46854.421924] igb 0000:10:00.1 enp16s0f1: Reset adapter
[46857.822159] igb 0000:10:00.1 enp16s0f1: igb: enp16s0f1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[47413.354830] igb 0000:10:00.1 enp16s0f1: Reset adapter
[47416.654976] igb 0000:10:00.1 enp16s0f1: igb: enp16s0f1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[47494.287262] audit: type=1101 audit(1608620348.496:358): pid=44152 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:accounting grantors=pam_access,pam_unix,pam_permit,pam_time acct="mats" exe="/usr/bin/sshd" hostname=192.168.178.23 addr=192.168.178.23 terminal=ssh res=success'
[47494.288014] audit: type=1103 audit(1608620348.496:359): pid=44152 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=pam_shells,pam_faillock,pam_permit,pam_env,pam_faillock acct="mats" exe="/usr/bin/sshd" hostname=192.168.178.23 addr=192.168.178.23 terminal=ssh res=success'
[47494.288064] audit: type=1006 audit(1608620348.496:360): pid=44152 uid=0 old-auid=4294967295 auid=1000 tty=(none) old-ses=4294967295 ses=2 res=1
[47494.377432] audit: type=1130 audit(1608620348.582:361): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=user-runtime-dir@1000 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[47494.382168] audit: type=1101 audit(1608620348.589:362): pid=44155 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:accounting grantors=pam_access,pam_unix,pam_permit,pam_time acct="mats" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[47494.382172] audit: type=1103 audit(1608620348.589:363): pid=44155 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=? acct="mats" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
[47494.382219] audit: type=1006 audit(1608620348.589:364): pid=44155 uid=0 old-auid=4294967295 auid=1000 tty=(none) old-ses=4294967295 ses=3 res=1
[47494.382793] audit: type=1105 audit(1608620348.589:365): pid=44155 uid=0 auid=1000 ses=3 msg='op=PAM:session_open grantors=pam_loginuid,pam_loginuid,pam_keyinit,pam_limits,pam_unix,pam_permit,pam_mail,pam_systemd,pam_env acct="mats" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[47494.386402] audit: type=1334 audit(1608620348.592:366): prog-id=9 op=LOAD
[47494.386407] audit: type=1334 audit(1608620348.592:367): prog-id=9 op=UNLOAD
[47501.087534] kauditd_printk_skb: 3 callbacks suppressed
[47501.087537] audit: type=1101 audit(1608620355.292:371): pid=44185 uid=1000 auid=1000 ses=2 msg='op=PAM:accounting grantors=pam_unix,pam_permit,pam_time acct="mats" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success'
[47501.087627] audit: type=1110 audit(1608620355.292:372): pid=44185 uid=1000 auid=1000 ses=2 msg='op=PAM:setcred grantors=pam_faillock,pam_permit,pam_env,pam_faillock acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success'
[47501.087715] audit: type=1105 audit(1608620355.296:373): pid=44185 uid=1000 auid=1000 ses=2 msg='op=PAM:session_open grantors=pam_limits,pam_unix,pam_permit acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success'

Does anyone have an idea of where else to look for this kind of issue?

EDIT:
From here
https://community.nxp.com/t5/i-MX-Proce … m-p/546299

I found this comment

In general, useful information might include:
- was this preceded by any interface reconfiguration or link changes?
- extended network stats (ethtool -S)
- MDIO register dump (mii-tool -vv) (if the interface has an MDIO PHY) 

Having seen this error many times with different causes, I wrote a short summary for the support team here, which (with some references removed) may be generally useful: 

The watchdog will fire if all these conditions are met:
1. The interface is up
2. A TX queue is stopped (normally because it is full)
3. No packets have been added to the queue in the last 5 seconds
4. The driver has not told the kernel that the device is unable to transmit now (e.g. link is down). 

Conditions 2 and 3 together normally mean that the TX queue has been stopped for 5 seconds and therefore that few packets (not necessarily none at all) have been completed in that time.  The time taken for individual packets to be completed is *not* considered. 

This can happen due to:
a. Driver bug causing conditions 2 and 4 to be true during reconfiguration
b. MAC blocked by a pause frame flood
c. IRQ handling is delayed by a long time (can happen due to excessive serial logging)
d. Firmware bug causes driver to see link as up when it's not e. Hardware fault (always a possibility)

Now this exact error occurs on both Debian and Arch, with the onboard Ethernet port, and two distinct PCIe Ethernet cards with two different drivers.
As such I would rule out a. and d.
I don't know enough about the underlying concepts to understand b. and c. and would appreciate further pointers in that direction.

Last edited by matsjj (2020-12-22 07:11:26)

Offline

#3 2021-01-02 11:49:13

matsjj
Member
Registered: 2020-12-13
Posts: 3

Re: Connectivity issues due to NETDEV WATCHDOG: transmit queue 3 timed out

I solved this today.
After replacing another key components, like the Switch, it turned out that the USB-C hub of my MacBook was simply at fault.
It looks like there is little regulation on those components, and this particular device basically created a network storm when
- connected to the charger to passthrough power to the MacBook
- had an Ethernet cable connected
- was itself *not* connected to further downstream devices like the MacBook, basically leaving it on it's own devices, but powered through the charger

Some more insight here for anyone who finds this
https://old.reddit.com/r/mac/comments/a … _wireless/

Offline

Board footer

Powered by FluxBB