You are not logged in.
I'm using this ethernet dongle as a libvirt macvtap passhtrough to a VM. It's been working great until the last month or so. Around a month ago it would just randomly stop working with the following error:
[32666.311689] ax88179_178a 2-1.4.1:1.0 ethlan0: Failed to read reg index 0x0000: -110
[32671.431771] ax88179_178a 2-1.4.1:1.0 ethlan0: Failed to read reg index 0x0001: -110
[32676.551716] ax88179_178a 2-1.4.1:1.0 ethlan0: Failed to read reg index 0x0009: -110
[32681.671638] ax88179_178a 2-1.4.1:1.0 ethlan0: Failed to read reg index 0x000a: -110
With the latest kernel it now crashes with the following watchdog trace. Works fine with the LTS kernel version.
Anyone else having similar issues?
[ 2513.808887] ------------[ cut here ]------------
[ 2513.808908] NETDEV WATCHDOG: ethwan0 (ax88179_178a): transmit queue 0 timed out
[ 2513.809053] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:529 dev_watchdog+0x20b/0x220
[ 2513.809067] Modules linked in: wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha ip6_udp_tunnel udp_tunnel veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c br_netfilter bridge stp llc ksmbd crc32_generic rdma_cm iw_cm ib_cm ib_core cifs_arc4 overlay vhost_net tun vhost vhost_iotlb macvtap macvlan tap mousedev ax88179_178a usbnet mii joydev snd_hda_codec_hdmi hid_holtek_mouse usbhid uas usb_storage snd_sof_pci_intel_apl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda intel_rapl_msr intel_rapl_common snd_sof_pci snd_sof_xtensa_dsp snd_sof intel_pmc_bxt snd_sof_utils intel_telemetry_pltdrv soundwire_bus intel_punit_ipc intel_telemetry_core snd_soc_skl snd_hda_codec_realtek snd_soc_hdac_hda snd_hda_codec_generic snd_hda_ext_core snd_soc_sst_ipc
[ 2513.809177] x86_pkg_temp_thermal intel_powerclamp snd_soc_sst_dsp coretemp snd_soc_acpi_intel_match snd_soc_acpi kvm_intel snd_soc_core kvm snd_compress ac97_bus irqbypass crct10dif_pclmul snd_pcm_dmaengine crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_intel_dspcfg dell_wmi nls_iso8859_1 snd_intel_sdw_acpi serio ledtrig_audio vfat snd_hda_codec fat aesni_intel dell_smbios wmi_bmof dell_wmi_descriptor mei_pxp sparse_keymap crypto_simd mei_hdcp rfkill i915 snd_hda_core cryptd ee1004 ucsi_acpi rapl dcdbas snd_hwdep r8169 typec_ucsi intel_cstate snd_pcm pcspkr realtek typec drm_buddy mmc_block wmi roles ttm mdio_devres snd_timer snd drm_dp_helper i2c_i801 video intel_gtt mei_me soundcore libphy i2c_smbus mac_hid intel_lpss_pci intel_lpss mei idma64 dm_multipath dm_mod fuse bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 sdhci_pci cqhci sdhci crc32c_intel xhci_pci mmc_core xhci_pci_renesas
[ 2513.809286] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.18.15-arch1-1 #1 9ff3be2e7813d5f2c07119812e1642852fe6c646
[ 2513.809291] Hardware name: Dell Inc. Wyse 5070 Thin Client/0K6VXP, BIOS 1.14.0 11/11/2021
[ 2513.809293] RIP: 0010:dev_watchdog+0x20b/0x220
[ 2513.809297] Code: ff e9 40 ff ff ff 48 89 df c6 05 56 00 3f 01 01 e8 ea 74 f9 ff 44 89 e9 48 89 de 48 c7 c7 60 d0 96 91 48 89 c2 e8 b2 60 19 00 <0f> 0b e9 22 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3
[ 2513.809299] RSP: 0018:ffffa9ae0012ce90 EFLAGS: 00010282
[ 2513.809301] RAX: 0000000000000000 RBX: ffff9bc584550000 RCX: 0000000000000027
[ 2513.809303] RDX: ffff9bc8efca16a8 RSI: 0000000000000001 RDI: ffff9bc8efca16a0
[ 2513.809305] RBP: ffff9bc5845504c8 R08: 0000000000000000 R09: ffffa9ae0012cca0
[ 2513.809306] R10: 0000000000000003 R11: ffffffff920caa08 R12: ffff9bc58455041c
[ 2513.809308] R13: 0000000000000000 R14: ffffffff90dd5a80 R15: ffff9bc5845504c8
[ 2513.809309] FS: 0000000000000000(0000) GS:ffff9bc8efc80000(0000) knlGS:0000000000000000
[ 2513.809311] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2513.809313] CR2: 00007f87a6bd8d40 CR3: 00000002e0210000 CR4: 0000000000352ee0
[ 2513.809315] Call Trace:
[ 2513.809319] <IRQ>
[ 2513.809323] ? pfifo_fast_reset+0x140/0x140
[ 2513.809327] call_timer_fn+0x24/0x130
[ 2513.809332] __run_timers+0x21c/0x2a0
[ 2513.809336] run_timer_softirq+0x1d/0x40
[ 2513.809338] __do_softirq+0xd0/0x2c9
[ 2513.809344] ? sched_clock_cpu+0xd/0xb0
[ 2513.809349] __irq_exit_rcu+0x8e/0xc0
[ 2513.809353] sysvec_apic_timer_interrupt+0x72/0x90
[ 2513.809358] </IRQ>
[ 2513.809358] <TASK>
[ 2513.809360] asm_sysvec_apic_timer_interrupt+0x19/0x20
[ 2513.809364] RIP: 0010:cpuidle_enter_state+0xdc/0x380
[ 2513.809369] Code: 00 00 31 ff e8 85 b5 7e ff 45 84 ff 74 16 9c 58 0f 1f 40 00 f6 c4 02 0f 85 92 02 00 00 31 ff e8 ca a2 84 ff fb 0f 1f 44 00 00 <45> 85 f6 0f 88 25 01 00 00 49 63 ce 48 8d 04 49 48 8d 04 81 49 8d
[ 2513.809370] RSP: 0018:ffffa9ae000dfe90 EFLAGS: 00000246
[ 2513.809372] RAX: ffff9bc8efcb2cc0 RBX: 0000000000000004 RCX: 0000000000000000
[ 2513.809374] RDX: 000002494aaec5a3 RSI: fffffffb82b16afe RDI: 0000000000000000
[ 2513.809375] RBP: ffff9bc8efcbe100 R08: 0000000000000000 R09: 0000000055785785
[ 2513.809377] R10: 0000000000000018 R11: 0000000000000c76 R12: ffffffff9214be00
[ 2513.809378] R13: 000002494aaec5a3 R14: 0000000000000004 R15: 0000000000000000
[ 2513.809385] cpuidle_enter+0x2d/0x40
[ 2513.809388] do_idle+0x1ba/0x220
[ 2513.809391] cpu_startup_entry+0x1d/0x20
[ 2513.809393] start_secondary+0x11c/0x140
[ 2513.809397] secondary_startup_64_no_verify+0xd5/0xdb
[ 2513.809404] </TASK>
[ 2513.809405] ---[ end trace 0000000000000000 ]---
[ 3009.267672] audit: type=1106 audit(1659274570.477:358): pid=3308 uid=1000 auid=1000 ses=4 msg='op=PAM:session_close grantors=pam_systemd_home,pam_limits,pam_unix,pam_permit
Offline
110 is a timeout
- Do you have same problems when *not* passing the device through to the VM?
- Which kernel did you use before 5.18.15? 5.18.14?
- Are there other devices on the same hub ("lsusb -tv")?
- Do you use some power saving tools?
- What if you disable usb autosuspend?
https://wiki.archlinux.org/title/Power_ … utosuspend - "usbcore.autosuspend=-1" will globally disable it but nb. that forementioned userspace tools will probably override that and you'll have to configure the behavior there.
- Is there context in the system journal?
(Stuff that happens in userspace and might trigger the behavior)
Offline
110 is a timeout
- Do you have same problems when *not* passing the device through to the VM?
- Which kernel did you use before 5.18.15? 5.18.14?
- Are there other devices on the same hub ("lsusb -tv")?
- Do you use some power saving tools?
- What if you disable usb autosuspend?
https://wiki.archlinux.org/title/Power_ … utosuspend - "usbcore.autosuspend=-1" will globally disable it but nb. that forementioned userspace tools will probably override that and you'll have to configure the behavior there.
- Is there context in the system journal?
(Stuff that happens in userspace and might trigger the behavior)
Thanks for the reply,
1. I did try another ax88179 usb adapter in a different computer and eventually got the same error
2. I think the last stable kernel for this adapter was 5.17.x
3. yes, but on the other laptop i tested nothing else was plugged into the usb ports.
4. No power saving tools
4. autosuspend is not enabled, but i will give that kernel parameter a try
5. Didn't see anything out of the ordinary
I do have a realtek dual adapter coming sometime today, we'll see if that exhibits the same behavior.
Offline
4. autosuspend is not enabled, but i will give that kernel parameter a try
Frt, the default autosuspend timeout is 2 seconds.
Offline
This looks like the culprit:
https://lore.kernel.org/lkml/2022072716 … ation.org/
Problems observed:
======================================================================
1) Using ssh/sshfs. The remote sshd daemon can abort with the message:
"message authentication code incorrect"
This happens because the tcp message sent is corrupted during the
USB "Bulk out". The device calculate the tcp checksum and send a
valid tcp message to the remote sshd. Then the encryption detects
the error and aborts.
2) NETDEV WATCHDOG: ... (ax88179_178a): transmit queue 0 timed out
3) Stop normal work without any log message.
The "Bulk in" continue receiving packets normally.
The host sends "Bulk out" and the device responds with -ECONNRESET.
(The netusb.c code tx_complete ignore -ECONNRESET)
Under normal conditions these errors take days to happen and in
intense usage take hours.
Offline
I face it too.
After some upgrade (I dont remember the kernel version), my Samsung S22 USB tethering failed.
Aug 02 08:57:41 universe kernel: usb 1-5: new high-speed USB device number 5 using xhci_hcd
Aug 02 08:57:47 universe kernel: usb 1-5: device descriptor read/64, error -110
Aug 02 08:57:57 universe kernel: usb 1-5: device descriptor read/64, error -71
Aug 02 08:57:57 universe kernel: usb 1-5: new high-speed USB device number 6 using xhci_hcd
Aug 02 08:57:58 universe kernel: usb 1-5: device descriptor read/64, error -71
Aug 02 08:57:58 universe kernel: usb 1-5: device descriptor read/64, error -71
Aug 02 08:57:58 universe kernel: usb usb1-port5: attempt power cycle
Aug 02 08:57:58 universe kernel: usb 1-5: new high-speed USB device number 7 using xhci_hcd
Aug 02 08:57:58 universe kernel: usb 1-5: Device not responding to setup address.
Aug 02 08:57:59 universe kernel: usb 1-5: Device not responding to setup address.
Aug 02 08:57:59 universe kernel: usb 1-5: device not accepting address 7, error -71
Aug 02 08:57:59 universe kernel: usb 1-5: new high-speed USB device number 8 using xhci_hcd
Aug 02 08:57:59 universe kernel: usb 1-5: Device not responding to setup address.
Aug 02 08:57:59 universe kernel: usb 1-5: Device not responding to setup address.
Aug 02 08:57:59 universe kernel: usb 1-5: device not accepting address 8, error -71
Aug 02 08:57:59 universe kernel: usb usb1-port5: unable to enumerate USB device
Aug 02 08:58:14 universe kernel: usb 1-3: new high-speed USB device number 9 using xhci_hcd
Aug 02 08:58:19 universe kernel: usb 1-3: device descriptor read/64, error -110
Aug 02 08:58:30 universe kernel: usb 1-3: device descriptor read/64, error -71
Aug 02 08:58:30 universe kernel: usb 1-3: new high-speed USB device number 10 using xhci_hcd
Aug 02 08:58:30 universe kernel: usb 1-3: device descriptor read/64, error -71
Aug 02 08:58:30 universe kernel: usb 1-3: device descriptor read/64, error -71
Aug 02 08:58:30 universe kernel: usb usb1-port3: attempt power cycle
Aug 02 08:58:31 universe kernel: usb 1-3: new high-speed USB device number 11 using xhci_hcd
Aug 02 08:58:31 universe kernel: usb 1-3: Device not responding to setup address.
Aug 02 08:58:31 universe kernel: usb 1-3: Device not responding to setup address.
Aug 02 08:58:31 universe kernel: usb 1-3: device not accepting address 11, error -71
Aug 02 08:58:31 universe kernel: usb 1-3: new high-speed USB device number 12 using xhci_hcd
Aug 02 08:58:31 universe kernel: usb 1-3: Device not responding to setup address.
Aug 02 08:58:32 universe kernel: usb 1-3: Device not responding to setup address.
Aug 02 08:58:32 universe kernel: usb 1-3: device not accepting address 12, error -71
Aug 02 08:58:32 universe kernel: usb usb1-port3: unable to enumerate USB device
Aug 02 09:00:00 universe vmnetBridge[751]: RTM_NEWLINK: name:wlp8s0 index:3 flags:0x00001002
Offline
rakotomandimby, I am almost certain that is a bad cable and is unrelated to the thread. Those are very low level USB messages indicating the device won't even enumerate on the bus -- this is way before the system can can even tell what the device is and what driver to use.
Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing
---
How to Ask Questions the Smart Way
Offline
rakotomandimby, I am almost certain that is a bad cable and is unrelated to the thread. Those are very low level USB messages indicating the device won't even enumerate on the bus -- this is way before the system can can even tell what the device is and what driver to use.
Found the culprit: My smartphone's USB somehow hung up.
I took the random decision to restart the smartphone and everything went good...
Offline
Is the issue still present using 5.18.15 with https://git.kernel.org/pub/scm/linux/ke … f236d25f51 reverted?
https://drive.google.com/file/d/1CcqVgX … sp=sharing linux-5.18.15.arch1-1.1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1IEx5AM … sp=sharing linux-headers-5.18.15.arch1-1.1-x86_64.pkg.tar.zst
Offline
Is the issue still present using 5.18.15 with https://git.kernel.org/pub/scm/linux/ke … f236d25f51 reverted?
https://drive.google.com/file/d/1CcqVgX … sp=sharing linux-5.18.15.arch1-1.1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1IEx5AM … sp=sharing linux-headers-5.18.15.arch1-1.1-x86_64.pkg.tar.zst
Thanks, I will give it a shot, it's been up and running now for around 30 mins. Will report back tomorrow.
Offline
Just following up, reverted patch still running without issues. Let's continue the discussion in the actual bug thread to avoid duplication: https://bugs.archlinux.org/task/75491
Offline
I have experienced a terrible bug since 5.18.15 and my symptoms are different to jmandawg but I am mentioning it here because it is also fixed by installing loqs kernel with the reverted patch.
Since 5.18.15 (and still in 5.18.16), whenever my laptop suspends then my entire home network goes down (and my family start screaming). My laptop is connected to a dock which is connected via ethernet network to my main home router. When the laptop suspends, it puts that connection in some odd state which locks up that router (and thus disables the attached wifi access point which most of my family rely on). If I disconnect the ethernet cable, or the dock cable to the laptop, or wake up my laptop, then the router and network immediately recover. I have switched back to good 5.18.14 a few times to prove where the bug was introduced. I just discovered this forum thread so tried loqs kernel and find that fixes the bug also.
It only happens when using the dock ethernet connection. If I disconnect the dock network cable but keep the dock connected and then use a USB ethernet dongle connected directly to the laptop then the problem does not occur.
Last edited by bulletmark (2022-08-04 12:30:49)
Offline
With regards to my post above, I later found I had an error in my testing and that loqs kernel did not fix my problem and so my issue is unrelated to this thread. @loqs, after fixing my test process, I bisected the kernel between 5.18.14 and 5.18.15 and found the offending commit so raised a bug at https://bugzilla.kernel.org/show_bug.cgi?id=216333.
Offline
https://git.kernel.org/pub/scm/linux/ke … d2c17fb6e0 Revert "net: usb: ax88179_178a needs FLAG_SEND_ZLP"
@bulletmark you might want to add the author of the causal commit for your bug to the CC list of your bug report.
Offline
@loqs, kernel bugzilla would not let me add him as CC so I emailed a link to the bug to him directly. He has corresponded a few times back and forward with me doing tests and providing data.
Offline
@loqs, the patch to fix the bug I raised is included in new kernel 5.19.6.arch1-1 currently in Arch testing repo, and also in new linux-lts 5.15.64-1.
Offline
I just reinstalled my system and problem disapeared. Note that I installed my system 3 years ago and made upgrades once or twice a week.
Offline
Hi,
I'm experiencing periodic disconnects with this one as well,
- kernel 6.7.5-arch1-1 x86_64
Feb 22 10:49:54 ax88179_178a 2-3.4:1.0 enp0s20f0u3u4: ax88179 - Link status is: 0
Feb 22 10:49:57 ax88179_178a 2-3.4:1.0 enp0s20f0u3u4: ax88179 - Link status is: 1
every few seconds.
Already did a
pacman -Syu
and reboot, all packages, including linux-firmware, are up to date, problem still persists.
It started with the updates during the last week, it was working fine before that.
Can't really pinpoint it to a kernel version or a version of linux-firmware, but currently its broken.
Last edited by Linux2Brain (2024-02-22 10:10:22)
Offline
I'm experiencing periodic disconnects with this one as well,
- kernel 6.7.5-arch1-1 x86_64
Feb 22 10:49:54 ax88179_178a 2-3.4:1.0 enp0s20f0u3u4: ax88179 - Link status is: 0 Feb 22 10:49:57 ax88179_178a 2-3.4:1.0 enp0s20f0u3u4: ax88179 - Link status is: 1
Why do you believe it is the same issue? The extract you provided does not match the outputs for this issue.
Can't really pinpoint it to a kernel version or a version of linux-firmware, but currently its broken.
Why can you not pinpoint it? What were the results from your downgrading the just the kernel to a known good version? Similarly for linux-firmware. Then for both. Then for the full system to a known good date.
Offline
I thoguht this might be a firmware issue, too.
Rebooted into Ubuntu 23 livestick, there the error ocurred, too.
Rebooted back into arch and the error seems to be gone. (after several reboots)
Maybe this is another Issue with UEFI or USB controller (got no logs regarding these points).
For now it works again, very strange.
Maybe the UEFI of thinkpad x390 is buggy and doesn't do a proper usb reset.
Regarding my point: consider it solved for now.
Offline