You are not logged in.
For sure not the power. I mean the same setup works with my other Linux laptop, as well as with my Windows 11 one from work, which I use quite regularly with this setup.
And might provide more current on the relevant USB port.
But the dock has active power supply, does it? Do you use that?
If you remove all devices from the dock, connect the dock, onemississippi twomississippi threemississippi and then connect only the (wired) keyboard: does it show up?
There were
https://bbs.archlinux.org/viewtopic.php?id=307802
https://bbs.archlinux.org/viewtopic.php?id=307641&p=3
https://bbs.archlinux.org/viewtopic.php?id=303879
But those predated the good (6.14) kernel from May, are supposed to be fixed and only hit on resume from sleep.
Have you btw. tried to boot w/ and w/o the dock resp. to use it WITHOUT closing the lid?
Offline
But the dock has active power supply, does it? Do you use that?
Yes it has and yes I do. And it is working as well, as the the small LED on the underside shows.
If you remove all devices from the dock, connect the dock, onemississippi twomississippi threemississippi and then connect only the (wired) keyboard: does it show up?
still "assume dead" and not working, http://0x0.st/KAHz.log
Have you btw. tried to boot w/ and w/o the dock resp. to use it WITHOUT closing the lid?
With so many attempts by now I had several with the lid open and with the lid closed. In the past I used it lid closed, because the USB-C (?) is actually too old to provide full resolution for my two screens. But then during this trouble shooting ordeal I had lid open several times, since I need the laptop keyboard with the external one not working.
I also tried a boot with the dock already attached. Makes no difference.
So to me the question to answer is why does xHCI consider the device dead? And if there are parameters like timeouts and retries. I know, it should have worked with the older kernels, but at some point implausible options become more plausible.
Offline
I tried to look into the xhci and found an kernel option to enable some "quirks". Comparing my working with my non-working notebook I see:
my now non-working now: 0x81109810
my now non-working back in May when it worked: 0x81109810 (so the same now and then)
my working laptop now: 0x200009810.
Out of curiosity I tried to set my faulty notebook to the same value with xhci-hcd.quirks=0x200009810.
But this actually leads to 0x281109810 in the logs after reboot. So the default quirks get OR'ed with my kernel parameter. How can I force my value?
Offline
Not, the module will apply the hardware quirks anyway, you're just activating some global defaults.
I also don't think this is a good idea.
So to me the question to answer is why does xHCI consider the device dead?
Not kernel or bios, unlikely power supply… I fear to ask because I should have asked that right away:
Is it the cable?
Offline
Do you have logs from that other laptop? Was this behavior always there?
I posted a log of my working laptop a bit further up. Here it is again, http://0x0.st/KTwG.log
Not sure how I should answer the question.
I mean, do you have old logs (from May or whenever) and in those logs, are the 1-5 devices disconnecting after a second and coming back later, or are they connecting once and staying connected until the dock is unplugged?
This is potentially important because on the other laptop, type C port USB 2.0 is routed to the motherboard xHCI controller, not the funky thunderbolt xHCI. And the dock behaves weirdly (disconnects briefly) on the motherboard controller, before the TBT controller shows up on PCIe. Maybe this weirdness is what breaks the poor TBT controller and maybe the weirdness wasn't always there and maybe it's the fault of the dock somehow. It could be due to dock FW, or different devices connected to the dock, or whether it has the PSU connected or not.
Of course the TBT has no excuse and it should still detect all devices after it comes back. Speaking of which, can you post log (from the broken laptop) with some dynamic debug?
echo 'module usbcore +p ' >/proc/dynamic_debug/control
echo 'module xhci_hcd +p ' >/proc/dynamic_debug/control
Offline
I tried to look into the xhci and found an kernel option to enable some "quirks". Comparing my working with my non-working notebook I see:
my now non-working now: 0x81109810
my now non-working back in May when it worked: 0x81109810 (so the same now and then)
my working laptop now: 0x200009810.Out of curiosity I tried to set my faulty notebook to the same value with xhci-hcd.quirks=0x200009810.
But this actually leads to 0x281109810 in the logs after reboot. So the default quirks get OR'ed with my kernel parameter. How can I force my value?
You can't without patching the driver and recompiling.
But you are comparing TBT with the motherboard controller.
Both laptops have the same 8086:15c1 TBT xHCI and same 200009810 quirks, in the past or today.
I fear to ask because I should have asked that right away:
Is it the cable?
Or the ages old USB solution: keep rotating the plug 180° until it works
Offline
Is it the cable?
Well, it is the same cable that worked with the "good" laptop yesterday and my Win11 laptop last work week. And I cannot even exchange it, because one end is fixed to the dock, it is soldered to the PCB
Offline
echo 'module usbcore +p ' >/proc/dynamic_debug/control
echo 'module xhci_hcd +p ' >/proc/dynamic_debug/control
here you go, http://0x0.st/KAN9.log
Of course I tried it with the connector rotated 180deg
The log from May is http://0x0.st/KTPd.txt
Offline
here you go, http://0x0.st/KAN9.log
Sep 21 23:11:58 calculus.home kernel: xhci_hcd 0000:39:00.0: Get port status 3-1 read: 0x2a0, return 0x100
Sep 21 23:11:58 calculus.home kernel: xhci_hcd 0000:39:00.0: Get port status 3-2 read: 0x2a0, return 0x100
Sep 21 23:11:58 calculus.home kernel: hub 3-0:1.0: state 7 ports 2 chg 0000 evt 0000
Sep 21 23:11:58 calculus.home kernel: hub 3-0:1.0: hub_suspend
Sep 21 23:11:58 calculus.home kernel: usb usb3: bus auto-suspend, wakeup 1
Sep 21 23:11:58 calculus.home kernel: usb usb3: suspend raced with wakeup event
Sep 21 23:11:58 calculus.home kernel: usb usb3: usb auto-resume
Sep 21 23:11:58 calculus.home kernel: hub 3-0:1.0: hub_resume
Sep 21 23:11:58 calculus.home kernel: xhci_hcd 0000:39:00.0: Get port status 3-1 read: 0x2a0, return 0x100
Sep 21 23:11:58 calculus.home kernel: xhci_hcd 0000:39:00.0: Get port status 3-2 read: 0x2a0, return 0x100
Sep 21 23:11:58 calculus.home kernel: hub 3-0:1.0: state 7 ports 2 chg 0000 evt 0000
Sep 21 23:11:58 calculus.home kernel: hub 3-0:1.0: hub_suspend
Sep 21 23:11:58 calculus.home kernel: usb usb3: bus auto-suspend, wakeup 1
Sep 21 23:11:58 calculus.home kernel: usb usb3: suspend raced with wakeup event
Sep 21 23:11:58 calculus.home kernel: usb usb3: usb auto-resume
Sep 21 23:11:58 calculus.home kernel: hub 3-0:1.0: hub_resume
I'm not entirely sure why this stuff is repeating, but I see similar output here when turning on any xHCI controller with no USB 2.0 devices connected.
The log from May is http://0x0.st/KTPd.txt
That's a log from the broken machine, I asked about old logs from the "good" machine.
Offline
That's a log from the broken machine, I asked about old logs from the "good" machine.
Oh, ok, that will have to wait a while, since I am out of country atm. But what do you expect to find? I mean, good machine was good then and still is good now.
Offline
Whether and what is different, I assume.
Is http://0x0.st/KAN9.log w/ usbcore.autosuspend supposingly disabled?
"usbcore.autosuspend=-1", https://wiki.archlinux.org/title/Kernel_parameters
Do the suspend/wakeup races still occur?
Offline
So, back at home, the solution finding can continue.
That's a log from the broken machine, I asked about old logs from the "good" machine.
This is the good machine, also from May: http://0x0.st/KMdg.log
Offline
"usbcore.autosuspend=-1"
This is a log with autosuspend off, http://0x0.st/KMdd.txt
Offline
This is the good machine, also from May: http://0x0.st/KMdg.log
This shows that the "disconnect after one second, reconnect after another one" behavior was always there, at least in the high-speed hub.
But there was no "HC died" on the 0000:39/3a:00.0 TBT xHCI controller, and now it's present on both machines and apparently it's still present if you boot old kernel packages. WTF.
I don't remember, did you try downgrading all "-firmware" packages and the kernel at the same time? Maybe you still have the installation ISO with old kernel and old userspace?
Sep 21 23:11:48 calculus.home kernel: usb usb4: New USB device found, idVendor=1d6b, idProduct=0003, bcdDevice= 6.14
Sep 21 23:11:48 calculus.home kernel: usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1
Sep 21 23:11:48 calculus.home kernel: usb usb4: Product: xHCI Host Controller
Sep 21 23:11:48 calculus.home kernel: usb usb4: Manufacturer: Linux 6.14.4-arch1-2 xhci-hcd
Sep 21 23:11:48 calculus.home kernel: usb usb4: SerialNumber: 0000:39:00.0
Sep 21 23:11:48 calculus.home kernel: usb usb4: usb_probe_device
Sep 21 23:11:48 calculus.home kernel: usb usb4: configuration #1 chosen from 1 choice
Sep 21 23:11:48 calculus.home kernel: xHCI xhci_add_endpoint called for root hub
Sep 21 23:11:48 calculus.home kernel: xHCI xhci_check_bandwidth called for root hub
Sep 21 23:11:48 calculus.home kernel: usb usb4: adding 4-0:1.0 (config #1, interface 0)
Sep 21 23:11:48 calculus.home kernel: hub 4-0:1.0: usb_probe_interface
Sep 21 23:11:48 calculus.home kernel: hub 4-0:1.0: usb_probe_interface - got id
Sep 21 23:11:48 calculus.home kernel: hub 4-0:1.0: USB hub found
Sep 21 23:11:48 calculus.home kernel: hub 4-0:1.0: 2 ports detected
Sep 21 23:11:48 calculus.home kernel: hub 4-0:1.0: standalone hub
Sep 21 23:11:48 calculus.home kernel: hub 4-0:1.0: no power switching (usb 1.0)
Sep 21 23:11:48 calculus.home kernel: hub 4-0:1.0: individual port over-current protection
Sep 21 23:11:48 calculus.home kernel: hub 4-0:1.0: TT requires at most 8 FS bit times (666 ns)
Sep 21 23:11:48 calculus.home kernel: hub 4-0:1.0: power on to power good time: 100ms
Sep 21 23:11:48 calculus.home kernel: hub 4-0:1.0: local power source is good
Sep 21 23:11:48 calculus.home kernel: typec port0: bound usb3-port1 (ops connector_ops)
Sep 21 23:11:48 calculus.home kernel: typec port0: bound usb4-port1 (ops connector_ops)
Sep 21 23:11:48 calculus.home kernel: usb usb4-port1: peered to usb3-port1
Sep 21 23:11:48 calculus.home kernel: usb usb4-port2: peered to usb3-port2
Sep 21 23:11:48 calculus.home kernel: usb usb4: port-1 no _DSM function 5
Sep 21 23:11:48 calculus.home kernel: usb usb4: port-2 no _DSM function 5
Sep 21 23:11:48 calculus.home kernel: usb usb4-port2: DeviceRemovable is changed to 1 according to platform information.
Sep 21 23:11:48 calculus.home kernel: hub 4-0:1.0: trying to enable port power on non-switchable hub
Sep 21 23:11:48 calculus.home kernel: xhci_hcd 0000:39:00.0: set port power 4-1 ON, portsc: 0x210
Sep 21 23:11:48 calculus.home kernel: xhci_hcd 0000:39:00.0: set port power 4-2 ON, portsc: 0x2a0
Sep 21 23:11:48 calculus.home kernel: xhci_hcd 0000:39:00.0: xHCI host controller not responding, assume dead
Sep 21 23:11:48 calculus.home kernel: xhci_hcd 0000:39:00.0: HC died; cleaning up
This is the dynamic debug log. Is it always crashing in the exact same place? Maybe there is something to it, though I don't know what it could be.
Offline