You are not logged in.

#1 2025-02-26 09:05:42

nunibye
Member
Registered: 2025-02-05
Posts: 19

Advice for debugging kernel

context: https://bbs.archlinux.org/viewtopic.php?id=303199
I've been trying everything to fix my touchpad and I finally found that the problem is a conditional in commit 9140ce47872bfd89fca888c2f992faa51d20c2bc.

Here is the conditional form /drivers/dma/idma64.c:

static irqreturn_t idma64_irq(int irq, void *dev)
{
	struct idma64 *idma64 = dev;
	u32 status = dma_readl(idma64, STATUS_INT);
	u32 status_xfer;
	u32 status_err;
	unsigned short i;

        //HERE
	/* Since IRQ may be shared, check if DMA controller is powered on */
	if (status == GENMASK(31, 0))
	    return IRQ_NONE;

	dev_vdbg(idma64->dma.dev, "%s: status=%#x\n", __func__, status);

	/* Check if we have any interrupt from the DMA controller */
	if (!status)
		return IRQ_NONE;

	status_xfer = dma_readl(idma64, RAW(XFER));
	status_err = dma_readl(idma64, RAW(ERROR));

	for (i = 0; i < idma64->dma.chancnt; i++)
		idma64_chan_irq(idma64, i, status_err, status_xfer);

	return IRQ_HANDLED;
}

I have been removing this conditional and compiling the kernel module myself and that works great, but I just realized removing that conditional floods my CPU with interrupts. I was going to add some print statements and try to add a rule that kept the conditional in pace but excluded my touchpad but I am worried that if I start printing random things I will flood the console and my computer won't work. Do you have any advice on how to go about this? Any idea what exactly this function is doing and what kind of data I can access from inside of it?

Apologies if this is a stupid question but I am new to linux.

Offline

#2 2025-02-26 10:22:37

xerxes_
Member
Registered: 2018-04-29
Posts: 1,053

Re: Advice for debugging kernel

How about changing section touchpad in file /usr/share/X11/xorg.conf.d/40-libinput.conf to something like this:

Section "InputClass"
        Identifier "libinput touchpad catchall"
        MatchIsTouchpad "on"
        MatchDevicePath "/dev/input/event*"
        Option "Tapping" "True"
        Option "TappingDrag" "True"
        Option "ScrollMethod" "Edge"
#       Option "AccelProfile" "linear"
#       Option "AccelSpeed" "0.4"
        Option "DisableWhileTyping" "False"
        Driver "libinput"
EndSection

Maybe this will help?

Offline

#3 2025-02-26 10:38:51

ReDress
Member
From: Nairobi
Registered: 2024-11-30
Posts: 245

Re: Advice for debugging kernel

nunibye wrote:

Here is the conditional form /drivers/dma/idma64.c:

        //HERE
	/* Since IRQ may be shared, check if DMA controller is powered on */
	if (status == GENMASK(31, 0))
	    return IRQ_NONE;

I have been removing this conditional and compiling the kernel module myself and that works great, but I just realized removing that conditional floods my CPU with interrupts.

My goto idea would be to validate that the mask from GENMASK is what we want. Could be someone made a mistake.

Offline

#4 2025-02-26 16:48:48

nunibye
Member
Registered: 2025-02-05
Posts: 19

Re: Advice for debugging kernel

Thanks for the suggestion xerxes_, but this would only change how libinput interprets the event that are written to /dev/input/eventX not anything that actually produces the events right? Without my kernel change (or a better one) I am not getting enough events written to /dev/input/eventX to do anything useful.

Offline

#5 2025-02-26 16:54:59

nunibye
Member
Registered: 2025-02-05
Posts: 19

Re: Advice for debugging kernel

Thanks for the reply ReDress, I think that the GENMASK is correct. I believe it produces 0xFFFFFFFF which I think would indeed signal that the device is off. I think it may be something specific to my touchpad with it communicating in a weird way. My idea was to hard code in a case for my touchpad in this driver, but I am not sure if that would make any sense because it seems like that would likely still result in the high interrupt count. Do you know if my thoughts are accurate? Is this actually weird behavior? I'm guessing my touchpad shouldn't use the majority of core but I don't really have much to compare it to.

Offline

#6 2025-02-26 17:37:57

xerxes_
Member
Registered: 2018-04-29
Posts: 1,053

Re: Advice for debugging kernel

If you don't know, you can just "do it" and compare what's the difference before and after your kernel change and compile it.

You can see interrupts in system, for example, with this commands:

watch lsirq
watch cat /proc/interrupts
watch cat /proc/softirqs
watch cat /proc/stat
watch cat /proc/schedstat

Last edited by xerxes_ (2025-02-26 17:43:59)

Offline

#7 2025-02-26 17:48:06

ReDress
Member
From: Nairobi
Registered: 2024-11-30
Posts: 245

Re: Advice for debugging kernel

nunibye wrote:

Thanks for the reply ReDress, I think that the GENMASK is correct. I believe it produces 0xFFFFFFFF which I think would indeed signal that the device is off. I think it may be something specific to my touchpad with it communicating in a weird way. My idea was to hard code in a case for my touchpad in this driver, but I am not sure if that would make any sense because it seems like that would likely still result in the high interrupt count. Do you know if my thoughts are accurate? Is this actually weird behavior? I'm guessing my touchpad shouldn't use the majority of core but I don't really have much to compare it to.

Then quite possibly this commit/change is not the problem and the device is actually sleeping.

For some strange reason. Maybe someone just didn't set it's status correctly when configuring it.

Last edited by ReDress (2025-02-26 17:48:56)

Offline

#8 2025-02-26 19:13:09

nunibye
Member
Registered: 2025-02-05
Posts: 19

Re: Advice for debugging kernel

xerxes_ wrote:

If you don't know, you can just "do it" and compare what's the difference before and after your kernel change and compile it.

You can see interrupts in system, for example, with this commands:

watch lsirq
watch cat /proc/interrupts
watch cat /proc/softirqs
watch cat /proc/stat
watch cat /proc/schedstat

Sorry I'm not sure what you mean exactly. Do what? Add a case to exclude my touchpad from the conditional? I would like to but I'm not really sure how to. Do you know if I can easily access anything inside of idma64.c that I can use to uniquely identify my device?

Offline

#9 2025-02-26 19:18:14

nunibye
Member
Registered: 2025-02-05
Posts: 19

Re: Advice for debugging kernel

ReDress wrote:
nunibye wrote:

Thanks for the reply ReDress, I think that the GENMASK is correct. I believe it produces 0xFFFFFFFF which I think would indeed signal that the device is off. I think it may be something specific to my touchpad with it communicating in a weird way. My idea was to hard code in a case for my touchpad in this driver, but I am not sure if that would make any sense because it seems like that would likely still result in the high interrupt count. Do you know if my thoughts are accurate? Is this actually weird behavior? I'm guessing my touchpad shouldn't use the majority of core but I don't really have much to compare it to.

Then quite possibly this commit/change is not the problem and the device is actually sleeping.

For some strange reason. Maybe someone just didn't set it's status correctly when configuring it.

I think so. It seems like this commit/change is there for a reason, but it is the commit where my touchpad stops working, so not really sure what else I can do. Would it be possible to see/debug if the touchpad is setting status correctly? Seems like that might be getting into a scary realm of proprietary hardware stuff. I'm thinking of just writing a script to tell the kernel to ignore my touchpad and bind it to my touchpad disable button. It would be annoying, but livable. Not sure exactly how to do that but I can't imagine it would be too hard.

Offline

#10 2025-02-26 19:24:53

nunibye
Member
Registered: 2025-02-05
Posts: 19

Re: Advice for debugging kernel

nunibye wrote:
xerxes_ wrote:

If you don't know, you can just "do it" and compare what's the difference before and after your kernel change and compile it.

You can see interrupts in system, for example, with this commands:

watch lsirq
watch cat /proc/interrupts
watch cat /proc/softirqs
watch cat /proc/stat
watch cat /proc/schedstat

Sorry I'm not sure what you mean exactly. Do what? Add a case to exclude my touchpad from the conditional? I would like to but I'm not really sure how to. Do you know if I can easily access anything inside of idma64.c that I can use to uniquely identify my device?

I think this is what is eating my cpu:
28 55100614 IR-IO-APIC 28-fasteoi idma64.1, i2c_designware.1
there is also this:
57    78020 IR-IO-APIC 57-fasteoi ELAN1206:00

My touchpad is an ELAN1206. Maybe I could filter out the junk not related to my touchpad somehow?

form dmesg:
[    3.912343] input: ELAN1206:00 04F3:30F1 Touchpad as /devices/pci0000:00/0000:00:15.1/i2c_designware.1/i2c-14/i2c-ELAN1206:00/0018:04F3:30F1.0002/input/input30

Offline

#11 2025-02-26 20:45:05

nunibye
Member
Registered: 2025-02-05
Posts: 19

Re: Advice for debugging kernel

I ran some random commands and I'm starting to think the interrupt spam might be from a device other then my touchpad, so maybe my idea of a conditional would work. I would need to find a way to identify my touchpad in the kernel module.

Offline

#12 2025-02-27 09:12:19

nunibye
Member
Registered: 2025-02-05
Posts: 19

Re: Advice for debugging kernel

My solution for now is to just compile the kernel with out the conditional and then run this always with systemd:

#!/bin/bash

TOUCHPAD_NAME="ELAN1206"

find_touchpad_event() {
  for device in /sys/class/input/event*/device/name; do
    if [ -f "$device" ]; then
      if grep -q "$TOUCHPAD_NAME" "$device"; then
        event_path=$(dirname "$device")
        event_num=$(basename "$(dirname "$event_path")" | sed 's/event//')
        if [[ $(cat "$device") == *"Touchpad"* ]]; then
          echo "/dev/input/event$event_num"
          return 0
        fi
      fi
    fi
  done
  return 1
}

TOUCHPAD_EVENT=$(find_touchpad_event)
if [ -z "$TOUCHPAD_EVENT" ]; then
  exit 1
fi

LAST_EVENT_TIME=$(date +%s)
IS_SLEEPING=false

exec 3< <(evtest "$TOUCHPAD_EVENT")

while true; do
  if IFS= read -r -t 1 line <&3; then
    CURRENT_TIME=$(date +%s)
    LAST_EVENT_TIME=$CURRENT_TIME

    if [ "$IS_SLEEPING" = true ]; then
      # wake
      echo 'idma64.1' | sudo tee /sys/bus/platform/drivers/idma64/bind
      IS_SLEEPING=false
    fi

  else
    CURRENT_TIME=$(date +%s)
    DIFF=$((CURRENT_TIME - LAST_EVENT_TIME))
    if [ "$DIFF" -ge 1 ] && [ "$IS_SLEEPING" = false ]; then
      # sleep
      echo 'idma64.1' | sudo tee /sys/bus/platform/drivers/idma64/unbind
      IS_SLEEPING=true
    fi
  fi
done

There are quite a few interrupts when my touch pad is moving, but then as soon as it stops I shut it down and wait until it starts moving again. Pretty hacky, but it works surprisingly well. I am able to do this since most touchpad events and lots of what I imagine is junk come through irq 57 but some come through irq 28 for some reason. This lets me check if its being used with out having the dirty irq enabled. Not a true solution but whatever. I am worried that idma64.1 won't always be the correct device, and I couldn't figure out how to find the right one without trial and error.

Offline

#13 2025-02-27 15:59:09

ReDress
Member
From: Nairobi
Registered: 2024-11-30
Posts: 245

Re: Advice for debugging kernel

Maybe something like this would function to exclude the touchpad from the filter.

We need to work backwards from the device(which we have as idma64->dma.dev) to the input device(input_dev) which should contain the name of the input device which we can then compare with what we expect from the touchpad.

Note: This assumes that your touch pad is loading the driver at https://github.com/torvalds/linux/blob/ … hid-elan.c

.....

struct input_dev = container_of(idma64->dma.dev, struct input_dev, dev.parent);

if (strcmp(input_dev->name, "Elan Touchpad")) { 
    	if (status == GENMASK(31, 0))
	    return IRQ_NONE;
}

.....

Not particularly thaaaaat hard.

Offline

#14 2025-03-16 19:17:44

nunibye
Member
Registered: 2025-02-05
Posts: 19

Re: Advice for debugging kernel

Sorry for the late reply. This doesn't seem to work for me. I think the problem might be more complicated then a simple conditional unfortunately. IRQ28 is the one that spams, but is also where my touch pad is, so I don't think I can disable it. For now I wrote a C "driver" that completely fixes the problem without needing to modify the kernel.

Offline

Board footer

Powered by FluxBB