You are not logged in.

#1 2019-10-22 19:35:27

BeefEater
Member
Registered: 2019-03-20
Posts: 8

one CPU core at 100% usage after kernel upgrade, suspecting i915

After upgrading from 5.2.9-arch1-1 to 5.3.1-arch1-1, I am observing 100% usage on one of the two CPU cores. Uptime never goes below 1.0. Here is what I have found so far:

Kernel version 5.3.1

[root@nuc2 pkg]# uname -a
Linux nuc2 5.3.1-arch1-1-ARCH #1 SMP PREEMPT Sat Sep 21 11:33:49 UTC 2019 x86_64 GNU/Linux
[root@nuc2 ~]# uptime
 14:25:35 up 36 min,  1 user,  load average: 1.04, 1.01, 0.95
[root@nuc2 ~]# cat /proc/interrupts 
            CPU0       CPU1       CPU2       CPU3       
   8:          0          0          0          0   IO-APIC    8-edge      rtc0
   9:          1          0          0          0   IO-APIC    9-fasteoi   acpi, INT0002
  18:          0          0          0          0   IO-APIC   18-fasteoi   i801_smbus
 115:          0          0          0          0   PCI-MSI 458752-edge      PCIe PME
 116:          0          0          0          0   PCI-MSI 460800-edge      PCIe PME, pciehp
 117:          0          0          0          0   PCI-MSI 462848-edge      PCIe PME
 118:          0         38          0          0   PCI-MSI 327680-edge      xhci_hcd
 119:          0          0       4442          0   PCI-MSI 311296-edge      ahci[0000:00:13.0]
 120:          0          0          0          0  INT0002 Virtual GPIO    2  ACPI:Event
 121:         27          0          0          0   PCI-MSI 425984-edge      mei_txe
 122:          0          0          0       1766   PCI-MSI 524288-edge      enp1s0
 124:          0          0          0          0   PCI-MSI 1572864-edge      iwlwifi
 125:          0          0      94635          0   PCI-MSI 32768-edge      i915
 126:        405          0          0          0   PCI-MSI 442368-edge      snd_hda_intel:card0
 NMI:          0          0          1         99   Non-maskable interrupts
 LOC:      14193       7789       8235    1467833   Local timer interrupts
 SPU:          0          0          0          0   Spurious interrupts
 PMI:          0          0          1         99   Performance monitoring interrupts
 IWI:          0          0          0          3   IRQ work interrupts
 RTR:          0          0          0          0   APIC ICR read retries
 RES:      19976       1230       1314       5554   Rescheduling interrupts
 CAL:       2910       2408       1560       1461   Function call interrupts
 TLB:         12         26         36          0   TLB shootdowns
 TRM:          0          0          0          0   Thermal event interrupts
 THR:          0          0          0          0   Threshold APIC interrupts
 DFR:          0          0          0          0   Deferred Error APIC interrupts
 MCE:          0          0          0          0   Machine check exceptions
 MCP:          7          7          7          7   Machine check polls
 HYP:          0          0          0          0   Hypervisor callback interrupts
 HRE:          0          0          0          0   Hyper-V reenlightenment interrupts
 HVS:          0          0          0          0   Hyper-V stimer0 interrupts
 ERR:          0
 MIS:          0
 PIN:          0          0          0          0   Posted-interrupt notification event
 NPI:          0          0          0          0   Nested posted-interrupt event
 PIW:          0          0          0          0   Posted-interrupt wakeup event

Then I downgrade to 5.2.9:

cd /var/cache/pacman/pkg ; pacman -U linux-5.2.9.arch1-1-x86_64.pkg.tar.xz linux-api-headers-5.1-1-any.pkg.tar.xz ; reboot

After downgrading to kernel 5.2.9:

[root@nuc2 ~]# uname -a
Linux nuc2 5.2.9-arch1-1-ARCH #1 SMP PREEMPT Fri Aug 16 11:29:43 UTC 2019 x86_64 GNU/Linux
[root@nuc2 ~]# uptime
 15:03:30 up 33 min,  1 user,  load average: 0.00, 0.01, 0.00
[root@nuc2 ~]# cat /proc/interrupts 
            CPU0       CPU1       CPU2       CPU3       
   0:         11          0          0          0   IO-APIC    2-edge      timer
   8:          0          0          0          0   IO-APIC    8-edge      rtc0
   9:          0          1          0          0   IO-APIC    9-fasteoi   acpi, INT0002
  18:          0          0          0          0   IO-APIC   18-fasteoi   i801_smbus
 115:          0          0          0          0   PCI-MSI 458752-edge      PCIe PME
 116:          0          0          0          0   PCI-MSI 460800-edge      PCIe PME, pciehp
 117:          0          0          0          0   PCI-MSI 462848-edge      PCIe PME
 118:          0          0         38          0   PCI-MSI 327680-edge      xhci_hcd
 119:          0          0          0       4279   PCI-MSI 311296-edge      ahci[0000:00:13.0]
 120:          0          0          0          0  INT0002 Virtual GPIO    2  ACPI:Event
 121:         32          0          0          0   PCI-MSI 425984-edge      mei_txe
 122:          0       1579          0          0   PCI-MSI 524288-edge      enp1s0
 124:          0          0          0          0   PCI-MSI 1572864-edge      iwlwifi
 125:          0          0          0       1605   PCI-MSI 32768-edge      i915
 126:        403          0          0          0   PCI-MSI 442368-edge      snd_hda_intel:card0
 NMI:          0          0          0          2   Non-maskable interrupts
 LOC:       6895      14695       5773      36116   Local timer interrupts
 SPU:          0          0          0          0   Spurious interrupts
 PMI:          0          0          0          2   Performance monitoring interrupts
 IWI:          0         12          0          0   IRQ work interrupts
 RTR:          0          0          0          0   APIC ICR read retries
 RES:       1577        896       1152        744   Rescheduling interrupts
 CAL:       3037       2380       1875       2065   Function call interrupts
 TLB:          1        110        197         14   TLB shootdowns
 TRM:          0          0          0          0   Thermal event interrupts
 THR:          0          0          0          0   Threshold APIC interrupts
 DFR:          0          0          0          0   Deferred Error APIC interrupts
 MCE:          0          0          0          0   Machine check exceptions
 MCP:          7          7          7          7   Machine check polls
 HYP:          0          0          0          0   Hypervisor callback interrupts
 HRE:          0          0          0          0   Hyper-V reenlightenment interrupts
 HVS:          0          0          0          0   Hyper-V stimer0 interrupts
 ERR:          0
 MIS:          0
 PIN:          0          0          0          0   Posted-interrupt notification event
 NPI:          0          0          0          0   Nested posted-interrupt event
 PIW:          0          0          0          0   Posted-interrupt wakeup event

What strikes me is that on an upgrade from 5.2.9 to 5.3.1:

  • the number of i915 interrupts increases by a factor of 59,

  • the number of local timer interrupts increases by a factor of 24,

  • the number of rescheduling interrupts increases by a factor of 6.

It looks like something may be wrong with the i915 module in kernel 5.3.1. What else should I check? Where should I report this issue?

Last edited by BeefEater (2019-10-22 21:28:59)

Offline

#2 2019-10-22 19:49:05

loqs
Member
Registered: 2014-03-06
Posts: 17,378

Re: one CPU core at 100% usage after kernel upgrade, suspecting i915

What about the current version of 5.3      5.3.7.arch1-1 ?

Online

#3 2019-10-22 19:52:16

BeefEater
Member
Registered: 2019-03-20
Posts: 8

Re: one CPU core at 100% usage after kernel upgrade, suspecting i915

The issue is present in the current version 5.3.7. I tried bisecting to see at which kernel the issue appeared first. If there were no versions between 5.2.9 and 5.3.1, then the answer is: 5.3.1.

Offline

#4 2019-10-22 20:24:07

loqs
Member
Registered: 2014-03-06
Posts: 17,378

Re: one CPU core at 100% usage after kernel upgrade, suspecting i915

Try from the ALA https://archive.archlinux.org/packages/ … pkg.tar.xz
If the issue was introduced by 5.3,  I would suggest testing linux-mainline 5.4-rc4 or drm-tip to see if the issue has already been fixed.
If not then I can provide step by step instructions for bisecting 5.2 to 5.3.

Online

#5 2019-10-22 20:49:48

BeefEater
Member
Registered: 2019-03-20
Posts: 8

Re: one CPU core at 100% usage after kernel upgrade, suspecting i915

5.2.14 does not have this issue:

[root@nuc2 ~]# uname -a
Linux nuc2 5.2.14-arch2-1-ARCH #1 SMP PREEMPT Thu Sep 12 10:42:38 UTC 2019 x86_64 GNU/Linux
[root@nuc2 ~]# uptime
 16:42:26 up 4 min,  1 user,  load average: 0.03, 0.10, 0.05

Apparently, 5.3.1 is the first version where this issue is present. @loqs, I would much appreciate a reference to instructions on how to proceed with compiling and testing linux-mainline. Thank you.

Also, it appears that the issue described in forum thread #250155 is similar to this one or possibly even the same one.

Offline

#6 2019-10-22 21:07:28

seth
Member
Registered: 2012-09-03
Posts: 51,311

Re: one CPU core at 100% usage after kernel upgrade, suspecting i915

Just install the linked AUR package: https://wiki.archlinux.org/index.php/Ar … g_packages and boot it (adding it to your bootloader will depend on your bootloader)

Online

#7 2019-10-23 03:37:32

wangqr
Member
Registered: 2017-09-14
Posts: 6

Re: one CPU core at 100% usage after kernel upgrade, suspecting i915

I'm linked here from #250155. I've just tried linux-mainline-5.4rc4-1 from AUR, and the issue is present.

Offline

#8 2019-10-23 10:28:14

loqs
Member
Registered: 2014-03-06
Posts: 17,378

Re: one CPU core at 100% usage after kernel upgrade, suspecting i915

The following assumes the base-devel group and git are installed and I recomend enabling Makepkg#Parallel_compilation to reduce build times

$ git clone git://git.archlinux.org/svntogit/packages.git --single-branch --branch "packages/linux"
$ cd packages/trunk
$ git checkout 755b012b4c22d24afcb10c5a38f82d29ea6eb156 #5.2.14.arch2-1
$ cd ../..
$ cp -r packages/trunk linux-git
$ rm -rf packages
$ cd linux-git
# Edit replace the PKGBUILD and 90-linux.hook with the ones below
$ makepkg -rsi #This is to confirm 5.2 as built on your system does not have the issue.  Update bootloader for new kernel if needed

$ cd linux-git/src/linux
$ git checkout v5.3
$ cd ../..
$ makepkg -ersi #This is to confirm 5.3 as built on your system does have the issue select the default option for all prompted options

$ cd linux-git/src/linux/
$ git bisect start
$ git bisect good v5.3
$ git bisect bad v5.2
$ cd ../..
$ makepkg -ersif

$ cd linux-git/src/linux/
$ git bisect $result #Substitue good or bad here
$ cd ../..
$ makepkg -ersif #Repeat these four lines and test the generated kernel until git has found the bad commit

PKGBUILD

# Maintainer: Boohbah <boohbah at gmail.com>
# Contributor: Tobias Powalowski <tpowa@archlinux.org>
# Contributor: Thomas Baechler <thomas@archlinux.org>
# Contributor: Jonathan Chan <jyc@fastmail.fm>
# Contributor: misc <tastky@gmail.com>
# Contributor: NextHendrix <cjones12 at sheffield.ac.uk>

pkgbase=linux-git
_srcname=linux
pkgver=5.2.r0.g0ecfebd2b524
pkgrel=1
arch=('x86_64')
url="https://www.kernel.org/"
license=('GPL2')
makedepends=('kmod' 'inetutils' 'bc' 'libelf')
options=('!strip')
source=('git+https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git#tag=v5.2'
        #'git+https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git#tag=X.X.Y'
        'config'   # the main kernel config file
        '60-linux.hook'  # pacman hook for depmod
        '90-linux.hook'  # pacman hook for initramfs regeneration
        'linux.preset'   # standard config files for mkinitcpio ramdisk
)

sha256sums=('SKIP'
            'e0d0f140128a8574217701e61e874a0a108f3b8cd0f6e35d8b16afe897999f8e'
            'ae2e95db94ef7176207c690224169594d49445e04249d2499e9d2fbc117a0b21'
            '75f99f5239e03238f88d1a834c50043ec32b1dc568f2cc291b07d04718483919'
            'ad6344badc91ad0630caacde83f7f9b97276f80d26a20619a87952be65492c65')

_kernelname=${pkgbase#linux}
: ${_kernelname:=-ARCH}

pkgver() {
  cd "${_srcname}"

  git describe --long | sed -E 's/^v//;s/([^-]*-g)/r\1/;s/-/./g;s/\.rc/rc/'
}

prepare() {
  cd ${_srcname}

  cp -Tf ../config .config

  # set localversion to git commit
  sed -i "s|CONFIG_LOCALVERSION=.*|CONFIG_LOCALVERSION=\"${_kernelname}\"|g" ./.config
  sed -i "s|^.*CONFIG_LOCALVERSION_AUTO.*|CONFIG_LOCALVERSION_AUTO=y|" ./.config

  # don't run depmod on 'make install'. We'll do this ourselves in packaging
#  git tracks scripts/depmod.sh so do not change it when using the existing source dir for bisection
#  sed -i '2iexit 0' scripts/depmod.sh

  # get kernel version
  make prepare

  # load configuration
  # Configure the kernel. Replace the line below with one of your choice.
  #make menuconfig # CLI menu for configuration
  #make nconfig # new CLI menu for configuration
  #make xconfig # X-based configuration
  #make oldconfig # using old config from previous kernel version
  make olddefconfig # old config from previous kernel, defaults for new options
  # ... or manually edit .config
}

build() {
  cd ${_srcname}

  make bzImage modules
}

_package() {
  pkgdesc="The Linux kernel and modules (git version)"
  depends=('coreutils' 'linux-firmware' 'kmod' 'mkinitcpio>=0.7')
  optdepends=('crda: to set the correct wireless channels of your country')
  backup=("etc/mkinitcpio.d/${pkgbase}.preset")
  install=linux.install

  cd ${_srcname}

  # get kernel version
  _kernver="$(make kernelrelease)"
  _kernver=${_kernver%-dirty} #https://bbs.archlinux.org/viewtopic.php?id=236702
  _basekernel="$(make kernelversion)"
  _basekernel=${_basekernel%.*}

  mkdir -p "${pkgdir}"/{boot,usr/lib/modules}
  make INSTALL_MOD_PATH="${pkgdir}/usr" modules_install
  cp arch/x86/boot/bzImage "${pkgdir}/boot/vmlinuz-${pkgbase}"

  # make room for external modules
  local _extramodules="extramodules-${_basekernel}${_kernelname}"
  ln -s "../${_extramodules}" "${pkgdir}/usr/lib/modules/${_kernver}/extramodules"

  # add real version for building modules and running depmod from hook
  echo "${_kernver}" |
    install -Dm644 /dev/stdin "${pkgdir}/usr/lib/modules/${_extramodules}/version"

  # remove build and source links
  rm "${pkgdir}"/usr/lib/modules/${_kernver}/{source,build}

  # now we call depmod...
  depmod -b "${pkgdir}/usr" -F System.map "${_kernver}"

  # add vmlinux
  install -Dt "${pkgdir}/usr/lib/modules/${_kernver}/build" -m644 vmlinux

  # sed expression for following substitutions
  local _subst="
    s|%PKGBASE%|${pkgbase}|g
    s|%KERNVER%|${_kernver}|g
    s|%EXTRAMODULES%|${_extramodules}|g
  "

  # hack to allow specifying an initially nonexisting install file
  sed "${_subst}" "${startdir}/${install}" > "${startdir}/${install}.pkg"
  true && install=${install}.pkg

  # install mkinitcpio preset file
  sed "${_subst}" ../linux.preset |
    install -Dm644 /dev/stdin "${pkgdir}/etc/mkinitcpio.d/${pkgbase}.preset"

  # install pacman hooks
  sed "${_subst}" ../60-linux.hook |
    install -Dm644 /dev/stdin "${pkgdir}/usr/share/libalpm/hooks/60-${pkgbase}.hook"
  sed "${_subst}" ../90-linux.hook |
    install -Dm644 /dev/stdin "${pkgdir}/usr/share/libalpm/hooks/90-${pkgbase}.hook"
}

_package-headers() {
  pkgdesc="Header files and scripts for building modules for Linux kernel (git version)"

  cd ${_srcname}
  local _builddir="${pkgdir}/usr/lib/modules/${_kernver}/build"

  install -Dt "${_builddir}" -m644 Makefile .config Module.symvers
  install -Dt "${_builddir}/kernel" -m644 kernel/Makefile

  mkdir "${_builddir}/.tmp_versions"

  cp -t "${_builddir}" -a include scripts

  install -Dt "${_builddir}/arch/x86" -m644 arch/x86/Makefile
  install -Dt "${_builddir}/arch/x86/kernel" -m644 arch/x86/kernel/asm-offsets.s

  cp -t "${_builddir}/arch/x86" -a arch/x86/include

  install -Dt "${_builddir}/drivers/md" -m644 drivers/md/*.h
  install -Dt "${_builddir}/net/mac80211" -m644 net/mac80211/*.h

  # http://bugs.archlinux.org/task/13146
  install -Dt "${_builddir}/drivers/media/i2c" -m644 drivers/media/i2c/msp3400-driver.h

  # http://bugs.archlinux.org/task/20402
  install -Dt "${_builddir}/drivers/media/usb/dvb-usb" -m644 drivers/media/usb/dvb-usb/*.h
  install -Dt "${_builddir}/drivers/media/dvb-frontends" -m644 drivers/media/dvb-frontends/*.h
  install -Dt "${_builddir}/drivers/media/tuners" -m644 drivers/media/tuners/*.h

  # add xfs and shmem for aufs building
  mkdir -p "${_builddir}"/{fs/xfs,mm}

  # copy in Kconfig files
  find . -name Kconfig\* -exec install -Dm644 {} "${_builddir}/{}" \;

  # add objtool for external module building and enabled VALIDATION_STACK option
  install -Dt "${_builddir}/tools/objtool" tools/objtool/objtool

  # remove unneeded architectures
  local _arch
  for _arch in "${_builddir}"/arch/*/; do
    [[ ${_arch} == */x86/ ]] && continue
    rm -r "${_arch}"
  done

  # remove files already in linux-docs package
  rm -r "${_builddir}/Documentation"

  # remove now broken symlinks
  find -L "${_builddir}" -type l -printf 'Removing %P\n' -delete

  # Fix permissions
  chmod -R u=rwX,go=rX "${_builddir}"

  # strip scripts directory
  local _binary _strip
  while read -rd '' _binary; do
    case "$(file -bi "${_binary}")" in
      *application/x-sharedlib*)  _strip="${STRIP_SHARED}"   ;; # Libraries (.so)
      *application/x-archive*)    _strip="${STRIP_STATIC}"   ;; # Libraries (.a)
      *application/x-executable*) _strip="${STRIP_BINARIES}" ;; # Binaries
      *) continue ;;
    esac
    /usr/bin/strip ${_strip} "${_binary}"
  done < <(find "${_builddir}/scripts" -type f -perm -u+w -print0 2>/dev/null)
}

_package-docs() {
  pkgdesc="Kernel hackers manual - HTML documentation that comes with the Linux kernel (git version)"

  cd ${_srcname}
  local _builddir="${pkgdir}/usr/lib/modules/${_kernver}/build"

  mkdir -p "${_builddir}"
  cp -t "${_builddir}" -a Documentation

  # Fix permissions
  chmod -R u=rwX,go=rX "${_builddir}"
}

pkgname=("${pkgbase}" "${pkgbase}-headers" "${pkgbase}-docs")
for _p in ${pkgname[@]}; do
  eval "package_${_p}() {
    $(declare -f "_package${_p#${pkgbase}}")
    _package${_p#${pkgbase}}
  }"
done

# vim:set ts=8 sts=2 sw=2 et:

90-linux.hook

[Trigger]
Type = File
Operation = Install
Operation = Upgrade
Target = boot/vmlinuz-%PKGBASE%
Target = usr/lib/initcpio/*

[Action]
Description = Updating %PKGBASE% initcpios...
When = PostTransaction
Exec = /usr/bin/mkinitcpio -p %PKGBASE%

Online

#9 2019-10-23 17:47:22

BeefEater
Member
Registered: 2019-03-20
Posts: 8

Re: one CPU core at 100% usage after kernel upgrade, suspecting i915

Confirming that this issue has not been fixed in 5.4.0-rc4-mainline:

Linux nuc2 5.4.0-rc4-mainline #1 SMP PREEMPT Tue Oct 22 23:34:27 EDT 2019 x86_64 GNU/Linux
[root@nuc2 ~]# uptime
 13:42:38 up 15 min,  1 user,  load average: 1.00, 0.98, 0.70

Let me see what I can find from bisecting as suggested by loqs. Thanks loqs.

Offline

#10 2019-10-25 01:23:25

BeefEater
Member
Registered: 2019-03-20
Posts: 8

Re: one CPU core at 100% usage after kernel upgrade, suspecting i915

It is commit 6cfe7ec02e854278fb341e62db54d49a2b199c62 that results in one of the CPU cores being 100% busy.

What should I do next?

Offline

#11 2019-10-25 01:34:22

loqs
Member
Registered: 2014-03-06
Posts: 17,378

Re: one CPU core at 100% usage after kernel upgrade, suspecting i915

See https://01.org/linuxgraphics/documentat … eport-bugs on how to file the bug report upstream

Online

#12 2019-10-25 02:40:48

BeefEater
Member
Registered: 2019-03-20
Posts: 8

Re: one CPU core at 100% usage after kernel upgrade, suspecting i915

The regression has been reported at https://bugs.freedesktop.org/show_bug.cgi?id=112125.

Offline

Board footer

Powered by FluxBB