You are not logged in.
Pages: 1
Hello All!
Earlier today, I did a full system upgrade. This moved me up to Kernel 6.7.4-arch1-1. I rebooted into a fresh kernel, only to be greeted with a wall of:
[ 1445.056460] ixgbe 0000:06:00.1: Warning firmware error detected FWSM: 0x00298040
[ 1446.549827] ixgbe 0000:06:00.0: Warning firmware error detected FWSM: 0x00298040
[ 1447.189749] ixgbe 0000:06:00.1: Warning firmware error detected FWSM: 0x00298040
[ 1448.683055] ixgbe 0000:06:00.0: Warning firmware error detected FWSM: 0x00298040
[ 1449.323017] ixgbe 0000:06:00.1: Warning firmware error detected FWSM: 0x00298040
[ 1450.816322] ixgbe 0000:06:00.0: Warning firmware error detected FWSM: 0x00298040
[ 1451.456314] ixgbe 0000:06:00.1: Warning firmware error detected FWSM: 0x00298040
[ 1451.648620] ixgbe 0000:06:00.1: Warning firmware error detected FWSM: 0x00298040
[ 1452.949625] ixgbe 0000:06:00.0: Warning firmware error detected FWSM: 0x00298040
[ 1453.589604] ixgbe 0000:06:00.1: Warning firmware error detected FWSM: 0x00298040
[ 1455.082924] ixgbe 0000:06:00.0: Warning firmware error detected FWSM: 0x00298040
[ 1455.722874] ixgbe 0000:06:00.1: Warning firmware error detected FWSM: 0x00298040
[ 1457.219503] ixgbe 0000:06:00.0: Warning firmware error detected FWSM: 0x00298040
[ 1457.856155] ixgbe 0000:06:00.1: Warning firmware error detected FWSM: 0x00298040
[ 1459.352789] ixgbe 0000:06:00.0: Warning firmware error detected FWSM: 0x00298040
[ 1459.989486] ixgbe 0000:06:00.1: Warning firmware error detected FWSM: 0x00298040
[ 1461.482743] ixgbe 0000:06:00.0: Warning firmware error detected FWSM: 0x00298040
[ 1462.122720] ixgbe 0000:06:00.1: Warning firmware error detected FWSM: 0x00298040
[ 1463.616050] ixgbe 0000:06:00.0: Warning firmware error detected FWSM: 0x00298040
[ 1464.255992] ixgbe 0000:06:00.1: Warning firmware error detected FWSM: 0x00298040
ixgbe is the driver for my Intel network adapater (intel X540-AT2). Some reading online suggests that I need to manually update my ixgbe drivers:
https://forum.proxmox.com/threads/pve-6 … ors.58592/
Okay, no problem! I download the source, go to compile, and...
make[1]: Entering directory '/usr/lib/modules/6.7.4-arch1-1/build'
CC [M] /home/c/intel/ixgbe-5.19.9/src/ixgbe_main.o
/home/c/intel/ixgbe-5.19.9/src/ixgbe_main.c: In function ‘ixgbe_clean_rx_irq’:
/home/c/intel/ixgbe-5.19.9/src/ixgbe_main.c:2389:17: error: implicit declaration of function ‘xdp_do_flush_map’; did you mean ‘xdp_do_flush’? [-Werror=implicit-function-declaration]
2389 | xdp_do_flush_map();
| ^~~~~~~~~~~~~~~~
| xdp_do_flush
/home/c/intel/ixgbe-5.19.9/src/ixgbe_main.c: In function ‘ixgbe_probe’:
/home/c/intel/ixgbe-5.19.9/src/ixgbe_main.c:12707:9: error: implicit declaration of function ‘pci_enable_pcie_error_reporting’ [-Werror=implicit-function-declaration]
12707 | pci_enable_pcie_error_reporting(pdev);
... more implicit declerations
It looks like the most recent intel drivers are a few minor versions behind the current linux kernel. A quick check verifies that xdp_do_flush_map was removed in 6.7.
AFAIK, my only recourse at this point is to downgrade my kernel while I wait for intel to catch up. However, looking up "pinning kernel versions" yields only this forum post, which does not cover downgrading your kernel.
To be clear:
Am I okay to just:
pacman -S linux=6.1
? Is there any way I can formalize this freeze? How do I list the available versions for the linux package?
Offline
I got version names from gitlab, but testing if pacman would even _let_ me install them fails:
$ sudo pacman -S linux=6.6.1.arch1-1
error: target not found: linux=6.6.1.arch1-1
Offline
On the phone, so keeping this short: look at https://wiki.archlinux.org/title/downgrading_packages
Offline
On the phone, so keeping this short: look at https://wiki.archlinux.org/title/downgrading_packages
Thank you! This is very helpful.
How can I verify if a given package also needs to be moved back? Does linux-firmware need to? What about linux-firmware-whence?
Offline
Wouldn't it be easier to just switch to linux-lts? Or am I missing something?
Offline
Wouldn't it be easier to just switch to linux-lts? Or am I missing something?
Probably, yeah
As for dependencies: It depends. The few times I had to downgrade the kernel I downgraded Linux and linux-firmware, and that was enough. But I don't even know what linux-firmware-whence is...
Offline
How can I verify if a given package also needs to be moved back? Does linux-firmware need to? What about linux-firmware-whence?
No linux-firmware / linux-firmware-whence do not need to be. linux-headers and anything providing a kernel module such as nvidia would.
Offline
I successfully downgraded to 6.6.4, and then, noticed Fuxino's suggestion, and re-upgraded to 6.6.16 (current lts). Thank you all for your suggestions, I was able to figure it out.
However, it appears that the second symbol used by intel, pci_enable_pcie_error_reporting, was removed in 6.6, so both my stale kernel and lts are too new for me to compile ixgbe-5.19.9 from source. I don't understand why I didn't get these errors before the system wide update, as I was running what is apparently an unsupported kernel.
Part of me suspects that if I got several months hosting consistent traffic without an issue, that I should just ignore the errors and continue on. The alternative is to build and install linux 6.5.
Offline
Actually...
There's another option.
There are only two problematic symbols:
* pci_disable_pcie_error_reporting
* pci_enable_pcie_error_reporting
#if defined(CONFIG_PCIEAER)
/* PCIe port driver needs this function to enable AER */
int pci_enable_pcie_error_reporting(struct pci_dev *dev);
int pci_disable_pcie_error_reporting(struct pci_dev *dev);
int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
#else
static inline int pci_enable_pcie_error_reporting(struct pci_dev *dev)
{
return -EINVAL;
}
static inline int pci_disable_pcie_error_reporting(struct pci_dev *dev)
{
return -EINVAL;
}
static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
{
return -EINVAL;
}
#endif
we see that this functionality is gated behind CONFIG_PCIEAER. My kernel _does_ have this enabled. However, in theory, the driver should work with kernels without this flag set. Peeking into the driver source code, the return value to these calls is ignored, which implies that the calls can fail without causing problems. If I comment out the pciaear calls, the drivers do compile.
I'm unsure what's riskier:
1.) Running with drivers that have worked for several months, but keep logging a firmware error
2.) Installing my modified drivers
Offline
Offline
Pages: 1