You are not logged in.
Pages: 1
Topic closed
It seems that the latest kernel module in the mainline kernel is 4.0-0. I would like to try to install 4.9-2.2.4.0.
I've tried to get it to compile but most of the compile scripts are strictly tied to a supported OS, which Arch is not.
What is the best way to install the latest Mellenox kernel modules?
Offline
Did you try building the DKMS from kernel-mft-4.15.1-9.src.rpm inside MLNX_OFED_SRC-4.9-2.2.4.0.tgz which is I found under RHEL/CentOS -> RHEL/CentOS 82 -> x86_64 -> Sources ?
Offline
Nice! I got it installed now I need to test it and see if it fixes the Code 43 I was getting in my Windows VM. I'll let you know.
Thank you!
Offline
It did not. kernel-mft-4.15.1-9.src.rpm installs the mst_pci module, which seems to be for the tooling. What I need is the mlx4_core module updated. That comes in mlnx-ofa_kernel-4.9-OFED.4.9.2.2.4.1.src.rpm, which is the same source I was originally trying to compile. Unfortunately it does not come with a dkms.conf file.
During the ./configure I keep running into
checking for external module build target... configure: error: kernel module make failed; check config.log for details
config.log: https://gist.github.com/FallingSnow/d3b … 15fff14a62
configure:5502: cp conftest.c build && env CROSS_COMPILE= make -d /home/admin/MLNX_OFED_SRC-4.9-2.2.4.0/SRPMS/mlnx-ofa_kernel-4.9/compat/build/ MLNX_KERNEL_TEST=conftest.i LD=ld CC=gcc -f /home/admin/MLNX_OFED_SRC-4.9-2.2.4.0/SRPMS/mlnx-ofa_kernel-4.9/compat/build/Makefile MLNX_LINUX_CONFIG=/usr/lib/modules/5.10.14-arch1-1/build/.config LINUXINCLUDE=-include generated/autoconf.h -I/usr/lib/modules/5.10.14-arch1-1/build/arch/x86/include -Iarch/x86/include/generated -Iinclude -I/usr/lib/modules/5.10.14-arch1-1/build/arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I/usr/lib/modules/5.10.14-arch1-1/build/include -I/usr/lib/modules/5.10.14-arch1-1/build/include/uapi -Iinclude/generated/uapi -I/usr/lib/modules/5.10.14-arch1-1/build/arch/x86/include -Iarch/x86/include/generated -I/usr/lib/modules/5.10.14-arch1-1/build/arch/x86/include -I/usr/lib/modules/5.10.14-arch1-1/build/arch/x86/include/generated -I/usr/lib/modules/5.10.14-arch1-1/build/include -I/usr/lib/modules/5.10.14-arch1-1/build/include -I/usr/lib/modules/5.10.14-arch1-1/build/include2 -include /usr/lib/modules/5.10.14-arch1-1/build/include/linux/kconfig.h -o tmp_include_depends -o scripts -o include/config/MARKER -C /usr/lib/modules/5.10.14-arch1-1/build EXTRA_CFLAGS=-Werror-implicit-function-declaration -Wno-unused-variable -Wno-uninitialized CROSS_COMPILE= M=/home/admin/MLNX_OFED_SRC-4.9-2.2.4.0/SRPMS/mlnx-ofa_kernel-4.9/compat/build >/dev/null 2>build/output.log; [ 0 -ne 0 ] && cat build/output.log 1>&2 && false || config/warning_filter.sh build/output.log
make[2]: *** No rule to make target '/home/admin/MLNX_OFED_SRC-4.9-2.2.4.0/SRPMS/mlnx-ofa_kernel-4.9/compat/build//home/admin/MLNX_OFED_SRC-4.9-2.2.4.0/SRPMS/mlnx-ofa_kernel-4.9/compat/build/'. Stop.
make[1]: *** [scripts/Makefile.build:471: __build] Error 2
make: *** [Makefile:1805: /home/admin/MLNX_OFED_SRC-4.9-2.2.4.0/SRPMS/mlnx-ofa_kernel-4.9/compat/build] Error 2
make[2]: *** No rule to make target '/home/admin/MLNX_OFED_SRC-4.9-2.2.4.0/SRPMS/mlnx-ofa_kernel-4.9/compat/build//home/admin/MLNX_OFED_SRC-4.9-2.2.4.0/SRPMS/mlnx-ofa_kernel-4.9/compat/build/'. Stop.
make[1]: *** [scripts/Makefile.build:471: __build] Error 2
make: *** [Makefile:1805: /home/admin/MLNX_OFED_SRC-4.9-2.2.4.0/SRPMS/mlnx-ofa_kernel-4.9/compat/build] Error 2
configure:5505: $? = 0
configure:5507: test -s build/conftest.i
configure:5510: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "compat_mlnx"
| #define PACKAGE_TARNAME "compat_mlnx"
| #define PACKAGE_VERSION "2.3"
| #define PACKAGE_STRING "compat_mlnx 2.3"
| #define PACKAGE_BUGREPORT "http://support.mellanox.com/SupportWeb/service_center/SelfService"
| #define PACKAGE_URL ""
| #define PACKAGE "compat_mlnx"
| #define VERSION "2.3"
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define SIZEOF_UNSIGNED_LONG_LONG 8
| /* end confdefs.h. */
|
| #include <linux/kernel.h>
|
| int
| main (void)
| {
|
| ;
| return 0;
| }
Offline
I've got it to run with a tweak to a source while trying to configure/compile on linux-lts. Now I'm running into a unknown symbol error.
$ sudo modprobe mlx4_core
modprobe: ERROR: could not insert 'mlx4_core': Unknown symbol in module, or unknown parameter (see dmesg)
modprobe: ERROR: Error running install command '/usr/bin/modprobe --ignore-install mlx4_core && (if [ -f /usr/lib/rdma/mlx4-setup.sh -a -f /etc/rdma/mlx4.conf ]; then /usr/lib/rdma/mlx4-setup.sh < /etc/rdma/mlx4.conf; fi; /usr/bin/modprobe mlx4_en; if /usr/bin/modinfo mlx4_ib > /dev/null 2>&1; then /usr/bin/modprobe mlx4_ib; fi)' for module mlx4_core: retcode 1
modprobe: ERROR: could not insert 'mlx4_core': Invalid argument
Offline
Was able to get it installed and working.
To compile it you must be on Linux LTS, then I compiled with
./configure -j32 --with-mlx4-mod --with-user_access-mod --with-user_mad-mod --with-addr_trans-mod --with-pa-mr --with-core-mod.
Once compiled you first need to insert the compat module (mlx_compat.ko) or you will get "mlx4_core: Unknown symbol backport_dependency_symbol (err -2)" in your dmesg.
# insmod MLNX_OFED_SRC-4.9-2.2.4.0/SRPMS/mlnx-ofa_kernel-4.9/compat/mlx_compat.ko
# insmod MLNX_OFED_SRC-4.9-2.2.4.0/SRPMS/mlnx-ofa_kernel-4.9/drivers/net/ethernet/mellanox/mlx4/mlx4_core.ko num_vfs=1 port_type_array=1,1 probe_vf=0
Then make sure to install the correct windows drivers (Winof) from the Mellanox website on your Windows VM. https://www.mellanox.com/products/adapt … ws/winof-2
Offline
I've got it to run with a tweak to a source while trying to configure/compile on linux-lts. Now I'm running into a unknown symbol error.
$ sudo modprobe mlx4_core modprobe: ERROR: could not insert 'mlx4_core': Unknown symbol in module, or unknown parameter (see dmesg) modprobe: ERROR: Error running install command '/usr/bin/modprobe --ignore-install mlx4_core && (if [ -f /usr/lib/rdma/mlx4-setup.sh -a -f /etc/rdma/mlx4.conf ]; then /usr/lib/rdma/mlx4-setup.sh < /etc/rdma/mlx4.conf; fi; /usr/bin/modprobe mlx4_en; if /usr/bin/modinfo mlx4_ib > /dev/null 2>&1; then /usr/bin/modprobe mlx4_ib; fi)' for module mlx4_core: retcode 1 modprobe: ERROR: could not insert 'mlx4_core': Invalid argument
Could you provide which tweak you have used to get the build done?
Thnx in advance.
Offline
Oh man I have no idea what I did lol. I remember I followed the errors though. I might have just commented out some failing tests.
Offline
Oh man I have no idea what I did lol. I remember I followed the errors though. I might have just commented out some failing tests.
You may have your modified sources lying around?
Or to do it quick'n'dirty you may still have the 3 modues and may share them?
Last edited by efeu86 (2021-05-21 12:00:54)
Offline
I don't. I've wiped those drives to reuse them elsewhere.
What error are you getting? Maybe I can look at the source and the error and push you in the right direction.
Offline
For now I tried to compile on an ubuntu20.04 VM, cus it is officialy supported, but getting same errors like you:
https://pastebin.com/LKghCpUy
Tried different versions, but the errors are always the same. Trying to compile mlnx-ofed-kernel-4.9 from the MLNX_OFED_LINUX-4.9-3.1.5.0 package.
Offline
It looks like you're trying to compile on linux 5.10. I believe MLNX_OFED_LINUX-4.9.xxxxx modules will only work on older versions of linux-lts. Try kernel 5.4.XXX and go from there. If it works, move up to the next LTS version, if not, move down. See https://www.kernel.org/ for versions.
Also try different versions of the OFED modules. The newer MLNX_OFED 5.X.X branch might compile on the newer kernel but they drop support for some legacy equipment. See https://www.mellanox.com/support/mlnx-o … sw_drivers
I did a little looking and Ubuntu 20.04 seems to come with linux 5.4? So that is most likely the issue.
Last edited by FallenSnow (2021-05-28 20:17:15)
Offline
Hmm, looking at your logfile, you compiled (somehow) against kernel 5.10.14-arch1-1. Or have you downgraded?
Compiling against kernel 5.4 is no problem. It only happens on kernels 5.6 and up. The package I took is the new LTS driver for cx3(pro), and should work with kernel 5.8 and 5.10, but actual it does not.
I think i will run kernel 5.4 for now and wait until mellanox fixes their "new" broken LTS driver.
Offline
Hi!
I'm fighting with this same problem and it's driving me totally nuts.
I've spent better part of a week trying to find a way to make the driver compile, with no success.
The kernel version is 5.14.14, and I'm using the driver's Debian source package.
The funny thing is: At first attempt (when I did of course not take any notes) I was somehow able to compile the driver, and I'm pretty sure it was one of the latest versions -- not the LTS version. And, it worked -- even though the card is Connect-X 3 Pro which, in theory, should not be supported since driver version 5.1.
But now, simply can't make the compilation go through any more.
I know this comment does not bring much light to anything but I'm trying to reach out anyone else struggling with this,
Offline
So, it appears that while all previous versions refuse to compile on any newer kernel, version 5.5 of the Mellanox OFED drivers actually does compile -- at least on kernel 5.14.14.
As I mentioned above, the funny thing is that in theory support for Connect-X 3 Pro was dropped starting from version 5.1 -- yet 5.5 still does actually appear to work with Connect-X 3 Pro just fine.
Can't confirm yet but at least now the compilation is proceeding. I'll post more details a bit later.
Offline
So, it appears that while all previous versions refuse to compile on any newer kernel, version 5.5 of the Mellanox OFED drivers actually does compile -- at least on kernel 5.14.14.
As I mentioned above, the funny thing is that in theory support for Connect-X 3 Pro was dropped starting from version 5.1 -- yet 5.5 still does actually appear to work with Connect-X 3 Pro just fine.
Can't confirm yet but at least now the compilation is proceeding. I'll post more details a bit later.
Hi, any news?
I just want my Windows VMs to be able to use SR-IOV damn it!
Offline
@rootpeer as Jyri-poika pointed out, the newer driver that "don't support" generations below Connect-X 4 does still seem to work with Connect-X 3 devices. I've tested this on my Debian system. Granted I didn't have to compile the drivers myself but the newest driver does seem to work on 5.15.12 with Connect-X 3 cards at least. You just gotta get it to compile on Arch Linux.
Offline
Can't you guys build an AUR package for us compiling noobs?
Please!
Offline
I`m totaly desperate with the MLNX driver and CX3 card on truenas scale TrueNAS-22.02.0. I know that this is ARCH forum but still maybe you can help. First i went through this helpful post https://xtremeownage.com/2022/03/26/tru … nfiniband/ . I could see CX3 card and setup all the networks including VMs with linux. However windows VMs post Code 43.
I have installed the LTS MLNX_OFED_LINUX-5.4-3.1.0.0-debian10.8-x86_64 (it went through all of the compilation processes and loaded mlx5 and ib modules). The kernell is 5.10.93 LTS + truenas patches. However, with mlx5 driver CX3 is not visible in ibstat atall. Just to note mlx4_core can be loaded (old one) but the mlx4_ib cannot be loaded
modprobe mlx4_ib
modprobe: ERROR: could not insert 'mlx4_ib': Invalid argument).
@FallenSnow: How did you manage CX3 to work on your Debian with the new driver? If possible please post the steps you went?
Last edited by xdanil (2022-04-13 12:51:29)
Offline
Pages: 1
Topic closed