You are not logged in.
uname -a
Linux Jammin1 5.3.1-arch1-1-ARCH #1 SMP PREEMPT Sat Sep 21 11:33:49 UTC 2019 x86_64 GNU/Linux
inxi -Nxxx
Network:
Device-1: Intel Ethernet I219-V vendor: ASUSTeK driver: e1000e v: 3.2.6-k
port: f000 bus ID: 00:1f.6 chip ID: 8086:15b8
systemctl list-unit-files|grep network
systemd-network-generator.service disabled
systemd-networkd-wait-online.service disabled
systemd-networkd.service disabled
systemd-networkd.socket disabled
network-online.target static
network-pre.target static
network.target static
FWIW, I have just an ethernet connection. No wireless (or bluetooth) capability at all. Only using dhcpcd, and haven't had any issues at all.
Eenie meenie, chili beanie, the spirits are about to speak -- Bullwinkle J. Moose
It's a big club...and you ain't in it -- George Carlin
Registered Linux user #149839
perl -e 'print$i=pack(c5,(41*2),sqrt(7056),(unpack(c,H)-2),oct(115),10); '
Offline
Suggestion to help identify the cause
git clone git://git.archlinux.org/svntogit/packages.git --single-branch --branch "packages/linux"
cd packages/trunk
Revert commits suggested by seth. So the prepare function of PKGBUILD becomes:
prepare() {
cd $_srcname
msg2 "Setting version..."
scripts/setlocalversion --save-scmversion
echo "-$pkgrel" > localversion.10-pkgrel
echo "$_kernelname" > localversion.20-pkgname
# git revert -n def4ec6dce393e2136b62a05712f35a7fa5f5e56
# git revert -n 59653e6497d16f7ac1d9db088f3959f57ee8c3db
# git revert -n ab6973aed6200510662856afce5e3d1e386b7b64
# git revert -n f74dc880098b4a29f76d756b888fb31d81ad9a0c
# git revert -n d17ba0f616a08f597d9348c372d89b8c0405ccf3
# git revert -n caff422ea81e144842bc44bab408d85ac449377b
git revert -n def4ec6dce393e2136b62a05712f35a7fa5f5e56
local src
for src in "${source[@]}"; do
src="${src%%::*}"
src="${src##*/}"
[[ $src = *.patch ]] || continue
msg2 "Applying patch $src..."
patch -Np1 < "../$src"
done
msg2 "Setting config..."
cp ../config .config
make olddefconfig
make -s kernelrelease > ../version
msg2 "Prepared %s version %s" "$pkgbase" "$(<../version)"
}
I suggest enabling Parallel compilation to reduce build time.
makepkg -rsi #build and install the package
Is the issue still present in the built package?
Edit:
Change reverted commit to the one referenced from https://bugzilla.kernel.org/show_bug.cgi?id=205047#c1
Last edited by loqs (2019-10-02 00:10:12)
Offline
Looks like it was bisected already and this is the culprit
https://git.kernel.org/pub/scm/linux/ke … a7fa5f5e56
I pinged the people involved with the commit to make sure they are aware of the bugzilla ticket.
Offline
@sambo99 you confirmed it was the same issue by building the kernel with def4ec6dce393e2136b62a05712f35a7fa5f5e56 reverted?
Offline
@Ioqs Tomasz reported that he tested that and the issue was exactly the same as I had.
If you have a dkms you would like me to try here I am more than happy.
Offline
@Sambo99 the problems with building using DKMS include:
Upstream expects an in tree build rather than an out of tree build.
The version is the sames so dkms will error with:
dkms install e1000e/5.3.1 -k 5.3.1-arch1-1-ARCH
Error! Module version 3.2.6-k for e1000e.ko.xz
is not newer than what is already found in kernel 5.3.1-arch1-1-ARCH (3.2.6-k).
Edit:
PKGBUILD
pkgname=e1000e-dkms
_modname=e1000e
pkgver=5.3.1
_tag=v5.3.1
pkgrel=1
pkgdesc=""
license=('GPL2')
arch=('any')
depends=('dkms')
makedepends=('git')
url=''
source=("git+https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git#tag=${_tag}?signed"
'dkms.conf.in')
sha256sums=('SKIP'
'96595e678b4f1fe92e0388212706bac1baf76568cb13ebb029a03f8b73217819')
validpgpkeys=(
'647F28654894E3BD457199BE38DBBDC86092693E' # Greg Kroah-Hartman
)
prepare() {
cd linux
git revert -n def4ec6dce393e2136b62a05712f35a7fa5f5e56
}
package() {
install -dm755 "${pkgdir}/usr/src/${_modname}-${pkgver}/"
install -Dm644 dkms.conf.in "${pkgdir}/usr/src/${_modname}-${pkgver}/"dkms.conf
sed -e "s/@MODNAME@/$_modname/" \
-e "s/@PKGVER@/$pkgver/" \
-i "${pkgdir}/usr/src/${_modname}-${pkgver}/"dkms.conf
cd linux/drivers/net/ethernet/intel/e1000e
find . -exec install -Dm644 '{}' "${pkgdir}/usr/src/${_modname}-${pkgver}/"'{}' \;
sed -i 's/DRV_EXTRAVERSION "-k"/DRV_EXTRAVERSION "-l"/' "${pkgdir}/usr/src/${_modname}-${pkgver}/netdev.c"
}
dkms.conf.in
PACKAGE_NAME="@MODNAME@"
PACKAGE_VERSION="@PKGVER@"
MAKE[0]="make -C /usr/lib/modules/$kernelver/build M=$dkms_tree/@MODNAME@/@PKGVER@/build"
CLEAN="make clean"
BUILT_MODULE_NAME[0]="@MODNAME@"
DEST_MODULE_LOCATION[0]="/updates"
AUTOINSTALL="yes"
Edit2:
drivers/net/ethernet/intel/e1000e/netdev.c
#define DRV_EXTRAVERSION "-k"
#define DRV_VERSION "3.2.6" DRV_EXTRAVERSION
....
MODULE_VERSION(DRV_VERSION);
The version could be increased by increasing DRV_EXTRAVERSION or DRV_VERSION but that is introducing a change just to satisfy dkms.
Edit3:
Added sed call to increase version.
Last edited by loqs (2019-10-04 12:37:55)
Offline
I think the out of kernel driver is working Ioqs
https://sourceforge.net/projects/e1000/
I just modprobed the thing I build and it appears to be working
Offline
@sambo99 if the version you tried was e1000e-3.6.0.tar.gz then that contains the commit the upstream bug report identified as the cause.
Edit:
If you were testing e1000e-3.6.0 successfully please report it upstream https://bugzilla.kernel.org/show_bug.cgi?id=205047#c4
Last edited by loqs (2019-10-07 19:31:44)
Offline
I do not think a plain modprobe test for the out-of-kernel driver really tells you much as a simple
rmmod e1000e
modprobe e1000e
temporarily fixes the problem for me on the Arch Linux distributed kernel (5.3.5.arch1-1) as well.
Offline
@verbbis please report those findings to the kernel bug report.
Offline
Before I do that I'll have to investigate a bit more. It might not be a kernel bug exactly.
I use the 'embedded NIC' interface designations e.g. 'eno1' as those have been working for me rather consistently thus far. According to the dmesg logs an interface rename is performed when the driver is loaded.
[ 2.393543] e1000e 0000:03:00.0 eno1: renamed from eth1
The Intel card has 2 interfaces: eth0 and eth1 respectively. I have only one of them connected.
Now, something has changed during the recent updates that eth1 gets renamed to eno1 during the boot process. It so happens that this particular interface is not connected on my box. When I perform the rmmod/modprobe dance, eth0 gets (correctly) renamed to eno1 instead and the DHCP lookups subsequently succeed.
[ 67.262513] e1000e 0000:00:19.0 eno1: renamed from eth0
That said, I do get
[ 2.284554] e1000e 0000:03:00.0: Disabling ASPM L0s L1
[ 2.284555] e1000e 0000:03:00.0: can't disable ASPM; OS doesn't have ASPM control
messages when the driver is loaded during boot, but no longer when I do the modprobe afterwards. So there is something fishy there regardless.
Last edited by verbbis (2019-10-09 17:44:42)
Offline
Could be a separate issue. You could test reverting def4ec6dce393e2136b62a05712f35a7fa5f5e56 and see if your system still has the issue.
Offline
Reverting said commit had no effect for me. Surprisingly it's a different issue then. Strange that these problems coincided - it's the same driver after all...
FWIW, switching to linux-lts had no effect either, so mine is probably systemd-related.
Last edited by verbbis (2019-10-09 19:15:47)
Offline
I'm still not sure what we are seeing here. Essential stuff like kernel LAN drivers and/or systemd is unlikely to break.
Offline
Essential stuff like kernel LAN drivers [...] is unlikely to break.
https://bugzilla.kernel.org/show_bug.cgi?id=205047#c1 suggests that it does at times…
Essential stuff like […] systemd is unlikely to break.
*lol*
Offline
Hello
Currently I have not updated my "remote" system. But the system is using e1000 and e1000e drivers,
Normally I update the system once a month .. so now is the time to update it.
My problem is that this is a remote system and I dont have physical access to it. If anything goes wrong then I wont have access to it and I wont be able to downgrade kernel immediately. It can take 2 days for someone to go there and downgrade. Its a production system and I can not afford 2 days of downtime.
Looking at previous comments, can someone clarify:
1) Does issue occur with just netctl or with systemd-networkd too? Someone earlier said that switching to systemd-networkd solved the problem.
Anyone using systemd-networkd - please let me know.
2) Issue is reported for e1000e but can it occur with e1000 too?
Anyone using e1000 driver, please let me know, if you have similar issue.
3) Does this happen to all LAN cards using e1000e? Or some cards work without any issue?
Anyone having e1000e driver but not having this issue, please let me know along with your card name if possible.
Sorry for too many questions but I have no other option.
Thank you in advance for your answers.
Last edited by amish (2019-10-20 13:35:59)
Offline
I'm not concerned by any of your questions (netctl / e1000e / with problem).
I, too, have a headless system concerned with this bug and I have managed to prevent my system to be unreachable by putting a script in cron that check for net connectivity and pacman -U the old working kernel before rebooting if necessary. A better solution would be to create a systemd unit, but... oh well. It's working.
Last edited by kadafax (2019-10-21 08:20:55)
Offline
oh well. It's working.
What worked? The new kernel 5.3+ and e1000e based card OR your cron script worked and detected the failure and downgraded and rebooted?
If cron script - is it possible for you to share the script?
Thank you.
Offline
The script worked. Beware it's kinda ugly, I'm not a pro. Edit your email address and the IP to ping. Be sure to have the correct path to the correct kernel pkg.
Also remember to remove the cronjob after kernel downgrade+reboot if you want to avoid unnecessary reboot because of a temporary connection issue.
#!/bin/bash
if [ -f /tmp/checknet4kernellock ]
then
echo "lock detected" | mail -s "Failsafe kernel Downgrade and Reboot: Lock detected" your@email
exit 0
fi
touch /tmp/checknet4kernellock
# Ping a trusted IP address you are sure to be up
ping -c3 -q RELIABLE_IP_ADDRESS > /dev/null
if [ $? -ne 0 ]; then
echo "At `date` I've downgraded the kernel because no network - Remove the cronjob ? " | mail -s "Failsafe kernel Downgrade and Reboot: done" your@email
/usr/bin/pacman -U --noconfirm /var/cache/pacman/pkg/linux-5.2.14.arch2-1-x86_64.pkg.tar.xz
rm -f /tmp/checknet4kernellock
/sbin/reboot
exit 0
fi
rm -f /tmp/checknet4kernellock
exit 0
Offline
Thank you very much for the script. I would try it soon within a week.
May be script is already sufficient for you but in case you still want to improve it then:
1) You can check for current kernel version via say uname and downgrade only if its >= 5.3.0
2) Downgrade only if e1000e module is loaded (say via lsmod) i.e. downgrade only if system is using that module
3) Copy working linux package else where so that pacman -Sc (routine maintenance) does not accidentally remove it from pacman cache dir, after upgrade.
4) Delete cron file if everything is fine or not applicable
Thanks again.
Last edited by amish (2019-10-21 14:44:14)
Offline
For me, it is intermittent. Sometimes I boot up and it works fine, other times it cycles through disconnecting over and over. When it isn't working, restarting NetworkManager seems to fix it:
sudo systemctl restart NetworkManager
Since it is intermittent, that could just be giving me another bite at the apple, but it has worked each time I've tried it so far.
Offline
@teamosil,
Exactly the same situation you are describing
Another puzzling thing The laptop is connected to the router by Ethernet cable. Most of the time, the diode light on the router connector was off, when laptop was turned off.
Now it is on, even when laptop is off. What could it mean?
The router light goes off, once Arch is started
Last edited by Fixed (2019-11-02 10:51:35)
XFCE4 under Arch on Honor MagicBook
Offline
There seems to be progress in the corresponding kernel bug.
Maybe with 5.4.0 we'll have working LAN connections again.
Offline
Tested 5.4rc7 - doesn't work
Offline
Interestingly, kernel linux-ck has the same problem, but not linux-lqx. Has anyone tried linux-zen (lqx is derived from it)?
Offline