You are not logged in.

#1 2018-02-20 23:15:07

DCx86
Member
From: Uppsala
Registered: 2018-02-20
Posts: 47

Failure to communicate with kernel device-mapper driver

Hello everyone,

About me:
This is my very first time trying to reinstall Linux and I'm not too familiar with the system. I just enrolled at https://linuxacademy.com/
to learn more about Linux and I thought I will do it on a Linux machine when "all hell breaks loose".

Device:
Laptop: DELL XPS 15 9560
OS: Archlinux

Hardware change:
SSD from 256 to 1TB NVMe 960 Samsung Pro.

Short description:
After the SSD upgrade, I start installing Archlinux, carefully following the steps. I'm not sure what happened but I got some errors
and I decided to abord, reboot and start all over, but then the system failed to reboot.

Error:

...
[FAIL] Failed unmounting /run/archiso/bootmnt
...
[  60.860507] watchdog: watchdog0: watchdog did not stop!
[  60.958301] systemd-shutdown[631]: Failed to unmount /run/archiso/bootmnt: Device or resource busy
[  60.958452] system-shutdown[1]: Failed to wait for process: Protocol error
[  60.024254] watchdog: watchdog0: watchdog did not stop!
/dev/mapper/control: open failed: No such device
Failure to communicate with kernel device-mapper driver.
Check that device-mapper is available in the kernel.
Incompatible libdevmapper 1.02.146 (2017-12-18) and kernel driver (unknown version).
Command failed
umount: can't unmount /dev/loop0: Invalid argument
umount: can't unmount /oldrun/archiso/cowspace: Invalid argument
[  245.383327] INFO: task reboot:652 blocked for more then 120 seconds.
[  245.383913]         Not tainted 4.14.15-1-ARCH #1
[  245.384491] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

Troubleshoot:
Searching on google I haven't found a specific solution. I'm afraid to try random stuff, but it seems to be Kernel / Hardware problem.
I have tried running few commands and here is the output:

lspci --nn

[  134.257976] NMI watchdog: Watchdog detected hard LOCKUP on cpu ?
[  195.349989] INFO: rcu_preempt detected stalls on CPUs/tasks:
[  195.350010] o7-...: (1 GPs behind) idle=bba/140000000000000/0 softirq=1096/1098 fqs=1565
[  195.350040] o(detected by 6, t=18018 jiffies, g=787, c=786, q=22)
[  248.203341] INFO: task lspci:581 blocked for more then 120 seconds.
[  248.203373]         Not tainted 4.14.15-1-ARCH #1
[  248.203421] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

After running lspci--nn command I pressed the power button to shut down the laptop and here is what I got; after which laptop froze.

...
[FAILD] Failed unmounting Temporary /etc/pacman.d/gnupg directory.
...

journalctl -p 3 -xb

Feb 20 22:24:00 archiso kernel: ACPI Error: [\_SB_.PCI0.XHC_.RHUB.HS11] Namespace lookup failure, AE_NOT_FOUND (20170728/dswload-210)
Feb 20 22:24:00 archiso kernel: ACPI Exception: AE_NOT_FOUND, During name lookup/catalog (20170728/psobject-252)
Feb 20 22:24:00 archiso kernel: ACPI Exception: AE_NOT_FOUND, (SSDT:xh_rvp11) while loading table (20170728/tbxfload-228)
Feb 20 22:24:00 archiso kernel: ACPI Error: 1 table load failures, 12 succesfull (20170728/tbxfload-246)
Feb 20 22:24:00 archiso kernel: pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00eB(Transmitter ID)
Feb 20 22:24:00 archiso kernel: pcieport 0000:00:1d.0:      device [8086:a118] error status/mask=00001000/00002000
Feb 20 22:24:00 archiso kernel: pcieport 0000:00:1d.0:       [12] Replay Timer Timeout
Feb 20 22:24:00 archiso kernel: nouveau 0000:01:00.0:  DRM: Pointer to TMDS table invalid
Feb 20 22:24:00 archiso kernel: pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00eB(Transmitter ID)
Feb 20 22:24:00 archiso kernel: pcieport 0000:00:1d.0:      device [8086:a118] error status/mask=00001000/00002000
Feb 20 22:24:00 archiso kernel: pcieport 0000:00:1d.0:       [12] Replay Timer Timeout
Feb 20 22:24:00 archiso kernel: sd 2:0:0:0: [sda] No Caching mode page found
Feb 20 22:24:00 archiso kernel: sd 2:0:0:0: [sda] Assuming drive cache: write through
Feb 20 22:24:00 archiso kernel: pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00eB(Transmitter ID)
Feb 20 22:24:00 archiso kernel: pcieport 0000:00:1d.0:      device [8086:a118] error status/mask=00001000/00002000
Feb 20 22:24:00 archiso kernel: pcieport 0000:00:1d.0:       [12] Replay Timer Timeout

lspci -t

[  1809.011858] NMI watchdog: Watchdog detected hard LOCKUP on cpu ?
[  1872.206301] INFO: rcu_preempt detected stalls on CPUs/tasks:
[  1872.206340] o7-...: (1 GPs behind) idle=aaa/140000000000000/0 softirq=1744/1746 fqs=79
[  1872.206372] o(detected by 1, t=18919 jiffies, g=912, c=911, q=13)
[  1872.207411] rcu_preempt kthread starved for 1177 jiffies! g912 c911 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=1
[  1966.343000] INFO: task lspci:645 blocked for more then 120 seconds
[  1966.343059]         Not tainted 4.14.15-1-ARCH #1
[  1966.343090] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  2052.259600] INFO: rcu_preempt detected stalls on CPUs/tasks:
[  2052.259642] o7-...: (1 GPs behind) idle=aaa/140000000000000/0 softirq=1744/1746 fqs=199
[  2052.259676] o(detected by 3, t=72934 jiffies, g=912, c=911, q=50)

lspci -tv

[  74.100438] NMI watchdog: Watchdog detected hard LOCKUP on cpu 6
[  135.130009] INFO: rcu_preempt detected stalls on CPUs/tasks:
[  135.130029] o6-...: (1 GPs behind) idle=dc2/140000000000000/0 softirq=2073/2075 fqs=426
[  135.130059] o(detected by 0, t=18018 jiffies, g=741, c=740, q=16)
[  245.383320] INFO: task lspci:586 blocked for more then 120 seconds
[  245.383371]         Not tainted 4.14.15-1-ARCH #1
[  245.383384] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

I'm really sorry I can't do much more but at this point, I'm scared not to mess up things even furthermore. sad

Offline

#2 2018-02-20 23:42:13

jasonwryan
Anarchist
From: .nz
Registered: 2009-05-09
Posts: 30,424
Website

Re: Failure to communicate with kernel device-mapper driver

Moving to Newbie Corner...


Arch + dwm   •   Mercurial repos  •   Surfraw

Registered Linux User #482438

Offline

#3 2018-02-21 01:20:15

dif
Member
From: Stalowa Wola, Poland
Registered: 2009-12-22
Posts: 137

Re: Failure to communicate with kernel device-mapper driver

I cannot tell you what the problem is. I can only tell you what I would do.
If I were not able to switch off my laptop, I would take out the battery. In fact, I have had to switch my laptop once this way.
It booted OK afterwords.
Just make sure your boot media, a USB disk or whatever you use, is not corrupted. I understand that you have been running Arch linux from the previous disk, so all the UEFI settings are already set up in such a way that you do not have to change anything in there.

Off topic:  One thing I can recommend is using the EFISTUB instead of any type of bootloaders. It is just simpler, and you do not have to use something you do not really need.

Offline

#4 2018-02-23 17:49:57

DCx86
Member
From: Uppsala
Registered: 2018-02-20
Posts: 47

Re: Failure to communicate with kernel device-mapper driver

Is there a way to get around this message mentioned above?
While googling around, i found a suggestion to suppress them by setting the kernel parameter pcie_aspm=off

So, i went into /etc/default/grub/ and changed this line to :

GRUB_CMDLINE_LINUX_DEFAULT="quiet pcie_aspm=off"

After making these changes, i am supposed to update the grub. The command update-grub doesn't work in arch linux.
i have tried :

grub-mkconfig -o /boot/grub/grub.cfg

but this gives me the following error  : failed to get canonical path of 'airootfs'.
I have also tried this same command by being inside chroot, and it cannot find the command.

I am unable to install arch linux at all to my new SSD. I have tried using different USB and boot from it. Once i am in as root , the first thing i try to do is clean up the disk totally using the following command :

 dd if=/dev/zero of=/dev/nvmeon1 status=progress

and it floods the console with errors.
Any suggestions ?

Offline

#5 2018-02-23 18:57:44

frostschutz
Member
Registered: 2013-11-15
Posts: 1,418

Re: Failure to communicate with kernel device-mapper driver

There should be no need for a zero wipe on SSDs, you can use TRIM instead (with blkdiscard). mkfs.ext4 (and others) even do this without asking - no undo if you format something by accident.

If you get hard cpu lockups and the like, something is wrong with the kernel, with the hardware, with the compatibility between the two. It's very difficult to make a suggestion. Basically - no idea.

Run a memtest86(+) just in case, try a different kernel version, do a bios update (or if you did that and got one of the faulty intel microcode meltdown/spectre fixes, undo those or see if there is an update for the update by now), reset bios to defaults, ...

but this gives me the following error  : failed to get canonical path of 'airootfs'.

Are you working within a proper chroot with bind-mounts for /proc /sys /dev?

Offline

#6 2018-02-23 20:15:55

DCx86
Member
From: Uppsala
Registered: 2018-02-20
Posts: 47

Re: Failure to communicate with kernel device-mapper driver

Since I can't find a way to install the OS on the 1TB SSD I have decided to switch back to the old SSD - on which I was able to successfully install Archlinux.

Does this mean I have a hardware problem? Could the SSD be faulty? How can I check?

Thank you for your help.

Offline

Board footer

Powered by FluxBB