You are not logged in.
Pages: 1
I'm trying to setup Kdump on my Arch Linux laptop, because I need to capture a kernel crash dump in case of kernel panic. After two days of research and experimentation I failed to accomplish this task. So I'm asking you to help me with this issue. Please excuse me for my bad English.
These are the steps I took to setup Kdump:
Downloading and extracting the Linux kernel from https://www.kernel.org. I downloaded 3.14.4 version of the kernel, because Arch Linux is using 3.14.4-1 version at this moment. Are there any differences between original and Arch Linux packaged version of the kernel?
Editing configuration file before building the kernel. Using /proc/config.gz of 3.14.4-1 version as a template. This is my kernel config:
Processor type and features ->
[*] kexec system call default
[*] kernel crash dumps
[*] kexec jump
(0x1000000) Physical address where the kernel is loaded
[*] Build a relocatable kernel
(0x1000000) Alignment value to which kernel should be aligned
File systems ->
Pseudo filesystems ->
[*] /proc/vmcore support
Kernel hacking ->
Compile-time checks and compiler options ->
[*] Compile the kernel with debug info
[*] Panic on Oops
[*] Kernel debugging
Device Drivers ->
IOMMU Hardware Support ->
[ ] Support for Intel IOMMU using DMA Remapping Devices (Red Hat website says: "A limitation in the current implementation of the Intel IOMMU driver can occasionally prevent the kdump service from capturing the core dump image. To use kdump on Intel architectures reliably, it is advised that the IOMMU support is disabled").
Compiling and installing the custom kernel and its modules.
Generating an initramfs image with mkinitcpio.
Generating new GRUB configuration. Adding crashkernel parameter to GRUB configuration. Reserving 128 MB of RAM for dump capture kernel. I don't think I need to use an offset here, because the kernel is relocatable. Please correct me if I'm wrong.
linux /vmlinuz-linux-kdump root=UUID=fbc5f457-ce39-44fb-94bb-a618a9b2d3c3 rw crashkernel=128M
Installing kexec-tools.
Creating and enabling Kdump service.
[Unit]
Description=Load dump capture kernel
After=local-fs.target
[Service]
ExecStart=/usr/bin/kexec -p /boot/vmlinuz-linux-kdump --initrd=/boot/initramfs-linux-kdump.img --append="root=UUID=fbc5f457-ce39-44fb-94bb-a618a9b2d3c3 rw single irqpoll maxcpus=1 reset_devices"
Type=oneshot
[Install]
WantedBy=multi-user.target
Rebooting the system (selecting vmlinuz-linux-kdump kernel). I will try to reproduce my kernel panic here. The same kernel image will be used for my dump capture kernel.
Testing.
Checking if kexec has loaded dump capture kernel:
cat /sys/kernel/kexec_crash_loaded
1
Checking the amount of memory reserved for dump capture kernel:
cat /sys/kernel/kexec_crash_size
134217728
Checking the address range used for dump capture kernel:
cat /proc/iomem
00000000-00000fff : reserved
00001000-0009ebff : System RAM
0009ec00-0009ffff : reserved
000a0000-000bffff : PCI Bus 0000:00
000c0000-000cfbff : Video ROM
000d0000-000d3fff : PCI Bus 0000:00
000d4000-000d7fff : PCI Bus 0000:00
000d8000-000dbfff : PCI Bus 0000:00
000dc000-000fffff : reserved
000e0000-000effff : Extension ROM
000f0000-000fffff : System ROM
00100000-bf8a0fff : System RAM
01000000-01515ff6 : Kernel code
01515ff7-018d63bf : Kernel data
019f6000-01b37fff : Kernel bss
2d000000-34ffffff : Crash kernel <---
bf8a1000-bf8a6fff : reserved
bf8a7000-bf9b6fff : System RAM
Triggering a kernel crash:
echo c > /proc/sysrq-trigger
System hangs for approximately 15 seconds and then reboots going through the BIOS screen. No dump capture kernel is booted on panic. I have no idea what I'm doing wrong here.
I'm able to boot this kernel using:
kexec -l /boot/vmlinuz-linux-kdump --initrd=/boot/initramfs-linux-kdump.img --append="root=UUID=fbc5f457-ce39-44fb-94bb-a618a9b2d3c3 rw single irqpoll maxcpus=1 reset_devices"
kexec -e
But unable to boot it on panic.
Could you please point me in the right direction? Has anyone experienced a similar Kdump problem? I would very much appreciate any help.
Last edited by archuser_4573 (2014-06-19 20:23:43)
Offline
I should revisit the wiki page and redo Kdump configuration step by step.
One thing I noticed in your instructions is item #1 and #2. Use PKGBUILD file from ABS. The 'linux' package contains config.x86_64 - set compile options
CONFIG_DEBUG_INFO=y
CONFIG_CRASH_DUMP=y
CONFIG_PROC_VMCORE=y
as described at the wiki page. https://wiki.archlinux.org/index.php/Kdump
Everything else in your instructions looks fine.
Read it before posting http://www.catb.org/esr/faqs/smart-questions.html
Ruby gems repository done right https://bbs.archlinux.org/viewtopic.php?id=182729
Fast initramfs generator with security in mind https://wiki.archlinux.org/index.php/Booster
Offline
I should revisit the wiki page and redo Kdump configuration step by step.
One thing I noticed in your instructions is item #1 and #2. Use PKGBUILD file from ABS. The 'linux' package contains config.x86_64 - set compile options
CONFIG_DEBUG_INFO=y
CONFIG_CRASH_DUMP=y
CONFIG_PROC_VMCORE=yas described at the wiki page. https://wiki.archlinux.org/index.php/Kdump
Everything else in your instructions looks fine.
Thanks for reply.
I compiled the kernel again using ABS this time, but the Kdump problem persists.
Offline
After a lot of trial and error I managed to solve my problem, but the exact cause is not isolated yet.
This is what I changed:
I disabled Core Multi-Processing in the BIOS. It looks like the most important change I made. It seems that the maxcpus=1 kernel parameter doesn't work correctly, but I don't know for sure.
kexec reuses kernel parameters from the running kernel now:
[Unit]
Description=Load dump capture kernel
After=local-fs.target
[Service]
ExecStart=/usr/bin/kexec -p /boot/vmlinuz-linux-custom --initrd=/boot/initramfs-linux-custom.img --reuse-cmdline
Type=oneshot
[Install]
WantedBy=multi-user.target
I changed the crashkernel parameter to 512M in grub.cfg:
linux /vmlinuz-linux-custom root=UUID=fbc5f457-ce39-44fb-94bb-a618a9b2d3c3 rw crashkernel=512M
System reboots into the crash kernel on panic. /proc/vmcore is present.
Last edited by archuser_4573 (2014-06-13 16:22:24)
Offline
I tried isolating the cause of my Kdump problems and I came up with two issues:
maxcpus=1 kernel parameter doesn't work while Core Multi-Processing BIOS setting is enabled. Using nr_cpus=1 parameter works fine. I also found some information about this issue: http://lists.infradead.org/pipermail/ke … 05378.html, https://lists.fedoraproject.org/piperma … 25900.html.
Not using --reuse-cmdline kexec option causes a problem during initramfs stage:
:: running early hook [udev]
:: running hook [udev]
:: Triggering uevents...
ERROR: device '' not found. Skipping fsck.
ERROR: Unable to find root device ''.
You are being dropped to a recovery shell
Type 'exit' to try and continue booting
root variable is not set while calling default_mount_handler() from init_functions:
default_mount_handler() {
if [ ! -b "$root" ]; then
err "Unable to find root device '$root'."
echo "You are being dropped to a recovery shell"
echo " Type 'exit' to try and continue booting"
launch_interactive_shell
msg "Trying to continue (this will most likely fail) ..."
fi
msg ":: mounting '$root' on real root"
if ! mount ${fstype:+-t $fstype} -o ${rwopt:-ro}${rootflags:+,$rootflags} "$root" "$1"; then
echo "You are now being dropped into an emergency shell."
launch_interactive_shell
msg "Trying to continue (this will most likely fail) ..."
fi
}
This initramfs issue is only visible when I'm using a fresh installed Arch Linux virtual machine. On my laptop it still shows the kernel panic output (screen doesn't refresh itself). Blindly typing and executing ./init command results in HDD led blinking.
Last edited by archuser_4573 (2014-06-12 20:27:21)
Offline
Interesting, the nr_cpus=1 thing might changed recently. Could you please update the wiki page with information you've found?
But I am a bit surprised with --reuse-cmdline, you already passed --append="root= parameter to kexec, why bootloader cannot use it correctly?
Read it before posting http://www.catb.org/esr/faqs/smart-questions.html
Ruby gems repository done right https://bbs.archlinux.org/viewtopic.php?id=182729
Fast initramfs generator with security in mind https://wiki.archlinux.org/index.php/Booster
Offline
Interesting, the nr_cpus=1 thing might changed recently. Could you please update the wiki page with information you've found?
I added a small note about maxcpus=1 parameter issue to the wiki. Advice to use nr_cpus=1 comes from Vivek Goyal, the author of the official Kdump documentation from kernel.org website.
However neither maxcpus=1 nor nr_cpus=1 works on VirtualBox. Reducing to one CPU core is the only solution I found. I hope it's just a VirtualBox issue. I'm going to test it on KVM/QEMU tomorrow.
But I am a bit surprised with --reuse-cmdline, you already passed --append="root= parameter to kexec, why bootloader cannot use it correctly?
Using double quotes after --append=
ExecStart=/usr/bin/kexec -p /boot/vmlinuz-linux --initrd=/boot/initramfs-linux.img --append="root=/dev/sda3 single irqpoll nr_cpus=1 reset_devices"
results in:
cat /proc/cmdline
"root=/dev/sda3 memmap=exactmap memmap=635K@4K memmap=130420K@753664K elfcorehdr=884084K memmap=64K#2097088K
however after removing double quotes:
ExecStart=/usr/bin/kexec -p /boot/vmlinuz-linux --initrd=/boot/initramfs-linux.img --append=root=/dev/sda3 single irqpoll nr_cpus=1 reset_devices
it works fine:
cat /proc/cmdline
root=/dev/sda3 memmap=exactmap memmap=635K@4K memmap=130420K@753664K elfcorehdr=884084K memmap=64K#2097088K
I don't understand why this is working that way. It looks like a systemd parsing behavior.
Last edited by archuser_4573 (2014-06-17 20:02:04)
Offline
I added a small note about maxcpus=1 parameter issue to the wiki. Advice to use nr_cpus=1 comes from Vivek Goyal, the author of the official Kdump documentation from kernel.org website.
However neither maxcpus=1 nor nr_cpus=1 works on VirtualBox. Reducing to one CPU core is the only solution I found. I hope it's just a VirtualBox issue. I'm going to test it on KVM/QEMU tomorrow.
I tested Kdump on dual core KVM/QEMU virtual machine with SMP enabled. Strangely it works even without specifying nr_cpus=1 or maxcpus=1. This is my configuration:
[Unit]
Description=Load dump capture kernel
After=local-fs.target
[Service]
ExecStart=/usr/bin/kexec -p /boot/vmlinuz-linux --initrd=/boot/initramfs-linux.img --append=root=/dev/vda3 single irqpoll reset_devices
Type=oneshot
[Install]
WantedBy=multi-user.target
Offline
Pages: 1