You are not logged in.
Pages: 1
Hi!
roughly a month ago my system reached Windows 3.11 stability (before that it had run for almost a year rock solid). It freezes and only power switch can help. I've been trying to install kexec/kdump to get a kernel dump and ask somebody in the company I work to debug it for me. As you can see I'm quite optimistic thinking that kdump will help in case of freeze but I don't have many options left.
Getting to the point I was following the documentation one can find in kernel sources linux/Documentation/kdump/kdump.txt and I found out there are some discrepancies between the file and what I can find in config file. Is the kdump.txt out of date or Arch config file is customized so heavily and that's why the discrepancies? I'm afraid the former is true. Below what I found:
A) First kernel:
a) Enable "kexec system call" feature (in Processor type and features).
CONFIG_KEXEC=y
b) This kernel's physical load address should be the default value of
0x100000 (0x100000, 1 MB) (in Processor type and features).
CONFIG_PHYSICAL_START=0x100000
c) Enable "sysfs file system support" (in Pseudo filesystems).
CONFIG_SYSFS=y
d) Boot into first kernel with the command line parameter "crashkernel=Y@X".
Use appropriate values for X and Y. Y denotes how much memory to reserve
for the second kernel, and X denotes at what physical address the reserved
memory section starts. For example: "crashkernel=64M@16M".
a) ok
b) not in "Processor type and features" but in "Firmware Drivers"
c) ok
B) Second kernel:
a) Enable "kernel crash dumps" feature (in Processor type and features).
CONFIG_CRASH_DUMP=y
b) Specify a suitable value for "Physical address where the kernel is
loaded" (in Processor type and features). Typically this value
should be same as X (See option d) above, e.g., 16 MB or 0x1000000.
CONFIG_PHYSICAL_START=0x1000000
c) Enable "/proc/vmcore support" (Optional, in Pseudo filesystems).
CONFIG_PROC_VMCORE=y
d) Disable SMP support and build a UP kernel (Until it is fixed).
CONFIG_SMP=n
e) Enable "Local APIC support on uniprocessors".
CONFIG_X86_UP_APIC=y
f) Enable "IO-APIC support on uniprocessors"
CONFIG_X86_UP_IOAPIC=y
a) does not exist
b) not in "Processor type and features" but in "Firmware Drivers"
c) does not exist
d) ok
e) does not exist
f) does not exist
any comments? Should I just add the missing stuff to the .config file and compile it?
cheers
waldek
Offline
You need patches for the missing pieces - have a look here.
Offline
AFAIK LKCD and KEXEC/KDUMP are different projects. KEXEC/KDUMP is supported in kernel (does not require any patches) while LKCD does. I thought KEXEC/KDUMP would be easier to use.
Offline
Whoops - sorry. See following (double) post.
Offline
Got news for you if you think kdump doesn't need patches.
Dumping is one of those areas Linux is severely lacking. I've kept an eye on this for a while - it's part of what I do in the real world.
LKCD seemed a likely candidate, but seems to have atrophied - deprecated if you prefer.
kexec seems reasonably accepted - kdump ain't there yet. From the site
Kdump is a kexec based crash dumping mechansim for Linux. Kdump functionality is broken mainly in two components, user space and kernel space. Kernel space patches are already part of -mm tree. User space component is nothing but a patch on top of existing kexec tools.
Hmmmm ...
Should we read that as written "Kdump functionality is broken mainly in two components ...", or should we hope it was supposed to be "Kdump functionality is broken out mainly in two components ..."
I must admit I haven't tested it.
If you want to capture a crash dump, basically you need to pick the project you want to use (there are others BTW), install it ahead of time, and wait for it to trip.
I've not looked if any are available in Arch ...
Offline
Apologies for any confusion. I just did a quick google for "CONFIG_CRASH_DUMP", which led me straight to lkcd, but another bit of digging helps, this time in the kernel docs. Here's your problem:
config CRASH_DUMP
bool "kernel crash dumps (EXPERIMENTAL)"
depends on EMBEDDED
depends on EXPERIMENTAL
depends on HIGHMEM
help
Generate crash dump after being started by kexec.
config PROC_VMCORE
bool "/proc/vmcore support (EXPERIMENTAL)"
depends on PROC_FS && EMBEDDED && EXPERIMENTAL && CRASH_DUMP
help
Exports the dump image of crashed kernel in ELF format.
So you'll only see CRASH_DUMP if you turn on EMBEDDED, EXPERIMENTAL, and HIGHMEM, and after activating it, you'll see PROC_VMCORE.
The X86_UP options are a bit more puzzling, because you should see X86_UP_APIC when you turn off SMP, and X86_UP_IOAPIC when you turn on X86_UP_APIC. Are you doing this with make menuconfig, or are you editing the config file manually? Manual editing won't work for this kind of thing.
Offline
thanks a lot for your replies an help. As you can see I don't have too much experience in this matter but I have had a lot of freezes recently, so I'm getting pretty desperate even thinking about reinstall, though when I think how much customization I've done - hopefully most of it is in /etc and /home.
@syg00: I was under the impression that kexec/kdump should be implemented in the stock 2.6.15 available in Arch. The documentation is there, looking at what tomk said the config options are there too. The only patch they mentioned was for the tools, not kernel as such (unless there's something I'm missing) and it worked fine for me. As I said befor I don't know what I'm talking about so it is only the impression I got from reading the docs. Should I go for -mm kernel?
Guys from the company I work in use LKCD, I had a look and I'm probably too stupid to implement it. That's why I started with kexec/kdump - looked clearer too me. "out" - sometimes I'm glad I'm not a native speaker in my english it's roughly the same ;-)
@tomk: you're right I tried to modify config I took from /proc. I'll give it a try again using make config, thanks a lot for your time!
cheers
waldek
Offline
Nope - my turn to apologise.
Been doing other stuff, and obviously not staying on top of this.
LCA is on in New Zealand next week. I see there is a presentation on implementing kdump on PPC64. Don't have one, but I'll see if I can attend that session just out of interest.
As for the quote, "broken" means just that - not working; "broken out" in such a context means separated.
Offline
Pages: 1