Question & Answer
Question
How do you configure KDump to capture operating system dump files in an IBM Smart Analytics System 5600 environment or a Balanced Warehouse D5100 environment?
Answer
KDump is a Novell Linux tool that you can use to collect an operating system dump file with additional diagnostic information about a system crash or hang in cases where standard Linux log files do not provide sufficient information about the root cause of a system crash.
KDump can be configured in your IBM Smart Analytics System 5600 environment or your Balanced Warehouse D5100 environment to collect an operating system dump file in the event of a system crash.
A. Verify that the correct RPM packages are installed on your system
1. Log in as the root user on each administration node, data node, and standby node, and verify that the correct Novell RPM packages are installed on your system:
- rpm -qa | grep <package_name>
where <package_name>represents the name of the Novell package.
The following RPM packages need to be installed before you can configure and use KDump in your environment:
- kernel-kdump
- kdump
- kexec-tools
Note: The version of the kernel-kdump package has to match the version of the Linux kernel you are running in your system.
Example: For example, if you are running kernel-smp-2.6.16.60-0.85.1 as your kernel version, then you have to install the kernel-kdump-2.6.16.60-0.85.1 package,
2. If you do not have the correct Novell RPM packages installed, you will need to download the correct packages from the Novell site.
Note: Access to the packages on the Novell site is restricted and requires a valid Novell license and ID.
B. Configure the kernel parameters
1. Log in as root on each administration node, data node, and standby node.
2. Edit the /etc/sysctl.conf file and add the following parameters:
kernel.suid_dumpable = 1
kernel.sysrq = 1
kernel.panic_on_oops = 1
3. Issue the following command to reload the kernel configuration from the /etc/sysctl.conf file:
sysctl -p
4. Optionally, you can run the following commands to verify that the kernel parameters have been changed.
a. sysctl -A | grep dumpable
The command should return the following output: kernel.suid_dumpable = 1
b. sysctl -A |grep sysrq
The command should return the following output: kernel.sysrq = 1
c. sysctl -A |grep panic_on_oops
The command should return the following output: kernel.panic_on_oops = 1
C. Configure the GRUB boot string
You need to add a kernel parameter to the /boot/grub/menu.lst file to reserve a specific amount of memory for KDump in the event of a system crash.
a. Determine the amount of physical memory on each administration node, data node, and standby node.
b. Identify the appropriate value for the crashkernel parameter from the table below.
Memory on the node | Value of the crashkernel parameter |
13 - 48 GB | 128M@16M |
49 - 128 GB | 256M@16M |
129 - 256 GB | 512M@16M |
- Example: For a node with 64 GB of physical memory, select the value in the second row 256M@16M.
where <value_from_table> represents the string in the table you identified in step 1b.
Example: Using the previous example of a node with 64 GB of physical memory, you would add the value crashkernel=256@16M value to the node boot string in the /boot/grub/menu.lst file.
D. Configure the KDump parameters
Configure KDump by editing the KDump configuration file: /etc/sysconfig/kdump. The KDump configuration settings should be set to the same values on all administration nodes, data nodes, and standby nodes in your system.
1. Configure the KDUMP_RUNLEVEL parameter. This parameter sets the runlevel to boot the KDump kernel.
- If the vmcore file is written to a file system that is mounted in multi-user mode or to a file system that is mounted over the network (for example, an NFS-mounted directory), set the value to runlevel 3. Add the following string to the KDump configuration file to set the parameter to runlevel 3:
- If the vmcore file is written to any other type of file system, use the default value of runlevel 1. Add the following string to the KDump configuration file:
KDUMP_RUNLEVEL="3"
KDUMP_RUNLEVEL="1"
2. Configure the KDUMP_SAVEDIR parameter. This parameter determines the directory where the vmcore dump file will be copied. You can set the value any directory you prefer. The following examples show how to set the value to a local directory on the node and how to set the value to an NFS-mounted directory.
KDUMP_SAVEDIR="file://work_files/DUMPDIR/data01"
The vmcore dump file is copied to the directory /work_files/DUMPDIR/data01/<timestamp>.
To copy the dump file to an NFS-mounted directory, add the following string to the KDump configuration file:
KDUMP_SAVEDIR="nfs://10.11.12.13://DUMPDIR/data02"
If the node boots to the KDump kernel, the vmcore dump file is copied to the /DUMPDIR/data02/<timestamp> directory on the node with the IP address 10.11.12.13. The /DUMPDIR/data02/ directory is mounted to the /mnt directory on the local node.
Note: The example assumes that the /DUMPDIR/data02 directory is exported on the NFS server (10.11.12.13) and that it can be NFS-mounted. You can use the following command to verify that the file system can be mounted:
mount -t nfs 10.11.12.13://DUMPDIR/data01 /mnt
The NFS directory does not need to be mounted to the /mnt mount point for KDump to work.
3. Configure the KDUMP_DUMPLEVEL parameter. This parameter determines the dump level, or what is to be dumped and what is to be stripped. Specify a value in the range 0 - 31. If you specify a value of zero, nothing is stripped from the vmcore file and it will be the largest possible file size. If you specify a value of 31, the maximum amount is stripped from the vmcore file and it will be the smallest possible file size.
KDUMP_DUMPLEVEL=31
4. Configure the KDUMP_DUMPFORMAT parameter. This parameter determines the format of the dump file and can be used to reduce the size of the vmcore dump file. To specify a smaller compressed dump file, add the following string to the KDump configuration file:
KDUMP_DUMPFORMAT="compressed"
5. On each administration node, data node, and standby node, set KDump to start each time the node reboots:
chkconfig kdump on
6. Start KDump by issuing the following command on each administration node, data node, and standby node:
rckdump restart
Note: Run the rckdump restart command each time you change the KDump configuration file. You do not need to reboot the node to activate the changes.
7. If you are configuring KDump for the first time, reboot each administration node, data node, and standby node to activate the change you made to the GRUB boot string.
E. Verify that KDump is configured correctly
Optionally, you can verify that KDump is configured correctly by simulating a system crash and generating a vmcore file.
Note: The simulation will crash the system, cause it to reboot and generate the vmcore.
1. Log in as the root user on each administration node, data node, and standby node, and issue the following command:
echo c > /proc/sysrq-trigger
Related Information
Was this topic helpful?
Document Information
Modified date:
16 June 2018
UID
swg21623587