When an Oracle Linux system experiences a kernel panic and crashes unexpectedly or hangs, information about the system state and kernel calls leading up to the crash can be useful for troubleshooting. The Kdump feature provides a dumping mechanism for kernel crash information. In Oracle Linux platform images, the OS is either fully configured or partially configured to generate a crash dump, depending on the image release date.
Kdump includes a second kernel, that resides in a reserved part of the system memory, so that it can capture information about a stopped kernel. Kdump uses the kexec system call to boot into the second kernel, called a capture kernel, without the need to reboot the system, and then captures the contents of the stopped kernel’s memory as a crash dump. For more information about the contents of a crash dump, see What's Inside a Linux Kernel Core Dump.
For Oracle Linux instances, crash dump information collected by Kdump is copied into the /var/oled/crash/<ip-address>-<YYYY-MM-DD>-<HH:MM:SS> directory, by default. A new <ip-address>-<YYYY-MM-DD>-<HH:MM:SS> directory is created for each crash dump, for example:
[opc@<instance_name> crash] ls -a
127.0.0.1-2025-02-07-15:18:07
127.0.0.1-2025-02-07-16:28:19
The dump directory contains the crash dump file, vmcore, a text file and a log file, for example:
[opc@<instance_name><127.0.0.1-2025-02-07-16:28:19>] ls -a
vmcore
vmcore-dmesg.txt
kexec-dmesg.log
Important
If you have an Oracle Linux instance that is unreachable or unresponsive, you can send a diagnostic interrupt to troubleshoot. A diagnostic interrupt causes the instance's OS to crash and reboot. To use the console or API to send a diagnostic interrupt, you must have Kdump configured to generate a crash dump. For more information, see Sending a diagnostic interrupt.
Setting the Memory Reserved for a Crash Dump
If you are using an Oracle Linux platform image, Kdump is installed and either fully configured or partially configured. You can change the memory amount that is reserved on the kernel to save the crash dump, also called a crashkernel memory reservation. In Oracle Linux 8, and earlier, the default memory reservation is set to adjust automatically: GRUB_CMDLINE_LINUX="crashkernel=auto". However, crashkernel=auto is not supported for Oracle Linux 9, so you must set a specific amount of reserved memory using the crashkernel parameter.
Define an offset value for the reserved memory. Because the crashkernel reservation occurs early in the boot process, some systems require that you reserve memory with a certain fixed offset. When a fixed offset is specified, the reserved memory begins at that point. For example, to reserve 128 MB of memory, starting at 16 MB:
GRUB_CMDLINE_LINUX="crashkernel=128M@16M"
Save the changes and refresh the grub configuration:
Using the /etc/kdump.conf, you can change the location in which the crash dump files are saved, transfer them via SSH or export them to a network share.
By default, if Kdump fails to send its result to the configured output locations, it reboots the server. This action deletes any data that has been collected for the dump. To prevent this outcome, change the Kdump configuration.
Edit /etc/kdump.conf to uncomment and change the default value in the file as follows:
Copy
default dump_to_rootfs
The dump_to_rootfs option tries to save the result to a local directory, which can be useful if a network share is unreachable. You can use shell instead to copy the data manually from the command line.
Note
The poweroff, restart, and halt options are also valid for the default kdump failure state. However, performing these actions causes you to lose the collected data if those actions are performed. See the kdump.conf.5 file in the /usr/share/man/man5/kdump.conf.5.gz archive for more information.
When you have finished, save the changes and restart the kdump service.
Initiate the crash from the console or command line:
Copy
echo 1 > /proc/sys/kernel/sysrq
echo c > /proc/sysrq-trigger
This forces the kernel to crash and the dump files are copied into the /var/oled/crash/<ip-address>-<YYYY-MM-DD>-<HH:MM:SS> directory, by default, or to the location you have selected in the configuration.
Reboot the instance.
Review the crash dump files in the /var/oled/crash/<ip-address>-<YYYY-MM-DD>-<HH:MM:SS>:
The kernel message buffer includes the most essential information about the system crash and it is always dumped first in to the vmcore-dmesg.txt file. This is useful when an attempt to get the full vmcore file fails, for example because of lack of space on the target location.
As the kexec tool boots into the second kernel and captures the contents of the crashed kernel’s memory, it also writes to the kexec-dmes.log file so you can trace the process. For example, at the end of the file you can see the crash dump save process:
...
Feb 07 16:28:19 linux9 systemd[1]: Starting Kdump Vmcore Save Service...
Feb 07 16:28:19 linux9 kdump[504]: Kdump is using the default log level(3).
Feb 07 16:28:19 linux9 kdump[541]: saving to /kdumproot/var/oled/crash/127.0.0.1-2025-02-07-16:28:19/
Feb 07 16:28:19 linux9 kdump[546]: saving vmcore-dmesg.txt to /kdumproot/var/oled/crash/127.0.0.1-2025-02-07-16:28:19/
Feb 07 16:28:19 linux9 kdump[552]: saving vmcore-dmesg.txt complete
Feb 07 16:28:19 linux9 kdump[554]: saving vmcore
Feb 07 16:28:21 linux9 kdump.sh[555]:
Checking for memory holes : [ 0.0 %] /
...
Copying data : [100.0 %] \ eta: 0s
Feb 07 16:28:21 linux9 kdump.sh[555]: The dumpfile is saved to /kdumproot/var/oled/crash/127.0.0.1-2025-02-07-16:28:19//vmcore-incomplete.
Feb 07 16:28:21 linux9 kdump.sh[555]: makedumpfile Completed.
Feb 07 16:28:21 linux9 kdump[559]: saving vmcore complete
Feb 07 16:28:21 linux9 kdump[561]: saving the /run/initramfs/kexec-dmesg.log to /kdumproot/var/oled/crash/127.0.0.1-2025-02-07-16:28:19//
The vmcore file contains the crash dump information. To analyze the crash dump, you need a utility that can read the vmcore file format. See Analyzing Crash Dumps for information on using the crash utility.
Analyzing Crash Dumps 🔗
You can use the crash utility to analyze the crash dumps collected by Kdump. In Oracle Linux platform images, crash is installed by default. For other Linux instances, use the command line to install it: sudo dnf install crash.
Configure an Oracle Linux Instance to Use the crash Utility
To analyze a crash dump with crash, complete the following configuration tasks:
Enable the Oracle Linux debuginfo repository by creating the /etc/yum.repos.d/debuginfo.repo file with root privileges and the following contents, for example:
[debuginfo]
name=Oracle Linux 8 Debuginfo Packages
baseurl=https://oss.oracle.com/ol8/debuginfo/
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle
gpgcheck=1
enabled=1
Run the install command each time the kernel is updated through the package manager. The debuginfo package is only functional when it matches the running kernel, and it's not replaced automatically when a newer kernel version is installed on the system.
Analyze a Crash Dump Using the crash Utility
To analyze a crash dump, provide the vmcore information to crash, and then use the crash shell options retrieve crash dump information. For detailed information about using the crash utility, type man crash at a command prompt or see the crash documentation.
$(uname -r) identifies the running kernel version within the command, <ip-address>-<YYYY-MM-DD>-<HH:MM:SS> represents the directory that gets created for the crash dump files, and vmcore file contains the crash dump.
The crash shell starts and displays some system crash info, such as:
kmem -i displays kernel memory usage information, for example:
crash> kmem -i
PAGES TOTAL PERCENTAGE
TOTAL MEM 1974231 7.5 GB ----
FREE 208962 816.3 MB 10% of TOTAL MEM
USED 1765269 6.7 GB 89% of TOTAL MEM
SHARED 365066 1.4 GB 18% of TOTAL MEM
BUFFERS 111376 435.1 MB 5% of TOTAL MEM
CACHED 1276196 4.9 GB 64% of TOTAL MEM
SLAB 120410 470.4 MB 6% of TOTAL MEM
TOTAL HUGE 524288 2 GB ----
HUGE FREE 524288 2 GB 100% of TOTAL HUGE
TOTAL SWAP 2498559 9.5 GB ----
SWAP USED 81978 320.2 MB 3% of TOTAL SWAP
SWAP FREE 2416581 9.2 GB 96% of TOTAL SWAP
COMMIT LIMIT 3485674 13.3 GB ----
COMMITTED 850651 3.2 GB 24% of TOTAL LIMIT
When you have finished analyzing the core dump, exit the shell by typing exit or q.