Centos/rhel kdump configuration:

Title:

Centos/rhel kdump configuration:

Author:

Douglas O’Leary <dkoleary@olearycomputers.com>

Description:

kdump storage requirements and ssh configuration

Date created:

11/2014

Date updated:

11/2014

Disclaimer:

Standard: Use the information that follows at your own risk. If you screw up a system, don’t blame it on me…



Overview:

Linux typically takes ideas from the major OS vendors and improves upon them so I was quite surprised to find the crash dump process so under documented and primitive. That being said, the linux kdump utility does provide a method of storing the crash dump remotely which, to my knowledge, is not supported by any of the major vendors.

The basics of the kdump facility are documented well enough in chater 29 of the rhel Deployment Guide.

Dump location sizing:

Completely missing from that documentation is any kind of sizing requirements for the location that kdump will store the crash. Turns out, that’s somewhat critical.

  • If you’re using local disk, the size must be roughly equivalent to the RAM size even though the resulting vmcore will be compressed. A test with a system with 192 gigs of RAM had crash dumps of ~ 3.5 gigs. If the local/SAN storage isn’t roughly equivalent to RAM, though, kdump won’t store the crash dump and will never tell you why.

  • If you’re using remote storage, you can get away with using appropriately sized filesystems; however, as the phrase goes, Unexexpected results will occur if the crash dump needs more space than is provided.

    A recent test with a KVM guest with 4 gigs of RAM successfully wrote to a 200 meg remote filesystem. The kdump utility did complain about the space during a restart; but, it did run and was able to save the crash dump:

    # service kdump restart
    Stopping kdump:                                            [  OK  ]
    Detected change(s) the following file(s):
    
      /etc/kdump.conf
    Rebuilding /boot/initrd-2.6.32-431.5.1.el6.x86_64kdump.img
    Warning: There might not be enough space to save a vmcore.
             The size of kdump@vmhost:/ignite/kdump/tmp.qtXs8kA8Br should be greater than 3922736 kilo bytes.
    Starting kdump:                                            [  OK  ]
    

A reasonable approach would be:

  • Critical systems, like production database servers

    • Local storage matching RAM size.

    • Consider keeping the dump location unmounted to prevent it from becoming a dumping ground for temporary data.

      • Provision and partition an appropriately sized lun.

      • Add ext3 ${dev} to /etc/kdump.conf

      • service kdump restart

      • Consider adding comments to /etc/fstab to identify the device and purpose.

  • Other linux systems should use remote storage, possibly attached to a management system such as a syslog server. Suggested size is roughly equivalent to the RAM size of the system with the most RAM. There should be few instances of multiple systems crashing at the same time; however, this set up will support that, if required.

Remote storage access:

kdump supports raw and filesystem for local storage and nfs and ssh for remote storage. There are plenty of administrators who still think nfs is evil. While their experience may have basis in fact, nfs has come a long way over the past 10 years or so. I’m a great fan of using nfs where it’s appropriate to do so. This isn’t one of those places.

In order to use nfs for a global dump location, you’re going to have very wide nfs mount permissions. Yet the data that’ll be included in these crash dumps will be incredibly sensitive. Anything that’s in RAM when the box crashes, be it passwords, passphrases, or database data (includng unencrypted PCI, HIPAA or SOX), will be accessible to anyone that can access the crash dump

It would be one thing if nfs were the only option; but, it’s not. The kdump facility also allows ssh access to remote filesystems.

General kdump configuration:

Before diving into the intracacies of remote crash dumping, there are three items that must be done regardless of how the crash dump is to be stored:

  1. Update kernel crash related parameters.

  2. Update boot menus to enable crash memory reservations.

  3. Verify kdump is configured to run.

    chkconfig --list kdump
    chkconfig kdump on
    

Crash memory reservations:

Kdump reserves a small chunk of ram via the crashkernel parameter that’s appended to the kernel lines in the boot menus. The amount of RAM to reserve is based on the size of phyusical RAM configured on the system. See the table below for correct numbers. As I’ve found out the hard way, if the reserved RAM is too low then the resultin crash dump will be incomplete and unusable for root cause analysis.

Looking through the results of kdump helper, it seems like the offset isn’t used at all in RHEL6. You’ll have a chance to test when you start crashing the boxes. NOTE: You may need a redhat subscription to access that link.

RAM size

Ksize

Offset

Up to 2 GB

128M

16M

2GB - 5.99GB

256M

24M

6GB - 7.99 GB

512M

16M

>= 8GB

768M

32M

The exact value for the crashkernel parameter seems to differ between physical and virtual systems. Several of the boot menus may be soft linked to each other; it’s safest, however, to simply edit all three.

vi /etc/grub.conf /boot/grub/grub.conf /boot/grub/menu.lst
  • Physicals: Add crashkernel=${Ksize}@${Offset} to the end of each kernel line

  • Virtuals: Add crashkernel=${Ksize} to the end of each kernel line.

Using ssh to access remote filesystems:

As one may surmise from other lessons learned entries, I am a big fan of secure shell and public key authentication (pka). Kdump’s use of ssh/pka ties into my philosophy very well. For obvious reasons, the key used for dumping crashes will have to be null passphrased. As noted in sudo vs ssh/pka, there are three rules for using null passphrased keys:

  1. The key should not be used for interactive shells

  2. The key should not be the default.

  3. The key should be locked down to the commands required.

The kdump utility can support all three rules. Here’s how to configure it:

Step 1: Create the key:

On a system that will be configured to dump crashes to a remote system, create a null passphrased key as root:

# cd ~/.ssh
# ssh-keygen -t dsa -P "" -f ./kdump
Generating public/private dsa key pair.
Your identification has been saved in ./kdump.
Your public key has been saved in ./kdump.pub.
The key fingerprint is:
85:45:5f:e0:65:f9:59:01:bc:f8:19:75:2d:5b:8e:0e root@vmhost.olearycomputers.com
The key's randomart image is:
+--[ DSA 1024]----+
|         .o oo=oo|
|         o o =o.=|
|        . . + oB+|
|         . .Eoo.o|
|        S   .oo  |
|             o.  |
|                 |
|                 |
|                 |
+-----------------+

I’m not usually a fan of sharing private keys; but, this is one use where it’s appropriate. Since this key will be used for crash dumps only, it would be appropriate to use it across all systems that will be configured to dump cores remotely. It also makes configuring additional clients quite a bit easier. If you decide to go that route, simply copy the public/private key pairs to root’s .ssh directory on all client systems.

Step 2: Configure ssh on the server hosting the crash filesystem:

  • Create a kdump user:

    groupadd -g kdump
    useradd -u 25 -g kdump -c 'kdump user' -d /home/kdump -s /bin/bash kdump
    
  • Disable password authentication:

    # perl -i -ple 's/!!/NP/g if (m{^kdump})' /etc/shadow
    
  • Create the authorized_keys file.

    • If following the recommendations in sudo vs ssh/pka,

      • copy /root/.ssh/kdump.pub from the client system to /etc/sshkeys/authorized_keys.kdump on the storage system.

      • chmod 640 /etc/sshkeys/authorized_keys.kdump

      • chgrp kdump /etc/sshkeys/authorized_keys.kdump

    • Otherwise, create ~kdump/.ssh/authorized_keys

      • mkdir -p -m 700 ~kdump/.ssh

      • copy /root/.ssh/kdump.pub from client to ~kdump/.ssh/authorized_keys

      • chown -R kdump:kdump ~kdump/.ssh

  • Verify/troubleshoot access:

    • On the client system: ssh -l kdump ${crash_server} hostname

    • For example:

      # h
      dumper1.olearycomputers.com
      # ssh -l kdump vmhost hostname
      vmhost.olearycomputers.com
      

Step 3: Configure approriately sized filesytem on storage server:

  • Provision storage, create a volume group, logical volume, and filesytem per your environment’s standards.

  • Mount it in an appropriate location. Examples used in this doc will display /ignite/kdump.

  • Set permissions on dump location: chown kdump:kdump /ignite/kdump

Step 4: Configure kdump on client system:

  • The following settings in /etc/kdump.conf, on the client system configures a collector which will filter out zero and free memory pages, and identifies the key to use, the user to use to connect to the storage host, the storage host, and the path to use. Update as needed for your environment:

    core_collector makedumpfile -c --message-level 1 -d 17
    sshkey /root/.ssh/kdump
    ssh kdump@vmhost
    path /ignite/kdump
    
  • After updating /etc/kdump.conf, restart the kdump service. service kdump restart Reboot the system if you updated the boot menus. Troubleshoot any issues.

  • Lastly, crash the client system to verify the configuration:

    echo 1 > /proc/sys/kernel/sysrq && echo c > /proc/sysrq-trigger
    
  • Troubleshoot as necessary.

Step 5: Lock the ssh access to required commands:

The fcmds script is an update to the sshroot script that was originally described in sudo vs ssh/pka. You can use this script to monitor the commands that are executed via the kdump key or you can use the kdump script and it’s associated list of valid commands to completely lock it down.

On the storage host:

  • Ensure sshd is appropriately configured:

    F=/etc/ssh/sshd_config
    perl -i -ple 's/^(permitrootlogin).*/\1 without-password/gi' ${F}
    perl -i -ple 's/(loglevel).*/\1 VERBOSE/gi' ${F}
    service sshd restart
    
  • Copy fcmds and kdump to /usr/local/sbin

  • Insert approrpiate forced command to the front of kdump’s authorized_keys file:

    command="/usr/local/sbin/fcmds" ssh-dss AAAAB3NzaC1k [[snip]]
    

    OR:

    command="/usr/local/sbin/kdump" ssh-dss AAAAB3NzaC1k [[snip]]
    
  • If using kdump, create the valid commands file by copying kdump.cmds to /etc/ssh/fcmds/kdump. Update scripts and paths as needed to match your environment

Step 6: Configure and test other clients:

  • Copy kdump public/private keys to /root/.ssh on all clients. If you don’t want to use the same key pair, create and configure new keys as documented above.

  • Update /etc/kdump.conf on all clients

  • Test crash each client in turn, verifying appropriate results and troubleshoot as necessary.

Scripts:

Remember to update scripts and paths as needed to match your environment.

fcmds:

#!/bin/ksh

##########################################################################
# sshroot:  Forced command for ssh/pka access to root.  If remote command
#           is given, this script will log the command to syslog;
#           otherwise, it'll log in normally, source the profile, and execute
#           ksh.
# Author:   Doug O'Leary
# Created:  01/02/08
##########################################################################
# $Log: sshroot,v $
#
# Revision 1.3 2014/1/29 jroess
# Added case handling of various shells bash,ksh,zsh
# Added forced display of /etc/motd
#
# Revision 1.2  2008/01/04  09:52:05  09:52:05  root ()
# replaced exec w/eval to allow multiple commands in ssh batch mode
#
# Revision 1.1  2008/01/02  22:12:01  22:12:01  root ()
# Initial revision
#
##########################################################################

user=$(/usr/bin/id -un 2>/dev/null)
[[ ${#user} -eq 0 ]] && user=$(echo $LOGNAME)
shell=$(/usr/bin/getent passwd $user | awk 'BEGIN{FS=":"}{print $7}' | \
    awk 'BEGIN{FS="/"}{print $3}' )
os=$(uname -s)
echo ${os} | grep -i linux > /dev/null 2>&1
[[ $? -eq 0 ]] && logl="authpriv" || logl="auth"

if [ "${SSH_ORIGINAL_COMMAND}x" != "x" ]
then
    logger -p ${logl}.info "ssh[${PPID}]/pka executed ${SSH_ORIGINAL_COMMAND}"
    eval "${SSH_ORIGINAL_COMMAND}"
else
    case "$shell" in
        bash)
            [ -f /etc/motd ] && cat /etc/motd
            echo "sshroot"
            /bin/$shell --login
            ;;
        ksh)
            [ -f /etc/motd ] && cat /etc/motd
            /bin/$shell -
            ;;
        zsh)
            [ -f /etc/motd ] && cat /etc/motd
            #/bin/$shell ./.zprofile
            /bin/$shell
            ;;
        *)
            echo "too bad, software pirate!"
            exit 1
            ;;
    esac
fi

kdump:

#!/bin/ksh

##########################################################################
# sshroot:  Forced command for ssh/pka access to root.  If remote command
#           is given, this script will log the command to syslog;
#           otherwise, it'll log in normally, source the profile, and execute
#           ksh.
# Author:   Doug O'Leary
# Created:  01/02/07
##########################################################################
# $Log:     sshroot,v $
# Revision 1.2  2008/01/04  09:52:05  09:52:05  root ()
# replaced exec w/eval to allow multiple commands in ssh batch mode
#
# Revision 1.1  2008/01/02  22:12:01  22:12:01  root ()
# Initial revision
#
##########################################################################

VC=/etc/ssh/fcmds/kdump

if [ "${SSH_ORIGINAL_COMMAND}x" != "x" ]
then
    echo "${SSH_ORIGINAL_COMMAND}" | fgrep -f  ${VC} > /dev/null 2>&1
    if [ $? -eq 0 ]
    then
            logger -p authpriv.info "ssh/pka executed ${SSH_ORIGINAL_COMMAND}"
            eval "${SSH_ORIGINAL_COMMAND}"
    else
            logger -p auth.warning "invalid command for this key! ${SSH_ORIGINAL_COMMAND}"
            # echo "too bad, software pirate!"
            echo "REJECTED: Invalid command!"
            exit 1
    fi
else
    logger -p auth.warning "non-interactive key attempted interactive login!"
    echo "too bad, software pirate!"
    exit 1
fi

kdump.cmds:

dd of=/ignite/kdump
df -P /ignite/kdump
mkdir /ignite/kdump
mkdir -p /ignite/kdump
mktemp -dqp /ignite/kdump
mv /ignite/kdump
rmdir /ignite/kdump

Summary:

The primary purpose of this page is to rectify some of the short comings in the kdump documentation, specifically, storage requirements and ssh configuration.

As is the case with most things in UNIX, there will be multiple ways to skin this particular cat so feel free to change things to whatever works for you. If you find something wrong or have a more clever way to approach this, feel free to contact me directly.

Thanks.

Doug O’Leary