======================================================= MCSG: Symptoms/Cause matrix ======================================================= :Title: MCSG: Symptoms/Cause matrix :Author: Douglas O'Leary :Description: MCSG: Symptoms/Cause matrix :Date created: 06/2007 :Date updated: 06/2008 :Disclaimer: Standard: Use the information that follows at your own risk. If you screw up a system, don't blame it on me... This, more than any other document in the MCSG section, is going to be a work in progress. The following are symptoms of a cluster problem and how to go about fixing them. Please send me a note with anything that can/should be added. There are a couple of places to look for errors. Probably the first stop is the package log maintained in the package directory, /etc/cmcluster/${pkg}. The next stop is the syslog and/or the log file created via cmsetlog. Read through those logs carefully; there's usually only one line indicating the problem and it'll be easy to miss. * Symptom: One of the nodes reboots for unknown reasons. Cause: Please check out the :doc:`Reasons for TOCs ` for possible causes. * Symptom: Activation mode requested for volume group ${vg} conflicts with configured mode Cause: Almost definitely caused by the ${vg} not having the exclusive bit set. With the cluster running, execute ``vgchange -c y ${vg}`` * Symptom: Node ${node} is currently unable to run ${pkg} Cause: Local switch for the node is disabled. Check via cmviewcl -vp ${pkg} and reset with cmmodpkg -e -n ${node} ${pkg} * Symptom: Can't find service name ${service} Cause: Possibly a problem with the way the service name was identified in the package configuration file. Ensure the name is a single word without quotes * Symptom: #. Package starts on primary node, then shortly thereafter stops. #. Package starts up on each of the adoptive nodes, then shortly thereafter stops. #. Once the package has gone through all the adoptive nodes, it dies completely. #. The local switches for all primary/adoptive nodes read disabled Cause: If the package is running any services, ensure they all have infinite loops. If a service exits, by default, it will cause a package switch and set the node's local switch to disbabled * Symptom: cmviewcl command hangs Cause: ``ps -ef | grep cmclconfd`` If there are bunches of them, kill them all then restart inetd by running ``inetd -c``