MCSG: Reasons for TOC

Title:

MCSG: Reasons for TOC

Author:

Douglas O’Leary <dkoleary@olearycomputers.com>

Description:

MCSG: Reasons for TOC

Date created:

06/2007

Date updated:

07/2008

Disclaimer:

Standard: Use the information that follows at your own risk. If you screw up a system, don’t blame it on me…

Transfer of Control (TOC) is a flashy name for a hard system reboot. It’s not a system crash, so there won’t be a dump to analyze; however, it’s not a shutdown -r either, so filesystems will have to be checked and any RDBMSs will have to go through their recovery methods.

MCSG will TOC a system in order to release system resources and to ensure data integrity. There are several scenarios in which MCSG will TOC a system. They are:

  1. A two-node cluster loses heartbeat at which time a single node cluster will form. The system that loses the race to the lock disk will TOC.

  2. The cluster daemon, cmcld, dies for any reason

  3. NODE_FAIL_FAST=YES is set in a package configuration file.

  4. The cluster lvm daemon, cmlvmd, dies for any reason.

  5. System safety time is disabled via the cmsetsafety command.

  6. SERVICE_FAIL_FAST_ENABLED = YES is set (causes reboot).