Performance Monitoring
Standard disclaimer: use the information that follows at
your own risk. If you screw up a system, don't blame it on me...
mailto: dkoleary@olearycomputers.com
Overview
sar, system accounting & reporting, is
a performance monitoring tool that comes standard with most flavors of
UNIX. It does a good job of collecting statistics and helping to
identify three out of the four possible UNIX based bottlenecks. The
first script below, sadc.txt, sets up sar. Two other scripts which
will be forthcoming generate reports on the CPUs and disks. These
are the two reports that I've used most often and also the two that need
the most manipulation in order to provide any meaningful results.
A couple of notes:
-
The scripts that follow were generated on HP platforms. Generally,
sar is set up the same between the various vendors; however, verify the
binary paths in the scripts if you're setting this up on systems other
than HP.
-
Most major vendors supply a -r option to their sar commands to report on
memory. HP, for some reason, doesn't. What they do have that
other vendors don't is the -Mu option which will break out the cpu stats
by CPU. That comes in handy to verify that the applications are hitting
all the cpus simultaneously.
And now, on to the scripts:
-
sadc.ksh: #
Set up and run the data collection
-
graph.sar-u: # Report on CPU
stats averaged out over an hour.
-
graph.sar-d: # Report on disk
statistics that break user defined threshholds.
sadc.ksh
As mentioned above, sadc.ksh sets up sar.
It should be kicked off via cron. The script takes two command line
arguments and will complain if it doesn't get them. The first identifies
the scan rate - how many seconds between system poles. The second
argument identifies the number of poles. The script will initiate
the sadc data collector with the scanning rate and number you specify and
store the data in /var/adm/sa/sar.YYMMDD.
This means that you'll have to use the -f command
line argument to sar to get information from the file. For instance,
to get the cpu stats from today's file, the command line is:
sar -u -f /var/adm/sa/sar.000204
Depending on the scan rate you specify, some of these files can get
pretty big. Make sure you have enough room in the /var filesystem.
Better yet, create a new filesytsem for the sar stats and either mount
it at /var/adm/sa or soft link it there. I usually like to have at
least a gig for a good performance evaluation - preferrably two gigs.
Speaking of my normal (performance evaluation) setup: I will run
the collector every two minutes for the full day. This catches just
about any twitch the system does. It provides the detail necessary
and, with the reporting scripts, can be averaged out to provide a much
higher level perspective. When I'm not actively running a performance
evaluation, I will back the scan rate off to about 15 minutes between poles.
At the two minute pole rate, the sar files will get up to 10-12 megs each.
As you can tell, it won't take long to fill up a filesystem with files
like that. The cron table entries look like:
0 0 * * * /usr/bin/ksh /usr/local/sbin/sadc.ksh 120
720 # Full rate Performance evaluation.
0 0 * * * /usr/bin/ksh /usr/local/sbin/sadc.ksh 900
96 # Reduced rate for normal system monitoring.
Make sure you either clean up the sar files periodically or stop the
data collection when the performance evaluation is complete.
graph.sar-u
Notes the graph.sar-u script will go here.
graph.sar-d
Notes on the graph.sar-d script will go here.
| Document:
|
|
| URL:
|
|
| Last updated:
|
|