IBM Support

Perl script to collect IBM Spectrum Protect server monitoring data V3.1

Troubleshooting


Problem

Servermon is a script used to collect Protect Server data in order to troubleshoot problems with server processes, client sessions and/or of a performance nature.  Version 3.1 was just released with some enhancements.

Environment

Important changes to version 3.1:

If you have been using Servermon in the past, please review this document carefully as there has been some enhancements and it behaves slightly differently.  Here are the highlights of the changes:

  1. Servermon now runs for 24 hours by default instead of 72 cycles of 1200 seconds.
  2. Servermon will run "always-on" by default and create one .zip file daily for each collection.
  3. The .zip file will be created at the end time specified.
  4. The administrator password will no longer be stored in clear text in servermon.ini.

For Linux x86 servers running Spectrum Protect 8.1.6.0 only:

Please upgrade to 8.1.6.1 or later.   If you cannot upgrade at this time, please disable instrumentation by editing servermon.ini and changing the value of "instrumentation" to N :

Example:

instrumentation = N

Other platforms and versions do not have an issue with instrumentation and should remain enabled.


Resolving The Problem

Collecting Data: Table of contents:

servermon.pl is a Perl script used to collect data from IBM Spectrum Protect servers. It typically does so in 20 minute intervals called diagCycles. But there are also additional collections as when the script starts and ends.

The script can also be used to collect a trace and in addition to that, stop the collection when a specific server message is logged in the activity log.

The script primarily collects IBM Spectrum Protect specific data such as server instrumentation for thread level analysis, various queries and show commands used for troubleshooting. It also collects DB2 specific data for when it's necessary to do more in depth analysis on a server problem. And finally, it also collects data from the operating system, such as CPU, memory and disk metrics.

This script has a very low overhead. It has been tested both internally and with customers and is not known to cause performance degradation.

With this technote you can find the most current version of the server monitoring script. This is supported at all levels of IBM Spectrum Protect server V7.1 and higher. To set up monitoring you need the following 3 files from the downloads section:

  • servermon.pl - the script file itself
  • servermon.ini - configuration parameters
  • commands.ini - commands that are submitted by the script


Download the files to a local directory, Data collected will be written to the current directory from where the script is run.

The script can be run a few different ways:

  • in the foreground (default)
  • in the background (noprompt parameter)
  • on a storage agent (no database)


Note: The script requires a Perl interpreter in order to run. Some operating systems such as Windows do not ship with Perl installed. Strawberry Perl and Active Perl are two popular distributions, but there are others as well.

1. While logged in as the instance owner, create a directory called "servermon" in the instance directory (where dsmserv.opt is located).

2. Download the 3 files .zip files, and extract them. Copy servermon.pl, commands.ini and servermon.ini to the servermon directory created in Step 1. The data collected will be written to the servermon directory as long as it's invoked from there.

3. Before running servermon, you need to update some of options in the servermon.ini file. The servermon.ini file contains different configuration stanzas,
The servermon.ini file assigns values to a fixed set of variables and is of the form:
variable = value # some comment

At a minimum the following options in the [TSM] stanza need to be updated:
[TSM]
server        = SERVERSTANZA     # server stanza for UNIX, TCPSERVERADDRESS for Windows
tcpport       = 1500         # only used on Windows

 

Optional, but it's also preferable to update the CollCycleEndTime in the [Run Time Options] stanza, which  sets the time when a 24-hour cycle will end and new one will begin, to reflect the start of the client backup window.  By default, it's set to 17:00.  For example, if the backup window starts at 18:00, then set it to:

[Run Time Options]
CollCycleEndTime = 18:00 # When AlwaysOn=Y, when will new collection cycle start

To find the proper value to use for the server variable, for *IX systems look at the server stanza that is either configured in the default dsm.sys file or the file DSM_CONFIG environment variable is pointing to.  On Windows systems, you can specify the IP address of the server or alternatively specify either localhost or 127.0.0.1.  The value for tcpport is the port number the server listens to.

4.  Optional, but recommended: Create a Spectrum Protect administrator ID with a name of SERVERMON and a password of your choice.   Grant it SYSTEM authority, and use this administrator ID to run servermon.   This way you can see in the activity log which commands are issued by servermon.  It helps troubleshooting if suspecting a problem with one of the servermon commands.

5.  When running in the background, it's no longer necessary to put the adminstrator ID and password in servermon.ini, it's obfuscated and stored in a separate file.   Before running servermon, it will be necessary to save the credentials, and this can be done using the buildcredentials option:

perl servermon.pl buildcredentials

For the script to collect a complete set of docs on an IBM Spectrum Protect server, it must be run locally as the instance user.

Windows:
Note that on Windows, it still runs in the foreground, the instructions here are to run it without any prompts using the presets in servermon.ini.  This should be started from the server console instead of an RDP session so that servermon doesn't stop when the RDP session ends.

Run the script from a DB2 command window:
Start -> Run -> db2cmd
From there, cd to the target directory where servermon.pl was saved and issue this command:
perl servermon.pl noprompt

AIX:
Change directory to where servermon.pl is stored and issue:
nohup perl64 servermon.pl noprompt show-details &
(the perl64 interpreter prevents out of memory conditions on AIX systems)

Linux:
Change directory to where servermon.pl is stored and issue:
nohup perl servermon.pl noprompt show-details &
or
nohup ./servermon.pl noprompt show-details &

For the script to collect a complete set of docs on an IBM Spectrum Protect server, it must be run locally as the instance user.

Windows:
Run the script from a DB2 command window:
Start -> Run -> db2cmd
From there, cd to the target directory where servermon.pl was saved and issue this command:
perl servermon.pl prompt

AIX:
Change directory to where servermon.pl is stored and issue:
perl64 servermon.pl prompt
(the perl64 interpreter prevents an out of memory symptom on AIX systems)

Linux:
Change directory to where servermon.pl is stored and issue:
perl servermon.pl prompt
or
./servermon.pl prompt

3. Follow the prompts which are listed below with their explanation:

Press enter to accept default values shown in brackets:

Enter the servername as found in dsm.opt or dsm.sys [servermon]:
On Windows server specify the TCP/IP address of the target server, as the script is expected to be run on the same machine as the server is running usually localhost will work.
For all non-Windows environments, specify the servername as found in dsm.sys/dsm.opt.

Collect SHOW THREADS output? (Y/N) [Y]:
Specifies whether or not to run the "show threads" command. Default value is Y.

Collect SHOW DEDUPDELETE output? (Y/N) [Y]:
If running legacy deduplication (DEVCLASS=FILE sequential pools), specify Y for this option. If you are only running container pools enter N.

Collect SHOW for memory statistics? (Y/N) [N]:
SHOW ALLOC/MEM/MEMTREND output is not collected by default, but the information is helpful if the symptom to investigate is memory usage related. Please use as advised by IBM support.

Collect SHOW REPLICATION output? (Y/N) [N]:
Show replication can sometimes take a long time to run, by default it's N.  Please use as advised by IBM support.

Run Procstack command? (Y/N) [N]:
Collects process call stack.  Please use as advised by IBM support.

Enable collectstmt instrumentation? (N/Y) [Y]:
Enabling COLLECTSTMT will result in the generation of instrumentation statistics for  SQL statements. Default value is Y.

Enable SNAPSHOT FOR DYNAMIC SQL? (N/Y) [N]:
Enabling SNAPSHOT FOR DYNAMIC SQL will result in the generation of additional DB2 statistics. As the collection could impact server performance the default is set to not collect the statistics.
Note: usually collecting snapshot for dynamic SQL requires to enable certain DB2 monitors, please use as advised by IBM support.

Monitor log spanning? (N/Y) [Y]:
Log span monitoring is enabled by default, the information gathered can help to analyze transactions that use lots of active log space.

Enter comma separated stop events, e.g. ANR#####E,ANR9999D_########## [none]:
Here you can specify a comma separated list of messages as logged to the activity log. If any of the messages specified is found during the runtime of the script, the script will go through some final doc collection and then stop.

Enter whitespace separated traceflags [none]:
Here you can specify any server traceflags as asked for by IBM support. Tracing will be stopped upon stopping the script via sending SIGINT or if the number of diagnostic cycles are reached (see below).

Enter traceflags to disable [none]:
Here you can specify any server traceflags to disable, for e.g. if you specified trace aggregate AF you might want to disable AFTXN subset. Specify as instructed by IBM support.

Copy FFDCLOG file? (N/Y) [Y]:
At the end of each (documentation collection cycle (see below) add a copy of the FFDCLOG file to the collected docs. Default is "Y". This is only supported for IBM Spectrum Protect servers, not storage agents.

Enter the DB2 instance name [tsminst1]:
Press Enter if your instance name is tsminst1, otherwise enter your instance name.

To stop the data collection, use CTRL+C and let the script terminate the last collection cycle and create the .zip file.

1. To collect monitoring data for an IBM Spectrum Protect storage agent (STA), review the client dsm.sys or dsm.opt file for the following options:

  • Lanfreecommmethod (supported on *IX via server stanza)
  • Lanfreeshmport (supported in *IX via server stanza)
  • Lanfreetcpport
  • Lanfreetcpserveraddress


2. On *IX systems, create a server stanza in dsm.sys for storage agent that matches the communication definitions identified in Step 1.

TCPServeraddress -> Lanfreetcpserveraddress
TCPPort -> Lanfreetcpport
Commmethod -> Lanfreecommmethod
Shmport -> Lanfreeshmport


For Windows systems, you need to configure the STA to allow for TCP communication by adding "commmethod tcpip" in dsmsta.opt if it it's not present in order for the script to work properly.

3. To start the script in the foreground no additional parameter is needed, to start the script in the background, do the following:
Windows:
From a Windows Command Prompt, change directory to where servermon.pl is stored and issue: and issue this command:
perl servermon.pl prompt

AIX:
Change directory to where servermon.pl is stored and issue:
perl64 servermon.pl prompt
(the perl64 interpreter prevents from an out of memory symptom on AIX systems)
 

Linux:
Change directory to where servermon.pl is stored and issue:
perl servermon.pl prompt
or
./servermon.pl prompt

Press enter to accept default values shown in brackets:

Enter the servername as found in dsm.opt or dsm.sys [servermon]:
On Windows server specify the TCP/IP address of the storage agent, as the script is expected to be run on the same machine as the server is running usually localhost will work.  For all non-Windows environments, specify the servername as found in dsm.sys/dsm.opt.


1. If the script is run with noprompt with a set number of diagnostic/collection cycles, it could also be used with a task scheduler (making sure that the DB2 environment is being established correctly first). This way, the script could be run continuously on a day to day basis to gather performance data or monitoring information for a given problem. In addition, the corresponding files created can be also be archived off (and deleted) with IBM Spectrum Protect for historical purposes using the archive retention policies set up in archive copygroup.

2. You must change the [DB2] stanza if your instance name is something other than the default of tsminst1:

[DB2]
instance        = tsminst1     # DB2 instance name, if your DB2INSTANCE environment variable is set it

dbalias         = tsmdb1       # DB2 database alias for TSM database

When the script is invoked it will verify that the instance specified exists, and if not the script will provide a list of instances to select from.


In most cases, the script will create a .zip file in the current directory, that is the file that needs to be sent to support.

In some cases if the script is interrupted before it completes the collection cycle, then it will be necessary to zip or tar.Z the data collection directory located in the current directory. The directory has the following naming convention: YYYYMMDD-HHMM-swg21590928

Data collected can be submitted to IBM using either ECUREP or Blue Diamond depending on the account..

If using ESR , update the PMR to indicate that data has been sent.


Filename

Download Version cksum md5sum
servermon.pl servermon-pl-V3-11.zip 3.11 856320126 111430 servermon.pl 4cfc9afc72f432ff6164c4646e238f59 *servermon.pl
servermon.ini servermon-ini-V3-10-001_0.zip 3.10.001 1118347151 4756 servermon.ini 5bdf34393a45d2e1acb282f8114bcda4 *servermon.ini
commands.ini commands-ini-V3-11-002.zip 3.11.002 579768499 24663 commands.ini dc80e152f9ddfd66b5f6f9c056ba3855 *commands.ini

Once extracted from the zip file, the above checksums apply.


Related information

IT28080: SPECTRUM PROTECT SERVER CRASH WHEN SHOW DBCONN COMMAND IS RUN
IT21957: LINUX SERVER PROCESSES CAN HANG AFTER SHOW THREADS IS RUN (fixed in 7.1.8/8.1.3)
IC96438: DSMSERV CRASH ADMINSTRUMENTEND END INSTRUMENTATION COMMAND (fixed in 7.1.1)
Script out of memory problem on AIX
ActivePerl
Cygwin
Strawberry Perl for Windows

Document information

More support for: IBM Spectrum Protect

Component: Server

Software version: All Supported Versions

Operating system(s): Platform Independent

Reference #: 1432937

Modified date: 12 March 2019