Basic TPC Server Health Check

Technote (FAQ)


Question

How do I perform a basic health check of my TPC server?

Cause

Normal operation, administration, preventive maintenance and planning

Answer

**Note: this health check is focused on the TPC server and software itself, and is not focused on the health of the devices that are being monitored by TPC, which is beyond the scope of this document.

Periodic TPC server health checks are an important part of maintaining a smoothly functioning TPC environment. It is also an important part of disaster recovery readiness.

It is particularly important to conduct a health check prior to major configuration changes or upgrades, as a poorly functioning TPC environment is likely to lead to errors, failure, and instability (server hangs, needing frequent reboots, etc.).

DOCUMENTATION

Do you have a document that has a record of your critical TPC server configuration information? This is vital information for the team administering and maintaining TPC, and for situations requiring support assistance. Examples of information you should document:

1) user accounts and passwords for TPC (login, host authentication, DB2 admin, common user, WAS admin console, JazzSM/TCR admin user)

2) deviations from a standard/default install (different install path/drive for TPC and/or DB2, multi-server install details, etc.)

TPC SERVER PLATFORM SPECIFICATIONS

When TPC was installed, the requirements for the server - OS version, memory, number of CPUs and processor speed, web browser version, DB2 version, etc. - should be at supported levels according to the TPC supported products and platforms document (link below). Review this document for your TPC version and verify that your server meets the requirements that are listed.

TPC SERVER ACCESS

1) Are you able to login to TPC with your administrator and user accounts?

2) Are passwords for DB2 admin, TPC service accounts, TPC common user set to never expire -OR- proper controls/procedures documented and followed to change/update passwords before they expire?

TPC SERVER LOGS

Review most recent TPC server logs for error messages that identify problems that need to be resolved:

1) Data server (most recent <TPC>/data/log/server_xxxxxx.log, TPCD_xxxxxx.log, Scheduler_xxxxxx.log files)

2) Device server (<TPC>/device/log/msgTPCDeviceServer.log, traceTPCDeviceServer.log, dmSvcTrace.log files)

Check the TPC server directories for old/large logs, dumps, etc. that can be deleted to save space. Refer to "Related Information" below for links to technotes on these topics.

DB2 HEALTH AND RECOVERABILITY

1) Do you take regular backups of the TPC database? Check to make sure you have a current/recent backup and that scheduled backups are taking place as planned.

2) Are the DB2 services up and running? Is the DB2 TCPIP port (usually port 50000) present in a netstat command output and in LISTENING state?

3) If you have the DB2 Control Center available (DB2 v9.7 and older), use the Health Center to check for alert conditions needing attention, and use the 'Recommendation Advisor' for guidance on remedies.

4) Consider installing and using IBM Data Studio for DB2 v10.1 and newer versions for access to tools equivalent to the DB2 Control Center Health Center.

5) Locate and scan the 'db2diag.log' file for error messages and conditions requiring action/attention. **Note: if this file is very large, consider running a 'db2support' command to capture the current log, and then run 'db2diag -A' to archive the current log and initialize a new one.

OVERALL SERVER HEALTH

1) Do you have backup software running on your server to backup the server for disaster recovery? Check to make sure you have a current/recent backup of the server that can be used for recovery, and that scheduled backups are taking place as planned.

2) Check your system disks/filesystems for adequate disk space (i.e., OS: C:, /root, /tmp filesystems, application: TPC install disk/filesystem, DB2 install/database disk/filesystem)

3) Check system and application event/error logs, errpt/syslog for error messages or conditions needing action/attention.

Related information

Supported Hardware/Platforms/Products
Managing TPC Log Files
Cleaning up the TPC server directories
DB2 Health Check
Basic DB2 Maintenance for TPC


Cross reference information
Segment Product Component Platform Version Edition
Storage Management Tivoli Storage Productivity Center Standard Edition AIX, Linux, Windows 4.2, 4.2.1, 4.2.2

Rate this page:

(0 users)Average rating

Add comments

Document information


More support for:

Tivoli Storage Productivity Center

Software version:

5.1, 5.1.1, 5.2

Operating system(s):

AIX, Linux, Windows

Reference #:

1666252

Modified date:

2014-03-06

Translate my page

Machine Translation

Content navigation