How do I perform a basic health check of my TPC server?
Normal operation, administration, preventive maintenance and planning
**Note: this health check is focused on the TPC server and software itself, and is not focused on the health of the devices that are being monitored by TPC, which is beyond the scope of this document.
Periodic TPC server health checks are an important part of maintaining a smoothly functioning TPC environment. It is also an important part of disaster recovery readiness.
It is particularly important to conduct a health check prior to major configuration changes or upgrades, as a poorly functioning TPC environment is likely to lead to errors, failure, and instability (server hangs, needing frequent reboots, etc.).
Do you have a document that has a record of your critical TPC server configuration information? This is vital information for the team administering and maintaining TPC, and for situations requiring support assistance. Examples of information you should document:
1) user accounts and passwords for TPC (login, host authentication, DB2 admin, common user, WAS admin console, JazzSM/TCR admin user)
2) deviations from a standard/default install (different install path/drive for TPC and/or DB2, multi-server install details, etc.)
TPC SERVER PLATFORM SPECIFICATIONS
When TPC was installed, the requirements for the server - OS version, memory, number of CPUs and processor speed, web browser version, DB2 version, etc. - should be at supported levels according to the TPC supported products and platforms document (link below). Review this document for your TPC version and verify that your server meets the requirements that are listed.
TPC SERVER ACCESS
1) Are you able to login to TPC with your administrator and user accounts?
2) Are passwords for DB2 admin, TPC service accounts, TPC common user set to never expire -OR- proper controls/procedures documented and followed to change/update passwords before they expire?
TPC SERVER LOGS
Review most recent TPC server logs for error messages that identify problems that need to be resolved:
1) Data server (most recent <TPC>/data/log/server_xxxxxx.log, TPCD_xxxxxx.log, Scheduler_xxxxxx.log files)
2) Device server (<TPC>/device/log/msgTPCDeviceServer.log, traceTPCDeviceServer.log, dmSvcTrace.log files)
Check the TPC server directories for old/large logs, dumps, etc. that can be deleted to save space. Refer to "Related Information" below for links to technotes on these topics.
DB2 HEALTH AND RECOVERABILITY
1) Do you take regular backups of the TPC database? Check to make sure you have a current/recent backup and that scheduled backups are taking place as planned.
2) Are the DB2 services up and running? Is the DB2 TCPIP port (usually port 50000) present in a netstat command output and in LISTENING state?
3) If you have the DB2 Control Center available (DB2 v9.7 and older), use the Health Center to check for alert conditions needing attention, and use the 'Recommendation Advisor' for guidance on remedies.
4) Consider installing and using IBM Data Studio for DB2 v10.1 and newer versions for access to tools equivalent to the DB2 Control Center Health Center.
5) Locate and scan the 'db2diag.log' file for error messages and conditions requiring action/attention. **Note: if this file is very large, consider running a 'db2support' command to capture the current log, and then run 'db2diag -A' to archive the current log and initialize a new one.
OVERALL SERVER HEALTH
1) Do you have backup software running on your server to backup the server for disaster recovery? Check to make sure you have a current/recent backup of the server that can be used for recovery, and that scheduled backups are taking place as planned.
2) Check your system disks/filesystems for adequate disk space (i.e., OS: C:, /root, /tmp filesystems, application: TPC install disk/filesystem, DB2 install/database disk/filesystem)
3) Check system and application event/error logs, errpt/syslog for error messages or conditions needing action/attention.
|Storage Management||Tivoli Storage Productivity Center Standard Edition||AIX, Linux, Windows||4.2, 4.2.1, 4.2.2|