This section describes some essential commands
for troubleshooting and performance monitoring on Linux and UNIX platforms.
For details on any one of these commands, precede it with "man"
on the command line. Use these commands to gather and process data
that can help identify the cause of a problem you are having with
your system. Once the data is collected, it can be examined by someone
who is familiar with the problem, or provided to IBM Software Support if requested.
Troubleshooting commands (AIX)
The following AIX® system commands are useful for DB2® troubleshooting:
- errpt
- The errpt command reports system errors such
as hardware errors and network failures.
- For an overview that shows one line per error, use errpt
- For a more detailed view that shows one page for each error, use errpt
-a
- For errors with an error number of "1581762B", use errpt
-a -j 1581762B
- To find out if you ran out of paging space in the past, use errpt
| grep SYSVMM
- To find out if there are token ring card or disk problems, check
the errpt output for the phrases "disk" and "tr0"
- lsps
- The lsps -a command monitors and displays how
paging space is being used.
- lsattr
- This command displays various operating system parameters. For
example, use the following command to find out the amount of real
memory on the database partition:
lsattr -l sys0 -E
- xmperf
- For AIX systems using Motif,
this command starts a graphical monitor that collects and displays
system-related performance data. The monitor displays three-dimensional
diagrams for each database partition in a single window, and is good
for high-level monitoring. However, if activity is low, the output
from this monitor is of limited value.
- spmon
- If you are using system partitioning as part of the Parallel System
Support Program (PSSP), you might need to check if the SP Switch is
running on all workstations. To view the status of all database partitions,
use one of the following commands from the control workstation:
- spmon -d for ASCII output
- spmon -g for a graphical user interface
Alternatively, use the command netstat -i from
a database partition workstation to see if the switch is down. If
the switch is down, there is an asterisk (*) beside the database partition
name. For example: css0* 65520 <Link>0.0.0.0.0.0
The
asterisk does not display if the switch is up.
Troubleshooting commands (Linux and UNIX)
The
following system commands are for all Linux and UNIX systems, including AIX, unless otherwise noted.
- df
- The df command lets you see if file systems
are full.
- To see how much free space is in all file systems (including mounted
ones), use df
- To see how much free space is in all file systems with names containing
"dev", use df | grep dev
- To see how much free space is in your home file system, use df
/home
- To see how much free space is in the file system "tmp", use df
/tmp
- To see if there is enough free space on the machine, check the
output from the following commands: df /usr , df /var , df
/tmp , and df /home
- truss
- This command is useful for tracing system calls in one or more
processes.
- pstack
- Available for Solaris 2.5.1 or later, the /usr/proc/bin/pstack command
displays stack traceback information. The /usr/proc/bin directory
contains other tools for debugging processes that seem to be suspended.
Performance Monitoring Tools
The following
tools are available for monitoring the performance of your system.
- vmstat
- This command is useful for determining if something is suspended
or just taking a long time. You can monitor the paging rate, found
under the page in (pi) and page out (po) columns. Other important
columns are the amount of allocated virtual storage (avm) and free
virtual storage (fre).
- iostat
- This command is useful for monitoring I/O activities. You can
use the read and write rate to estimate the amount of time required
for certain SQL operations (if they are the only activity on the system).
- netstat
- This command lets you know the network traffic on each database
partition, and the number of error packets encountered. It is useful
for isolating network problems.
- system file
- Available for Solaris operating system, the /etc/system file
contains definitions for kernel configuration limits such as the maximum
number of users allowed on the system at a time, the maximum number
of processes per user, and the interprocess communication (IPC) limits
on size and number of resources. These limits are important because
they affect DB2 performance
on a Solaris operating system machine.