Data Collector not Connecting to Managing Server
This document applies only to the following language version(s):
The Data Collector (DC) does not appear on the Managing Server (MS) Console because it has not connected properly.
Resolving the problem
The following steps will help you isolate the problem when the Data Collector does not connect to the Managing Server. Please do the following:
- Verify WebSphere AppServer was Configured
On the DC side, verify that the server.xml file did get configured and updated for ITCAM. You can do this by searching for genericJvmArguments and seeing if the ITCAM parameters are included. If the ITCAM parameters do not exist in the genericJvmArguments, the configuration was not successful (even if you thought it was successful based on the configuration logs). If you are in an ND environment, you may want to force a node sync to ensure changes are pushed down to the AppServer.
- Check Firewalls
Any firewalls between DCs and MS? If so, ports need to be opened between the MS and the DC.
From DC to MS, you need the Kernel and PS ports open. Kernel ports are needed for connectivity and PS ports are needed for data to flow. These are defined in the <MS_HOME>/bin/setenv.sh script on the Managing Server. By default, the ports needed are:
The ports needed from the MS to the DC are defined in the <DC_HOME>/runtime/<was.node.svr>/ datacollector.properties file. By default, the ports needed are:
Note that only one port from each range will be used by a given DC. So, for example, if you have 10 DCs on a given host, you will need at least 10 ports defined in each range so that the MS can communicate back to the DC.
Verify that you can get to a port from both directions by telnetting. For example:
telnet <DC IP address or hostname> port
telnet DCHOST.us.ibm.com 8300
telnet <MS IP address or hostname> port
telnet MSHOST.us.ibm.com 9122
Review the ITCAM Install Guide for firewall setup instructions.
- Make sure you can download files from the MS to the DC
Test to see if files can be downloaded from the MS to the DC by executing the following from the DC:
Use the codebase port defined to your MS in setenv.sh. The default codebase port is 9122.
If the files get downloaded via this mechanism but do not get downloaded when DC
code is communicating with MS code, the problem may be with the WebSphere
Application Server configuration where the DC is being configured.
- Check Network Interface Cards (NICs)
Do multiple NICs exist on the DC? If so, you may need to set the following properties
in the Data Collector definitions found in the <DC_HOME>/runtime/<wasver.node.svr>/
am.socket.bindip=<IP address by which DC knows itself>
am.socket.exportip=<IP address by which MS knows DC>
Make sure the DC uses am.socket.bindip to access the MS and the MS accesses the DC by using am.socket.exportip.
From the Data Collector, you can get the DC IP address that should be used for the am.socket.bindip setting by executing ipconfig, ifconfig, or pinging the DC hostname . From the Managing Server, you can get the DC IP address to be used for the am.socket.exportip by executing nslookup <DC_HOSTNAME>, looking in the /etc/hosts file, or pinging the DC hostname. Make sure you can ping the DC IP address from the MS.
Another alternative is to set the java.rmi.server.hostname argument in the DC AppServer. This IP address should be set to the IP address that the MS should use for the DC.
You can add the
java.rmi.server.hostname definition to the system generated data collector properties file for the DC or through the WAS Admin Console. For the properties file, add:
java.rmi.server.hostname=192.168.10.161 (or whatever the DC IP address is that you want to be used)
f you are adding this definition through the WebSphere Admin Console, navigate as follows:
WebSphere Application Server 6, 7, 8
- Select Server > Application Servers and select the <ServerName>.
- In the Configuration tab, navigate to Server Infrastructure > Java and Process Management > Process Definition > Additional Properties: Custom Properties.
WebSphere Application Server 5
- Select Server > Application Servers and select the <ServerName>.
- Navigate to Additional Properties: Process Definition > Additional Properties: Environment Entries
Is the DC having problems the only DC on the physical box or do multiple DCs exist on one box? If multiple, special setup may be required. For example, if a firewall exists, each DC needs it's own set of rmi ports defined in the system generated datacollector.properties file (probe_rmi_port, probe_controller_rmi_port). Alternately, you can put a range for each of these in the datacollector.properties file, and recycle the DC AppServer.
Check the WebSphere server.xml file for proxy host settings similar to the following:
- http.proxyHost - sets a proxy host in WebSphere
- http.nonProxyHosts - indicates the hosts which should be connected to directly and not through the proxy server.
- value - the list of hosts that should be connected to directly and not through the proxy server.
Go to the <DC_HOME>/runtime/<was.node.appsvr>/dc.java.properties file and make sure that
dc.operation.mode=ms,wr or dc.operation.mode=ms
dc.operation.mode must include the ms setting if the DC is to communicate with the MS. wr is needed for the DC to connect to the ITM environment.
If different IDs are used to install, permissions on certain directories may affect installations after the first DC (e.g. permissions on /tmp/ibm_am_installed_dc). Permission errors are not always obvious. Search for "Denied" in the logs.
In the DC server.xml, verify that there are no other -Xrun arguments in the Generic JVM Arguments other than -Xrunam*. This is a Java limitation prior to Java 1.5. ITCAM can not run in this case. The only -Xrun in the Generic JVM Arguments should be -Xrunam*.
ITCAM cannot run if xhealthcenter is installed. If you see xhealthcenter in the server.xml where the DC is installed, xhealthcenter will have to be removed. You will also see an error similar to the following in the WebSphere SystemErr.log when xhealthcenter is installed:
<PPECONTROLLER, c1cd50e4-1757-e001-2e4d-afc9e243e427.620, 10.48.62.11>
Unable to join Kernel 10.48.32.85:9120 [12/7/11 13:09:02:884 EST] 00000164 SystemErr
java.rmi.ServerException: RemoteException occurred in server thread;
nested exception is: java.rmi.UnmarshalException: error unmarshalling arguments;
nested exception is: java.lang.ClassNotFoundException: com.cyanea.probe.ControllerAVMData
Go to the MS and execute "klctl.sh status" and make sure all of the components are running.
On the DC side, verify that the MS home directory is correct in the WebSphere's variables.xml for that AppServer. The home directory may also be defined via the java.rmi.server.codebase setting in the <DC_HOME>/runtime/<wasver.node.svr>/<wasver.node.svr>.datacollector.properties file.
If an error message exists in the MS kernel logs indicating that the connection is refused for an RMI connection, make sure that the correct hostname or IP address is listed for the connection. See the section above for multiple NIC cards for setup considerations.
If the customer has other applications using RMI, our default setting of -1 (infinite) may cause a problem. In this case, set the am.rmisocket.timeout=180000 in datacollector.properties.
Here is an explanation:
The DC uses an RMI mechanism to join the MS kernel. By default, class 'java.rmi.server.UnicastRemoteObject' method 'exportObject(Remote obj, int port)' used to export ProbeController to make it available to receive incoming calls from the kernel. In this case the timeout is set to -1 (infinite).
Customer's applications are also using RMI. After the applications start, the DC RMI connection is reset, but because of infinite timeout no reconnection with the kernel happens.
In the DC AppServer SystemOut.log, you should see a PPECONTROLLER message stating that the DC has joined the kernel if the DC connects to the MS. If the DC is configured via the MS Application Monitor and MS has communicated back to the DC, you will see PPEPROBE messages in the log as well.
|Systems and Asset Management||Tivoli Composite Application Manager for Application Diagnostics||AIX, HP-UX, Linux, Solaris, Windows||7.1|
|Systems and Asset Management||Tivoli Composite Application Manager for Application Diagnostics on z/OS||z/OS||6.1|
More support for:
Tivoli Composite Application Manager for WebSphere
Software version: All Versions
Operating system(s): AIX, HP-UX, Linux, Solaris, Windows, z/OS
Reference #: 1267468
Modified date: 10 January 2011