Data Collector not Connecting to Managing Server

Technote (troubleshooting)


This document applies only to the following language version(s):

US English

Problem(Abstract)

The Data Collector (DC) does not appear on the Managing Server (MS) Console because it has not connected properly.

Resolving the problem

The following steps will help you isolate the problem when the Data Collector does not connect to the Managing Server. Please do the following:

  • Verify WebSphere AppServer was Configured

    On the DC side, verify that the server.xml file did get configured and updated for ITCAM. You can do this by searching for genericJvmArguments and seeing if the ITCAM parameters are included. If the ITCAM parameters do not exist in the genericJvmArguments, the configuration was not successful (even if you thought it was successful based on the configuration logs). If you are in an ND environment, you may want to force a node sync to ensure changes are pushed down to the AppServer.

  • Check Firewalls

    Any firewalls between DCs and MS? If so, ports need to be opened between the MS and the DC.

    From DC to MS, you need the Kernel and PS ports open. Kernel ports are needed for connectivity and PS ports are needed for data to flow. These are defined in the <MS_HOME>/bin/setenv.sh script on the Managing Server. By default, the ports needed are:

    PORT_KERNEL_CODEBASE01=9122
    PORT_KERNEL_RFS01=9120
    PORT_KERNEL_RMI01=9118
    PORT_PS=9103
    PORT_PS2=9104

    The ports needed from the MS to the DC are defined in the <DC_HOME>/runtime/<was.node.svr>/ datacollector.properties file. By default, the ports needed are:

    probe.rmi.port=8200-8299
    probe.controller.rmi.port=8300-8399

    Note that only one port from each range will be used by a given DC. So, for example, if you have 10 DCs on a given host, you will need at least 10 ports defined in each range so that the MS can communicate back to the DC.

    Verify that you can get to a port from both directions by telnetting. For example:

    On MS:
    telnet <DC IP address or hostname> port
    telnet DCHOST.us.ibm.com 8300

      On DC:
      telnet <MS IP address or hostname> port
      telnet MSHOST.us.ibm.com 9122

      Review the ITCAM Install Guide for firewall setup instructions.
    • Make sure you can download files from the MS to the DC

      Test to see if files can be downloaded from the MS to the DC by executing the following from the DC:

      wget http://<MS_HOSTNAME>:9122/kernel.core.jar

      Use the codebase port defined to your MS in setenv.sh. The default codebase port is 9122.

      If the files get downloaded via this mechanism but do not get downloaded when DC
      code is communicating with MS code, the problem may be with the WebSphere
      Application Server configuration where the DC is being configured.

    • Check Network Interface Cards (NICs)
      Do multiple NICs exist on the DC? If so, you may need to set the following properties
      in the Data Collector definitions found in the <DC_HOME>/runtime/<wasver.node.svr>/
      <wasver.node.svr>.datacollector.properties file:

      am.socket.bindip=<IP address by which DC knows itself>
      am.socket.exportip=<IP address by which MS knows DC>

      Make sure the DC uses am.socket.bindip to access the MS and the MS accesses the DC by using am.socket.exportip.

      From the Data Collector, you can get the DC IP address that should be used for the am.socket.bindip setting by executing ipconfig, ifconfig, or pinging the DC hostname . From the Managing Server, you can get the DC IP address to be used for the am.socket.exportip by executing nslookup <DC_HOSTNAME>, looking in the /etc/hosts file, or pinging the DC hostname. Make sure you can ping the DC IP address from the MS.

      Another alternative is to set the java.rmi.server.hostname argument in the DC AppServer. This IP address should be set to the IP address that the MS should use for the DC.

      You can add the java.rmi.server.hostname definition to the system generated data collector properties file for the DC or through the WAS Admin Console. For the properties file, add:
      java.rmi.server.hostname=192.168.10.161 (or whatever the DC IP address is that you want to be used)

      f you are adding this definition through the WebSphere Admin Console, navigate as follows:

      WebSphere Application Server 6, 7, 8

      1. Select Server > Application Servers and select the <ServerName>.
      2. In the Configuration tab, navigate to Server Infrastructure > Java and Process Management > Process Definition > Additional Properties: Custom Properties.
      WebSphere Application Server 5
      1. Select Server > Application Servers and select the <ServerName>.
      2. Navigate to Additional Properties: Process Definition > Additional Properties: Environment Entries
    • Set up for Multiple DCs on one Machine

      Is the DC having problems the only DC on the physical box or do multiple DCs exist on one box? If multiple, special setup may be required. For example, if a firewall exists, each DC needs it's own set of rmi ports defined in the system generated datacollector.properties file (probe_rmi_port, probe_controller_rmi_port). Alternately, you can put a range for each of these in the datacollector.properties file, and recycle the DC AppServer.

    • Check for proxy hosts in WebSphere

      Check the WebSphere server.xml file for proxy host settings similar to the following:

      <systemProperties xmi:id="Property_1216843378838"
      name="http.nonProxyHosts"
      value="10.64.181.151|10.64.181.152|10.64.181.153|10.64.181.156|10.64.181.157|10.64.181.158"

      • http.proxyHost - sets a proxy host in WebSphere
      • http.nonProxyHosts - indicates the hosts which should be connected to directly and not through the proxy server.
      • value - the list of hosts that should be connected to directly and not through the proxy server.
      If the DC is not connecting to the MS and there is a proxy server defined in the DC WebSphere server.xml, make sure that the MS hostname or IP address is included in the value list for name="http.nonProxyHosts".
    • Make sure the setting to communicate with the MS is set correctly

      Go to the <DC_HOME>/runtime/<was.node.appsvr>/dc.java.properties file and make sure that

      dc.operation.mode=ms,wr or dc.operation.mode=ms

      dc.operation.mode must include the ms setting if the DC is to communicate with the MS. wr is needed for the DC to connect to the ITM environment.

    • Check Permissions

      If different IDs are used to install, permissions on certain directories may affect installations after the first DC (e.g. permissions on /tmp/ibm_am_installed_dc). Permission errors are not always obvious. Search for "Denied" in the logs.

    • Verify that only one Profiler exists if Java 1.4.2 or 1.3 is used

      In the DC server.xml, verify that there are no other -Xrun arguments in the Generic JVM Arguments other than -Xrunam*. This is a Java limitation prior to Java 1.5. ITCAM can not run in this case. The only -Xrun in the Generic JVM Arguments should be -Xrunam*.

    • Verify that xhealthcenter is not installed in the AppServer

      ITCAM cannot run if xhealthcenter is installed. If you see xhealthcenter in the server.xml where the DC is installed, xhealthcenter will have to be removed. You will also see an error similar to the following in the WebSphere SystemErr.log when xhealthcenter is installed:

      <PPECONTROLLER, c1cd50e4-1757-e001-2e4d-afc9e243e427.620, 10.48.62.11>
      Unable to join Kernel 10.48.32.85:9120 [12/7/11 13:09:02:884 EST] 00000164 SystemErr
      java.rmi.ServerException: RemoteException occurred in server thread;
      nested exception is: java.rmi.UnmarshalException: error unmarshalling arguments;
      nested exception is: java.lang.ClassNotFoundException: com.cyanea.probe.ControllerAVMData

    • Make sure the Managing Server (MS) is up and running

      Go to the MS and execute "klctl.sh status" and make sure all of the components are running.

    • Verify MS Home Directory

      On the DC side, verify that the MS home directory is correct in the WebSphere's variables.xml for that AppServer. The home directory may also be defined via the java.rmi.server.codebase setting in the <DC_HOME>/runtime/<wasver.node.svr>/<wasver.node.svr>.datacollector.properties file.

    • Check for RMI Connection error in kernel log (kl*.log)

      If an error message exists in the MS kernel logs indicating that the connection is refused for an RMI connection, make sure that the correct hostname or IP address is listed for the connection. See the section above for multiple NIC cards for setup considerations.

    • No Reconnection by DC with Kernel

      If the customer has other applications using RMI, our default setting of -1 (infinite) may cause a problem. In this case, set the am.rmisocket.timeout=180000 in datacollector.properties.

      Here is an explanation:
      The DC uses an RMI mechanism to join the MS kernel. By default, class 'java.rmi.server.UnicastRemoteObject' method 'exportObject(Remote obj, int port)' used to export ProbeController to make it available to receive incoming calls from the kernel. In this case the timeout is set to -1 (infinite).

      Customer's applications are also using RMI. After the applications start, the DC RMI connection is reset, but because of infinite timeout no reconnection with the kernel happens.

    • Once everything is verified, make sure DC is connecting

      In the DC AppServer SystemOut.log, you should see a PPECONTROLLER message stating that the DC has joined the kernel if the DC connects to the MS. If the DC is configured via the MS Application Monitor and MS has communicated back to the DC, you will see PPEPROBE messages in the log as well.

    Cross reference information
    Segment Product Component Platform Version Edition
    Systems and Asset Management Tivoli Composite Application Manager for Application Diagnostics AIX, HP-UX, Linux, Solaris, Windows 7.1
    Systems and Asset Management Tivoli Composite Application Manager for Application Diagnostics on z/OS z/OS 6.1

    Product Alias/Synonym

    ITCAM4WS
    ITCAMfWS
    ITCAM4WAS
    ITCAMfWAS

    Rate this page:

    (0 users)Average rating

    Add comments

    Document information


    More support for:

    Tivoli Composite Application Manager for WebSphere

    Software version:

    All Versions

    Operating system(s):

    AIX, HP-UX, Linux, Solaris, Windows, z/OS

    Reference #:

    1267468

    Modified date:

    2011-01-10

    Translate my page

    Machine Translation

    Content navigation