IOB002I message written by the IOBSNMP subagent

Technote (troubleshooting)


Problem(Abstract)

The IOBSNMP SNMP subagent writes one of the following two IOB002I messages to the IOBSNMP JES joblog:

IOB002I <date> <time> Could not obtain handle from agent. Retrying
IOB002I <date> <time> Could not obtain handle from agent. Exiting

This may occur upon starting IOBSNMP, or after IOBSNMP has been running successfully for some time. Additionally, this message may be accompanied by the following message:

IOB032I <date> <time> Disconnecting from SNMP Agent w/ RC -5

Cause

This message is issued for two different reasons:

  1. IOBSNMP cannot connect to the SNMP agent. This is often the result of a configuration problem. For example:
    • Attempting to connect to the SNMP agent using the wrong port
    • Attempting to connect to the SNMP agent using the wrong community name
    • In a CINET environment, associating IOBSNMP with the wrong TCP/IP stack
    • The permission bits on the DPI UNIX pathname used for the connection are not set to read and write. The default DPI UNIX pathname is /tmp/dpi_socket.

      When things such as these are the cause, the error typically occurs when attempting to start the IOBSNMP subagent.
  2. IOBSNMP is disconnected from the SNMP agent. This is often the result of a timeout on the DPI connection between IOBSNMP and the SNMP agent. When this is the cause, the IOBSNMP subagent typically initializes successfully, runs for some period of time, and later is disconnected.

Resolving the problem

Since there are numerous reasons why the problems discussed above occur, the best way to troubleshoot the problem is to use the following checklist:

1. Determine if the problem is occurring at startup, or if IOBSNMP runs successfully for some time before the message is issued. If the problem occurs at startup, proceed to checklist item #2. Otherwise, skip ahead to checklist item #8.

2. Ensure that the SNMP agent is active.

3. Verify that the community name being used by IOBSNMP is consistent with the SNMP agent's community name configuration. To do this, first determine what community names and IP addresses are configured for the SNMP agent. If the agent being used is the one shipped with the z/OS Communications Server, check the agent's PW.SRC file (or the
SNMPD.CONF) for the community name information. See the SNMP chapter of the IP Configuration Reference: PW.SRC, and SNMPD.CONF for additional information. Note that if an SNMPD.CONF file exists, any PW.SRC file configuration will not be used.

Once the correct community name being used by the SNMP agent has been determined, check the IOBSNMP subagent to determine what community name it is using to connect to the SNMP agent. This will be configured in the JCL used to start IOBSNMP by the '-c' parameter:


    //IOBSNMP PROC P=' -c public -p 161 -s TCPIP -d 0'
    //IOBSNMP EXEC PGM=IOBSNMP,TIME=1440,REGION=4096K,
    // DYNAMNBR=5,PARM='&P.'
    //*
    //SYSPRINT DD SYSOUT=*
    //SYSUDUMP DD SYSOUT=*

If no '-c' parameter exists, the default of 'public' is used. The community name is case sensitive, and any inconsistency between the community names used by IOBSNMP and the SNMP agent will cause an error.

If you are using the SNMP agent shipped with the z/OS Communications Server, you can verify that an incorrect community name is being used by activating an SNMP agent trace at level 72. The trace will write a message indicating that a failure occurred with return codes -14 and SNMP_RC_NOT_AUTHENTICATED. This indicates that an SNMP subagent or manager has attempted to access the SNMP agent with a bad community name or from the wrong IP address.

4. If the community name is correct, ensure that the SNMP agent is listening on the port specified by the '-p' parameter in IOBSNMP's started procedure.


    //IOBSNMP PROC P=' -c public -p 161 -s TCPIP -d 0'
    //IOBSNMP EXEC PGM=IOBSNMP,TIME=1440,REGION=4096K,
    // DYNAMNBR=5,PARM='&P.'
    //*
    //SYSPRINT DD SYSOUT=*
    //SYSUDUMP DD SYSOUT=*

If no '-p' parameter is specified in IOBSNMP's started procedure, it will use 161 as the default. Verify which port the SNMP agent is listening on by issuing the netstat command:

.


    netstat
    MVS TCP/IP NETSTAT CS V1R7

     User Id  Conn     Local Socket  Foreign Socket  State  
     -------  ----     ------------  --------------  -----
     OSNMPD   0000009A 0.0.0.0..161  *..*            UDP

If the port numbers do not match, alter the '-p' parameter for IOBSNMP to cause IOBSNMP to use the correct port.

5. If running in a CINET (multiple stack) environment, ensure that the stack name specified by the '-s' parameter in IOBSNMP's started procedure is the same stack with which the SNMP agent is associated.


    //IOBSNMP PROC P='-c public -p 161 -s TCPIP -d 0'
    //IOBSNMP EXEC PGM=IOBSNMP,TIME=1440,REGION=4096K,
    // DYNAMNBR=5,PARM='&P.'
    //*
    //SYSPRINT DD SYSOUT=*
    //SYSUDUMP DD SYSOUT=*


6. The IOBSNMP subagent uses the DPI API to connect to the SNMP Agent by way of a UNIX connection. The default UNIX pathname used for the connection is /tmp/dpi_socket. Ensure that the permission bits for the pathname are set to read and write.


7.
If the problem continues to occur despite verifying that the correct configuration is being used, use the '-d 3' option to obtain the IOBSNMP trace and contact OSA level 2 support. This option is specified in the procedure used to start IOBSNMP.


    //IOBSNMP PROC P='-c public -p 161 -s TCPIP - d 3'
    //IOBSNMP EXEC PGM=IOBSNMP,TIME=1440,REGION=4096K,
    // DYNAMNBR=5,PARM='&P.'
    //*
    //SYSPRINT DD SYSOUT=*
    //SYSUDUMP DD SYSOUT=*

The output of this trace will be sent to IOBSNMP's joblog.

If you are using the SNMP agent shipped with the z/OS Communications Server, it is beneficial to activate a level 72 trace using the following MODIFY command:


    MODIFY OSNMPD,TRACE,LEVEL=72

where OSNMPD is the jobname for your SNMP agent. The output from this trace is written to the SYSLOG Daemon.

8. If the IOB002I message is not issued at startup, then the IOBSNMP subagent is most likely being disconnected from the SNMP agent due to a timeout condition. Ideally, it would be best to obtain an IOBSNMP level 2 trace and, if you are using the z/OS Communications Server's SNMP agent, an SNMP agent level 72 trace. Information on these two traces can be found in checklist item #7, above. The SNMP agent trace will show the following messages:


    .. select() timed out
    cDPIpacket: Major=2, Version=2, Release=0, Id=3541, Type=SNMP_DPI_CLOSE
    cDPIclose: close reason_code=7 (timeout)
    IDSTMOE.I0493373.SOURCE.S@AGV123(2638):
    Sent Subagent packet on fd 8
    # Dump of 9 byte outgoing Subagent packet, count 9703
    00 07 02 02 00 0d d5 09 07
    EZZ6229I Closing DPI UNIX socket connection, fd=8

However, given the unpredictable nature of this type of problem, tracing is often difficult. As such, steps #9 and #10 below provide additional suggestions which you can use in an attempt to resolve the problem without tracing.

9. First, try increasing the timeout value which the SNMP agent uses for the IOBSNMP agent. Whenever an SNMP subagent connects to an SNMP agent, it specifies timeout values the agent should use when sending queries for values supported by that subagent. By default, some of the values specified by IOBSNMP are three seconds. Depending on what values are being queried, IOBSNMP may require more than three seconds to respond, causing the SNMP agent to timeout the connection with the subagent.

The method used to increase the timeout value used by the SNMP agent will vary depending on which SNMP product you use. If you are using z/OS Communications Server's SNMP agent, use an SNMP manager to issue a SET command to increase the value of the following objects:


    saTtimeout.1.3.6.1.2.1.10.7.2.0
    saTtimeout.1.3.6.1.4.1.2.6.188.1.1.0
    saTtimeout.1.3.6.1.4.1.2.6.188.1.3.0
    saTtimeout.1.3.6.1.4.1.2.6.188.1.4.0
    saTtimeout.1.3.6.1.4.1.2.6.188.1.8.0

An example of doing this using the z/OS UNIX snmp command line interface (also shipped with the z/OS Communications Server) follows. Note that <commname> should be replaced with the appropriate community name set up for your system. In this example, the timeout values are increased to ten seconds.

    osnmp -c <commname> set saTtimeout.1.3.6.1.2.1.10.7.2.0 10
    osnmp -c <commname> set saTtimeout.1.3.6.1.4.1.2.6.188.1.1.0 10
    osnmp -c <commname> set saTtimeout.1.3.6.1.4.1.2.6.188.1.3.0 10
    osnmp -c <commname> set saTtimeout.1.3.6.1.4.1.2.6.188.1.4.0 10
    osnmp -c <commname> set saTtimeout.1.3.6.1.4.1.2.6.188.1.8.0 10

You can also display the current timeout values for the different MIB subtrees by issuing the following command on the osnmp command line interface:

    osn mp - v -c <com mname> walk saTtimeout

You must make this change every time the IOBSNMP subagent is restarted. This includes any time that IOBSNMP is disconnected from the agent, as the timeout value will be reset to the default when the subagent reconnects.

10. Another possibility, particularly if the IOB002I message seems to be issued during periods of increased system workload, is that the IOBSNMP subagent is not getting enough CPU time. As a result, when the SNMP agent sends a query to IOBSNMP, IOBSNMP cannot obtain enough CPU time to fulfill the query prior to exceeding the timeout value.

In order to prevent this, increase IOBSNMP's dispatching priority. This increases the likelihood that IOBSNMP will be dispatched more often, and therefore, help it obtain enough CPU time to fulfill the SNMP agent's queries. However, this still may not prevent the error on systems where heavy CPU workloads are experienced.

11. If none of these suggestions provide a resolution, attempt to gather the IOBNSMP trace (level 3) and SNMP agent trace (level 72) described in step #7, and contact OSA level 2.

Rate this page:

(0 users)Average rating

Add comments

Document information


More support for:

z/OS Communications Server
All

Software version:

1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 1.10, 1.11, 1.12, 1.13, 2.1

Operating system(s):

z/OS

Reference #:

1227463

Modified date:

2006-01-09

Translate my page

Machine Translation

Content navigation