IBM Support

IC72754: DEVICE SERVER CRASHES WITH OUTOFMEMORYERROR EXCEPTION.

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • This defect is due to an overload of logger files specifically
    the ones that log communications with the CIMOM.
    
    In device manager logs:
    2010.09.19 14:25:51.940-05:00 java.lang.OutOfMemoryError at
    java.lang.StringBuffer.ensureCapacityImpl(StringBuffer.java:386)
     at java.lang.StringBuffer.append(StringBuffer.java:234)
     at com.ibm.tpc.disk.common.util.DiskMessageHelper.handleArray
    (DiskMessageHelper.java:526)
     at com.ibm.tpc.disk.common.util.DiskMessageHelper.toMsgString
    (DiskMessageHelper.java:436)
     at
    com.ibm.tpc.disk.common.log.LogTraceHelper.exit(LogTraceHelper.j
    ava:388)
     at
    com.ibm.tpc.disk.common.CIMOMManager.getCIMOMs(CIMOMManager.java
    :537)
     at
    com.ibm.tpc.discovery.ProbeFabricAgents.cleanUpCorruptFabricCIMS
    canners(Discover.java:3546)
     at
    com.ibm.tpc.discovery.ProbeFabricAgents.process(Discover.java:14
    12)
     at com.ibm.tpc.Router.perform(Router.java:1050)
     at com.ibm.tpc.Router.perform(Router.java:586)
     at com.ibm.tpc.fabric.FabricProbeAgentAssigner.
    makeAssignmentsAndRunProbes(FabricProbeAgentAssigner.java:4366)
     at com.ibm.tpc.discovery.ProbeFabric.process(Discover.java:924)
     at com.ibm.tpc.Router.perform(Router.java:1050)
     at com.ibm.tpc.Router.perform(Router.java:586)
     at
    com.ibm.tpc.discovery.ProcessProcessor.process(ProcessProcessor.
    java:69)
     at
    com.ibm.tpc.infrastructure.threads.TPCThread.run(TPCThread.java:
    258)   com.ibm.tpc.fabric.FabricProbeAgentAssigner
    makeAssignmentsAndRunProbes
    2010.09.19 14:26:11.915-05:00 java.lang.OutOfMemoryError at
    java.lang.String.concat(String.java:486)
     at com.ibm.tpc.discovery.snmp.scanner.SNMPScanner.runSnmpQuery
    (SNMPScanner.java:1641)
     at
    com.ibm.tpc.discovery.snmp.scanner.CiscoMDSScanner.runProcess
    (CiscoMDSScanner.java:2778)
     at com.ibm.tpc.discovery.snmp.scanner.CiscoMDSScanner.invoke
    (CiscoMDSScanner.java:624)
     at
    com.ibm.tpc.discovery.tsanm.OutbandScanner.process(OutbandScanne
    r.java:273)
     at
    com.ibm.tpc.infrastructure.threads.TPCThread.run(TPCThread.java:
    258)   com.ibm.tpc.discovery.tsanm.OutbandScanner process
    2010.09.19 14:47:09.890-05:00 java.lang.OutOfMemoryError
    
    Javacores are generated under
    \device\apps\was\profiles\deviceServer
    
    NULL
    ----------------------------------------------------------------
    --------
    0SECTION       TITLE subcomponent dump routine
    NULL           ===============================
    1TISIGINFO     Dump Event "systhrow" (00040000) Detail
    "java/lang/OutOfMemoryError" received
    1TIDATETIME    Date:                 2010/09/19 at 14:17:18
    1TIFILENAME    Javacore filename:
    /opt/IBM/app/TPC/device/apps/was/profiles/deviceServer/javacore.
    20100919.141659.340014.0073.txt
    NULL
    ----------------------------------------------------------------
    --------
    0SECTION       GPINFO subcomponent dump routine
    NULL           ================================
    2XHOSLEVEL     OS Level         : AIX 6.1
    2XHCPUS        Processors -
    3XHCPUARCH       Architecture   : ppc
    3XHNUMCPUS       How Many       : 4
    NULL
    1XHERROR2      Register dump section only produced for SIGSEGV,
    SIGILL or SIGFPE
    

Local fix

  • Restart device server
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED: TPC 4.2 and 4.2.1 users that have a TPC      *
    *                 environment with a lot of logging            *
    *                 (many probes, discoveries especially         *
    *                 on the fabric area), that get Out of         *
    *                 memory errors on the device server           *
    *                 side.                                        *
    ****************************************************************
    * PROBLEM DESCRIPTION: When running a lot of discoveries,      *
    *                      probes (having a large environment      *
    *                      with many CIMOMs) scheduled to run      *
    *                      once a day, the trace logging gets      *
    *                      very large, with millions of logging    *
    *                      objects.  The heap dump analysis shows  *
    *                      the class LogTraceHelperFactory         *
    *                      occupying 78% of the total memory       *
    *                      (with LinkedList instances). The        *
    *                      device server crashes with out of       *
    *                      memory. A quick workaround is to        *
    *                      restart the device server and to try    *
    *                      to minimize the logging (running        *
    *                      scheduled necessary discovery and       *
    *                      probes once per day, seting the         *
    *                      debug level to min).                    *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    -
    

Problem conclusion

  • The customer can contact support to improve the TPC
    configuration and restart Device Server as a workaround or
    upgrade to a 4.2.1 version including this fix.  The fix for
    this APAR is targeted for the following maintenance package:
    
    | fix pack | 4.1.1-TIV-TPC-FP0006 - target April 2011
    | fix pack | 4.2.1-TIV-TPC-FP0002 - target February 2011
    
    http://www-01.ibm.com/support/docview.wss?&uid=swg21320822
    
    The target dates for future fix packs do not represent a formal
    commitment by IBM. The dates are subject to change without
    notice.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IC72754

  • Reported component name

    MULTIPLE DEVICE

  • Reported component ID

    5648HWN01

  • Reported release

    420

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2010-11-23

  • Closed date

    2010-12-23

  • Last modified date

    2010-12-23

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    MULTIPLE DEVICE

  • Fixed component ID

    5648HWN01

Applicable component levels

  • R411 PSY

       UP

[{"Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU029","label":"Software"},"Product":{"code":"SSMMUP","label":"Tivoli Storage Productivity Center for Disk"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"420"}]

Document Information

Modified date:
16 September 2021