IBM Support

IV55838: VMWARE VI AGENT WORKSPACE HANGS AND STOPS SHOWING DATA.

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • VMware VI Agent remains online. Subnodes representing ESX
    servers remain online. But updates to the set of ESX servers,
    situation status and TEP displays stop.
    
    
    ENVIRONMENT:
    
    IBM Tivoli Monitoring for Virtual Environment: VMware VI Agent
    7.2 FixPack 2 InterimFix 1.
    VMware VI agents managing large vCenter environments with fast
    situation evaluation intervals, history collection and/or
    subnode update intervals.
    
    This is experienced on Windows platforms, but applicable to
    others.
    
    This APAR affects the factory agent code:
    
    1) The agent runs in a number of threads. A set of the agent
    threads deadlock on themselves. Once this happens, the agent
    will still log things, but its main operations won't happen. It
    won't handle requests from the TEMA, it won't properly process
    events from the provider, and it won't discover new ESX servers,
    or delete old ones. Enough of the main threads lock up that it
    isn't really doing anything anymore.
    
    2) Debugging this issue can be a little difficult. Set the
    trace:
    On Windows: $CANDLEHOME\tmaitm6\KVMENV_<instance-name>
    On Linux: $CANDLEHOME/config/vm_<instance-name>.config
    KBB_RAS1=ERROR (UNIT:kpx ST ERR) (UNIT:kraafira ALL)
    (UNIT:kraatblm ALL) (UNIT:genericagent ST) (UNIT:genericagent
    FLOW) (UNIT:subnode FLOW) (UNIT:custom FLOW)
    COUNT=20
    LIMIT=10
    This will require space of around 20*10=200 MB. Wait for the
    HANG state to occur
    
    The agent will display the lock up with a call to
    xxxxx.acquireLock() - you will see the Entry() trace, but no
    exit() trace. Search for the following string:
    
    (534394A4.018D-9A8:subnodemanager.cpp,159,"SubnodeManager::acqui
    reLock") Entry
    
    No Exit() trace will be seen. That would indicate hang state.
    

Local fix

Problem summary

  • VMware VI Agent remains online. Subnodes representing ESX
    servers remain online. But updates to the set of ESX servers,
    situation status and TEP displays stop.
    

Problem conclusion

  • Factory team has provided a fix by handling the synchronization
    issue with the threads that were causing agent to go in the hang
    state.
    The fix for this APAR is contained in the following maintenance
    packages:
    Interim Fix: 0002, 7.2.0.2-TIV-ITM_VMWVI-IF0002
    

Temporary fix

Comments

APAR Information

  • APAR number

    IV55838

  • Reported component name

    ITMF VE VM WARE

  • Reported component ID

    5724L92VI

  • Reported release

    720

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2014-02-24

  • Closed date

    2014-04-19

  • Last modified date

    2014-04-19

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    ITMF VE VM WARE

  • Fixed component ID

    5724L92VI

Applicable component levels

  • R720 PSY

       UP

[{"Line of Business":{"code":"LOB45","label":"Automation"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SS9U76","label":"Tivoli Monitoring for Virtual Environments"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"720"}]

Document Information

Modified date:
03 October 2021