IBM Support

IV80830: UNIX OS AGENT MAY CORRUPT MEMORY WHILE COLLECTING METRICS FOR THE DISK GROUP.

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Random memory corruptions may occur in the UNIX OS Agent while
    collecting metrics for the Disk group
    Affected Platforms / Versions:
     This issue affects all the UNIXX OS agent versions since 6.23
    FP1 and
     it does not depend on the UNIX platform.
    
    Diagnostics:
     RAS1 logs at ERROR level show a sequence like this before
    ending:
    
    mount_stat.cpp,446,"get_mount_info")
      WARNING: The statvfs64 timeout expired!
    mount_stat.cpp,447,"get_mount_info")
      WARNING: The mounted file system "/example/file/system is
    probably unreachable
    .
    With more detailed RAS1 logging enabled, RAS logs will end with
    the thread
    for "exec_statvfs64":
    mount_stat.cpp,189,"exec_statvfs64") statvfs64 executed
    successfully for "/example/file/system"
    mount_stat.cpp,294,"exec_statvfs64") Exit: 0x0
    .
    The traceback of the coring thread is variable as this issue is
    the result of
    memory corruption when the call to statvfs64() returns and the
    "exec_stavtfs64"
    thread exits, where the calling thread has previously timed out.
    .
    Initial Impact:
     High - monitoring agent crashes.
    .
    Additional Keywords:
     UNIXDISK
     core
     dump
     sigsegv
     mount_stat
     kuxagent
    
    Local Fix:
     Increase KBB_NFS_TIMEOUT to prevent timeout of statvfs64
    

Local fix

  • Increase KBB_NFS_TIMEOUT setting to larger value than the
    statfvs64 call takes to return. Set in lz.ini and value is
    specified in seconds.
    

Problem summary

  • Problem Description: The Monitoring Agent for UNIX OS may
     corrupt memory while monitoring disks.
    Problem/Problem Summary: A condition has been identified where
     the buffer containing data for a filesystem can be overridden
     by those for another filesystem. This memory corruption is
     caused by a timing condition between two agent internal threads
     and it may occur when one or more filesystems do not respond
     within the 2 seconds default timeout for the statvfs64 system
     call.
    

Problem conclusion

  • Fix/Problem Conclusion: Memory corruption avoided.
    
    The fix for this APAR will be contained in the following
    maintenance packages:
    
    | FixPack    | 6.3.0-TIV-ITM-FP0007
    | InterimFix | 6.3.0.5-TIV-ITM_UNIX-IF0004
    

Temporary fix

Comments

APAR Information

  • APAR number

    IV80830

  • Reported component name

    ITM AGENT UNIX

  • Reported component ID

    5724C040U

  • Reported release

    623

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2016-01-26

  • Closed date

    2016-03-21

  • Last modified date

    2016-03-21

  • APAR is sysrouted FROM one or more of the following:

    IV80542

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    ITM AGENT UNIX

  • Fixed component ID

    5724C040U

Applicable component levels

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSTFXA","label":"Tivoli Monitoring"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"623","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
08 March 2023