IBM Support

IT16280: SSL ENABLED NODE REPLICATION CAN CAUSE SOURCE SERVER CRASH

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • An IBM Spectrum Protect (Tivoli Storage Manager) source server
    can crash during REPLICATE NODE processing when server-to-
    server communications are SSL enabled (SSL=YES).
    
    The failing callstack (Linux example) looks similar to the
    following:
    
      open_memstream()
      __vsyslog_chk()
      __libc_message()
      _int_malloc()
      malloc()
      fdopen@@GLIBC_2.2.5()
      psSysFopen()
      pkShowMsg()
      pkFreeTracked()
      matchDNS()
      verifyPartnerIdentity()
      ssltcpOpen()
      smOpenSession()
      SignOnToServer()
      StartConversation()
      smServerOpenOutbound()
      smReplGetSession()
      NrGetSession()
      DoServerHandshake()
      NrReplicationThread()
      StartThread()
    
    A message similar to the following might also exist in the
    dsmserv.err file and also be more evidence of this problem:
    
      06/27/2016 06:06:03  ANR9999D pkmemory.c(781): Error 9
                           detected while freeing memory; free
                           was called from ssltcomm.c(3507),
                           memory was allocated from
                           ssltcomm.c(3424).
    
    Subsequent attempts to start the source server after the
    crash may also result in additional crashes, which have
    the following stack:
    
      matchDNS()
      verifyPartnerIdentity()
      ssltcpOpen()
      smOpenSession()
      SignOnToServer()
      StartConversation()
      smServerOpenOutbound()
      smReplGetSession()
      NrGetSession()
      NrPingReplServer()
      NrDoHeartbeats()
      imReplMonitor()
      AdmServerMonitorThread()
      StartThread()
    
    This issue is triggered by a mis-match between the server-to-
    server high level address (HLA) definition within the IBM
    Spectrum Protect server, and the subject alternate name within
    the SSL certificate being used.  If the certificate is using
    a DNS name, then the HLA definition needs to specify that DNS
    name instead of a dotted IP address for the system.
    
    Customer/L2 Diagnostics:
    None.
    
    Initial Impact:
    High
    
    Tivoli Storage Manager Versions Affected:
    All supported server users of SSL-enabled REPLICATE NODE
    processing.
    
    Additional Keywords:
    TSM IBM SPECTRUM PROTECT CRASH ABEND MATCHDNS PKMEMORY ABORT
    

Local fix

  • If the source system continues to crash on restart after this
    issue has been experienced, you must complete the following
    steps to recover the source system:
    
      1. Halt the target server.
      2. Start the source server.
      3. Disable SSL server-to-server communications (SSL=NO)
         on source server.
      4. Restart the target server.
    
    Alternatively, unloading the certificate label from the
    server's key database may also allow the source server to
    start.
    
    To prevent this crash before fixing maintenance can be applied,
    change the HLA used on the DEFINE SERVER command for the
    server-to-server communications for SSL-enabled node
    replication to use the DNS name specified by the subject
    alternative name within your SSL certificate, and NOT a
    dotted IP address.
    
    Alternatively, do not use SSL-enabled server-to-server
    communications until fixing maintenance can be applied.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * All Tivoli Storage Manager server users.                     *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See error description.                                       *
    * This problem also can be encountered on any other            *
    * server-server or storage agent-server SSL session            *
    * using 3rd party certificates.                                *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply fixing level when available. This problem is currently *
    * projected to be                                              *
    * fixed in level 7.1.7.                                        *
    * Note that this is subject to change at the discretion of     *
    * IBM.                                                         *
    ****************************************************************
    This problem also can be encountered on any other
    server-server or storage agent-server SSL session
    using 3rd party certificates.
    

Problem conclusion

  • This problem was fixed.
    Affected platforms:  AIX, HP-UX, Solaris, Linux, and Windows.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT16280

  • Reported component name

    TSM SERVER

  • Reported component ID

    5698ISMSV

  • Reported release

    71L

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2016-07-25

  • Closed date

    2016-08-05

  • Last modified date

    2016-08-05

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    TSM SERVER

  • Fixed component ID

    5698ISMSV

Applicable component levels

  • R63A PSY

       UP

  • R63H PSY

       UP

  • R63L PSY

       UP

  • R63S PSY

       UP

  • R63W PSY

       UP

  • R61Z PSY

       UP

  • R71A PSY

       UP

  • R71H PSY

       UP

  • R71L PSY

       UP

  • R71S PSY

       UP

  • R71W PSY

       UP

  • R71Z PSY

       UP



Document information

More support for: Tivoli Storage Manager

Software version: 7.1.3

Reference #: IT16280

Modified date: 05 August 2016