IBM Support

IT20512: SYSTEMD DOES NOT NOTICE IF DSMSERV PROCESS IS STOPPED OUTSIDE SYSTEMD

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • If the IBM Spectrum Protect server is stopped outside systemd
    service manager, for example from dsmadmc administrative command
     line, systemctl query service still shows service status in
    running state. In order to start the server service by
    systemctl, systemd has to stop the service before it can be
    started again.
    
    IBM Spectrum Protect Versions Affected:
    IBM Spectrum Protect server on Linux utilizing systemd bootstrap
    system (Red Hat Enterprise Linux starting from v7.0 and SUSE
    Linux Enterprise Server starting from v12)
    
    Initial Impact:
    Low
    
    Additional Keywords:
    Spectrum Protect; TSM; server; systemd; systemctl; service
    status; startup script; rc.dsmserv
    

Local fix

  • Use service command to manage server service or alternatively
    the recommended way to circumvent the issue is to create a
    backup copy of existing startup script and change original using
     the patch below, and make sure to adjust the pidfile: line in
    the header to use the actual instance name. Run the following
    command after updating the script:
      systemctl daemon-reload
    $ diff -u dsmserv.rc.orig dsmserv.rc
    --- dsmserv.rc.orig     2017-08-04 18:57:56.797066670 -0700
    +++ dsmserv.rc  2017-08-11 16:04:36.929097292 -0700
    @@ -5,7 +5,7 @@
     # chkconfig: - 90 10
     # description: Starts/Stops an IBM Tivoli Storage Manager
    Server instance
     # processname: dsmserv
    -# pidfile: /var/run/dsmserv_instancename.pid
    +# pidfile: /var/run/dsmserv_instancename_su.pid
     #**************************************************************
    *********
     # Distributed Storage Manager (ADSM)
            *
    @@ -17,7 +17,7 @@
     #
            *
     # OCO Source Materials
            *
     #
            *
    -# 5765-303 (C) Copyright IBM Corporation 1990, 2009
            *
    +# 5765-303 (C) Copyright IBM Corporation 1990, 2017
            *
     #**************************************************************
    *********
     #
    @@ -42,6 +42,11 @@
     # If any of these assumptions are not valid, then the script
    will require
     # some modifications to work.  To start with, look at the
     # instance, instance_user, and instance_dir variables set
    below...
    +#
    +# Note that on distributions that use the systemd init system,
    the
    +# line in the header above that starts with "#pidfile:" MUST be
     updated
    +# by replacing 'instancename' in dsmserv_instancename_su.pid
    with the
    +# actual instance name.
     # First of all, check for syntax
     if ÝÝ $# != 1 ¨¨
    @@ -78,6 +83,7 @@
     instance_user=$instance
     instance_dir="${instance_home}/tsminst1"
     pidfile="/var/run/${prog}_${instance}.pid"
    +supidfile="/var/run/${prog}_${instance}_su.pid"
     PATH=/sbin:/bin:/usr/bin:/usr/sbin:$serverBinDir
    @@ -135,11 +141,13 @@
                exit 0
             else
                $serverBinDir/rc.dsmserv -u $instance_user -i
    $instance_dir -q >/dev/null 2>&1 &
    +           suPID=$!
                # give enough time to server to start
                sleep 5
                # if the lock file got created, we did ok
                if ÝÝ -f $instance_dir/dsmserv.v6lock ¨¨
                then
    +              echo $suPID>$supidfile
                   gawk --source '{print $4}'
    $instance_dir/dsmserv.v6lock>$pidfile
                   Ý $? = 0 ¨ && echo "Succeeded" || echo "Failed"
                   rc=$?
    @@ -192,6 +200,14 @@
                   echo "Be sure to remove $pidfile."
                   exit 1
                fi
    +           # remove the su pid file so that we don't try to
    kill same pid again
    +           rm $supidfile
    +           if ÝÝ $? != 0 ¨¨
    +           then
    +              echo "Process $prog instance $instance stopped,
    but unable to remove $supidfile"
    +              echo "Be sure to remove $supidfile."
    +              exit 1
    +           fi
             else
                echo "$prog instance $instance is not running."
             fi
    @@ -219,26 +235,28 @@
              if ÝÝ -n $running ¨¨
              then
                 g_status="running"
    +            rc=0
              else
    -            g_status="stopped"
    -            # remove the pidfile if stopped.
    -            if ÝÝ -e $pidfile ¨¨
    +            # see if server lock file still exits.  if so then
    the server
    +            # probably crashed instead of shutting down
    gracefully
    +            if ÝÝ -f $instance_dir/dsmserv.v6lock ¨¨
                 then
    -                rm $pidfile
    -                if ÝÝ $? != 0 ¨¨
    -                then
    -                    echo "$prog instance $instance stopped, but
     unable to remove $pidfile"
    -                    echo "Be sure to remove $pidfile."
    -                fi
    +               g_status="failed"
    +               rc=1
    +            else
    +               g_status="stopped"
    +               rc=3
                 fi
              fi
           else
             g_status="stopped"
    +        rc=3
           fi
           if ÝÝ $1 == 1 ¨¨
           then
                 echo "Status of $prog instance $instance:
    $g_status"
           fi
    +      return $rc
     }
     restart() {
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * All IBM Spectrum Protect server users.                       *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See error description.                                       *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply fixing level when available. This problem is currently *
    * projected to be fixed in levels 7.1.9 and 8.1.3. Note that   *
    * this is subject to change at the discretion of IBM.          *
    *                                                              *
    * Note that the copy of the startup script used for existing   *
    * instances will need to be manually updated once the fixing   *
    * level is installed.                                          *
    ****************************************************************
    

Problem conclusion

  • This problem was fixed.
    Affected platforms for reported release:  Linux
    Platforms fixed:  Linux.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT20512

  • Reported component name

    TSM SERVER

  • Reported component ID

    5698ISMSV

  • Reported release

    71L

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2017-05-17

  • Closed date

    2017-08-02

  • Last modified date

    2017-09-14

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    TSM SERVER

  • Fixed component ID

    5698ISMSV

Applicable component levels

  • R81L PSY

       UP



Document information

More support for: Tivoli Storage Manager

Software version: 7.1.3

Reference #: IT20512

Modified date: 14 September 2017