IBM Support

VM65776: VARY PROC LOCK HANG PREVENTS MCW002 ABEND

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • System hang because unresponsive processor handler is prevented
    from generating ABENDMCW002 due to a Vary Proc Lock hang.
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All users of z/VM.                           *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    ****************************************************************
    * RECOMMENDATION: APPLY PTF                                    *
    ****************************************************************
    z/VM OPERATOR console shows repeated MSHCPMPG9152E messages for
    the same unresponsive processor and the system stays up but
    becomes unusable because it is hung. If Monitor was running then
    it probably stopped generating sample records around the time
    of the first MSHCPMPG9152E message.
    
    When the unresponsive processor detection front-end (HCPMPGUP)
    finds an unresponsive processor, it stacks a call to the
    unresponsive processor recovery (HCPMPCPR) to determine whether
    the processor can be restarted to recover from the error, or to
    generate ABENDMCW002 and restart the system.  That process
    requires the Vary Processor Lock (HCPRCCVA).  A lock hang on
    that lock will prevent the recovery action from being performed
    and the system will eventually hang.  There are cases where the
    unresponsive processor may prevent a task that holds HCPRCCVA
    from completing.  In that case HCPRCCVA is never released and
    the unresponsive processor recovery is not able to run and
    generate ABENDMCW002 and restart the system.
    
    When the unresponsive processor recovery is blocked because of
    an HCPRCCVA hang, the system hangs and becomes unusable. The
    customer impact of the outage is greater because the system is
    essentially unavailable while hung even though it is running.
    
    The problem of the HCPRCCVA hang preventing the unresponsive
    processor detection function from running is not the cause of
    the outage.  The cause of the outage is the function that
    caused a processor to become unresponsive.  However, the hang
    on the HCPRCCVA lock prevents the unresponsive processor
    recovery from running and also prevents the generation of
    ABENDMCW002.  Without a dump of the system it is not possible
    to determine the cause of the unresponsive processor.
    
    This APAR addresses system availability and FFDC (first-failure
    data capture) aspects of the problem.
    

Problem conclusion

  • The APAR fix adds the capability to detect a Vary Proc Lock
    (HCPRCCVA) hang in unresponsive processor recovery prior to
    where it attempts to acquire HCPRCCVA.  If it detects a lock
    hang on HCPRCCVA then an ABENDMPC008 dump is generated.  This
    satisfies the FFDC concern by generating a dump as close to the
    point of failure as is reasonable.  The dump allows the cause
    of the unresponsive processor to be diagnosed.  The fix also
    improves availability by detecting a permanent error closer to
    the point it occurs and forcing an abend dump and re-IPL rather
    than allowing the system to remain up in an unusable state.
    
    Changed parts:
    - HCPRCC ASSEMBLE
    - HCPMPC ASSEMBLE
    - HCPLCK ASSEMBLE
    - HCPSGP ASSEMBLE
    - HCPMTC ASSEMBLE
    - HCPCCF ASSEMBLE
    
    
    SRL changes:
    GC24-6270-01 CP messages and Codes - z/VM Version 7 Release 1
    - Page 107 - add the MPC008 abend information.
    - This is Chapter 2. System Codes - CP Abend Codes
    GC24-6177-12 CP messages and Codes - z/VM Version 6 Release 4
    - Page 86  - add the MPC008 abend information.
    - This is Chapter 2. System Codes - CP Abend Codes -
      Abend Codes A - M
    
    
    MPC008
    Explanation: This module is distributed as object code only;
           therefore, no source program materials are available.
    User response: See z/VM: Diagnosis Guide for information on
           gathering the documentation you need to assist IBM in
           diagnosing the problem; then contact your IBM Support
           Center personnel.
    

Temporary fix

  • FOR RELEASE VM/ESA CP/ESA R640 :
    PREREQ: VM65988 VM66105
    CO-REQ: NONE
    IF-REQ: NONE
    FOR RELEASE VM/ESACP/ESAR710 :
    PREREQ: NONE
    CO-REQ: NONE
    IF-REQ: NONE
    

Comments

APAR Information

  • APAR number

    VM65776

  • Reported component name

    VM CP

  • Reported component ID

    568411202

  • Reported release

    640

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2019-04-16

  • Closed date

    2019-06-13

  • Last modified date

    2020-12-16

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    UM35376 UM35377

Modules/Macros

  • HCPCCF   HCPLCK   HCPMPC   HCPMTC   HCPRCC   HCPSGP
    

Fix information

  • Fixed component name

    VM CP

  • Fixed component ID

    568411202

Applicable component levels

  • R640 PSY UM35376

       UP19/06/19 P 2001 ¢

  • R710 PSY UM35377

       UP19/06/19 P 1902 ¢

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG27M","label":"APARs - z\/VM environment"},"Platform":[{"code":"PF054","label":"z\/OS"}],"Version":"640","Line of Business":{"code":"LOB16","label":"Mainframe HW"}}]

Document Information

Modified date:
12 January 2021