IBM Support

IV95025: CAA DOES NOT START AFTER REPDISK REPLACEMENT WITH SAVEBASE ERRORAPPLIES TO AIX 7100-04

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • When a repository gets replaced the clvdisk
    attribute of cluster0 device in ODM is updated
    with the UUID of the new primary repository disk.
    The savebase command is executed to update the
    Boot Logical Volume.
    
    In case an entire data centre fails:
    * One of two PowerHA/CAA nodes fails
    * One of two storage subsystems fails.
      The failed storage subsystem hosts
      - the primary repository disk
      - the rootvg disk of the remaining node,
        which has been used at boot.
    
    the savebase of the Automatic Reposirory
    Replacement (ARR) operation fails.
    syslog.caa looks like:
    ...
    Apr  3 12:36:54 ha1clC caa:info cluster[8454150]:
     cluster_utils.c        cl_run_log_method
      11951   523    START '/usr/sbin/chdev -l cluster0
       -a clvdisk='2e199ead-1ed0-aeb4-e331-8c59c3f668f6''
    ...
    Apr  3 12:36:55 ha1clC caa:info cluster[8454150]:
     cluster_utils.c         cl_run_log_method
       11951   523   START '/usr/sbin/savebase '
    Apr  3 12:36:55 ha1clC caa:info cluster[8454150]:
     cluster_utils.c         cl_run_log_method
      11982   523     FINISH return = 1
    ...
    Apr  3 12:36:55 ha1clC caa:err|error cluster[8454150]:
     caa_config.c     cl_th_cfg_msg   6691    523
       savebase failed on ha1clC.mainz.de.ibm.com.
       Please run savebase on ha1clC.mainz.de.ibm.com.
    ...
    Apr  3 12:36:57 ha1clC caa:info cluster[8454150]:
     cl_chrepos.c  automatic_repository_update
      2297    1       FINISH rc = 0
    Apr  3 12:36:57 ha1clC caa:info cluster[8454150]:
     caa_protocols.c       recv_protocol_slave
      1542    1       Returning from Automatic
       Repository replacement rc = 0
    ...
    
    CAA will not start after a reboot of
    remaining node. syslog.caa looks like:
    ...
    Apr  3 14:02:37 ha1clC caa:err|error cluster[3473558]:
     cluster_utils.c  cluster_repository_read_data
      4777    1       Could not get name of cluster
       repository disk from ODM (ODMDIR=/etc/objrepos).
    Apr  3 14:02:37 ha1clC caa:info cluster[3473558]:
     cluster_utils.c       cl_kern_repos_check
      11858   1       Could not read the respository.
    ...
    Apr  3 14:02:37 ha1clC caa:err|error
     cluster[3473558]:   clusterconf_lib.c
       _find_and_load_repos     1482    1
        cluster_repository_query() found a UUID
         but no corresponding disk. This condition
         may be temporary.
    Apr  3 14:02:37 ha1clC caa:warn|warning
     cluster[3473558]:    1035-242 clusterconf:
      Non-fatal error when loading the topology.
    ...
    

Local fix

  • Run
    $ clusterconf -r <new primary rep disk>
    at the remaining node after reboot.
    

Problem summary

  • savebase can fail on a mirrored rootvg if one disk
    is not accessible.
    

Problem conclusion

  • made savebase to look for blvs on all the disks in the mirrored
    rootvg.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IV95025

  • Reported component name

    AIX V7.1

  • Reported component ID

    5765H4000

  • Reported release

    710

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2017-04-12

  • Closed date

    2017-06-30

  • Last modified date

    2018-09-21

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    IV97722 IV97730 IV97733 IV97748 IV97751 IJ01657 U882085

Fix information

  • Fixed component name

    AIX V7.1

  • Fixed component ID

    5765H4000

Applicable component levels

  • R710 PSY U882085

       UP18/08/22 I 1000

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SG11R"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"710","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]

Document Information

Modified date:
19 April 2022