IBM Support

TSAMP HADR takeover fails when network cable unplugged

Troubleshooting


Problem

When HADR is used with Tivoli System Automation for Multiplatforms (TSAMP) 3.2.1.1 and a network cable is unplugged the HADR takeover will not finish and fail

Environment

HADR environment with TSAMP lower than 3.2.1.3

Diagnosing The Problem

At first two additional relationships were added to

Add a new relationship for your HADR database by running the following command as root :

rgreq -o lock HADR-rg
mkrel -p dependson -S IBM.Application:HADR-rs -G IBM.Equivalency:db2_public_network_0 \ HADR-rs_DependsOn_db2_public_network_0-rel
rgreq -o unlock HADR-rg

But this not completely solve problem.
The trace shows that the HADR resource is not stopped as it should. But as soon as DB2 reaches HADR disconnected status a LOCK for the resource group is set and prevent TSAMP from further actions. Here the IBM.ServiceIP was not stopped and therefore the resource group could not be started on the other side

09/02/14 10:12:08.948728 T(4141890448) _RCD ReportState: Resource : eth0/Fixed/IBM.NetworkInterface/db2b reported state change: 2
09/02/14 10:12:08.967404 T(4141890448) _RCD Resource::doRIBMAction Offline Request against db2-rs on node db2b.
09/02/14 10:12:08.967438 T(4141890448) _RCD Resource::doRIBMAction Offline Request against HADR-rs on node db2b.
09/02/14 10:12:08.988111 T(4141890448) _RCD ReportState: Resource : HADR-rs/Fixed/IBM.Application/db2b reported state change: 6
09/02/14 10:12:08.994833 T(4141890448) _RCD ReportState: Resource : db2-rs/Fixed/IBM.Application/db2b reported state change: 6
09/02/14 10:12:13.338019 T(4141890448) _RCD ReportState: Resource : HADR-rs/Fixed/IBM.Application/db2b reported state change: 2
09/02/14 10:12:16.763561 T(4141890448) _RCD LockResource request injected: HADR-rg/ResGroup/IBM.ResourceGroup
09/02/14 10:12:20.856603 T(4141890448) _RCD ReportState: Resource : db2-rs/Fixed/IBM.Application/db2b reported state change: 2
09/02/14 10:24:04.687419 T(4141890448) _RCD ReportState: Resource : eth0/Fixed/IBM.NetworkInterface/db2b reported state change: 1

It does not show stop of IBM.ServiceIP such as
09/04/14 09:23:39.927985 T(4134083472) _RCD ReportState: Resource : db2ip_172_xx_xx_xx-rs/Fixed/IBM.ServiceIP/db2b reported state change: 2

Resolving The Problem

It was found that this is the same problem as described in APAR IV03673.

This APAR is fixed in TSAMP version 3.2.1.3 and newer.

The APAR description is: RESOURCES RESET AFTER SUCCESSFUL START IN CASE ANOTHER RESOURCE STARTED AT THE SAME TIME HAS A LONG RUNNING STARTCOMMAND

Although the APAR does not indicate it was confirmed that the root cause is the same.

Problem can be resolved by installing FixPack 3.2.1.3 or newer such as 3.2.2.8 (most recent 3.2.2.x FixPack at time of writing)

[{"Product":{"code":"SSRM2X","label":"Tivoli System Automation for Multiplatforms"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"--","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"}],"Version":"3.2.1;3.2.2","Edition":"All Editions","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
17 June 2018

UID

swg21684693