IBM Support

Persistent Incomplete state on managed systems at 01Ax770_030 and later

Troubleshooting


Problem

A subset of POWER 7 server firmware levels have a problem that can result in the server going to an incomplete state the first time it is managed by HMC 7.7.8 or later. The problem also impacts Flex nodes when the FSM is updated to 1.3.2. This document describes the problem, recovery and fix.

Resolving The Problem

Problem Description
A subset of POWER 7 server firmware levels have a problem that can result in the server going to an incomplete state the first time it is managed by HMC 7.7.8 or later. The problem also impacts Flex nodes when the FSM is updated to 1.3.2.

The incomplete only occurs under the following conditions:
1. Server firmware level is at (or is upgraded to) one of the following impacted levels:
01Ax770 levels 032 (GA) to 064;
01Ax773 levels 033 (GA) to 054;
01Ax780 levels 040 (GA) to 054.

2.The server is managed by:
HMC V7R7.8.0 or later
FSM 1.3.2 or later

3. One or more partitions are running during the management console connection attempts.

For disruptive server firmware upgrades to an impacted level this implies additional conditions that:
- One or more partitions are defined to auto start ( "Automatically start with managed system") at the server power on.
-The server power on after the firmware upgrade is to Operating, allowing the partitions to auto-start.

For Management Console upgrades and conversions this implies:
- The HMC is later upgraded from HMC 7.7.7 to HMC 7.7.8 or later
- An FSM managed node is converted from FSM 1.3.1 or earlier to HMC 7.7.8 or later
- The FSM is upgraded to 1.3.2 or later.


Symptoms
The problem is typically exposed during one the following tasks:

1. Server shows incomplete state on the first server power on after a disruptive server firmware upgrade to one of the impacted releases.

2. Server shows incomplete state after upgrading the HMC from V7R7.7 to V7R7.8 or later.

3. Flex node shows incomplete state after migrating the node from FSM 1.3.1 or earlier to HMC V7R7.8 or later.

4. Flex node shows incomplete state after updating the FSM to 1.3.2 or later.


Prevention

When upgrading from an earlier EC level to an impacted level; upgrade directly to the fixed level.

For servers with an impacted server firmware level already applied, concurrently apply the fix prior to upgrading the management console.


Recovery

Power off partitions and rebuild.
- Power off all partitions that are running.
Since the HMC is in an incomplete state, this must be done directly from the OS. Wait several minutes for all the partition shut down commands to complete.

- Run the server rebuild task: Operations > Rebuild
After the last partition has shut down, the rebuild task will take the server from Incomplete state to a Operating or Powered Off state depending on the current setting of the server property "Power off the system after all the logical partitions are powered off." If the server powers off, then it must be powered on to standby before continuing.

- Verify the state returns to Operating.

Once the recovery is completed, the incomplete will not reoccur; even if the fix is not applied.


Fix

The issue is fixed in server firmware:
770.31 - Ax770_063 or later.
780.10 - Ax780_056 or later.
773.12 - AF773_056 or later; AF783 all levels.

[{"Product":{"code":"SSB6AA","label":"Power System Hardware Management Console Physical Appliance"},"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Component":"HMC","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"Version Independent","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]

Document Information

Modified date:
22 September 2021

UID

nas8N1020088