IBM Support

Error Codes Seen on the Power Hardware Management Console (HMC)

Question & Answer


Question

How do I know if the errors that show up on the HMC are important or not?

Cause


The system attention LED activated on one of the systems managed by the HMC.

Answer


The HMC functions as the service focal point (SFP) for the systems it is managing. One misconception about the errors that show up in the managed serviceable event task is that they are HMC centric. The majority of errors are conditions that occurred on the actual systems and LPARs being managed. Reference A has some brief illustrations about SFP and flow of data from the Electronic Services Agent functions on the HMC (or AIX LPAR) to IBM (if configured).

The system administrator might typically see a service event notification on the HMC (classic view) when the Attention LED lights up with the yellow triangle and exclamation point. The LED will usually show up at system level and if associated with a particular LPAR will also show up in the Status column of the LPAR as well. You have to use the Manage Serviceable Events task wizard to view the actual event and to look closer at any details available. The task can be accessed by clicking on the wrench icon as shown below or can be accessed in the task menus for a particular system.




Most of the time if you view the details of a service event on the HMC it will have information about what the event means. When you select the event then use the "Selected" pull down menu option you can "View Details" of the error event to get more information about it. The error code shown in the illustrations represent a disk failure and if the HMC had been configured to "call home" it would have initiated a service request to IBM as well as transmit the diagnostic data automatically. In situations where call home is not able to be configured the administrator has to do the extra analysis to determine if there is additional actions required. Viewing the details of an event as well as the information associated with the FRU provide the extra data needed to determine the issue on the server when the event is coded as "Call Home Required."








Unlike the error shown above, user notification events sometimes are not as cut and dry when trying to determine if the alert is an issue that requires administrator attention or just monitoring for trends. The HMC will link to its on internal database of error code information or even out to the IBM Knowledge Center if it can connect to the Internet. The most up to date information can usually be found on-line searching IBM Knowledge Center. The references that will aid you the most in your search are as follows.

Reference codes
https://www.ibm.com/support/knowledgecenter/POWER8/p8eai/reference_codes_parent.htm?cp=POWER8

SRNs
http://www.ibm.com/support/knowledgecenter/en/P8DEA/p8eai/srn_info.htm

Failing function codes
https://www.ibm.com/support/knowledgecenter/en/8231-E1D/p7eb7/ffckickoff.htm

Some common SRNs that might show up on systems as user notifications are typically temporary adapter related errors which usually begin with a "#" pound sign such as #2E38xxxx, #2E17xxxx or many others that are similar. Using the Failing function codes document you can get good information about the type of device as well as what the codes generally indicate.

For example, if you look at the Failing function codes list for the #2E38xxxx or #2E17xxxx you will see that they relate to network adapters. You have to click on the link referenced below to see more documentation about the adapters. You might also need to review the detailed text as what shown above for the disk failure to see if an LPAR was identified or not.

Failing function code 2E38
An Ethernet PCIe adapter might be failing.
https://www.ibm.com/support/knowledgecenter/en/8231-E1D/p7eb7/ffc2e3b.htm

Failing function code 2E17
An adapter might be failing.
https://www.ibm.com/support/knowledgecenter/en/8231-E1D/p7eb7/ffc2e17.htm

Unlike call home required errors, the user notification events usually just indicate a temporary condition that occurred. With network adapters such conditions are usually associated with link loss issues, but again you use your HMC error details in conjunction with the error report in your LPARs and IBM Knowledge Center documents to decide if the condition is anything to be concerned about or one that can be just be monitored closer. The events that IBM Development has deemed needs attention by IBM Hardware Service will generate a call home required reference code. If you do seek assistance from IBM Support for call home required issues then the appropriate service request to open would be for hardware and you would use the system designated in the event as the one you need to service on.

[{"Product":{"code":"SWG10","label":"AIX"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"Not Applicable","Platform":[{"code":"PF002","label":"AIX"}],"Version":"Not Applicable","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]

Document Information

Modified date:
17 June 2018

UID

isg3T1023944