IBM Support

N3001-001 IBM Netezza PureData System for Analytics Host reboots

Troubleshooting


Problem

Host report rebooting

Symptom

One or both N3001-001 hosts rebooting unexpectedly

Cause

Power going to host power supplies is interrupted

Environment

N3001-001 IBM Netezza PureData System for Analytics

Diagnosing The Problem

Check the output of # last reboot following to discover frequency of the reboots.
[root@nzdev01 ~]# last reboot
system boot 2.6.32-431.17.1. Thu Feb 18 03:05 - 15:57 (12:51)
system boot 2.6.32-431.17.1. Wed Feb 17 07:07 - 15:57 (1+08:50)
system boot 2.6.32-431.17.1. Tue Feb 16 02:32 - 15:57 (2+13:24)

Check ipmitool sel list for information, warnings or errors around the time of the reboots


Power Supply #0x70 | Power Supply AC lost | Asserted

Note that if power from both power supplies is lost at the time time the system will not be able to record the power loss of both power supplies. If the timing is just right the system may not be able to record either power supply loss but this is rare.

Look for any environmental causes for power loss, such as the data center having power problems.

If there is a lack of messages in the host /var/log/messages about a fencing event and there are power loss information in the ipmi log, suspect an external power cause.

If the host is connected to a UPS find out if the UPS is doing a self test at this time. It is possible that a weak battery in the UPS during a self testing period could not hold up the power requirement of the host and the host will reboot.

This can be spotted fairly easily if the time of day of the reboots is the same or very close to the same day.

Resolving The Problem

If the host is not connected to a UPS it would be highly advised to obtain a UPS for power consistency.

If the host is connected to a UPS, have the health of the UPS checked or is a smart UPS check any logs that might be available.

If not connected to a UPS and there is a UPS available this can be used to troubleshoot that the issue is a incoming power issue and not a power supply issue.

If a UPS is not available suspect that the PDU may have an issue.

[{"Product":{"code":"SSULQD","label":"IBM PureData System"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"IBM Netezza Analytics","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"1.0.0","Edition":"All Editions","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
17 October 2019

UID

swg21979502