Special instructions to remove latent Parity Inconsistency (PI) errors and upgrade a disk drive to a new firmware version

Technote (troubleshooting)


Problem(Abstract)

IBM has received reports of parity inconsistency (PI) errors on disk drive models running certain firmware releases when used as 'parity disks'. The affected disk drive identifiers, and the IBM® System Storage™ N series storage systems or storage expansion units in which these disk drives are used, are provided in the specific alerts issued for each type (FC, SAS, etc.) of disk drives that experience the PI errors. The instructions provided in this document must be used to upgrade the affected disk drives with a firmware release which resolves the reported PI errors.

Cause

The PI errors do not cause issues with data availability, and the storage system continues to function in the presence of the errors. The errors are detected and corrected automatically by Data ONTAP® during the RAID parity scrub. However, parity disks that experience PI errors are more susceptible to repeat occurrences. PI errors may affect the data integrity of the storage system if a data disk fails before RAID parity scrub is able to correct the parity inconsistencies.

Environment

IBM® System Storage™ N series storage systems or storage expansion units in which the affected disk drives are used

Resolving the problem

If an N series storage system has experienced a PI error in the last 3 months (as described in the alerts announcing the firmware versions available for the affected disk drives), the procedure described in these special instructions must be used to upgrade the disk drive firmware to the new releases.

If an N series storage system has not experienced a PI error in the last 3 months, the normal procedure can be followed to upgrade to the new release of the disk drive firmware.

Instructions for upgrading HDD firmware are available from the N series support website:

  • Refer to the Important information for N series support if you have not created and registered your IBM ID to access N series content on the Web.
  • Follow the appropriate special procedures described in this notification to upgrade the disk drive firmware.

There are two methods for upgrading disk drive firmware:
  1. Nondisruptive drive firmware upgrade
  2. Disruptive drive firmware upgrade

Please choose one of these methods and follow the instructions below:

Nondisruptive drive firmware upgrade:

A nondisruptive disk drive firmware upgrade can be performed on any storage system where disk drives are part of either RAID-DP or 'Syncmirrored' RAID-4/RAID-DP aggregates. This procedure uses the background disk drive firmware upgrade capability, and it does not require a storage system reboot. The background disk drive firmware upgrade is available on all Data ONTAP releases running on N series storage systems.

Because a normal RAID parity scrub in a storage system may take a long time (hours to days) to complete, it is preferable to use manual RAID parity scrubs to speed up the parity scrub process.

The series of steps involved are as follows:

1. R ec ord option settings:
    These options are hidden and are not visible to the customer. However, they can be viewed and set by specifying their complete names. Before setting the options, it is advised to record their original values. This step must be performed on both nodes of the High Availability (HA) pair.
    options raid.h.disk.skip_mask.enable
    options raid.lost.write.threshold
    options disk.read_after_write_verify_fua.enable
    options disk.read_after_write_verify.enable

2. Se t options as follows:
    The options should be set on both nodes of the HA pair. This ensures consistent behavior in the event of an unexpected takeover while scrubs are running.
    options raid.h.disk.skip_mask.enable off
    options raid.lost.write.threshold 0
    options disk.read_after_write_verify_fua.enable off
    options disk.read_after_write_verify.enable on
    The purpose of these options is to prevent new PI errors from being generated. It is possible that the storage system may experience a slight write performance degradation while the above options are turned on.

3. Reset the scrub progress:
    This step ensures that only one complete scrub cycle is required to clean any latent PI errors in RAID groups.
    aggr scrub stop

4. Start the scrub on RAID groups (or aggregates) conta ining the affected disk drive FRU part numbers identified in the Alerts which reference these instructions .
    aggr scrub start [ agg regate | raidgroup]
      (where aggregate or raidgroup is the identifier of the aggregate or RAID group to be scrubbed)

5. Wait for the scrubs to complete on all aggregates/RAID groups as started in step 4.
    Use the following command to monitor and determine if the scrubs have completed.

    aggr scrub status -v

6. Start disk the drive firmware upgrade, or upgrade the storage system to the Data ONTAP release which has bundled the new disk drive firmware release.
    Make sure the 'raid.background_disk_fw_update.enable' option is 'on'. The default value for this option is 'on'.

7. Verify the disk drive firmware upgrade has completed.
    The command ' sysconfig –a' can be used to verify that intended disk drives have been upgraded to the new disk drive firmware release.

8 . Revert options to the values recorded in Step 1. The default values are:
    options raid.lost.write.threshold 1
    options disk.read_after_write_verify_fua.enable on
    options disk.read_after_write_verify.enable off
    options raid.h.disk.skip_mask.enable on

Disruptive drive firmware upgrade:

If it is acceptable to have downtime (for example, there is a planned maintenance window), then disk drive firmware can be upgraded during reboot. The advantage of this approach is that it does not require a scrub cycle to be run to clean any latent PI errors.

The series of steps involved are as follows:

1. Disable automatic background disk drive firmware upgrade:
    options raid.background_disk_fw_update.enable off
    Disabling automatic background firmware upgrade ensures that a boot-time disk drive firmware upgrade will occur.

2. Perform appropriate setup to ensure that disk drive firmware gets upgraded on the next reboot.

3. Initiate disk drive firmware upgrade on boot.
    Note : This is a disruptive step and client access will be disabled for a short time. Firmware upgrades should start after the reboot. For a High Availability (HA) configuration, the partner must be in a responsive state.

4. Verify that disk drive firmware upgrade has completed

    The command ' sysconfig –a' can be used to verify that intended disk drives have been upgraded to the new disk drive firmware release.

5. Set the background disk drive firmware upgrade option to its default value.
    options raid.background_disk_fw_update.enable on


Cross reference information
Segment Product Component Platform Version Edition
Network Attached Storage (NAS) System Storage EXN2000 FC Expansion unit (2863-001) Hard Disk Drives (HDD)
Network Attached Storage (NAS) System Storage EXN4000 Storage Expansion unit (2863-004) Hard Disk Drives (HDDs)
Network Attached Storage (NAS) System Storage N3300 M/T 2859-A10- A20 Hard Disk Drives (HDD)
Network Attached Storage (NAS) System Storage N3400 M/T 2859-A11- A21 Hard Disk Drives (HDD)
Network Attached Storage (NAS) System Storage N3600 M/T 2862-A10- A20 Hard Disk Drives (HDD)

Rate this page:

(0 users)Average rating

Document information


More support for:

N series Disk drive firmware

Version:

Not Applicable

Operating system(s):

Data ONTAP

Reference #:

S1003638

Modified date:

2011-03-04

Translate my page

Machine Translation

Content navigation