IBM Support

Data Collection for JFS/JFS2 Filesystem Corruption PSI

Question & Answer


Question

What data should I collect in order to provide the best possibility for problem source identification of JFS or JFS2 filesystem or data corruption?

Cause

A need to collect information from a system that has experienced JFS or JFS2 filesystem or data corruption in order to provide Problem Source Identification

Answer


Following is a comprehensive list of data that might be required for determining the source of filesystem or data corruption within a JFS or JFS2 filesystem.  The data must be collected before corrective actions are taken to repair the problem filesystem.


1) Collect a snap

A full snap must be collected from the server before any corrective action is taken.

Remove any existing snap data:
# snap -r

Gather a full snap, but do not compress it yet. 
# snap -a

Navigate to the empty "testcase" directory under snap
# cd /tmp/ibmsupt/testcase
Now, run the commands in the following steps 2-7 to add more required data to the snap "testcase" directory.

2) Start a script session

To log subsequent commands along with their stdout and stderr to a file within the snap testcase directory, execute the following command:

# script myscript.out


3) Collect dumpfs

Gather dumpfs of the filesystem while it is in a corrupted state (before any corrective action taken):

# /usr/sbin/dumpfs /fsname > dumpfs.fsname


4) Collect filesystem metadata (JFS2 only)

There are a number of possible reasons for filesystem metadata corruption, as noted in the following technote:

If filesystem  metadata corruption occurs and PSI is requested, collection and analysis of the filesystem  metadata in its corrupted state, collected before any corrective action, can be crucial. The following technote describes how to collect a filesystem metadata for analysis:

*** Save all output into the /tmp/ibmsupt/testcase subdirectory using descriptive file names.
*** The metacapture flag is only valid on JFS2 filesystems at AIX 5.2 TL8 and above, AIX 5.3 and AIX 6.1


5) Collect Fileplace information

If data corruption is suspected (instead of filesystem metadata), the fileplace command can be useful to gather information which may help determine the cause of the data corruption.
*** Save all output into the /tmp/ibmsupt/testcase subdirectory using descriptive file names.


6) Repair the filesytem with fsck

*** If you ran fsck -yvv during step 4, skip this step.

If you have collected the information from preceding sections, and need the filesystem, you can use the fsck command to attempt to repair the metadata corruption.  For additional verbosity, use the "vv" arguments:

# fsck -yvv /dev/<lvname>


7) Collect fscklog output on JFS2 filesystems

The undocumented command  'fscklog' can list the last, or the one "prior to the last" fsck output.  Use the "-p" flag to list the "prior to the last" log for JFS2 filesystems. If this data is collected before any corrective actions are made, the output of fsck from step 6 should be sufficient, however, if fsck has been exec'd before the data collection,  the fscklog  output collected before the fsck exec might be useful.
To check most recent fsck log output for /fsname:
# /sbin/helpers/jfs2/fscklog /fsname > fscklog.out

To check previous fsck log output for /fsname:
# /sbin/helpers/jfs2/fscklog -p /fsname > fscklogprev.out


8) Upload the testcase for analysis

Exit the script session, compress and upload the snap directories for review:

# exit
# snap -c
# mv snap.pax.Z TSNNNNNNNNN.snap.pax.Z (Where NNN is your case number)
# ftp testcase.software.ibm.com

login as user 'anonymous' with your email address for the password

ftp> cd /toibm/aix
ftp> bin
ftp> put TSNNNNNNNNN.snap.pax.Z
ftp> bye

[{"Product":{"code":"SWG10","label":"AIX"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"File management","Platform":[{"code":"PF002","label":"AIX"}],"Version":"5.3;6.1;7.1","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]

Document Information

Modified date:
09 December 2019

UID

isg3T1011157