Data sets that REORG TABLESPACE uses
The REORG TABLESPACE utility uses a number of data sets during its operation.
The following table describes the data sets that REORG TABLESPACE uses. The table lists the DD name that is used to identify the data set, a description of the data set, and an indication of whether it is required. Include statements in your JCL for each required data set, and any optional data sets that you want to use.
Data set | Description | Required? |
---|---|---|
SYSIN | Input data set that contains the utility control statement. | Yes |
SYSUT1 | A temporary data set for sort input. | No |
SYSPRINT | Output data set for messages. | Yes |
STPRIN01 | A data set that contains messages from the sort program (usually, SYSOUT or DUMMY). This data set is used when statistics are collected on at least one data-partitioned secondary index. |
No1 |
SYSDISC | Data set that contains discarded records from REORG DISCARD. The default DD name is SYSDISC. | No3 |
SYSPUNCH | Data set that contains a LOAD statement that is generated by REORG, which loads records that REORG DISCARD or REORG UNLOAD EXTERNAL wrote to the DISCARD or UNLOAD data sets. The default DD name is SYSPUNCH. | No4 |
UTPRINT | A data set that contains messages from the sort program (usually, SYSOUT or DUMMY). | Yes |
Unload data set | Data set that contains the unloaded data that is to be
reloaded during the RELOAD phase. Specify its DD or template name with the UNLDDN option or with the
RECDSN field on the DB2I Utilities panel. The data set must be a sequential data set that is
readable by BSAM. The default DD name is SYSREC. The unload data set must be large enough to contain all the unloaded records from all the tables in the target table space. If at least one table in the table space does not have an index, REORG cannot use the SORTDATA method with SHRLEVEL CHANGE. As a result, you must unload the data in the SYSREC data set. |
Yes5 |
Sequential copies | From one to four output data sets that are to contain the image copies. Specify their DD or template names with the COPYDDN and RECOVERYDDN options of the utility control statement. | No6 |
FlashCopy® image copies | For table space or index space level copies, a VSAM data set for the
output FlashCopy image copy of each partition or piece.
For a partition level or piece level copy, a VSAM data set for the output FlashCopy image copy of the partition or piece. |
No12 |
Work data sets | Temporary data sets for sort input and output. The DD names have the form DATAWKnn. | No7 |
Work data sets | Temporary data sets for sort input and output when sorting keys, or for sorting data when SORTDATA is specified but NOSYSREC is not. If index build parallelism is used, the DD names have the form SWnnWKmm. If index build parallelism is not used, the DD names have the form SORTWKnn | Yes8 |
Sort work data sets | Temporary data sets for sort input and output when collecting inline statistics on at least one data-partitioned secondary index. The DD names have the form ST01WKnn. | No2,9 |
Sort work data sets | Temporary data sets for unload parallelism. The DD names have the form DAnnWKmm. | Yes10 |
Print data sets | Data sets for unload parallelism. The DD names have the form
DTPRINnn.
Every time you invoke REORG TABLESPACE, new DTPRINnn data sets are dynamically allocated. REORG TABLESPACE does not reuse DTPRINnn data sets from previous job steps. This behavior might cause the available JES2 job queue elements to be consumed more quickly than expected. |
Yes10,11 |
Note:
|
The following objects are named in the utility control statement and do not require DD statements in the JCL:
- Table space
- Object that is to be reorganized.
Calculating the size of the unload data set
The required size for the unload data set varies depending on the options that you use for REORG.
- If you use REORG with UNLOAD PAUSE or CONTINUE and you specify KEEPDICTIONARY (assuming that a
compression dictionary already exists), the size of the unload data set, in bytes, is the VSAM
high-allocated RBA for the table space. You can obtain the high-allocated RBA from the associated
VSAM catalog.
For SHRLEVEL CHANGE, also add the result of the following calculation (in bytes) to the VSAM high-used RBA:
number of records * 11
- If you use REORG with UNLOAD ONLY, UNLOAD PAUSE, or CONTINUE and you do not specify
KEEPDICTIONARY, you can calculate the size of the unload data set, in bytes, by using the following
formula:
The maximum row length is the row length, including the 6-byte record prefix, plus the length of the longest clustering key. If multiple tables exist in the table space, use the following formula to determine the maximum row length:maximum row length * number of rows
For SHRLEVEL CHANGE, also add the result of the following formula to the preceding result:Sum over all tables ((row length + (2 * number of VARBIN columns)) * number of rows)
In the preceding formula:(21 * ((NEARINDREF + FARINDREF) * 1.1))
- NEARINDREF
- Is the value that is obtained from the NEARINDREF column of the SYSIBM.SYSTABLEPART catalog table. The accuracy of the data set size calculation depends on recent information in the SYSTABLEPART catalog table.
- FARINDREF
- Is the value that is obtained from the FARINDREF column of the SYSIBM.SYSTABLEPART catalog table.
- If you have variable-length fields, the calculation in step 2 might result in excessive space. Use the average uncompressed row length, multiplied by the number of rows.
- If you use REORG with UNLOAD PAUSE or CONTINUE with the DISCARD option, and the table has variable length fields, use the maximum row length in the calculation. The DISCARD option without the NOPAD option pads the variable length fields.
For certain table spaces in the catalog and directory, the unload data set for the table spaces have a different format. The calculation for the size of this data set is as follows:
data set size in bytes = (28 + longrow) * numrows
In the preceding formula:
- longrow
- Is the length of the longest row in the table space.
- numrows
- Is the number of rows in the data set.
The length of the row is calculated as follows:
Sum of column lengths + 4 bytes for each link
The length of the column is calculated as follows:
Maximum length of the column + 1 (if nullable) + 2 (if varying length)
Calculating the size of the sort work data sets
Using two or three large SORTWKnn data sets is preferable to using several small ones. If adequate space is not available, you cannot run REORG.
Sort work data sets cannot span volumes. Smaller volumes require more sort work data sets to sort the same amount of data; therefore, large volume sizes can reduce the number of sort work data sets that are needed.
When you allocate sort work data sets on disk, allocate at least 1.2 times the amount of space that is used by the data that is to be sorted.
Allocating twice the space that is used by the unload data sets is usually adequate for the sort work data sets. For compressed data, if the data needs to be uncompressed for processing, allocate additional sort space. Use the compression ratio of the existing data to calculate the additional sort space that is required. Some examples of when the compressed data needs to be uncompressed include, but are not limited to:
- UNLOAD PAUSE without KEEPDICTIONARY
- UNLOAD CONTINUE without KEEPDICTIONARY
- Discard processing
- After schema changes, such as ALTER ADD COLUMN
- When AUX YES processing is in effect
Tape devices are not supported for sort work data sets.
Specifying a destination for sort program messages
The REORG utility job step must contain a UTPRINT DD statement that defines a destination for messages that are issued by the sort program during the SORT phase of REORG. DB2I, the %DSNU CLIST command, and the DSNUPROC procedure use the following default DD statement:
//UTPRINT DD SYSOUT=A
Calculating the size of the statistics sort work data sets:
To calculate the approximate size (in bytes) of the ST01WKnn data set, use the following formula:
2 ×(maximum record length × numcols × (count + 2) × number of indexes)
The variables in the preceding formula have the following values:
- maximum record length
- Maximum record length of the SYSCOLDISTSTATS record that is processed when collecting frequency statistics (You can obtain this value from the RECLENGTH column in SYSTABLES.)
- numcols
- Number of key columns to concatenate when you collect frequent values from the specified index.
- count
- Number of frequent values that DB2® is to collect.