Stacking data sets

When two or more data sets are placed on the same tape volume or set of tape volumes, the data sets are said to be stacked. Use data set stacking to increase the efficiency of tape media use and to decrease the number of tape volumes needed by allocation. Data set stacking is also useful when you send data offsite; you can group related data sets together on a reduced number of tape volumes.

A data set collection is a group of data sets you intend to allocate on the same tape volume or set of tape volumes as a result of data set stacking. You can stack data sets on a single volume (that is, a data set resides on one volume but shares that volume with at least one other data set). You can also stack data sets on multiple volumes (that is, a data set spans two or more volumes and shares at least one of those volumes with one or more data sets or portions of data sets). The stacking might be done in one step, in multiple steps within the same job, or in different jobs.

You can request data set stacking by specifying the data set sequence number on the LABEL parameter in combination with either the volume reference (VOL=REF) or volume serial (VOL=SER) subparameters. You can also use the UNIT=AFF subparameter to reduce the number of tape drives required. VOL=SER is not recommended, and it is required only when the existing data sets cannot be referenced by the catalog, or specific volumes must be used for output.

Use the following table to determine the JCL parameters needed to request data set stacking. This table shows which parameters IBM® recommends that you use when you want to request data set stacking. For example, to request that multiple data sets in different steps of a job be stacked on the same tape volume, you need to specify a volume reference to the DD statement which placed the previous data set on the tape.

Because it is not possible to use relative GDG names in the VOL=REF subparameter, IBM recommends using the technique shown in Example 5 in Examples of data set stacking when it is necessary to refer to a relative generation dataset by data set name in a VOL=REF subparameter.

Table 1. IBM-Recommended Parameters for Data Set Stacking
  First step of first job First step of subsequent job Steps 2 through 'n' of any Job
First Data Set Created in the Step. First data set on the tape, therefore no VOL=REF. VOL=REF by data set name to the last data set on the stacked tape. (1)(2) VOL=REF by DD name to the last stacked DD in the previous step. (2)
Data Sets 2 through 'n' Created in the Step

VOL=REF by DD name to the immediately preceding stacked DD in this step.(3)

UNIT=AFF to the immediately preceding stacked DD in this step. (3)

VOL=REF by DD name to the immediately preceding stacked DD in this step.(3)

UNIT=AFF to the immediately preceding stacked DD in this step. (3)

VOL=REF by DD name to the immediately preceding stacked DD in this step. (3)

UNIT=AFF to the immediately preceding stacked DD in this step. (3)

Notes:
  1. See example 5 for a special consideration if the data set name to be referred to is a Generation Data Set (GDS) using a relative reference.
  2. In these cases, at least one of the volume serial numbers on which the data set collection currently resides -- that of the volumes on which the previously created data sets reside -- is known. See examples 2, 4, and 5.
    • If none of the volume serial numbers on which data set stacking is to be done are known at the start of a step, then if any DD statement within that step extends onto another volume, the following DD statements within that step will know about the new volumes and will correctly stack their data sets onto the end of the new volumes. For example:
      //jobname   JOB  ..........
      //stepname  EXEC PGM=pgmname
      //OUTDD1    DD DSN=dsnA,    or    dsnA(+1),
      //             UNIT=TAPE,LABEL=(1,SL),        
      //             DISP=(NEW,CATLG),          
      //             VOL=(,RETAIN,,99)                 
      //OUTDD2    DD DSN=dsnB,    or    dsnB(+1),
      //             UNIT=AFF=OUTDD1,LABEL=(2,SL),     
      //             DISP=(NEW,CATLG),          
      //             VOL=(,RETAIN,,99,REF=*.OUTDD1)    
      //OUTDD3    DD DSN=dsnC,    or    dsnC(+1),
      //             UNIT=AFF=OUTDD2,LABEL=(3,SL),     
      //             DISP=(NEW,CATLG),          
      //             VOL=(,,,99,REF=*.OUTDD2)

      Because dsnA will be the first data set on the tape, and no volume serial numbers are known -- explicitly (through the VOL=SER subparameter) or implicitly (by the VOL=REF subparameter) -- at the start of the step, then if dsnA starts on volume 1 and ends on volume2, OUTDD2 will know about both of those volume serial numbers and will correctly stack dsnB after dsnA on volume2. Similarly, if dsnB then starts on volume2 and ends on volume3, OUTDD3 will know about both of those volume serial numbers and will correctly stack dsnC after dsnB on volume3.

    • If any of the volume serial numbers on which data set stacking is to be done are known at the start of a step, then if any DD statement within that step extends onto another volume, the following DD statements within that step will not know about the new volumes and will incorrectly attempt to stack their data sets onto the end of what was the last volume when that step began. For example:
      //jobname   JOB  ..........
      //STEP1     EXEC PGM=pgmname
      //OUTDD1    DD DSN=dsnA,    or    dsnA(+1),
      //             UNIT=TAPE,LABEL=(1,SL),        
      //             DISP=(NEW,CATLG,DELETE),          
      //             VOL=(,RETAIN,,99)                  
      //*
      //STEP2     EXEC PGM=pgmname
      //OUTDD2    DD DSN=dsnB,    or    dsnB(+1),
      //             UNIT=TAPE,LABEL=(2,SL),            
      //             DISP=(NEW,CATLG),              
      //             VOL=(,RETAIN,,99,REF=*.STEP1.OUTDD1)
      //OUTDD3    DD DSN=dsnC,    or    dsnC(+1),
      //             UNIT=AFF=OUTDD2,LABEL=(3,SL),         
      //             DISP=(NEW,CATLG),              
      //             VOL=(,,,99,REF=*.OUTDD2)  

      Because dsnA will be the first data set on the tape, and no volume serial numbers are known -- explicitly (through the VOL=SER subparameter) or implicitly (via the VOL=REF subparameter) -- at the start of the step, then if dsnA starts on volume 1 and ends on volume2, OUTDD2 will know about both of those volume serial numbers and will correctly stack dsnB after dsnA on volume2. However, if dsnB then starts on volume2 and ends on volume3, OUTDD3 will still only know about volume2 and will incorrectly attempt to stack dsnC after dsnB on volume2. The system will not allow this and the job will abend. If this is a possibility, it is better to stack only one data set per step, as shown in example 3, in subtopic Examples of data set stacking (although this would limit the number of data sets which a single job could stack to 255, for this is the maximum number of steps a job might have).

  3. For example, the DD statement for the 52nd stacked data set of the data set collection should VOL=REF and UNIT=AFF back to the DD statement for the 51st stacked data set of the collection.

Sample parameters:

VOLUME=REF to previous DD in this step: VOLUME=REF=*.ddname

VOLUME=REF to last DD in previous step: VOLUME=REF=*.stepname.ddname or

VOLUME=REF=*.stepname.procstepname.ddname

VOLUME=REF to last data set on the tape: VOLUME=REF=datasetname

(used in the first step of a different job)