Joining fields from different data sets

You can use ICETOOL's SPLICE operator to create output records in a variety of ways by splicing together up to 50 fields from records that have the same ON field values, but different information. The output records are sorted by the ON field values. The records to be spliced can originate from different input data sets, making it possible to perform various "join" and "match" operations.

You can use up to 10 ON fields; all of the ON field values are used for sorting and splicing. For example, if you use ON(1,4,CH), 'ABCD+01' and 'ABCD-01' are sorted in that order (by 1,4,CH,A) and counted as two occurrences of 'ABCD'. However, if you use ON(1,4,CH) and ON(5,3,FS), 'ABCD-01' is sorted before 'ABCD+01' (by 1,4,CH,A and 5,3,FS,A) and counted as one occurrence of 'ABCD-01' and one occurrence of 'ABCD+01'.

To do a join or match operation on two input data sets, you need to have the fields aligned appropriately for each pair of records to be spliced. Typically, you accomplish that by copying and reformatting one or both input data sets to temporary data sets, so you can splice the temporary data sets together.

Suppose you have two input data sets, REGION.IN1 and REGION.IN2, as shown in Table 1 and Table 2.

Table 1. REGION.IN1 data set for join
Region	Headquarters	Regional Director
1 5	6 20	21 35
East West North South	Philadelphia San Jose Boston Charlotte	C. Kent B. Wayne P. Parker D. Prince

Table 2. REGION.IN2 data set for join
Office	Region	Employees	Evaluation	Established
1 4	5 9	10 13	14 23	24 27
0001 0001 0002 0002 0001 0003 0002 0003 0001 0004	East South South East West East West South North South	0050 0023 0068 0125 0052 0028 0105 0054 0200 0070	Fair Good Fair Excellent Good Good Excellent Fair Fair Good	1983 1976 1978 1986 1995 1994 2001 1992 1991 2002

From these two input data sets, you want to create an output data set, REGION.OUT. For each record in REGION.IN2, you want to look up the corresponding Region in REGION.IN1, and combine fields from the two records into one output record in REGION.OUT, as shown in Table 3.

Table 3. REGION.OUT data set for join
Office	Region	Regional Director	Employees	Evaluation	Headquarters
1 4	5 9	10 24	25 28	29 38	39 53
0001 0002 0003 0001 0001 0002 0003 0004 0001 0002	East East East North South South South South West West	C. Kent C. Kent C. Kent P. Parker D. Prince D. Prince D. Prince D. Prince B. Wayne B. Wayne	0050 0125 0028 0200 0023 0068 0054 0070 0052 0105	Fair Excellent Good Fair Good Fair Fair Good Good Excellent	Philadelphia Philadelphia Philadelphia Boston Charlotte Charlotte Charlotte Charlotte San Jose San Jose

Write the following ICETOOL statements and JCL statements to create REGION.OUT from REGION.IN1 and REGION.IN2:

* Reformat REGION.IN1 to T1 so it can be spliced
  COPY FROM(REGNIN1) TO(T1) USING(CTL1)
* Reformat REGION.IN2 to T1 so it can be spliced
  COPY FROM(REGNIN2) TO(T1) USING(CTL2)
* Splice records in T1 with matching ON fields
  SPLICE FROM(T1) WITHALL -
    ON(5,5,CH)   -   Region
    WITH(1,4)    -   Office
    WITH(25,4)   -   Employees
    WITH(29,10)  -   Evaluation
    TO(REGNOUT)
/*
//REGNIN1 DD DSN=A123456.REGION.IN1,DISP=SHR
//REGNIN2 DD DSN=A123456.REGION.IN2,DISP=SHR
//T1 DD DSN=&&&T1,UNIT=3390,SPACE=(CYL,(5,5)),DISP=(MOD,PASS)
//REGNOUT DD DSN=A123456.REGION.OUT,DISP=(NEW,CATLG,DELETE),UNIT=3390,
//  SPACE=(CYL,(5,5))
//CTL1CNTL DD *
* Move REGION.IN1 fields to their locations for the
* output data set
  OUTREC FIELDS=(5:1,5,      Region
                 10:21,15,   Regional Director
                 39:6,15)    Headquarters
/*
//CTL2CNTL DD *
* Move REGION.IN2 fields to their locations for the
* output data set
  OUTREC FIELDS=(1:1,4,      Office
                 5:5,5,      Region
                25:10,4,     Employees
                29:14,10,    Evaluation
                53:X)
/*

The first COPY operator writes records to temporary data set T1 with the fields from REGION.IN1 as shown in Table 4.

Table 4. T1 data set fields from REGION.IN1
Blanks	Region	Regional Director	Blanks	Blanks	Headquarters
1 4	5 9	10 24	25 28	29 38	39 53

The second COPY operator writes records at the end (MOD) of temporary data set T1 with the fields from REGION.IN2 as shown in Table 5.

Table 5. T1 data set fields from REGION.IN2
Office	Region	Blanks	Employees	Evaluation	Blanks
1 4	5 9	10 24	25 28	29 38	39 53

Because the Region field is the ON field you want SPLICE to use to match your records, you must move it to the same location in both types of reformatted records (positions 5-9). You must move all of the other fields that you want in your output records to the locations in which you want them to appear in the combined record. Put blanks in each record for the corresponding fields in the other type of record. For example, the Regional Director field appears in positions 10-24 of the reformatted REGION.IN1 records while blanks appear in the corresponding positions 10-24 of the reformatted REGION.IN2 records, and the Office field appears in positions 1-4 of the reformatted REGION.IN2 records, while blanks appear in the corresponding positions 1-4 of the reformatted REGION.IN1 records.

The FROM(T1) operand tells DFSORT that the input for the SPLICE operation is the T1 data set, which contains the reformatted REGION.IN1 and REGION.IN2 records, in that order. The order of the input records is important, because the first record for each matching ON field acts as the base record on to which the second or subsequent record is spliced.

In this case, all of the reformatted REGION.IN1 records come before all of the reformatted REGION.IN2 records, so the SPLICE operator splices the WITH fields from the reformatted REGION.IN2 records to the reformatted REGION.IN1 record whenever their ON fields match. WITHALL tells SPLICE to splice the first record with every other record for which the ON fields match. Without WITHALL, only the first and last records are spliced.

This is just one example of the many types of join operations you can do with SPLICE, using some of its available operands.

You can also use a USING data set to specify DFSORT INCLUDE, OMIT, INREC, and OUTFIL statements for your SPLICE operation. INCLUDE or OMIT and INREC statement processing is performed before SPLICE processing. OUTFIL statement processing is performed after SPLICE processing.

For complete details of all of the operands you can use with SPLICE, as well as many more examples of join operations, see z/OS DFSORT Application Programming Guide.