Joining fields from different data sets

You can use ICETOOL's SPLICE operator to create output records in a variety of ways by splicing together up to 50 fields from records that have the same ON field values, but different information. The output records are sorted by the ON field values. The records to be spliced can originate from different input data sets, making it possible to perform various "join" and "match" operations.

You can use up to 10 ON fields; all of the ON field values are used for sorting and splicing. For example, if you use ON(1,4,CH), 'ABCD+01' and 'ABCD-01' are sorted in that order (by 1,4,CH,A) and counted as two occurrences of 'ABCD'. However, if you use ON(1,4,CH) and ON(5,3,FS), 'ABCD-01' is sorted before 'ABCD+01' (by 1,4,CH,A and 5,3,FS,A) and counted as one occurrence of 'ABCD-01' and one occurrence of 'ABCD+01'.

To do a join or match operation on two input data sets, you need to have the fields aligned appropriately for each pair of records to be spliced. Typically, you accomplish that by copying and reformatting one or both input data sets to temporary data sets, so you can splice the temporary data sets together.

Suppose you have two input data sets, REGION.IN1 and REGION.IN2, as shown in Table 1 and Table 2.

Table 1. REGION.IN1 data set for join
Region Headquarters Regional Director

1   5

6         20

21         35

East
West
North
South

Philadelphia
San Jose
Boston
Charlotte

C. Kent
B. Wayne
P. Parker
D. Prince

Table 2. REGION.IN2 data set for join
Office Region Employees Evaluation Established

1  4

5   9

10  13

14      23

24  27

0001
0001
0002
0002
0001
0003
0002
0003
0001
0004

East
South
South
East
West
East
West
South
North
South

0050
0023
0068
0125
0052
0028
0105
0054
0200
0070

Fair
Good
Fair
Excellent
Good
Good
Excellent
Fair
Fair
Good

1983
1976
1978
1986
1995
1994
2001
1992
1991
2002

From these two input data sets, you want to create an output data set, REGION.OUT. For each record in REGION.IN2, you want to look up the corresponding Region in REGION.IN1, and combine fields from the two records into one output record in REGION.OUT, as shown in Table 3.

Table 3. REGION.OUT data set for join
Office Region Regional Director Employees Evaluation Headquarters

1  4

5   9

10        24

25  28

29        38

39        53

0001
0002
0003
0001
0001
0002
0003
0004
0001
0002

East
East
East
North
South
South
South
South
West
West

C. Kent
C. Kent
C. Kent
P. Parker
D. Prince
D. Prince
D. Prince
D. Prince
B. Wayne
B. Wayne

0050
0125
0028
0200
0023
0068
0054
0070
0052
0105

Fair
Excellent
Good
Fair
Good
Fair
Fair
Good
Good
Excellent

Philadelphia
Philadelphia
Philadelphia
Boston
Charlotte
Charlotte
Charlotte
Charlotte
San Jose
San Jose

Write the following ICETOOL statements and JCL statements to create REGION.OUT from REGION.IN1 and REGION.IN2:
* Reformat REGION.IN1 to T1 so it can be spliced
  COPY FROM(REGNIN1) TO(T1) USING(CTL1)
* Reformat REGION.IN2 to T1 so it can be spliced
  COPY FROM(REGNIN2) TO(T1) USING(CTL2)
* Splice records in T1 with matching ON fields
  SPLICE FROM(T1) WITHALL -
    ON(5,5,CH)   -   Region
    WITH(1,4)    -   Office
    WITH(25,4)   -   Employees
    WITH(29,10)  -   Evaluation
    TO(REGNOUT)
/*
//REGNIN1 DD DSN=A123456.REGION.IN1,DISP=SHR
//REGNIN2 DD DSN=A123456.REGION.IN2,DISP=SHR
//T1 DD DSN=&&&T1,UNIT=3390,SPACE=(CYL,(5,5)),DISP=(MOD,PASS)
//REGNOUT DD DSN=A123456.REGION.OUT,DISP=(NEW,CATLG,DELETE),UNIT=3390,
//  SPACE=(CYL,(5,5))
//CTL1CNTL DD *
* Move REGION.IN1 fields to their locations for the
* output data set
  OUTREC FIELDS=(5:1,5,      Region
                 10:21,15,   Regional Director
                 39:6,15)    Headquarters
/*
//CTL2CNTL DD *
* Move REGION.IN2 fields to their locations for the
* output data set
  OUTREC FIELDS=(1:1,4,      Office
                 5:5,5,      Region
                25:10,4,     Employees
                29:14,10,    Evaluation
                53:X)
/*

The first COPY operator writes records to temporary data set T1 with the fields from REGION.IN1 as shown in Table 4.

Table 4. T1 data set fields from REGION.IN1
Blanks Region Regional Director Blanks Blanks Headquarters

1  4

5   9

10     24

25  28

29     38

39        53

The second COPY operator writes records at the end (MOD) of temporary data set T1 with the fields from REGION.IN2 as shown in Table 5.

Table 5. T1 data set fields from REGION.IN2
Office Region Blanks Employees Evaluation Blanks

1  4

5   9

10     24

25  28

29     38

39        53

Because the Region field is the ON field you want SPLICE to use to match your records, you must move it to the same location in both types of reformatted records (positions 5-9). You must move all of the other fields that you want in your output records to the locations in which you want them to appear in the combined record. Put blanks in each record for the corresponding fields in the other type of record. For example, the Regional Director field appears in positions 10-24 of the reformatted REGION.IN1 records while blanks appear in the corresponding positions 10-24 of the reformatted REGION.IN2 records, and the Office field appears in positions 1-4 of the reformatted REGION.IN2 records, while blanks appear in the corresponding positions 1-4 of the reformatted REGION.IN1 records.

The FROM(T1) operand tells DFSORT that the input for the SPLICE operation is the T1 data set, which contains the reformatted REGION.IN1 and REGION.IN2 records, in that order. The order of the input records is important, because the first record for each matching ON field acts as the base record on to which the second or subsequent record is spliced.

In this case, all of the reformatted REGION.IN1 records come before all of the reformatted REGION.IN2 records, so the SPLICE operator splices the WITH fields from the reformatted REGION.IN2 records to the reformatted REGION.IN1 record whenever their ON fields match. WITHALL tells SPLICE to splice the first record with every other record for which the ON fields match. Without WITHALL, only the first and last records are spliced.

This is just one example of the many types of join operations you can do with SPLICE, using some of its available operands.

You can also use a USING data set to specify DFSORT INCLUDE, OMIT, INREC, and OUTFIL statements for your SPLICE operation. INCLUDE or OMIT and INREC statement processing is performed before SPLICE processing. OUTFIL statement processing is performed after SPLICE processing.

For complete details of all of the operands you can use with SPLICE, as well as many more examples of join operations, see z/OS DFSORT Application Programming Guide.