Compare stage
The Compare stage is a processing stage. This stage performs a column-by-column comparison of records in two presorted input data sets.
The Compare stage is a processing stage. It can have two input links and a single output link.
The Compare stage performs a column-by-column comparison of records in two presorted input data sets. You can restrict the comparison to specified key columns.
The Compare stage does not change the table definition, partitioning, or content of the records in either input data set. It transfers both data sets intact to a single output data set generated by the stage. The comparison results are also recorded in the output data set.
You can use runtime column propagation in this stage and allow InfoSphere® DataStage® to define the output column schema for you at runtime. The stage outputs a data set with three columns:
- result. Carries the code giving the result of the comparison.
- first. A subrecord containing the columns of the first input link.
- second. A subrecord containing the columns of the second input link.
- Specify the parent column for the output data corresponding to the first input link, and set the SQL type to unknown.
- Specify the actual columns that carry your data and make these subrecords of the parent column. Name each column first.colname, for example first.col1, first.col2 and so on. Make each column a subrecord by selecting the column, selecting edit row from the shortcut menu, and specifying a level number (for example, 03) for that column. (You can speed up this process by making the first column a subrecord and using the propagate values feature to make the remaining columns subrecords of the parent column.)
- Specify the parent column for output data corresponding to the second input link, and set the SQL type to unknown.
- Specify the actual columns that carry the data from the second input link, name them second.colname (for example, second.col1, second.col2) and make these subrecords of the parent column.
The stage editor has three pages:
- Stage Page. This is always present and is used to specify general information about the stage.
- Input Page. This is where you specify the details about the single input set from which you are selecting records.
- Output Page. This is where you specify details about the processed data being output from the stage.