Information icon IBM InfoSphere DataStage and InfoSphere QualityStage, Version 8.5
space Feedback

Combine Records stage

The Combine Records stage is restructure stage. It can have a single input link and a single output link.

The Combine Records stage combines records (that is, rows), in which particular key-column values are identical, into vectors of subrecords. As input, the stage takes a data set in which one or more columns are chosen as keys. All adjacent records whose key columns contain the same value are gathered into the same record as subrecords.

Shows columns being combined into a vector of subrecords

The data set input to the Combine Records stage must be key partitioned and sorted. This ensures that rows with the same key column values are located in the same partition and will be processed by the same node. Choosing the (auto) partitioning method will ensure that partitioning and sorting is done. If sorting and partitioning are carried out on separate stages before the Combine Records stage, InfoSphere® DataStage® in auto mode will detect this and not repartition (alternatively you could explicitly specify the Same partitioning method).

Shows a Combine Records stage with a single input link and a single output link

The stage editor has three pages:


PDFThis topic is also in the IBM InfoSphere DataStage and QualityStage Parallel Job Developer's Guide.

Update timestamp Last updated: 2012-10-8