Configuring duplicate row handling

The Netezza connector handles duplicates by filtering duplicates in the input data and also by inserting only rows that do not exist in the target table. You can configure the connector to check for duplicate rows. The connector uses additional SQL logic to detect duplicate rows.

About this task

The connector groups rows by the key values and orders them based on the value of the ROWID column. The row added recently has a higher ROWID. After the rows are ranked within every key combination, the connector selects one of the combination and filters out the rest. When Use unique key column is set to Yes, the connector uses the specified column in place of ROWID.

Procedure

  1. Double-click the connector stage icon to open the connector properties.
  2. Select the Input tab.
  3. Select the Properties tab.
  4. Specify an option for Write mode.
  5. Set the Check duplicate rows option to Yes.
  6. Set the Duplicate row action option to Filter. The connector keeps only one row for each key combination and filters out the rest. When the option is set to Fail, the job fails if any duplicates are detected.

Results

The following table summarizes the behavior of connector for different write modes, when duplicate row handling is enabled:
Write mode Behavior
Insert Selects the first row for each key combination and ignores the rest. If the key is not already in the target table, the selected row is inserted. Else, no action is taken.
Update Selects the last row for each key combination and ignores the rest. If the matching row exists in the target table, the selected row is used to update the target table. Else, no action is taken.
Delete No change in behavior. All matching rows are deleted.
Update then insert
  • Update: Selects the last row for each key combination and ignores the rest. If the matching row exists in the target table, the selected row is used to update the target table. Else, no action is taken.
  • Insert: Selects the last row for each key combination and ignores the rest. If the key is not already in the target table, the selected row is inserted. Else, no action is taken.
Delete then insert
  • Delete: No change in behavior. All matching rows are deleted.
  • Insert: Selects the last row for each key combination and ignores the rest. Inserts the selected row into the target table.
Action column Breaks the input rows into groups for execution in different SQL statements based on the value of the action column and other factors. It ensures that the result is identical to the result you get if each row was applied one at a time in the original incoming order.