The Match Designer histogram displays the distribution
of the composite weights.
About this task
If you move the Current Data handle
on the histogram to a new weight, it automatically scrolls the data
grid to the location of the data with the selected weight. Likewise,
if you reposition the selection within the data display, the Current
Data handle is repositioned in the histogram.
Move
the
Current Data handle by either of the following
actions:
- To display records of a certain weight, move the Current
Data handle along the Weight axis. Ascending
by Weight Sort moves the Current Data handle
to the lowest detail weight value.
Note: For Unduplicate Match specifications,
the Current Data handle is available only when
the data display is in match pair order. To display in match pair
order, right-click the data display and click Group by
Match Pairs.
- To adjust the Clerical Cutoff or Match Cutoff settings, move the
cutoff handle along the Weight axis. The changed cutoffs show in the
Cutoff Values pane.
The
following list contains some points to remember about cutoffs:
- The clerical cutoff is a composite weight above which record pairs
are considered to be possible matches. Record pairs with weights between
the match and the clerical cutoff are known as clericals and typically
are reviewed to determine whether they are matches or nonmatches.
If you do not want to review clericals, make the match cutoff weight
equal to the clerical cutoff weight.
- Cutoff weights can be negative values, if you want. However, when
you set cutoff weights to negative values, this setting creates extremely
inclusive sets of matched records. The histogram displays the distribution
of the composite weights. If you use negative values for cutoff weights,
this histogram shows many values at highly negative weights, because
most cases are nonmatched pairs. However, record pairs that are obvious
disagreements are not a large part of the matching process, and thus,
negative weights are not often shown.
- There is another large group of values at highly positive weights
for the matched cases. The cutoff values for the match run can be
set by inspecting this histogram. Make the clerical review cutoff
the weight where the spike in the histogram reaches near the axis.
Set the other cutoff weight where the nonmatched cases start to dominate.
Experiment and examine the test results as a guide for setting the
cutoffs.
- For a Reference Many-to-One Duplicate match type, there is an
additional cutoff weight called a duplicate cutoff. This cutoff is
optional. If you use the duplicate cutoff, set it higher than the
match cutoff weight. If more than one record pair receives a composite
weight that is higher than the match cutoff, these records are declared
duplicates if their composite weight is equal to or greater than the
duplicate cutoff.
Procedure