Performance best practice with QualityStage Address Verification Interface
This document applies only to the following language version(s):
How does one achieve optimal performance with InfoSphere QualityStage Address Verification Interface?
Performance of the Address Verification Interface (AVI) is impacted by the input data, stage property settings, and parallel engine configuration
Experiments with USA address data sets indicate that the following steps could help improve the performance of the QS AVI Validation function:
1. Include the PostCode input field to avoid most addresses being flagged as unverified.
2. Sort the PostCode input field. In-house experiments show ~44% improvement by sorting the PostCode input field.
3. If possible, always include the Country input field. As much as a 74% degradation in performance was observed when the Country field was omitted.
4. If possible, avoid using unfielded input. Unfielded input is all address data in one column, with no differentiation between the address line, postal code, and other address data. Unfielded input contributes to a degradation in performance by as much as 10%.
5. Avoid using 'Validation' processing type with 'Suggestion' mode for batch processing. 'Suggestion' mode is not designed for batch processing
6. Increase the parallel engine processing node count in the APT_CONFIG_FILE if your computer has CPU resources available. AVI throughput scales linearly as the node count is increased.
Performance results vary depending on the operating system you run on and other system variables. The percentages provided here are only for your reference.
|Information Management||InfoSphere Information Server||AIX, Linux, Solaris, Windows||9.1, 8.7, 8.5|
More support for:
Address Verification Interface
Software version: 10.0
Operating system(s): AIX, Linux, Solaris, Windows
Reference #: 1625670
Modified date: 15 March 2013