Parsing, validating, and transliterating address data

You can configure the Address Verification stage to parse or validate address data. The stage also transliterates the data. You configure the stage in the IBM® InfoSphere® DataStage® and QualityStage® Designer client.

You can parse the addresses that are in your input stage. You assign address components, such as street or postal code, to a column.

You can validate the addresses against postal reference files to check the correctness of the data. The validation process assesses address deliverability and provides a status such as very likely, fair chance, or unlikely to be deliverable.

You can also generate geographic location information and a summary report as part of address validation. A Validation Summary report shows the following items:
  • Total number of records processed
  • Number and percentage of records that the stage passed, failed, validated, corrected, or suggested another address for
  • Number and percentage of records that the stage failed because of postal code, city, street, country, or region

If you choose the validation processing type, ensure that you have access to current postal validation reference files.

Transliteration is performed on the address data after the processing. Transliteration converts addresses from one representation (script) to another. You can transliterate addresses in non-Latin languages, such as Greek or Hebrew, to the Latin character set. Or you can transliterate from Latin to a non-Latin, Native character set. Use transliterated addresses to store data consistently in one common writing system.

When you configure the Address Verification stage, you can use the Fast Path navigation in the stage editor as a shortcut. The required tabs are: Stage > Processing, which is where you select the parse or validation processing type, Stage > Options, Input > Address Columns, and Output > Mapping. You can select and modify any of the other available tabs in addition to the Fast Path tabs.