Using predefined data rule definitions as templates for new data rule definitions

To build data rule definitions more quickly, create a data rule definition by starting from a predefined data rule definition. Modify it to fit your specific data conditions.

About this task

A typical example where you might create a data rule definition from a predefined data rule definition is where you have a field named Region that represents a specific segment of the world. The Region field might be defined as a text field that is five characters long, with the first two characters as alphabetic characters that must be in the following list:
  • AM (Africa and Middle East)
  • AP (Asia and Pacific)
  • EU (Europe)
  • NA (North America)
  • SA (South America)

The predefined rule definitions do not include the exact rule definition that you need. However, the predefined rule definition TextSubstrInRefList contains the following description: Substring text value starting in position 3 for length 3 is in reference list; applied to string data. This rule definition seems similar to what you need because you want to evaluate a substring for inclusion in a reference list. You can copy the predefined rule definition and modify it to fit your needs.

Procedure

  1. Log in to the IBM® InfoSphere® Information Server console.
  2. Open your project.
  3. From the Develop navigator menu, select Data Quality.
  4. From the Published Rules folder in the Data Quality workspace, select the predefined data rule definition. For example, select 08 Validity and Completeness > Text > TextSubstrInRefList.
  5. From the Tasks list, select Create a Copy, and then click OK.
    Figure 1. The TextSubstrInRefList data rule definition highlighted in the Data Quality workspace
    A screen capture that shows the TextSubstrInRefList data rule definition selected. The cursor is hovered over the Create a Copy task.
    A copy of the data rule definition is created in the All folder.
  6. Open the copy of the data rule definition in the All folder, and then edit it as necessary. For example, make these changes:
    1. Change the rule definition name. For example, call it Region_SubstrInRefList.
    2. Change the substring function from substring(TextField, 3, 3) to substring(Region, 1, 2). This change starts the substring function at the first character for a length of two characters.
    3. Change the reference list data from 'AAA','AAB','BAA','CCC'} to {'AM','AP','EU','NA','SA'}.
  7. Save the modified data rule definition.

What to do next

If you want to use the data rule definition to analyze data quality within InfoSphere Information Analyzer, generate a data rule from the rule definition. If you want to use the data rule definition in another InfoSphere Information Analyzer project, or in the Data Rules stage, publish the data rule definition.