IBM Support

Data cleansing feature in Integration Composer 7.5 can create duplicate entries for upgraded mappings



A data cleansing feature was added in Integration Composer 7.5 to normalize or cleanse the identifying attributes of the computer system.


Data cleansing overview

For computer systems to reconcile from multiple discovery sources, the identifying attributes of a computer system must be the same from each source. Integration Composer uses the Data Integration Service (DIS) component to generate a globally unique identifier (GUID) for each computer system for reconciliation. DIS uses the Common Data Model (CDM) naming rules to generate the GUID.

A data cleansing feature was added in Integration Composer 7.5 to normalize or cleanse the identifying attributes of a computer system before generating a DIS GUID. This cleansing ensures that the same DIS GUID is generated for a computer system coming from multiple sources.

For example, a serial number imported from Tivoli Asset Discovery for Distributed (TADd) may be formatted ABC-D. While the same serial number in Tivoli Application Dependency Discovery Manager (TADDM) may have the format A-B-C-D. The data cleansing feature normalizes the value to ABCD and uses this value to generate the DIS GUID. The original serial number value is stored in the database unchanged.

You can read more about the data cleansing feature here.

Data cleansing rules

Integration Composer 7.5 provides out-of-the-box cleansing rules that are defined in the following location:


There are a limited number of rules available to normalize the data:

Trim Removes leading and trailing white spaces from the value.
UpperCase The value is upper cased.
LowerCase The value is lower cased.
Regex Applies a regular expression to the value for formatting. This is the most commonly used and most powerful rule available. Any regular expression can be applied to format the value.

Since DIS uses the CDM, the out-of-the-box rules attempt to follow the recommendations on how values needs to be specified.

In Integration Composer 7.5, the out-of-the-box cleansing rules attempt to match the values from Tivoli Asset Discovery for Distributed (TADd) and Tivoli Application Dependency Discovery Manager (TADDM). However, the rules may need to be adjusted if other discovery tools are used. Data from multiple sources needs to be analyzed and the cleansing rules must be modified to format the data to match consistently.

Duplicate records

When upgrading to Integration Composer 7.5, new DIS GUIDs are generated for the Deployed Assets that are processed during the mapping. Since DIS GUIDs are used to reconcile, generating a new DIS GUID during the mapping may result in duplicate records for the computer system.

To avoid possible duplicates, a tool can be provided for Integration Composer that generates cleansed DIS GUIDs for all Deployed Assets in the database. This tool needs to be run before any 7.5 mappings are executed. You can request this tool from Integration Composer support.

Related information

Information Center: Naming and Reconciliation Service
Information Center: Naming attribute standardization

Cross reference information
Segment Product Component Platform Version Edition
Systems and Asset Management Tivoli Integration Composer Not Applicable Linux, Windows, AIX 7.5 All Editions

Document information

More support for: Control Desk

Software version: 7.5

Operating system(s): AIX, HP-UX, Linux, Solaris, Windows

Software edition: All Editions

Reference #: 1589561

Modified date: 04 April 2012

Translate this page: