Preparing to load master data in virtual MDM with the MDM Workbench

You use the MDM Workbench to set up your initial data load.

In the following diagram, the example workflow shows the high-level tasks for configuring the data load. Some tasks are iterative and are repeated until you achieve the results that you require. The diagram is provided only as an example. Your team might use a different work flow. In addition, the diagram implies that each task follows sequentially, whereas in reality tasks might be performed simultaneously.

Hover and click the icons, or replay the animation.

The
graphic shows the tasks that can performed to load data for virtual
MDM. In MDM Workbench, a configuration project is a container that holds the operational server configuration and its associated files. When you configure the member model, you use the Configuration editor. You can start with the Basic editor view to configure the sources, data model, matching, and strings. Algorithms identify and link member records across the source systems. They also identify data issues and linkages across the systems that potentially represent the same member or object. The algorithms standardize each data element and compare both similarities and differences to determine the likelihood of a match. After you create a project in MDM Workbench, you must import applicable artifacts, such as algorithms, string data files, and weight tables. You upload the configuration file from the MDM Workbench to an operational server by using the Deploy option. Validation errors are displayed and must be fixed before deployment. So that you can match and score member records, data derivation standardizes, buckets, and compares data. You use ETL operations to extract data from either a single source or multiple sources and load it into InfoSphere MDM. MDM APIs and utilities are integrated with InfoSphere DataStage to provide ETL operations for your data. Weights indicate the possibility of a match or non-match between members. With the MDM Workbench, you can generate weights for attributes that are associated with a particular entity type. The bulk cross match (BXM) compares and links thousands of records per second. This process establishes entities that are based on the starting source system records. If you do not use entities in your implementation, you do not need to perform a bulk cross match. The purpose of linkages is to enable an accurate enterprise-wide view of a member. The member records can be automatically linked when they compare above the auto-link threshold. The Generate Threshold Analysis Pairs job is a cross-match program to generate random pairs for weight generation. In the Analytics view, you evaluate aspects of the configuration, such as buckets and entities. You also evaluate your configuration to find errors and potential performance problems.


Last updated: 27 Jun 2018