Information Management IBM InfoSphere Master Data Management, Version 10.1

mpxlink utility

The mpxlink utility is a cross match program that enables entity linkage.

The mpxlink utility takes comparison results from the mpxcomp utility and creates entity link and task files (.unl files) that can be loaded into the database. This utility can be run from the command line or from IBM® Initiate® WorkbenchInitiate menu > New Job Set. See usage topics about IBM Initiate Workbench for more information.

All options and flags are case independent; option values are not independent.

Generating task sets can be a lengthy operation.

If you want to retain existing Enterprise IDs (entrecnos) while doing an incremental cross match (IXM), you must use the correct options:

Table 1. mpxlink options
Option Type Description Default
-entType Name Entity type name. This option identifies the type of entity being computed. If you are implementing multiple entity types (for example, identity and household), you must run mpxlink for each type. This option is required and there is no default setting. NONE
-bxmInpDir dirName .bin input directory. The directory where the input binary (.bin) files to link are stored. Input files can be from the mpxcomp utility output, or other processes such as an IXM.

This directory is typically the work directory on the server hosting your hub configuration. This option is required and there is no default setting.

You can list multiple directories for this option; separate multiple directories with single spaces.

NONE
-bxmOutDir dirName .bin output directory. Indicate where you want the BXM output files to be located. This directory is relative to the projects work directory on the hub:

MAD_HOMEIDR\inst\mpinet_instance_name\work\
project_name\work\bxm_output_dir

Also generate bulk cross match data in the designated BXM output directory.

NONE
-unlOutDir dirName .unl output directory. The directory in which you want the mpxlink output binary files located. Binary output files are used by the relationship linkers. The binary output file is named mpx_bxmxmem.bin.

This directory is typically relative to the work directory on the server hosting the hub configuration.

Generating the output in binary form is optional; specifying an output directory with this option is what causes binary output to be generated. In other words, if no directory is specified here, no binary output is generated.

NONE
-nMemParts N Number of member partitions (MemParts). MemParts are used to partition the data set. Typically this partition is done for memory considerations. Because the mpxlink utility requires the entire input data set (for example, the binary files of comparison results) to be read into memory at once, breaking the data set into smaller pieces allows them to fit into available memory.

The MemParts option differs from the MxmParts option in that MemParts breaks up the memHead and memCmpd data files, whereas MxmParts breaks up link and task files (the output of the mpxcomp utility).

The MemParts value set here must be the same as the MemParts value set in mpxcomp, and in the utility that created the input for mpxcomp (for example, mpxfsdvd, mpxprep, or mpxredvd). In other words, the MemParts setting in mpxcomp determines how many partitioned file segments are passed to mpxlink; the mpxlink MemParts setting must accurately reflect the number of partitioned file segments coming from mpxcomp.

There is a performance consideration to partitioning the data set: the higher the MemParts is set, the slower the mpxlink process.

Leave this value set to 1 unless memory is an issue. The maximum value is 100.

1
-nMxmParts N Number of maximum out partitions. Like MemParts, the MxmParts option partitions the output of the mpxcomp process. As with MemParts, this option is used when the output file is too large to be read into memory in its entirety, and needs to be broken up into smaller sections to fit into available memory.

The MxmParts option differs from MemParts in that MxmParts breaks up link and task files (the output of the mpxcomp utility), whereas MemParts breaks up the memHead and memCmpd data files.

The MxmParts value set here must be the same as the MxmParts value set in mpxcomp, which provides the input to mpxlink. In other words, the MxmParts setting in mpxcomp determines how many partitioned file segments are passed to mpxlink. The mpxlink MxmParts setting must accurately reflect the number of partitioned file segments coming from mpxcomp.

Leave this value set to 1 unless memory is an issue. The maximum value is 100.

1
-{no}bxmDiff   Use explicit different records from entrule. This option controls whether mpxlink uses existing entity rules when forming entities. For example, if two members in an entity are separated in IBM Initiate Inspector, a non-identity rule is created by the Master Data Engine. (Likewise if two members are manually linked, an identity rule is created.) The mpxrule utility captures these rules as "same" (identity) or "diff" (non-identity) rules. If you re-crossmatch an existing database, including these rules prevents the mpxlink utility from reforming linkages (in the case of diff rules), or force members to be in the same entity (in the case of a "same" rule).

The input data used here is created with the corresponding mpxcomp-bxmDiff option (Use explicit different records from entrule option in IBM Initiate Workbench).

-noBxmDiff
-{no}bxmSame   Use explicit same records from entrule. Like -bxmDiff, this option controls whether the mpxlink utility uses existing entity rules when forming entities. See description for the -bxmDiff option.

The input data used here is created with the corresponding mpxcomp utility -bxmSame option (Use explicit same records from entrule in IBM Initiate Workbench).

-noBxmSame
-{no}bxmXeia   Use implicit link records from entlink. This option instructs mpxlink to include the output from the mpxxeia utility. The mpxxeia utility captures existing entity data. The input data used here is created with the corresponding mpxcomp utility -bxmXeia option (Use implicit link records from entlink in IBM Initiate Workbench). -noBxmXeia
-{no}bxmPD   Use potential duplicate task records from entxtsk. The mpxlink utility uses this data to form review identifier tasks that can be loaded into the database.

The input data used here is created with the corresponding mpxcomp utility -bxmRvid (Use reviewid records from mpxcomp in IBM Initiate Workbench).

-noBxmPD
-{no}bxmPL   Use potential linkage task records from the mpxxtask utility (entxtsk), which captures existing task information from the database. -noBxmPL
-{no}bxmRI   Use review identifier task records from the mpxxtask utility (entxtsk), which captures existing task information from the database. -noBxmRI
-{no}bxmRule   Use member rule records from the mpxprep, mpxredvd, or mpxfsdvd utilities. Member rules express the relationship between the survivor and obsolete members in a merge. Because the input data used here is created by default in the mpxprep utility, it is not necessary to specify a corresponding option in mpxprep. -bxmRule
-{no}bxmLink   Use linkage records from the mpxcomp utility. The mpxlink utility uses this data to form entities that can be loaded into the database. The input data used here is created with the corresponding mpxcomp utility -bxmLink option (Use linkage records from mpxcomp in IBM Initiate Workbench). -bxmLink
-{no}bxmTask   Use task records from the mpxcomp utility. The mpxlink utility uses this data to form tasks that can be loaded into the database. The input data used here is created with the corresponding mpxcomp utility -bxmTask option (Use task records from mpxcomp in IBM Initiate Workbench). -bxmTask
-{no}bxmRvid   Use review identifier records from the mpxcomp utility. The mpxlink utility uses this data to form review identifier tasks that can be loaded into the database.

The input data used here is created with the corresponding mpxcomp utility -bxmRvid option (Use reviewid records from mpxcomp in IBM Initiate Workbench).

-bxmRvid
-{no}entLink   Instruct the engine to write new linkages and entity level tasks to a .unl file (mpi_entlink.unl). -entLink
-{no}entXeia   Instruct the engine to write historical Enterprise ID data to a .unl file (mpi_entxeia.unl). -entXeia
-{no}entXtsk   Instruct the engine to write information about tasks related to an entity to a .unl file (mpi_entxtsk.unl). -entXtsk
-{no}seqGen   When specified, this option writes a .unl file containing updated sequence generator numbers that can then be loaded into the database. The engine normally updates this table properly on startup. This option is useful for an installation that, when doing multiple links, needs to update the sequence numbers without starting an engine. -noSeqGen
-{no}tskSets   Compute full task set information. Assigns a task set number to a member in a task. A task set identifies a group (two or more) of records explicitly identified as being in a task.

For example, if memrecnos 1, 2, and 3 are in a potential duplicate task, they are all assigned tskset=1; memrecno, If memrecnos 4 and 5 are in a Potential Linkage task, they are assigned tskset=2, and so on.

This data is typically used for reporting purposes.

-noTskSets
-{no}tskRelatedMembers   Create a count of members in a task so that when you have a trigger member, you can tell that there are n members in the task. The count is only calculated when a member is cross matched. -tskRelatedMembers
-{no}strict   Forces xeia (entity linkage) information to default to existing information (rules and prior data). Setting this option to -strict makes the mpxlink utility sensitive to anomalies in the data.

Disable this option to instruct mpxlink to ignore anomalies in the data. For example, inconsistencies or discrepancies arising from live updates to the table. (That is, discrepancies that might occur because data is changing from updates as it is being collected by the mpx utilities that create the input for mpxlink.)

This option is typically used for reporting purposes.

-strict
-ixmMode   Indicates IXM mode. Used for IXM only. FALSE
-entRecno N Used with .unl only. The starting entity record number for the .unl.

The option allows for specifying an entity record number to start with for the creation of the mpi_entlink_xx.unl file. The parameter is optional. If not set, then the mpxlink utility defaults to applying 1 as the starting entity record number.

1
-tskRecno N Used with .unl only. Allows for specification of a starting task record number in the .unl file. This option reads the tskrecno from the mpi_seqgen table. mpi_seqgen.tskrecno
-audRecno N Used with .unl only. Common audRecno for all .unl files. This option sets the audit record number for the .unl files that are loaded into the mpi_audhead database table. When the -{no}audHead option (Write mpi_audhead.unl in IBM Initiate Workbench) is enabled, you can set the -audRecno option to an existing mpi_audhead record number. 2
-usrRecno N Used with .unl only. Common usrRecno for all .unl files. This option sets the user record number for the .unl files that are loaded into the mpi_audhead database table. When the -{no}audHead option (Write mpi_audhead.unl in IBM Initiate Workbench is enabled, you can set this option to an existing mpi_usrhead user record number. 1
-ixnRecno N Used with .unl only. This setting is the ixnRecno for audhead record. This option sets the transaction record number for the .unl files that are loaded into the mpi_audhead database table. When the -{no}audHead option (Write mpi_audhead.unl in IBM Initiate Workbench is enabled, you can set this option to an existing mpi_ixnhead user record number. 71
-evtTypeno N Used with .unl only. This setting is the evtTypeno for the audhead record. Use this option to specify an event type for the audhead records. When the -{no}audHead option (Write mpi_audhead.unl in IBM Initiate Workbench is enabled, you can set this option to an existing mpi_evttype event type number. 0
-{no}audHead   Used with .unl only. Writes mpi_audhead.unl file, and uses the audrecno specified in the -audRecno option. (Common audit record number for all .unl option.) This option is commonly used in new implementations where no audit records exist yet. -noAudHead
-bktOutDir dirName Used with NTE only; the output directory for the BXM files. NONE
-entBktd   Used with NTE only. Write entBktd information. This option and the -bktOutDir used together allow the mpxlink utility to generate a binary bucket file that is consumed by the mpxcomp utility to rescore members that exist in the same transitive entity. This is used for non-transtive entities to get scores between members who were brought together by a "glue" member and would not have a score generated by our traditional binary bucket file generated during the mpxprep or mpxfsdvd process. This setting allows a second pass using the mpxcomp and mpxlink utilities to produce accurate non-transitive entities. Although non-transitive entities can be produced with a single pass through the mpxcomp and mpxlink utilities, the two-pass approach improves accuracy. FALSE


Feedback

Timestamp Last updated: 14 Nov 2014

Topic URL: