IBM® InfoSphere® Global Name Recognition (GNR) is a name-recognition and name-scoring technology that classifies, searches, analyzes, and compares global name data sets. It is ideal for managing data on individuals in a multicultural world. Integrating the GNR APIs with your MDM operational server gives you the technology to cast a wider net across candidates.
The integration of a virtual MDM implementation and IBM Global Name Recognition (GNR) is achieved by using the GNRMETA bucket generation function in your algorithm configuration.
The operational server uses the IBM NameWorks (GNM) analyze component to provide name variants that can be used by the operational server during candidate selection. The analyze() method transliterates and parses the name. The method further provides gender information, a culture classification, a list of variant name forms for the name (name parts), and a list of countries where the name is found (country of association information).
The connection between the operational server and GNR is configured by using either the madconfig enable_gnr utility target or the Enable/Disable GNRMETA job in InfoSphere MDM Workbench.
The virtual MDM bucket generation function, GNRMETA, calls out to GNR for the name variants and corresponding percentages that are produced by the analyze() method. The percentages are the frequencies of a particular variant in comparison to other variants. The variants are then filtered by using a percentage threshold setting. Only those variants that are greater than or equal to that percent are used in bucketing. This threshold setting is configured by using the derivation argument percent=value setting.
GNRMETA is similar to using the EQMETA function with an equivalency string code (equistrcode) of NICKNAME. EQMETA, with NICKNAME, looks up the various nickname forms of a token and then passes it through the META function. With GNRMETA, the lookup is done with GNR instead of a NICKNAME table.
There are two data derivation arguments (dvdArgs) used with GNRMETA. The first is the phonetic function. The second is the percentage threshold value, which is specified as percent=value. The value must be an integer. For example, IDENTAPHONE, percent=10.
GNRMETA is a bucket generation function that can be used with the existing virtual MDM standardization, comparison, and phonetic functions. You must use either PXNM or BXNM bucketing functions.
The key benefit of this integration is to offer more results during candidate selection. For example, when you are using EQMETA with NICKNAME and the input name of "Omar" (nickname "Umar"), you get one bucket result of OMR. With GNRMETA, you get three buckets of AMR, OMR, and UMR.
There can be some affect on performance by implementing this feature, but it is minimal.