All of the language-related files are in the INSTALL_HOME/com.ibm.mdm.bigmatch.social.erflow/scripts/ directory.
Entity files
The
hil_modules directory
contains entity definitions such as
MDM_Member/MDM_Member.hil and
matching function definitions. The actual scripts are located under
the identically named folders. Entities define the logical objects
that are used as input and produced as output. All entities must be
declared before they are used.
- Customer.hil
- The Customer entity represents the format that is accepted by
the high-level integration language deterministic matching. Customize
this file only if the attributes required for matching change.
- MDM_Member.hil
- The MDM_Member entity represents the format of a member that is
accepted by the probabilistic matching engine as part of the high-level
integration language probabilistic flow. Whatever attributes are being
matched on and manipulated in the algorithm must be present in the mdm_member.hil file.
If you modify your PME algorithm, you must adjust this file to accommodate
changes in the matching algorithm.
- MDM_Person.hil
- The MDM_Person entity represents the format of the Big Match exported
entity. It must be customized in accordance to the Big Match schema.
- SDA_sentiment.hil
- The SDA_sentiment entity represents the format of the sentiment
extraction information that is produced by the IBM® Accelerator for Social Data Analytics. It
must be adjusted in the case where IBM Accelerator
for Social Data Analytics output format changes.
- SDA_social_media_profile
- The SDA_social_media_profile entity represents the format of the
binary (sequence file) profile information that is produced by the IBM Accelerator for Social Data
Analytics. It must be adjusted in the case where IBM Accelerator for Social Data Analytics output
format changes.
High-level integration language files
The
hil directory
contains the flow files. Files are listed in the order that they are
run and outline the flow of the Social MDM Matching application.
- person2er.hil
- Converts Big Match profiles
into the format that can be processed by deterministic rules. The
script also preprocesses and prepares some attributes and statistics
to be included in the flow. Edit this file to:
- Customize mapping between Big Match profiles
and customer data (deterministic matching format).
- Collect extra entity-based statistics to be used in the matching
process.
- er.hil
- This script contains the deterministic rules of the high-level
integration language process. Edit this file to:
- Customize the existing deterministic rules and include new deterministic
rules into the flow.
- createMembers.hil
- This script is responsible for converting Big Match entity
profiles and IBM Accelerator
for Social Data Analytics social profiles into the PME member format
that can be processed by the probabilistic matching engine. Edit this
file to:
- Customize the parameters included into the probabilistic matching
process.
- pme.hil
- This script is a template for the probabilistic flow. This script
starts the data derivation comparison function from the PME algorithms.
The flows use the attributes that are defined in the PME algorithm.
Important: Do not edit this file.
- combineLinks.hil
- This script is responsible for post-processing the results of
the probabilistic rules with deterministic filtering and combining
the results of all matching rules within the flow. Edit this file
to:
- Include new probabilistic rules into the flow.
- Customize threshold settings for probabilistic rules.
- Customize existing deterministic filtering rules.
- prepareToLoadIntoHBase.hil
- This script prepares data to be loaded into HBase tables, from
which the Social UI can retrieve it. Edit this file to:
- Include new attributes into the UI if they are not included by
default or if the Social or Big Match profile
structure is altered.
Each of the high-level integration language files
previously mentioned have an associated
.properties file.
Property files define the inputs into the matching
*.hil scripts.
The property files need to be modified when a new input is added to
the script. A single input binding is shown here in bold.
hil.jaql.bindings: { "SDA_TWProfile": { type: "text", filename: "IntegratedProfiles_Fused*"}, \
"CustomerData": { type: "sequence", filename: "ER/CustomerData.sequence"},\
"TW_Customer_Links": { type: "sequence", filename: "ER/TW_Customer_Links.sequence"}, \
"ScoredLinksRule02": { type: "sequence", filename: "PME/algorithms/social/rule02/output/ScoredLinks.sequence"},\
"ScoredLinksRule05": { type: "sequence", filename: "PME/algorithms/social/rule05/output/ScoredLinks.sequence"},\
"ScoredLinksRule09": { type: "sequence", filename: "PME/algorithms/social/rule09/output/ScoredLinks.sequence"} }