Migrating master data from internal format to XML format

You must convert master data to XML when migrating from IBM® InfoSphere® Master Data Management Server for Product Information Management Versions 6.0.0, 9.0.0, or 9.1.0 to IBM InfoSphere Master Data Management Collaboration Server Version 11.4.

Before you begin

  1. Create XML table spaces and buffer pools as described in the installation guide.
  2. Ensure that the database user for InfoSphere MDM Collaboration Server application has appropriate database privileges as mentioned in the installation guide.
  3. Stop the InfoSphere MDM Collaboration Server application on the local server.
  4. Run the delete_old_versions.sh script to delete all the old versions. Use your company code and the current date as values for the code and end_date parameters.
  5. Optional - Enhance memory available to the Java™ process: Running the data conversion script is a memory intensive process. It is recommended to provide 2 GB of memory to the Java process by updating the custom_java_options parameter in the $TOP/bin/conf/env_settings.ini file. Run the configureEnv.sh after you modify the custom_java_options parameter to ensure that the updated new memory settings come into effect.
    custom_java_options=-Xmx2048m -Xms256m

About this task

The data migration script requires twice the number of DB connections than the number of execution threads. For example, if you use 10 threads in the command-prompt option, then it requires 20 DB connections. You must configure the number of DB connections by using these two configuration parameters:
  • db_maxConnection
  • db_maxConnection_default

Both of the parameters are available in the common.properties file. If the values of these parameters are already equal to or greater than the required DB connections, then you do not need to change the values. If the values of these parameters are less than the required DB connections, then you must increase the values to improve speed.

If you have more the 100,000 master data entries in the catalogs, increase the number threads. For example, you can use 50 or 100 threads to complete data migration faster. The highest number of threads you can use depends on the following variables in InfoSphere MDM Collaboration Server. You can evaluate the optimum number of threads by running the migration script in a test environment.
  • Number of processor cores available
  • Number of specs defined
  • Available memory

All of the specs that are defined in the InfoSphere MDM Collaboration Server environment are retrieved and used in the data conversion process. If you have more than 100 specs, update the spec caching parameters to the appropriate values to prevent frequent database trips and improve overall script performance.

Procedure

  1. Get the spec count in InfoSphere MDM Collaboration Server by running the following command.
    $PERL5LIB/runSQL.pl --sql_command="select count(*) from spc;"
  2. Update these parameters with the values retrieved from the query in Step 1. These parameters are available in the mdm-cache-config.properties file in the $TOP/etc/default directory.
    specCache_KEY_START_VERSION_TO_VALUE.maxElementsInMemory=<spec count>
    
    specCache_KEY_TO_CURRENT_START_VERSION.maxElementsInMemory=<spec count>
    
    specCache_KEY_VERSION_TO_START_VERSION.maxElementsInMemory=<spec count>
    Note: Running the configureEnv.sh script after you update the mdm-cache-config.properties file overwrites the parameters with default values.
  3. Run the data migration shell script, migrateDataToXml.sh. This file is in the $TOP/bin/migration directory.
    migrateDataToXml.sh --company=<company name> --threads=<number of threads>
    Where
    --company
    Specifies the company code that is used to log on to InfoSphere MDM Collaboration Server.
    --threads
    Specifies the number of threads. Use a larger number of threads to improve speed but ensure that enough DB connections exist. The default is 1. It is recommended to provide at least 10 threads.
    For example,
    migrateDataToXml.sh --company=ibm --threads=50

What to do next

You can verify that the script worked as intended by observing the errors that are displayed in the console and logged in the default.log file in the $TOP/logs/default directory. A list of entries that failed to migrate is displayed at the end of execution.

The data migration script updates the database in batches. If there is an error, the entire batch is rolled back. Hence the number of entries that failed to migrate is more than the actual failures. Update the index_regeneration_batch_size parameter value to 1 in the common.properties file, then run the data migration script again to narrow down the entries with actual failures. A table of possible reasons for errors during data migration and recommended actions is given:
Table 1. Possible reasons for errors and recommended actions
Reasons for error during data migration Recommended action
Data migration displays the following error:

java.sql.SQLException:
ORA-00600: internal error
code, arguments: [KGHALP1],
[0x000000000], [], [], [],
[], [], [], [], [], [], []

Ensure that you are using the correct patch level for the database. If required, upgrade the database then run the data migration script again.
Data migration displays the following error:

java.sql.SQLException:
ORA-31061: XDB error:
XML event error
ORA-19202: Error occurred
in XML processing
In line 46 of orastream:
LPX-00241: entity reference
is not well formed

One of the attribute values in the entry has the "&" character or an invalid value. Update the entry to remove the invalid value from the item edit interface of InfoSphere MDM Collaboration Server and then attempt data migration again.


Last updated: 22 Jul 2017