Configuring the batch processor

You can configure the batch processor by modifying the configuration of the Batch.properties file.

MaxExceptionsAllowed=-1
This property defines the maximum number of exceptions allowed. When the batch processor detects that this number has been reached, it terminates the process. A value of -1 tells the controller to ignore any exceptions, meaning there will be no maximum.
queueSize=100
This property defines the size of all queues used by the batch processor. The queueSize must be set to a value greater than zero. Once a queue reaches the queueSize limit, any operation that attempts to add an element to the queue will be blocked. Similarly, any operation that attempts to take an element from an empty queue will be blocked.
Note: Batch processor queues order elements in first in, first out order.
ServerConfiguration.provider_url=<PROVIDER_URL>
This property defines the URL of InfoSphere® MDM operational server. For example:
corbaloc:iiop:mdm-host:2809
ServerConfiguration.context_factory=com.ibm.websphere.naming.WsnInitialContextFactory
This property defines the WebSphere® Application Server context factory.
ReaderQueue=com.dwl.batchframework.queue.FileReaderQueue
This property defines the default reader queue used by the batch processor.
WriterQueue=com.dwl.batchframework.queue.WriterChainedQueue
This property defines the default writer queue used by the batch processor.
You can add the WriterQueue=com.dwl.batchframework.queue.FailWriter property to batch_extension.properties in order to configure the batch processor to log the request and response messages of failed transactions in the log.
Writer.ForceStopAtCriticalError=false
This property defines whether a batch instance is able to gracefully shut down if it encounters a critical error.
  • true – the batch instance will terminate immediately without waiting for the messages in the writer queue to be written to the output file.
  • false – the batch instance is able to gracefully shut down.
Writer.includeTimestamp=true
This property defines whether timestamps are written to the output files.
  • true – the time and date when the message was written to the output files is recorded in the output file entry.
  • false – no time and date is included in the output files.
Writer.DateFormat=yyyy-MM-dd HH:mm:ss,SSS
This property defines the format used to record timestamps in the output file, if applicable.
Note: If this property is not defined or is defined with an invalid date format, then the default date format will be used: yyyy-MM-dd HH:mm:ss,SSS.
message_generator=com.ibm.mdm.batchframework.message.PlainMessageGenerator
This property defines the message generator that interprets an entity record or a line in an input file and generates a batch message.
home=<batch home>
This property defines the home directory of the batch processor. If the directory does not exist, the batch processor creates it, along with the following subdirectories:
  • input – The root directory of the default input files. If you are defining an input file with a relative path in an addTask XML request, the batch processor tries to expand the relative path from the input directory.
  • stage – The directory that stores all Stage, Result, and Restart files.
  • logs – The directory that stores all Activity log files.
instanceName=batch01
This property defines the name of the batch processor instance. Batch01 is the default name.

Only one instance of the batch processor is allowed in a single location.

Tip: To run another instance at the same time, you must copy the batch processor package to another location and give it a different instance name.

A batch job can only be restarted on the same instance where it was originally processed.

resultCategorizer=com.ibm.mdm.batchframework.message.BatchMessageCategorizer
This property defines the categorizer that the batch processor uses to determine if each processed message is a success or a failure. There are two included categorizers:
  • com.ibm.mdm.batchframework.message.BatchMessageCategorizer is the default categorizer, and uses the message status to determine if a processed message is a success or failure.
  • com.ibm.mdm.batchframework.message.ResultCodeMessageCategorizer uses the value of the <ResultCode> tag to determine if a processed message is a success or failure.
resultCategorizer.options=restart
This property defines how the batch processor is restarted:
  • restart is the default value, and indicates a normal batch job restart from the place where a stopped batch job halted processing.
  • restartWithErrors indicates that a batch job restart will also retry any failed transactions.
progressStatus.refreshTime=60
This property defines the heartbeat interval time. For details, see Monitoring the batch processing heartbeat.
duringChangeThreshold=10
This property defines the threshold time, in seconds, that the batch processor instance waits to pick up a batch job after it has been added or updated. When the batch processor instance detects that a job has been recently added or updated, it waits for the length of time defined in this threshold before beginning work on the job chain.
Important: This threshold should not be adjusted.
maxCommentLength=500
This property defines the maximum character length of the Task Comment string. If a Task Comment exceeds this length, the comment will be truncated.
Tip: To avoid a large comment from being cut off when storing it in the ALERT.DESCRIPTION (which defines a limit of 1000 characters), you can modify the ALERT.DESCRIPTION definition so that it matches the value of maxCommentLength.
Important: When considering the storage space for the comment string, do not forget to take into account languages that use multibyte characters, if applicable to your implementation.
mdm.database.uri=
This property defines the database connection URI.

For example, when using a data source:

mdm.database.uri=jndi:jdbc/DWLCustomer

For example, when using a JDBC:

mdm.databasse.uri=jdbc:db2://localhost:50000/MDMDB;user=db2admin;password=db2admin
mdm.database.prop.sslConnection=
This property determines whether the database connection uses SSL.

For example:

mdm.database.prop.sslConnection=true
database.jdbc.driver=
This property defines the fully qualified driver Java™ class. When using JDBC URL, this item is mandatory.

For example, for IBM® DB2®:

database.jdbc.driver = com.ibm.db2.jcc.DB2Driver
jta.jndi=jta/usertransaction
This property defines the Java Transaction API UserTransaction that the batch processor will use to start a transaction.
runtime.override.input.csv=ReaderQueue=com.ibm.mdm.batchframework.bulkprocessing.queue.TitledSingleLineCSVFileReaderQueue; message_generator=com.ibm.mdm.batchframework.message.CSVStringMessageGenerator
runtime.override.input.db=ReaderQueue=com.ibm.mdm.batchframework.bulkprocessing.queue.DatabaseReaderQueue; message_generator=com.ibm.mdm.batchframework.message.CSVStringMessageGenerator

runtime.override is the keyword used for characteristics that are overridden at runtime. The key and value pairs after the first equals sign are loaded as a property, and are separated using a semicolon. If properties that have the same key, the following order is used to determine which property will override the others (later properties will override the previous ones):

  1. Batch.properties
  2. Batch extension properties
  3. runtime.override.input.<inputType>
  4. runtime.override.jobdef.<TaskDefinitionId>
  5. RuntimeOverride specified in the batch job definition comment
  • input.csv defines the runtime override for batch jobs that read input from a CSV file. The batch job definition comment contains the <File> tag.
    • TitledSingleLineCSVFileReaderQueue is capable of reading a single line CSV formatted input file with enclosing quotation marks. Each line of the input file will be treated as a record with multiple columns that are separated by commas.
    • TitledCSVFileReaderQueue is an alternate input definition that is capable of reading an RFC4180-compiled CSV formatted input file. For details, see RFC4180 at http://tools.ietf.org/html/rfc4180. If necessary, you can also replace it with the name of a customized Java class that meets your business requirements.
  • input.db defines the runtime override for batch jobs that read a database as the input. The batch job definition comment contains either dynamic search SQL or a SQLOverride.

runtime.override.jobdef.10=ParseAndExecConfiguration.OperationType=All; ParseAndExecConfiguration.requesterName=cusadmin; ParseAndExecConfiguration.requesterLanguage=100; ParseAndExecConfiguration.Parser=TCRMService; ParseAndExecConfiguration.Constructor=TCRMService; ParseAndExecConfiguration.CompositeTxn=no
This property defines other properties that use the runtime override function for batch jobs with a task definition ID of 10.

For each out-of-the-box job definition ID (except 100, which is reserved for implicitly created batch jobs), there is a corresponding runtime override property. You can customize the job definition IDs and add new properties for newly introduced job definition IDs. This enables you to adjust the runtime characteristics of all jobs with the specific job definition ID.

resultfile.sort.max_chunk_size=1000000
This property defines the number of results in the result file that are sorted in memory by the batch processor as one portion, and are stored in a single file. The larger the value of this property, the faster the sorting will be. However, a larger value also uses more memory.
Attention: If this value is too large, it might cause an Out of Memory error.
resultfile.sort.max_open_chunks=200
This property defines the number of files of sorted chunks that are merged in a single process. The batch processor opens these files at the same time. The value cannot be over the operating system’s limit.
taskcategory=8
This property defines the task category type used to create batch jobs.
job.duplicateKeyError=<ReasonCode>12</ReasonCode>
This property defines the error phrase that indicates a duplicate key error while creating a batch job.
job.requesterName=cusadmin
This property defines the user for task management in the batch processor. The tasks and comments will be created or updated by the user defined here.
priority_min=10001
This property defines the minimum priority type. This is used for task management in the batch processor. This property, along with the related priority_max property, define the range of priority types that can be used for task management. These values represent the code types in the CdPriorityTp table.
priority_max=10020
This property defines the maximum priority type. This is used for task management in the batch processor. This property, along with the related priority_max property, define the range of priority types that can be used for task management. These values represent the code types in the CdPriorityTp table.