Batch processing readers and writers

The batch processor is not dependent on any specific batch input or output source or data format. Instead, it has externalized the components that perform reading, parsing, response construction, and writing tasks.

The batch processor is shipped with prebuilt readers and writers that can be used without customization. The InfoSphere® MDM Request/Response framework also contains some parsers and constructors that can be used to help process batch jobs.

If your organization’s batch input and output structures can be handled by a combination of the prebuilt readers, writers, parsers, and constructors, then you do not need to do develop any customized external components. For example, if the batch input is in an XML data format where each line contains one XML request, and the expected output is also XML with each line containing one XML response data, then you can use the prebuilt components to handle this input and output. However, if the input or output structure, or both, cannot be handled with the available components, you must develop new pluggable components. For more information, see the section on building custom batch jobs.

The default readers and writers delivered with InfoSphere MDM are:

File line reader
The file line reader reads batch input from a file where each line in the file represents one record. This reader is implemented by the com.dwl.batchframework.queue.FileReaderQueue class.
Titled CSV file reader
The titled CSV file reader reads CSV formatted files in which the first line is a title that must match the batch job definitions, and each line following the title line represents one record. This reader is implemented by the com.ibm.mdm.batchframework.bulkprocessing.queue.TitledSingleLineCSVFileReaderQueue class.
Restriction: The standard titled CSV file reader cannot read input file layouts in which:
  • A record is split across multiple lines.
  • More than one record appears on a single line.
Tip: InfoSphere MDM includes an enhanced titled CSV reader that is capable of handling these file layout restrictions. However, the enhanced titled CSV file reader may process batches more slowly due to the additional computations required. The enhanced reader is implemented by the com.ibm.mdm.batchframework.bulkprocessing.queue.TitledCSVFileReaderQueue class.
Extended file reader
The extended file reader is able to read a variety of XML requests that conform to the InfoSphere MDM platform service request schemas such as TCRMService, DWLAdminService, DWLCompositeServiceRequest, or any XML request that is configurable through the properties file. In the process, the parser populates all configuration properties that are necessary to inform the server for the request parser, response constructor, or target application. The extended file reader is implemented by the com.ibm.mdm.batchframework.queue.XFileReaderQueue class.

To use this reader, in addition to the usual property definitions, a property TxTokens with the value of the top level element of the XML request (that is, TCRMService, DWLAdminService, or DWLCompositeServiceRequest) must be included in the batch_extension.properties file. For example: ParseAndExecConfiguration.TxTokens=TCRMService. The value of the TxTokens field can not be used anywhere in the body of the request XML. Default namespace XML files are supported, but XSD files are not.

Database reader
The database reader reads inputs directly from a database using a query that is defined in a batch job. This reader is implemented by the com.ibm.mdm.batchframework.bulkprocessing.queue.DatabaseReaderQueue class.
File line writer
The file line writer writes the batch job output to a file formatted so that each record is on a separate line. This writer is implemented by the com.dwl.batchframework.queue.FileWriterQueue class.
Chained file writer
The chained file writer writes the batch job output to one or more files. The number of output files is configured using the Writer.properties file. The chained file writer is implemented by the com.dwl.batchframework.queue.WriterChainedQueue class.

For more information on available parsers and constructors, see Request and Response Framework.