Batch processing readers and writers
The batch processor is not dependent on any specific batch input or output source or data format. Instead, it has externalized the components that perform reading, parsing, response construction, and writing tasks.
The batch processor is shipped with prebuilt readers and writers that can be used without customization. The InfoSphere® MDM Request/Response framework also contains some parsers and constructors that can be used to help process batch jobs.
If your organization’s batch input and output structures can be handled by a combination of the prebuilt readers, writers, parsers, and constructors, then you do not need to do develop any customized external components. For example, if the batch input is in an XML data format where each line contains one XML request, and the expected output is also XML with each line containing one XML response data, then you can use the prebuilt components to handle this input and output. However, if the input or output structure, or both, cannot be handled with the available components, you must develop new pluggable components. For more information, see the section on building custom batch jobs.
The default readers and writers delivered with InfoSphere MDM are:
- File line reader
- The file line reader reads batch input from a file where each line in the file represents one record. This reader is implemented by the com.dwl.batchframework.queue.FileReaderQueue class.
- Titled CSV file reader
- The titled CSV file reader reads CSV formatted files in which
the first line is a title that must match the batch job definitions,
and each line following the title line represents one record. This
reader is implemented by the com.ibm.mdm.batchframework.bulkprocessing.queue.TitledSingleLineCSVFileReaderQueue
class.Restriction: The standard titled CSV file reader cannot read input file layouts in which:
- A record is split across multiple lines.
- More than one record appears on a single line.
Tip: InfoSphere MDM includes an enhanced titled CSV reader that is capable of handling these file layout restrictions. However, the enhanced titled CSV file reader may process batches more slowly due to the additional computations required. The enhanced reader is implemented by the com.ibm.mdm.batchframework.bulkprocessing.queue.TitledCSVFileReaderQueue class. - Extended file reader
- The extended file reader is able to read a variety of XML requests
that conform to the InfoSphere MDM platform
service request schemas such as TCRMService, DWLAdminService, DWLCompositeServiceRequest,
or any XML request that is configurable through the properties file.
In the process, the parser populates all configuration properties
that are necessary to inform the server for the request parser, response
constructor, or target application. The extended file reader is implemented
by the com.ibm.mdm.batchframework.queue.XFileReaderQueue class.
To use this reader, in addition to the usual property definitions, a property TxTokens with the value of the top level element of the XML request (that is, TCRMService, DWLAdminService, or DWLCompositeServiceRequest) must be included in the batch_extension.properties file. For example: ParseAndExecConfiguration.TxTokens=TCRMService. The value of the TxTokens field can not be used anywhere in the body of the request XML. Default namespace XML files are supported, but XSD files are not.
- Database reader
- The database reader reads inputs directly from a database using a query that is defined in a batch job. This reader is implemented by the com.ibm.mdm.batchframework.bulkprocessing.queue.DatabaseReaderQueue class.
- File line writer
- The file line writer writes the batch job output to a file formatted so that each record is on a separate line. This writer is implemented by the com.dwl.batchframework.queue.FileWriterQueue class.
- Chained file writer
- The chained file writer writes the batch job output to one or more files. The number of output files is configured using the Writer.properties file. The chained file writer is implemented by the com.dwl.batchframework.queue.WriterChainedQueue class.
For more information on available parsers and constructors, see Request and Response Framework.