Java batch persistence configuration

Java™ batch can configure a job repository to use a persistent store to persist status, checkpoints, and application persistent data across multiple runs of a job instance. The persistent store enables a job instance to be restarted if an earlier run fails or must be stopped by supplying the restarted job with the appropriate data.

Job repository

A job repository includes either a persistent store that is supported by a database, or a memory-based persistent store that is intended for development only. The Liberty batch function allows flexibility in defining an environment where multiple servers may use a common persistent store, or job repository. It is important to understand that many functions are scoped to all servers that are configured to use the same job repository. For more information, see Job repository.

Java batch memory-based persistence configuration

The batch persistence allows a job instance to be restarted if the execution ends in a FAILED or STOPPED state. In the absence of database configuration for batch persistence, Java batch defaults to memory-based persistence to track status, checkpoints, and application persistent data across multiple runs of a job instance.

Note: The default memory-based persistence comes with an obvious but significant limitation.  If the batch container runtime environment or server JVM crash or are restarted, persistence is lost. This function is intended for development purposes only and is not to be considered for production systems or critical batch processing support.

Java batch database persistence configuration

The batch persistence can be configured through a database store.  The database store references a data source, which in turns references a JDBC driver , a specific database location, and other database custom properties.   The qualified names of the batch persistence tables can be configured using the  schema and tablePrefix attributes of the database store itself.

A default database store, named defaultDatabaseStore, is provided and is activated by configuring the default data source, named DefaultDataSource.

Or you can configure a different database store using a databaseStore element and configure batch persistence to use it by adding a

batchPersistence element with a jobStoreRef attribute referencing your databaseStore.

Java batch database connection pooling considerations

The Java batch runtime in general follows a get-use-close pattern when a call is made to the database and typically does not hold onto JDBC connections for longer than the time it is using them. This means that the question of the number of needed connections to run a specific number of jobs and partitions on a server is a decision given almost entirely to the administrator.

You do not need to configure the connection pool size to greater than or equal to the number of running jobs and partitions to avoid deadlock. Resource contention, because of too few connections, might still lead to suboptimal performance. There will also be some minimum number of connections that are required, likely greater than 1, to run special paths, such as activation of the batch components. Administrators can start with the default values, and use connection pool metrics and monitoring, for example, to balance resource consumption against performance.  This applies to the runtime usage of JDBC connections. Administrators might still need to consider JDBC connection pooling because it is used by application code.

Automatic versus manual creation of database tables

By default, the batch run time auto-creates non-existent tables in the batch persistence database store.   This default auto-create behavior also extends existing tables and the creating of new columns in cases where maintenance contributing these new column definitions has been applied.

Alternatively, the ddlGen script can be used to generate a DDL based on the server configuration. If necessary, the DDL can be customized before manually creating the tables. This DDL also incorporates server configuration such as schema and tablePrefix and contains the appropriate SQL for the database type of the data source referenced by the database store.

Note: Customized DDL must use positive integer primary key IDs. As a limitation for database persistence, Java Batch does not accept negative or zero integer IDs persisted in the primary key identity columns. The Java Batch container runtime only runs jobs that use positive integer job IDs persisted in the primary key identity columns.

The auto-creation of tables can be disabled by using the createTables="false" attribute on the databaseStore. This option can be used to ensure that you use manually created tables instead of using auto-created tables if the batch runtime unexpectedly does not find your manually created tables.

To learn about manually creating tables by customizing the generated DDL, including possible customizations, see the Liberty Batch - Job Repository Configuration white paper. Although this white paper is for DB2® on z/OS® operating systems, you might find the information useful for other databases and platforms.

Persistence configuration samples

Default database store with auto-created tables, which are configured with Derby database RUNTIMEDB:

<!-- Derby JDBC driver --> 
<library id="DerbyLib"> 
    <fileset dir="${server.config.dir}/resources/derby" /> 
</library> 

<!-- Data source for batch tables, and possibly other components. --> 
<dataSource id="DefaultDataSource"> 
    <jdbcDriver libraryRef="DerbyLib" /> 
    <properties.derby.embedded  
        databaseName="${server.config.dir}/resources/RUNTIMEDB"     
        createDatabase="create" 
        user="user" password="pass" /> 
</dataSource> 

Batch-specific database store with manually created tables, custom schema, table prefix, batch data source

<batchPersistence jobStoreRef="BatchDatabaseStore"/>

<!-- DB Store config only used by batch components -->
<databaseStore id="BatchDatabaseStore" dataSourceRef="BatchDS"
    createTables="false" schema="HLQ" tablePrefix="JB1"/> 

<!-- Data source for batch tables --> 
<dataSource id="BatchDS"> 
    <jdbcDriver libraryRef="DerbyLib" /> 
     ...
</dataSource> 

Default database store with auth config, custom schema, auto-created tables, default data source:

<!-- DB Store used by batch and possibly other runtime components. --> 
<databaseStore id="defaultDatabaseStore" schema="HLQ">
    <authData user="user1" password="password1"/>  
</databaseStore>

<!-- Data source for batch tables, and possibly other components. --> <dataSource id="DefaultDataSource"> 
    <jdbcDriver libraryRef="DerbyLib" /> 
    ...
</dataSource> 

Reference

For a discussion of manually creating tables by customizing the generated DDL, including possible customizations, see white paper: https://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102716