Vol. 6, Issue 12 - 10 Dec 2008
By Snehal S. Antani
WebSphere XD Technical Lead
IBM Software Services for IBM WebSphere solutions
Reto Estermann
Director and Head of Finance Repository Solutions
Swiss Reinsurance
Swiss Reinsurance (Swiss Re) has worked closely with IBM product development to design and test IBM WebSphere XD Compute Grid for z/OS – a modern Java-based batch-processing platform for the enterprise. Read on to learn more about Swiss Re’s implementation and how you can get involved in a growing community of users who are sharing best practices and helping to influence the product’s direction.
The Swiss Re and IBM collaboration
As the world’s largest reinsurance company, Swiss Re has a long history of collaboration with IBM. Swiss Re adopted DB2 early and even located its US headquarters across the street from IBM in Armonk, NY.
Swiss Re and IBM joined forces early on to design IBM WebSphere XD Compute Grid. A true partnership, the collaboration ensured that Swiss Re’s requirements, concerns, and issues were quickly resolved, while IBM had a sounding board to identify new product features and validate ideas of interest to the z/OS community, where batch processing is tremendously important.
Swiss Re’s finance IT group is responsible for over 50 percent of corporate-wide batch jobs and executes over 6,000 batch jobs per day for reporting and data reconciliation. Due to acquisition-related growth, the finance IT group needed to modernize its batch-processing infrastructure with the objectives of reducing execution costs and improving agility without sacrificing performance. The group chose Java as its strategic application language and embarked on a project to move from traditional COBOL-based batch applications to Java-based batch.
The decision was based on:
- A shift from general-purpose MIPS on z/OS to IBM System z Application Assist Processor (zAAP)-eligible MIPS, which reduce costs
- Concerns about the diminishing availability of COBOL skills, which could impact the group’s ability to deliver new functions
- A desire to improve time to market by leveraging modern design patterns, tooling, open source libraries, and container-managed infrastructure services to speed development
- A move towards a shared-services infrastructure run on WebSphere Application Server, where the group has common batch and online transaction processing (OLTP) components, such as business services, application development process, testing and deployment infrastructure, operational management, disaster recovery and security.
The company also chose to strategically commit to the IBM System z platform, where over 20 terabytes of the company’s enterprise data is now stored. Swiss Re was one of the earliest adopters of the new IBM System z10 server.
The resulting Swiss Re project is called COBOL-Java Konversion (COJAK).
Batch processing with Compute Grid
IBM WebSphere XD Compute Grid delivers a modern batch-processing platform for enterprises such as Swiss Re. How does it do that?
First, it supports today’s batch processing needs, including:
- 24x7 batch processing, where batch can be executed concurrently with online transaction processing (OLTP)
- Sharing business services across batch and OLTP, where a service can be executed in multiple execution environments without sacrificing efficiencies, such as bulk-data processing
- Parallel-processing and caching features, where large problems can be partitioned, governed, and processed in parallel across a collection of server resources while hiding the complexities of multi-threading and management
- Container-managed batch qualities-of-service, such as checkpoint algorithms, restart mechanisms, multi-threading, and threshold policies, so the developer can focus on business logic
- Leverage application design patterns for building agile applications, where object-oriented design and service-orientation allow emerging middleware technologies, such as persistence and caching, to be adopted easily.
- Leverage the qualities-of-service of IBM WebSphere Application Server, such as security, thread-pooling, connection-pooling, scalability and z/OS integration.
Next, the platform includes:
- Runtime components for submitting, dispatching, monitoring, and governing the execution of batch applications across a collection of resources
- Workload-management integrations for goals-oriented execution, where batch jobs can be throttled, paced, and reprioritized to meet batch and OLTP service-level agreements
- Operational control, external scheduler integration, and management, with reduced operational complexity for parallel processing, disaster recovery, and high availability
- End-to-end application development tooling for lightweight, plain old Java object (POJO)-based development, development frameworks and libraries, unit test, and application deployment.
Finally, the solution meets the requirements of heterogeneous enterprise environments by supporting:
- Platform-neutral batch applications that allow the location of the application’s data to dictate the application deployment platform
- Standardized batch application architecture across platforms
- Standardized operational control and job management for all platforms, where the enterprise scheduler, in conjunction with WebSphere XD Compute Grid, can provide a common infrastructure for operational management; archiving, auditing, and log management; and scheduling for batch running on z/OS and non-z/OS platforms.
WebSphere XD Compute Grid is modeled after the z/OS job entry subsystem (JES), which uses components such as job control language (JCL), a job dispatcher, and job executors (JES initiators).
Similarly, WebSphere XD Compute Grid has an XML-based job description meta-data called xJCL, a job dispatcher called the job scheduler (JS), and multi-threaded job executors called grid execution endpoints (GEE). In addition, a parallel job manager (PJM) component partitions and governs the execution of parallel batch jobs.
Each of these WebSphere XD Compute Grid components are Java Enterprise Edition (JEE) applications that can be deployed to their own WebSphere Application Server clusters, and allow you to leverage WebSphere Application Server security, scalability, availability, and life-cycle management best practices.
COJAK project architecture
Swiss Re chose IBM Tivoli Workload Scheduler for z/OS to serve as its enterprise-wide scheduling environment. As a result, all batch jobs are submitted through and monitored by the Tivoli Workload Scheduler for z/OS. WebSphere XD Compute Grid integrates with external schedulers on z/OS through JES and WSGrid, a Java message service (JMS)-based connector that is shipped with the Compute Grid product.
By delivering a connector that can be called from a JCL step, the integration approach enables all external schedulers that operate on JCL and JES to seamlessly integrate with WebSphere XD Compute Grid. As a result, Swiss Re can manage (e.g. submit, monitor, and stop, restart, and cancel) WebSphere XD Compute Grid jobs with little impact to operational and process mechanisms (e.g. auditing and archiving).
The Swiss Re job flow, shown in Figure 1, is generally as follows:
- Tivoli Workload Scheduler determines which job should be executed – for example, based on operational plans.
- Tivoli Workload Scheduler submits the corresponding JCL to JES.
- The WSGrid connector is initialized (the submitted JCL contains a traditional pgm=WGRID statement) and passes the xJCL as a parameter.
- WSGrid submits the xJCL to the WebSphere XD Compute Grid job scheduler.
- The Compute Grid job scheduler, using the z/OS workload manager (WLM), selects the best grid execution end-point (GEE) to which to dispatch the batch job.
- Job logs, job execution status data, and the job and step return codes are transmitted to the WSGrid connector. The job executes, with its output directed to SYSOUT, like a standard MVS batch job.
- Batch jobs executing within the GEE can dynamically submit new jobs through Tivoli Workload Scheduler.

Figure 1: The Swiss Re system architecture and end-to-end flow of batch jobs from Tivoli Workload Scheduler to Compute Grid.Figure 2 puts into perspective Swiss Re’s WebSphere XD Compute Grid batch infrastructure, COBOL-batch infrastructure, and potential OLTP infrastructure, which is hosted currently on distributed platforms.

Figure 2: Swiss Re system architecture, including traditional COBOL-based batch and potential OLTP environment on z/OS.
COJAK application architecture
In Figure 3, shown below, the enterprise Java beans (EJB) container (using an EJB wrapper), along with the WebSphere XD Compute Grid batch container (using Compute Grid batch wrappers), manages transaction and security boundaries in the system. Exposed and private services within the application kernel are POJOs, and use standard application design patterns for accessing persistence technologies via the data-access layer.
Initially Swiss Re used hibernate persistence technology to speed development; however, given the data-intensive focus of batch, and therefore the demands for efficient bulk-data processing and very optimized SQL queries, the company is investigating a shift to another persistence technology. The data-access layer will allow a shift transparent to the application kernel.

Figure 3: Swiss Re’s application architecture, where the containers for OLTP and batch govern the transactions and security.The kernel development approach Swiss Re chose was a data-injection pattern for sharing services across batch and OLTP. With this pattern, business services shared across multiple execution environments should be written so business records to be processed are injected into the shared service versus acquiring data from within the service. The data-acquisition burden is then shifted to execution wrappers – OLTP and batch wrappers in this case. For example, the following code describes a shared business service:
The processRecord method is the shared-service, and is indifferent as to how the record to be processed was obtained. This shared-service focuses instead on applying a variety of business validations – such as account status, fraud detection and session balance. If the validations are successful, business logic is executed and the processed record is returned to its caller.
By designing the shared-service as data injected, batch-optimized bulk data readers and writers can feed data into the service and persist the processed data. Moreover, the bulk data readers and writers are delivered as patterns via the batch data stream (BDS) framework – a free development library for building Compute Grid applications. Figure 4 shows an example of a shared-service and where it fits in the batch application flow.

Figure 4: Bulk data readers and writers working with the shared-service application kernel. The input data stream (Input DS) is an object factory that injects input domain objects into the kernel. The output data stream (Output DS) persists output domain objects.For the OLTP case, the following code-snippet shows the OLTP wrapper:
The OLTP service acquires the business record from the data-access layer, which has been optimized to randomly access a single business record at a time (a typical OLTP interaction). The business validations and logic within the kernel are reused, where the OTP service is unaware about the actual processing to be completed.
By applying the data-injection pattern, optimizations such as parallel processing can be applied to the kernel transparently. Since the kernel is unaware of how the data record was acquired, multiple instances of the kernel can be created, where each instance operates on a different segment of the large input dataset. Figure 5 shows this application architecture.

Figure 5: Constrained queries, the parallel job manager, and the data-injection pattern for parallel processingIn Figure 5, the input data stream can be modified to execute constrained data queries, for example by applying a “where” clause to the select statement (or by applying byte-ranges for files, unique file names, etc.). The parallel job manager (PJM), the parallel processing engine in Compute Grid, can be used to instantiate multiple instances of the kernel across the cluster of grid nodes based on user-defined parallelization algorithms.
The shared-services architecture is intended to enable service reuse without sacrificing performance optimizations within each execution environment. The data-injection pattern is one such approach. Coupled with object-oriented techniques and patterns, this architecture enables agile applications that can adopt new technologies and performance optimizations with little impact on the application kernel.
What’s next for WebSphere XD Compute Grid
IBM continues to actively enhance the WebSphere XD Compute Grid. Next steps for the development team include:
- Further enable 24x7 batch through enterprise workload management integrations, such as job pacing, job throttling and dynamic checkpoint intervals.
- Pursue JEE vendor portability and deliver a portable batch container (GEE) that can run on many popular application servers.
- Support multiple programming models including Spring Batch and JZOS.
Join the WebSphere XD Compute Grid Community
Reto Estermann, Swiss Re project lead for COJAK, along with Philipp Spaeti, IBM Client IT Architect (CITA) to Swiss Re, has formed a WebSphere XD Compute Grid user group for z/OS. The community will bring users together to share best practices, convey requirements to the IBM product development team, and help the development team to validate ideas and features with the community. In addition, an IBM developerWorks forum for WebSphere XD Compute Grid lets you interact directly with IBM experts and other users.
Learn more
- developerWorks: Emerging technologies make WebSphere Extended Deployment Compute Grid ideal for handling mission-critical batch workloads
- developerWorks: Introduction to batch programming using WebSphere Extended Deployment Compute Grid
- Technical introduction: WebSphere Extended Deployment (XD) Compute Grid (1.1MB)
- developerWorks: Compute Grid frequently asked questions (FAQ)
- developerWorks: Development tooling with XD Compute Grid
- developerWorks: WebSphere Extended Deployment Compute Grid Wiki
- Subscribe to CCR2 for monthly delivery
Related links
- The Mainstream
Business journal for the System z community
- Tivoli Beat
Weekly updates on the IBM service management perspective
- IBM software for System z
The power to drive an enterprise
- IBM Tivoli software
Intelligent management software for the on demand world
- Tivoli Software Global User Group Community
Join your peers in our information and community hub
- IBM Tivoli Monitoring Newsletter
Enhance your skills in the management and support of your monitoring product portfolio
- Open Process Automation Library
OPAL is Tivoli's worldwide online catalog with hundreds of technically validated, production ready IT Service Management integrated extensions provided by IBM and IBM Tivoli Business Partners.
