Scalability of the InfoSphere Information Server engine

IBM® InfoSphere® Information Server is built on a highly scalable software architecture that delivers high levels of throughput and performance.

For maximum scalability, data integration software must use all available system resources to accomplish data integration tasks. This capability must extend beyond Symmetric Multiprocessing (SMP) systems to include both Massively Parallel Processing (MPP) systems and grid systems.

InfoSphere Information Server components use grid, SMP, and MPP environments to optimize the use of all available hardware resources.

For example, when you use the IBM InfoSphere DataStage® and QualityStage® Designer to create a data-flow graph, the underlying hardware architecture and number of processors is irrelevant. A separate configuration file defines the amount and location of parallel processing the job should run with. This configuration is bound to the job at run time and determines the resources required from the underlying computing system.

As Figure 1 shows, the configuration provides a clean separation between creating the data-flow graph and the parallel execution of the application. This separation simplifies the development of scalable data integration systems that run in parallel.

Figure 1. Hardware complexity made simple
IBM InfoSphere Information Server adapts to sequential or parallel hardware configurations

Without dynamic support for scalable hardware environments, the following problems can occur:

InfoSphere Information Server utilizes powerful parallel processing technology to ensure that large volumes of information can be processed quickly. This technology ensures that processing capacity does not inhibit project results and allows solutions to easily expand to new hardware and to fully utilize the processing power of all available hardware.