Job optimization

Optimization pushes processing functionality and related data I/O into database sources or targets or Hadoop clusters, depending on the optimization options that you choose.

When you optimize a job, Balanced Optimization searches the job for patterns of stages, links, and property settings. The patterns typically include one or more of the supported database connector stages (Teradata, DB2®, Netezza, or Oracle) or Big Data File stages. When a candidate pattern is found, Balanced Optimization combines the processing into the corresponding source or target database SQL or Hadoop cluster and removes or replaces any stages and links that are no longer needed. The tool then adjusts the remaining stages and links. After the tool has found a pattern and modified the job design, it repeats the process. Optimization terminates when none of the patterns match anything further in the optimized job, which indicates that there is no more work to be done.

Generally, optimizations are performed in a priority order. When there is ambiguity (for example, some processing could be performed either in a source or target database) processing is pushed into database targets.