Data set size and virtual storage

The relationship between data set size and amount of virtual storage available is critical to the performance of DFSORT. Basically, there are four separate cases to consider.

When virtual storage is larger than the data set, DFSORT may be able to perform the sort entirely within virtual storage, without need to store intermediate data. This is called an in-main-storage sort. Indeed, this is the preferred method for sorting small data sets, since it minimizes I/O usage as well as CPU and elapsed time.
When virtual storage is smaller than the data set, Hiperspace or work data sets are needed to store the intermediate data. Provided virtual storage is sufficient (see Table 1 for guidelines), DFSORT is still able to perform an efficient sort, with elapsed and CPU times close to those of an in-main storage sort. I/O or Hiperspace usage is increased, however, reflecting the need to write intermediate data to work data sets or Hiperspace.
When virtual storage is reduced further or the data set size is increased, DFSORT is forced to make less efficient use of Hiperspace or work data sets. DFSORT does what it can to maintain performance but is forced to use Hiperspace or work data sets less efficiently as the ratio of data set size to available storage increases.The loss of efficiency adversely affects elapsed time and EXCP counts.
This performance degradation can be especially dramatic when using work data sets allocated on devices attached to non-synchronous storage control units or connected to ESCON channels. In such cases, it is especially important to follow the virtual storage guidelines explained in Virtual storage guidelines.
When virtual storage is very small or the data set size is very large, DFSORT may require several additional passes over the data to perform the sort. This phenomenon is known as intermediate merging. DFSORT issues message ICE247I to indicate intermediate merging was required; processing continues with degraded performance. Figure 1 shows the benefit of increasing virtual storage to eliminate intermediate merging.

Figure 1. Benefits of Eliminating Intermediate Merging

Bar chart showing savings in elapsed time, CPU time, and EXCP counts by eliminating intermediate merging.

All other factors being equal, the range of data set sizes that DFSORT can sort efficiently (or sort without requiring intermediate merging) grows roughly as the square of the virtual storage size. That is, doubling the virtual storage in an application enables the application to handle data sets four times as large with the same degree of efficiency. Likewise, halving the virtual storage causes the application to handle data sets only one-fourth as large with the same efficiency.