Nested loop join (METHOD=1)

In nested loop join DB2® scans the composite (outer) table. For each row in that table that qualifies (by satisfying the predicates on that table), DB2 searches for matching rows of the new (inner) table.

Begin program-specific programming interface information.

DB2 concatenates any matching rows that it finds with the current row of the composite table. If no rows match the current row, then:

  • For an inner join, DB2 discards the current row.
  • For an outer join, DB2 concatenates a row of null values.

Stage 1 and stage 2 predicates eliminate unqualified rows during the join. (For an explanation of those types of predicate, see Stage 1 and stage 2 predicates.) DB2 can scan either table using any of the available access methods, including table space scan.

End program-specific programming interface information.

Performance considerations for nested loop join

A nested loop join repetitively scans the inner table of the join.

That is, DB2 scans the outer table once, and scans the inner table as many times as the number of qualifying rows in the outer table. Therefore, the nested loop join is usually the most efficient join method when the values of the join column passed to the inner table are in sequence and the index on the join column of the inner table is clustered, or the number of rows retrieved in the inner table through the index is small.

When nested loop join is used

DB2 often uses a nested loop join in the following situations.

  • The outer table is small.
  • Predicates with small filter factors reduce the number of qualifying rows in the outer table.
  • Either an efficient, highly clustered index exists on the join columns of the inner table, or DB2 can dynamically create a sparse index on the inner table, and use that index for subsequent access.
  • The number of data pages that are accessed in the inner table is small.
  • No join columns exist. Hybrid and sort-merge joins require join columns; nested loop joins do not.

Example: left outer join

The following figure illustrates a nested loop for a left outer join. The outer join preserves the unmatched row in OUTERT with values A=10 and B=6. The same join method for an inner join differs only in discarding that row. The following figure illustrates a nested loop join.

Figure 1. Nested loop join for a left outer join
Begin figure description. A flow chart with sample data that illustrates a merge scan join. End figure description.

Example: one-row table priority

For a case like the following example, with a unique index on T1.C2, DB2 detects that T1 has only one row that satisfies the search condition. DB2 makes T1 the first table in a nested loop join.

SELECT * FROM T1, T2
  WHERE  T1.C1 = T2.C1 AND
         T1.C2 = 5;

Example: Cartesian join with small tables first

A Cartesian join is a form of nested loop join in which no join predicates exist between the two tables. DB2 usually avoids a Cartesian join, but sometimes it is the most efficient method, as in the following example. The query uses three tables: T1 has 2 rows, T2 has 3 rows, and T3 has 10 million rows.
SELECT * FROM T1, T2, T3
  WHERE  T1.C1 = T3.C1 AND
         T2.C2 = T3.C2 AND
         T3.C3 = 5;

Join predicates are between T1 and T3 and between T2 and T3. No predicate joins T1 and T2.

Assume that 5 million rows of T3 have the value C3=5. Processing time is large if T3 is the outer table of the join and tables T1 and T2 are accessed for each of 5 million rows.

However, if all rows from T1 and T2 are joined, without a join predicate, the 5 million rows are accessed only six times, once for each row in the Cartesian join of T1 and T2. It is difficult to say which access path is the most efficient. DB2 evaluates the different options and could decide to access the tables in the sequence T1, T2, T3.

End program-specific programming interface information.

Sorts on the composite table

Begin program-specific programming interface information.
DB2 might sort the composite table under the following conditions:

  • The join columns in the composite table and the new table are not in the same sequence.
  • The join column of the composite table has no index.
  • The index is poorly clustered.

Nested loop join with a sorted composite table has the following performance advantages:

  • Uses sequential detection efficiently to prefetch data pages of the new table, reducing the number of synchronous I/O operations and the elapsed time.
  • Avoids repetitive full probes of the inner table index by using the index look-aside.
    End program-specific programming interface information.

Nested loop join with sparse index access

A value of PRIMARY_ACCESSTYPE = T indicates that DB2 dynamically creates a sparse index on the inner table and uses the sparse index to search the work file that is built on the inner table.

Nested loop join with sparse index has the following performance advantages:

  • Access to the inner table is more efficient when the inner has no efficient index on the join columns.
  • A sort of the composite table is avoided when composite table is relatively large.

Start of changeMemory allocation for sparse index is controlled by the value of the MXDTCACH subsystem parameter. However, if the available storage is insufficient when the query executes, DB2 might be unable to build the sparse index, degrading performance. The solution is to resolve the memory shortage. However, if you cannot do so, reduce the MXDTCACH setting.

End program-specific programming interface information.
End of change