Investigating join operations
A join operation retrieves rows from more than one table and combines them. The operation specifies at least two tables, but the two tables need not be distinct.
Composite tables
A composite table represents the result of accessing one or more tables in a query. If a query contains a single table, only one composite table exists. If one or more joins are involved, an outer composite table consists of the intermediate result rows from the previous join step. This intermediate result might, or might not, be materialized into a work file.
The new table (or inner table) in a join operation is the table that is newly accessed in the step.
A join operation can involve more than two tables. In these cases, the operation is carried out in a series of steps. For non-star joins, each step joins only two tables.
Composite table example
DB2® performs the following steps to complete the join operation:
- Accesses the first table (METHOD=0), named TJ (TNAME), which becomes the composite table in step 2.
- Joins the new table TK to TJ, forming a new composite table.
- Sorts the new table TL (SORTN_JOIN=Y) and the composite table (SORTC_JOIN=Y), and then joins the two sorted tables.
- Sorts the final composite table (TNAME is blank) into the order (SORTC_ORDERBY=Y) that you want.
The following tables show a subset of columns in a plan table for this join operation:
METHOD | TNAME | ACCESS-
TYPE |
MATCH-
COLS |
ACCESS-
NAME |
INDEX-
ONLY |
TSLOCK-
MODE |
---|---|---|---|---|---|---|
0 | TJ | I | 1 | TJX1 | N | IS |
1 | TK | I | 1 | TKX1 | N | IS |
2 | TL | I | 0 | TLX1 | Y | S |
3 | 0 | N |
SORTN
UNIQ |
SORTN
JOIN |
SORTN
ORDERBY |
SORTN
GROUPBY |
SORTC
UNIQ |
SORTC
JOIN |
SORTC
ORDERBY |
SORTC
GROUPBY |
---|---|---|---|---|---|---|---|
N | N | N | N | N | N | N | N |
N | N | N | N | N | N | N | N |
N | Y | N | N | N | Y | N | N |
N | N | N | N | N | N | Y | N |
Join conditions
A join operation typically matches a row of one table with a row of another on the basis of a join condition. For example, the condition might specify that the value in column A of one table equals the value of column X in the other table (WHERE T1.A = T2.X).
Two kinds of joins differ in what they do with rows in one table that do not match on the join condition with any row in the other table:
- An inner join discards rows of either table that do not match any row of the other table.
- An outer join keeps unmatched rows of one or the
other table, or of both. A row in the composite table that results
from an unmatched row is filled out with null values. As the following
table shows, outer joins are distinguished by which unmatched rows
they keep.
Table 3. Join types and kept unmatched rows Outer join type Included unmatched rows Left outer join The composite (outer) table Right outer join The new (inner) table Full outer join Both tables
Join condition example
Suppose that you issue the following statement to explain an outer join:
EXPLAIN PLAN SET QUERYNO = 10 FOR
SELECT PROJECT, COALESCE(PROJECTS.PROD#, PRODNUM) AS PRODNUM,
PRODUCT, PART, UNITS
FROM PROJECTS LEFT JOIN
(SELECT PART,
COALESCE(PARTS.PROD#, PRODUCTS.PROD#) AS PRODNUM,
PRODUCTS.PRODUCT
FROM PARTS FULL OUTER JOIN PRODUCTS
ON PARTS.PROD# = PRODUCTS.PROD#) AS TEMP
ON PROJECTS.PROD# = PRODNUM
The following table shows a subset of the plan table for the outer join.
QUERYNO | QBLOCKNO | PLANNO | TNAME | JOIN_TYPE |
---|---|---|---|---|
10 | 1 | 1 | PROJECTS | |
10 | 1 | 2 | TEMP | L |
10 | 2 | 1 | PRODUCTS | |
10 | 2 | 2 | PARTS | F |
Column JOIN_TYPE identifies the type of outer join with one of these values:
- F for FULL OUTER JOIN
- L for LEFT OUTER JOIN
- Blank for INNER JOIN or no join
At execution, DB2 converts every right outer join to a left outer join; thus JOIN_TYPE never identifies a right outer join specifically.
Materialization with outer join
Sometimes DB2 has to materialize a result table when an outer join is used in conjunction with other joins, views, or nested table expressions. You can tell when this happens by looking at the TABLE_TYPE and TNAME columns of the plan table. When materialization occurs, TABLE_TYPE contains a W, and TNAME shows the name of the materialized table as DSNWFQB(xx), where xx is the number of the query block (QBLOCKNO) that produced the work file.