Oracle hash joins are notoriously hard to set up within Oracle, and the Oracle Metalink archives are full of problem reports regarding successful hash join invocation. I will discuss hash joins in detail in Chapter 16, but for now, just be aware that Oracle reads the driving table into a RAM array of hash_area_size and uses a special hashing technique to join the memory array with the larger table.
For equi-join operations, hash joins can outperform nested loop joins, especially in cases where the driving table is small enough to fit entirely into the hash_area_size. If the driving table is too large, the hash join will write temporary segments into the TEMP tablespace, slowing down the query. Since the reading of the table rows for a hash join is the most time-consuming operation in a hash join, setting parallelism on the table can dramatically improve the performance and throughput of the query.
Here is an example of a query that forces a parallel hash join. Note that the emp table is set as the driving table:
select /*+ use_hash(e,b) parallel(e, 4) parallel(b, 4) */
e.ename,
hiredate,
b.comm
from
emp e,
bonus b
where
e.ename = b.ename
;
Here is the execution plan for the hash join. Note that both tables in this join are using parallel query to obtain their rows:
OPERATION
----------------------------------------------------------------------
OPTIONS OBJECT_NAME POSITION
------------------------------ ---------------------------- ----------
OTHER_TAG
----------------------------------------------------------------------
SELECT STATEMENT
3
HASH JOIN
1
PARALLEL_TO_SERIAL
TABLE ACCESS
FULL EMP 1
PARALLEL_TO_PARALLEL
TABLE ACCESS
FULL BONUS 2
For equi-join SQL, hash joins are often faster than nested loop joins, especially in cases where the driving table is filtered into a small number of rows in the query’s where clause.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment