How to run hash join in Oracle. 3 min Quick Guide

Asiri Gunasena

Published:

Blog

hash join in oracle

Oracle Uses this merging mechanism to merge large datasets. Hash Join in Oracle is effective when a dataset is fit to the memory. Optimizers choose this joining method when two datasets fit in memory. A hash table is used to store each row. The hash function determines the location of the hash table.

Oracle Join Types

There are different types of join methods in Oracle in this article let’s discuss hash join in Oracle below is the Oracle joining mechanism to merge two or more tables. For that Oracle considers statistics, Cost

  1. Hash Join
  2. Nested loop
  3. Sort Merge

Hash join in Oracle most effective smaller dataset fit with memory. If the dataset does not fit into memory, the Oracle database partitions the source. Then joining happens each partition by partition. This is doing a lot of IO in temporary tablespace. (How to delete temp tablespace). If the dataset fits in memory two mapping datasets are read only a single time. then reduce the IO cost on join.

Otherwise, join proceed partition by partition. This increases IO Cost because have to read each partition separately.

How to run hash join in Oracle.

The algorithm takes joining keys as input parameters and then the hash function generates value which is the index of the hash table or slot of the hash table.

Hash Tables

Let’s consider a small example to understand how to store records in a hash table. hash join in Oracle happens taking records of the hash table. The hash function is used to arrange a slot of the hash table. If there are many records in the same slot, then a linked list is used to keep the records.

Let’s take a small dataset in the Employee table. The table’s primary key is employee_id. The other side has an Employee Subject table and there is a foreign key as employee_id. Once we need to join two tables using employee_id then how to work hash join in Oracle.

EMPLOYEE_ID    EMPLOYEE _NAME                   MANAGER_ID 
-------------  ------------------------------   ---------- 
           100 Pat                              200        
           200 Jone                             201        
           300 Jimmy                            114        
           400 Katty                            203        
           500 Taro                             121        

What is a Hash Table?

In this example pass employee_id as input value for hash function. The function returns the hash value. In this example generate 5 hash values for each record.

--
f(100) = 3
f(200) = 1
f(300) = 3
f(400) = 2
f(500) = 5
--

In his example, the hash function returns 3 for both employee_id’s 100 and 300. So hash table contains two records for hash value 3. That means the hash table needs to use a linked list to keep both records.

--
1    200,Jone,201,1800
2    400,Katty,203,2400
3    100,Pat,200,1700 -> 300,Jimmy,114,1700
4
5    500,Taro,121,1500
--

In this example, hash value 3 contains two records. This is called hash collision. In such a situation manage by using a linked list. Below explain the full example of handling such a situation hash join in Oracle.

hash join in oracle

Hash Join in Oracle with Simple Steps

In this section let’s try to understand the hash join that manages to load into memory. In this time optimizers use smaller data sources. The dataset join keys are in memory. Below are the steps to join.

  1. In the first table the full table scans and applies the hash function for the join keys in the Program Global Area (PGA- which is a private memory region). Then create the BUILD TABLE for the smaller dataset.
--
For small_data_set_record (Select * from employee)
Loop
Slot_number := Hash Function (small_table_set_record_join_keys);
Insert to hash table (Slot_number, small_data_set_record);
End Loop;
--
  1. Once completed the first dataset then creates the second dataset called probe table. In this process perform a full scan through the data set and retrieve matching records as output.
--
For large_data_set_record( select * from Emp_Subject) 
Loop
   Large_slot_number= Hash Function(lerge_data_set_joon_keys);
   Small_table_record= Check Hash Table for Macthine record (Large_slot_number);
   If Small_table_record Exist Then
     Out = Small_table_record + large_data_set_record;
   End If.
End Loop
--
  1. For the large dataset execute the hash function for join keys.
  2. Retrieve the hash function for the record.
  3. If records do not exist, then fetch the next records.
  4. The large data set iterated until the loop and identified a matching record from a hash table.
  5. If multiple rows are present in the slot for the same hash value, then walk through the linked list and check each row. In the above example, slot 3 contains two records

Sample Code and Execution Plan

Let’s execute the above example in the Oracle database to check how to execute Hash Join in Oracle.  You already know we have two tables Employee and Employee_Subject. Let’s join two tables using employee_id.

--
SELECT *
FROM   Employee_table e, Employee_Subject_table s
WHERE  e. employee_id = s. employee_id;
=================================

CREATE TABLE employee_table(
    employee_id VARCHAR2(50) ,
    employee_name VARCHAR2(150) ,
    PRIMARY KEY(employee_id)
);

CREATE TABLE employee_subject_table(
    employee_subject_id VARCHAR2(50) ,
    start_date DATE ,
    employee_id VARCHAR2(50) ,
    PRIMARY KEY(employee_subject_id)
);

declare 
  -- insertemployee 1000000 records into database
  i integer:=0;
begin
LOOP
   i:= i+1;
     INSERT INTO employee_table (employee_id, employee_name) 
   VALUES ('emp-'||i, 'emp-name'||i);
   EXIT WHEN i > 999; END LOOP;
end;

declare 
  -- insert 1000000 records into database
  i integer:=0;
 
begin

LOOP
   i:= i+1;
     INSERT INTO employee_subject (employee_subject_id,employee_id, start_date) 
   VALUES ('CT-'||i, round(dbms_random.value(1,999),0), sysdate);
   EXIT WHEN i > 99999; END LOOP;
end;
--

Let’s take the execution plan for the small query. This is easy to understand how to hash join in Oracle. In our example, the small table is the Employee Table and the big table is the Employee_subject Table. Let’s add our theory to this execution plan.

hash join in oracle
  1. First full table access for the Employee table and create a hash table.
  2. Then scan the Employee Subject table and match it with the hash table.
  3. Identified matching records from the hash table.
  4. Once completed the big table Returns the full record set.

Example 2

What happens if the hash table does not have enough memory? Then oracle uses temporary space to hold data. The hash table stores part in memory and other remaining data sets on disk. Oracle uses temporary table space to store each remaining partition data in the disk.

If big table data match with memory there is no need to load disk data into memory. Otherwise, oracle loads each partition data from disk to memory. Those temporary table space records will load until the end of the hash table.

USE_HASH Hint is used to force Oracle to use hash join.

Ref: Oracle

ennicode

Address: 89/1 Rabbegamuwa, Handessa, Kandy, Central, Sri Lanka

Email to: Primary: [email protected]

Services

E-Learning

Company Websites

Support and Configuration work

Banners, Covers, and Post

Web Development & Configurations

Content Writing and Marketing

Contact

Ennicode