How to Master Oracle SQL Connect by Hierarchical Queries

Asiri Gunasena

Published:

SQL

oracle SQL connect by

If you have a table or multiple tables with hierarchical data then use Oracle SQL connect by to select complete structing using a single query. This is a big benefit for reducing recursive function calls and reducing huge code blocks into small pieces of code.

Oracle SQL connect by mapping records with the same table connecting with parent records and selecting all the hierarchy by saying the position of the row. There is a relationship between parent and child that can be introduced in the query itself. There are 2 main clauses you need to identify below.

  1. START WITH
  2. CONNECT BY PRIOR

START WITH: This is the starting point of the hierarchy. This can be a single record or multiple records. LEVEL=0. This is optional. Without Start with Clause, Oracle decides which rows to root in at a time.

CONNECT BY PRIOR: This is used to walk from parent record t until the final children record. This has a continuing parent then moves to the first child and takes that child as a parent then searches for children for it and continues this process until the end of a single child.

Connect by SQL Oracle

Connect by prior refers to the parent row. You can have multiple prior conditions as well. The below examples show how we can use conditions.

--
... PRIOR expr = expr
or
... expr = PRIOR expr
--
oracle SQL connect by

Prior generally used to map parent record with the current record. In that case prior can end up in an infinite loop.

Oracle SQL Connect by Example

Let’s discuss a simple example of an employee table. All the workers in a company are employees but according to the position of the employee, there is a hierarchy of people. In this example let’s have 3 levels of hierarchy and see how to take all the records.

Create Table

--
CREATE TABLE employee_table(
    employee_id VARCHAR2(50) ,
    employee_nic VARCHAR2(50) ,
    employee_name VARCHAR2(100) ,
    manager_id VARCHAR2(50) ,
    department_id VARCHAR2(50) ,
    PRIMARY KEY(employee_id)
);
--

Insert Records

--
declare 
  -- insert 1000000 records into database
  i integer:=99;
  
  type namesarray IS VARRAY(10) OF VARCHAR2(50); 
   names namesarray; 
begin
names := namesarray('DEP_1', 'DEP_2', 'DEP_3', 'DEP_4', 'DEP_5','DEP_6','DEP_7','DEP_8','DEP_9','DEP_10');

  

LOOP
   i:= i+1;
   if i=100 then
     
     INSERT INTO employee_table (employee_id, employee_nic, employee_name, manager_id,department_id) 
   VALUES ('EMP-'||i,1000000+i,'EMP-NAME-'||i, null, names(round(dbms_random.value(1,10),0)));
   elsif i>100 and i<110 then
     INSERT INTO employee_table (employee_id, employee_nic, employee_name, manager_id,department_id) 
   VALUES ('EMP-'||i,1000000+i,'EMP-NAME-'||i, 'EMP-'||100, names(round(dbms_random.value(1,10),0)));
   elsif i>110 and i<200 then
     INSERT INTO employee_table (employee_id, employee_nic, employee_name, manager_id,department_id) 
   VALUES ('EMP-'||i,1000000+i,'EMP-NAME-'||i, 'EMP-'||round(dbms_random.value(100,110),0), names(round(dbms_random.value(1,10),0)));
   elsif i>200 and i<400 then
     INSERT INTO employee_table (employee_id, employee_nic, employee_name, manager_id,department_id) 
   VALUES ('EMP-'||i,1000000+i,'EMP-NAME-'||i, 'EMP-'||round(dbms_random.value(110,200),0), names(round(dbms_random.value(1,10),0)));
   else
     INSERT INTO employee_table (employee_id, employee_nic, employee_name, manager_id,department_id) 
   VALUES ('EMP-'||i,1000000+i,'EMP-NAME-'||i, 'EMP-'||round(dbms_random.value(200,400),0), names(round(dbms_random.value(1,10),0)));
   
   end if;
   EXIT WHEN i > 1999; END LOOP;
end; 
--

Oracle SQL Connect by Example 1

Now we have data in the table and we can start with simple Oracle SQL connect by query. In this example, I would like to take all the employees connected with their managers.

--
SELECT employee_id, employee_name, manager_id
   FROM EMPLOYEE_TABLE
   CONNECT BY PRIOR employee_id = manager_id;
--

Here we don’t have new things. all are discussed early. Here we don’t have a starting position. So each and every record is taken as the parent and evaluated until complete the connect by prior condition.

Connect by Level Oracle Example 2

Additionally, there is a LEVEL keyword to show the level or position of the hierarchy. We discuss LEVEL later again.

--
SELECT employee_id, employee_name, manager_id, department_id, LEVEL
   FROM EMPLOYEE_TABLE
   CONNECT BY PRIOR employee_id = manager_id;
--
oracle sql connect by

Oracle start with Connect by prior Example 3

As we discussed in the introduction START WITH is used to find the start location of the dataset. In other words Root of the hierarchy. The root can be a single record or multiple records. In this example, the CEO or No Manager Employee is the Root. See the example of how to write an Oracle SQL connect by with the START WITH clause.

oracle sql connect by

Here 1st record is the root. He does not have a manager and LEVEL is 1 only for him.

connect_by_isleaf Oracle example 4

oracle SQL connect by is for hierarchy. if you wish to find only leaf records. That means there are no expanded records and no can’t be further expanded. In such case use connect_by_isleaf pseudo column. this is return 1 if this is going to be a leaf record otherwise it is a parent.

--
SELECT employee_id, employee_name, manager_id,department_id, LEVEL,CONNECT_BY_ISCYCLE
   FROM EMPLOYEE_TABLE
   where CONNECT_BY_ISLEAF = 1
   START WITH manager_id is null
   CONNECT BY NOCYCLE PRIOR employee_id = manager_id
   order by LEVEL;
--

Similar to this process if you are willing to find a record which has at least a single leaf record. that is simple now. use the same use connect_by_isleaf pseudo column again but use filter criteria as CONNECT_BY_ISLEAF = 0. Then you get all the records that have multiple leaves in the structure or a single leaf in the structure

--
SELECT employee_id, employee_name, manager_id,department_id, LEVEL,CONNECT_BY_ISCYCLE
   FROM EMPLOYEE_TABLE
   where CONNECT_BY_ISLEAF = 0
   START WITH manager_id is null
   CONNECT BY NOCYCLE PRIOR employee_id = manager_id
   order by LEVEL;
--

You can check above is correct by executing a simple query below. select * from EMPLOYEE_TABLE t where manager_id like '%EMP-112%'

connect by level in Oracle sql example 5

oracle SQL connect by LEVEL is pseudocolumn and it returns 1 for the Root record. 2 LEVEL is for children of the root records. The last row is the LEAF row which is don’t have any children. LEVEL pseudocolumn helps to filter records in each level. If you with t filter 1st and 2nd level of record and LEVEL filter in your query.

--
SELECT employee_id, employee_name, manager_id,department_id, LEVEL,CONNECT_BY_ISCYCLE
   FROM EMPLOYEE_TABLE
   where LEVEL < 3
   START WITH manager_id is null
   CONNECT BY NOCYCLE PRIOR employee_id = manager_id
   order by LEVEL;
--

Also LEVEL keyword can be used in your RADER BY clause as well. Then records sort according to the LEVEL of the hierarchy.

CONNECT_BY_ROOT example 6

--
SELECT employee_id, employee_name, manager_id,department_id, LEVEL,CONNECT_BY_ROOT  employee_name
   FROM EMPLOYEE_TABLE
   where LEVEL < 5
   START WITH employee_id in ('EMP-101','EMP-102')
   CONNECT BY PRIOR employee_id = manager_id
   order by LEVEL;
--

See the query here we started from START WITH employee_id in (‘EMP-101’, ‘EMP-102’) which means Root will be ‘EMP-101’ or ‘EMP-102’. But looking at the final output we can’t decide which record is the Root of this record. This is Not a parent. IT IS THE Root. You can take any value of the Root record by using CONNECT_BY_ROOT (Column).

This helps to filter records later in two queues according to Root. You cannot specify this operator in the START WITH condition or the CONNECT BY condition. And it is only for hierarchical queries only.

sys_connect_by_path in Oracle example

sys_connect_by_path is also valid in hierarchical queries. This returns the path from the root to the current node. Column values must be separated from char values. This also helps to identify the rad t root that row returned by CONNECT BY condition.

--
SELECT employee_id, employee_name, manager_id,department_id, LEVEL,CONNECT_BY_ROOT  employee_name, SYS_CONNECT_BY_PATH(employee_name, '/')
   FROM EMPLOYEE_TABLE
   where LEVEL < 5
   START WITH employee_id in ('EMP-101','EMP-102')
   CONNECT BY PRIOR employee_id = manager_id
   order by LEVEL;
--

Oracle Connect by Performance

CONNECT BY PRIOR clause is Oracle provided a great solution for hierarchical queries but connect by prior in Oracle performance issues are headache.

It’s because of large dataset misuse of keywords, less focus, or keywords not in the correct order. And there are many causes for connecting prior in Oracle performance.

Oracle connect by Performance improvement

  1. Indexing:

Add index into Connect by and START WITH clause and if there is a where condition then if required add index there as well.

--
CREATE INDEX emp_hierarchy_idx ON employee_table(manager_id);
--
  1. Limit the Depth:

The LEVEL keyword is used to set the number of levels in the tree hierarchy. If you are not required all the levels use the LEVEL keyword to limit it.

--
SELECT * FROM employee_table 
WHERE LEVEL <= 3 
CONNECT BY PRIOR employee_id = manager_id;
--
  1. Materialized Views:

If you have static data in your table this is a good solution for CONNECT BY performance. Data is already precomputed and that time is saved if you can use Materialized Views for queries.

--
CREATE MATERIALIZED VIEW emp_hierarchy AS SELECT * 
FROM employees 
WHERE LEVEL <= 3 
CONNECT BY PRIOR employee_id = manager_id 
START WITH employee_id = 100;
--
  1. Add inner query

The below examples show how to add an inner query if required. This is to filter out the CONNECT BY dataset. Query explicitly reduces the set of data set who work in Sales, which Oracle can then do before performing the CONNECT BY.

--
SELECT * FROM (
               SELECT * FROM employees 
                WHERE dept_id = 'SALE'
              )
START WITH manager_id is null
CONNECT BY PRIOR employee_id = manager_id
--

CONNECT BY and START WITH are doing related to hierarchical things. Therefore writing queries efficiently is good for performance.

Summary

oracle SQL connect by is for hierarchical queries this helps to write big recursive methods in t single small query with few keywords. The most important is Oracle SQL connect by keyword and START WITH keyword, but it is optional. once executing the query below steps are used to fetch data.

  • Oracle selects the Root rows of the hierarchy. Use START WITH condition t find the root records.
  • Oracle selects children for all the root records that satisfy the Oracle SQL connect by condition. This is happened until find the last leaf record.
  • If a query is available where the condition filter applies each record and eliminates each row individually. Take only satisfied records according to the condition.

REF: https://docs.oracle.com

Categories SQL
ennicode

Address: 89/1 Rabbegamuwa, Handessa, Kandy, Central, Sri Lanka

Email to: Primary: [email protected]

Services

E-Learning

Company Websites

Support and Configuration work

Banners, Covers, and Post

Web Development & Configurations

Content Writing and Marketing

Contact

Ennicode