2023年3月14日 星期二

[Oracle] CONNECT BY and Recursive CTE for hierarchy data

  Oracle   CTE   Hierarchy   CONNECT BY

 

Problem


Assume that we have a table, EMPLOYEES, which include hierarchy data like this,

 



Each employee has a manager, and a manager also has his/her manager, except JB, who is the boss.

We are going to learn how to use the following 2 ways to traverse the hierarchy rows.

1.  START WITH...CONNECT BY...

2.  Recursive CTE

 

START WITH...CONNECT BY...


We can use Oracle's START WITH... CONNECT BY... clause to traverse hierarchal rows.

Pattern


SELECT columns
FROM table
START WITH {condition to find the root row}
CONNECT BY {condition to find next row};
\

 

l   START WITH specifies the root row of the hierarchy.

l   CONNECT BY specifies the relationship of parent and child rows.

l   PRIOR indicates the recursive condition to traverse all the rows.
e.q. PRIOR A = B means using current row's column A to match next row's column B. And A = PRIOR B means use current row's column B to match next row's column A.

 

SQL

The following SQL lists the top-down hierarchy by starting from JB that has no manager and traverse the records that have manager as JB, and so on.

Notice that the condition to find next row is PRIOR EMP_NO = MANAGER_NO, and that means we use EMP_NO to find(match) the next row by its MANAGER_NO.

 


SELECT EMP_NO, NAME, MANAGER_NO
FROM EMPLOYEES
START WITH MANAGER_NO IS NULL
CONNECT BY PRIOR EMP_NO = MANAGER_NO;

 

Result:



 

And the following SQL lists the bottom-up hierarchy by from James and shows his reporting line.

Notice that the condition to find next row is EMP_NO = PRIOR MANAGER_NO, and that means we use MANAGER_NO to find(match) the next row by its EMP_NO.


SELECT EMP_NO, NAME, MANAGER_NO
FROM EMPLOYEES
START WITH NAME = 'James'
CONNECT BY EMP_NO = PRIOR MANAGER_NO;

 

Result:

 


CTE Recursive


We can do the same thing by using CTE recursive.

SQL


WITH t1(EMP_NO,NAME,MANAGER_NO) AS
(
    SELECT EMP_NO,NAME,MANAGER_NO FROM EMPLOYEES
    WHERE NAME = 'James'
    UNION ALL
    SELECT t2.EMP_NO, t2.NAME, t2.MANAGER_NO FROM EMPLOYEES t2, t1
    WHERE t2.EMP_NO = t1.MANAGER_NO
)
SELECT * FROM t1;

 

 

Result:

 


(Optional) Create a Function


Let's create a Function FindReportLine that returns the report line of an exployee by his/her name.


Function by START WITH... CONNECT BY...


CREATE OR REPLACE FUNCTION FindReportLine (
    in_name IN VARCHAR2
) RETURN SYS_REFCURSOR IS
    emp_cursor SYS_REFCURSOR;
BEGIN
    OPEN emp_cursor FOR
    SELECT EMP_NO, NAME, MANAGER_NO
    FROM EMPLOYEES
    START WITH NAME = in_name
    CONNECT BY EMP_NO = PRIOR MANAGER_NO;
    RETURN emp_cursor;
END;


Function by Recursive CTE


CREATE OR REPLACE FUNCTION FindReportLine (
    in_name IN VARCHAR2
) RETURN SYS_REFCURSOR IS
    emp_cursor SYS_REFCURSOR;
BEGIN
    OPEN emp_cursor FOR
    WITH t1(emp_no, name, manager_no) as (
        SELECT EMP_NO, NAME, MANAGER_NO
        FROM EMPLOYEES
        WHERE name = in_name
        UNION ALL
        SELECT t2.EMP_NO, t2.name, t2.MANAGER_NO
        FROM EMPLOYEES t2 INNER JOIN t1
        ON t2.EMP_NO = t1.MANAGER_NO
    )
    SELECT * FROM t1;
    RETURN emp_cursor;
END;


Use the Function

Since the function returns the rows as SYS_REFCURSOR, we can parse the result by XML.

XMLTABLE maps the result of an XQuery evaluation into relational rows and columns.


SELECT * FROM xmltable(
'/ROWSET/ROW'
PASSING xmltype(FindReportLine('James'))
columns
EMP_NO PATH 'EMP_NO',
NAME PATH 'NAME',
MANAGER_NO PATH 'MANAGER_NO'
);


The XMLSEQUENCE could also do the same thing, though it's deprecated.


SELECT
extractvalue(column_value,'/ROW/EMP_NO') EMP_NO,
extractvalue(column_value,'/ROW/NAME') NAME,
extractvalue(column_value,'/ROW/MANAGER_NO') MANAGER_NO
FROM TABLE(xmlsequence(FindReportLine('James')));


(Optional) Aggregate the Result

We can use LISTAGG function in Oracle to combine multiple rows into a single row with each value separated by a specified character or symbol. For example, the following SQL aggregates and show the names of the report line of James.


SELECT LISTAGG(NAME, ' / ') WITHIN GROUP ( ORDER BY ROWNUM ) AS REPORT_LINE
FROM
(
SELECT EMP_NO, NAME, MANAGER_NO
FROM EMPLOYEES
START WITH NAME = 'James'
CONNECT BY EMP_NO = PRIOR MANAGER_NO
);

The result will be "James / Jack / JB".


Reference


[Sql Server] Recursive CTE for hierarchy data

[PostgreSQL] Recursive CTE for hierarchy data