Recursive Queries and The Power of Common Table Expressions

Yesterday I was working on a requirement for a client that asked me to show a list of Organizational Units and the every single parent Organization above every Organization. The source table gave each Org Unit’s ID with their parent Org Unit’s ID. In Oracle’s PSQL, this requirement was met using the Start With and Connect By statements. In TSQL, however, there is no such thing. To meet this requirement, I could have written multiple select statements Union’ed together with differing amounts of joins back to the OrgUnit table to go up in the OrgUnit hierarchy multiple levels, but this would have been extremely painful, time consuming, and the query would have performed very poorly. Instead, I used the magic of Common Table Expressions (CTE) to build a simple query with a recursive join.

If you’re not familiar with CTE’s, check out this link about using Common Table Expressions.

For this example, I’m using the DimEmployee table in the AdventureWorksDW database. So let’s take a look at the query and then we’ll discuss it.

;with a as ( /*This select statement is our anchor statement. This will bring back our top level employees that do not report to anyone.*/ Select EmployeeKey, /*If the ParentEmployeeKey IS NULL, then we want to set the EmployeeKey as the ParentEmployeeKey*/ CASE WHEN ParentEmployeeKey IS NULL THEN EmployeeKey ELSE ParentEmployeeKey END as ParentEmployeeKey, FirstName + ' ' + LastName + ', ' + Title as Boss, FirstName + ' ' + LastName + ', ' + Title as Employee, 0 as [Level] From DimEmployee where ParentEmployeeKey is null Union All /*This select statement joins our source table back to our CTE using the ParentEmployeeKey.*/ Select e.EmployeeKey, e.ParentEmployeeKey, a.Employee as Boss, e.FirstName + ' ' + e.LastName + ', ' + e.Title, [Level] + 1 as [Level] From DimEmployee e /*This is our recursive join back to our CTE.*/ inner join a on a.EmployeeKey = e.ParentEmployeeKey ) Select * From a

The first thing you’ve probably noticed is that my CTE contains two select statements union’ed together. The first select statement is whats called our anchor statement. The first select statement only brings back our top level employees that do not report to anyone. In the case of AdventureWorksDW, there is only one top level employee (Ken Sanchez, CEO), but your situation could have multiple top levels.

The second select statement returns all employees, but it does something very interesting. The select statement actually join DimEmployee to the CTE it is contained in, thus creating a recursive join. By joining DimEmployee.ParentEmployeeKey to the EmployeeKey in the CTE, we are able to recursively join our results back to itself over and over until no records with a ParentEmployeeKey are left.

But if you remember, my requirements were to show every ParentEmployeeKey in the hierarchy of each Employee. So our first CTE alone isn’t going to cut it. To get all ParentEmployeeKey’s in the hierarchy, we will need a second CTE with a recursive join. Here is the select statement that returns those results.

;with a as ( /*This select statement is our anchor statement. This will bring back our top level employees that do not report to anyone.*/ Select EmployeeKey, /*If the ParentEmployeeKey IS NULL, then we want to set the EmployeeKey as the ParentEmployeeKey*/ CASE WHEN ParentEmployeeKey IS NULL THEN EmployeeKey ELSE ParentEmployeeKey END as ParentEmployeeKey, FirstName + ' ' + LastName + ', ' + Title as Boss, FirstName + ' ' + LastName + ', ' + Title as Employee, 0 as [Level] From DimEmployee where ParentEmployeeKey is null Union All /*This select statement joins our source table back to our CTE using the ParentEmployeeKey.*/ Select e.EmployeeKey, e.ParentEmployeeKey, a.Employee as Boss, e.FirstName + ' ' + e.LastName + ', ' + e.Title, [Level] + 1 as [Level] From DimEmployee e /*This is our recursive join back to our CTE.*/ inner join a on a.EmployeeKey = e.ParentEmployeeKey ) , b as ( /*Once again this is our anchor statement. We are only bringing in Employees who report to someone besides themselves.*/ Select EmployeeKey, ParentEmployeeKey, Boss, Employee, [Level] From a Where ParentEmployeeKey != EmployeeKey UNION ALL /* This is where it gets tricky. In this recursive join, we join on the ParentEmployeeKey's in CTE a with the EmployeeKeys in this CTE, but we display the ParentEmployeeKey of this CTE. This allows us to show the parent of the parent recursively. Essentially we are going back up the hierarchy to display the boss of each level in the hierarchy.*/ Select a.EmployeeKey, b.ParentEmployeeKey, b.Boss, a.Employee, b.Level From a inner join b on a.ParentEmployeeKey = b.EmployeeKey ) /* In the final select statement, we must once again query our first CTE, a, to gather the employee that do no report to anyone. Those employees are at level 0. */ Select EmployeeKey, ParentEmployeeKey, Boss, Employee, [Level] as LevelsDownFromTop From a Where [Level] = 0 union all /* Then we query CTE b to get all the other employees along with every employee above them in their hierarchy. */ Select * From b Order by employeekey, [Level]

 

 

 

 

 

 

 

 

 

 

Essentially what we are doing in our second CTE, called “b”, is we are joining on the ParentEmployeeKey’s in the first CTE, “a”, with the EmployeeKey in b. This allows us to show the parent of the parent so we can go up the hierarchy and return all the bosses in an employee’s hierarchy. Here is a screen shot of some of the results.

QueryResults

 

As you can see, the third record shows that Guy Gilbert reports to Jo Brown. One level up from Jo Brown, we see that Guy Gilbert reports to Peter Krebs, who is Jo Brown’s boss. And finally we can see that Guy also reports to Ken Sanchez, since Ken is the CEO.