Output 6.12 PROC SQL Output for Expanding a Hierarchy
How It Works
This solution uses a self-join (reflexive join) to match employees and their supervisors.
The SELECT clause assigns aliases of A and B to two instances of the same table and
retrieves data from each instance. From instance A, the SELECT clause performs the
following:
• selects the ID column and assigns it a label of Employee ID
• selects and concatenates the FirstName and LastName columns into one output
column and assigns it a label of Employee Name
From instance B, the SELECT clause performs the following:
• selects the ID column and assigns it a label of Supervisor ID
• selects and concatenates the FirstName and LastName columns into one output
column and assigns it a label of Supervisor Name
In both concatenations, the SELECT clause uses the TRIM function to remove trailing
spaces from the data in the FirstName column, and then concatenates the data with a
single space and the data in the LastName column to produce a single character value for
each full name.
trim(A.FirstName)||' '||A.LastName label="Employee Name"
When PROC SQL applies the WHERE clause, the two table instances are joined. The
WHERE clause conditions restrict the output to only those rows in table A that have a
supervisor ID that matches an employee ID in table B. This operation provides a
supervisor ID and full name for each employee in the original table, except for those
who do not have a supervisor.
where A.Supervisor=B.ID and A.Supervisor is not missing;
Note: Although there are no missing values in the Employees table, you should check
for and exclude missing values from your results to avoid unexpected results. For
example, if there were an employee with a blank supervisor ID number and an
employee with a blank ID, then they would produce an erroneous match in the
results.
Expanding Hierarchical Data in a Table 191