SQL Query Review: Staff Reporting Structure and Department Compensation Analytics

I need help reviewing two SQL queries for a staff database system. I’m concerned about performance and correctness issues.

First Query - Staff Reporting Chain

This query shows employee info (job title, first name, last name) plus their supervisor’s title by self-joining the staff table. The supervisor column stores full names as “FirstName LastName” format.

SELECT 
    emp.JobTitle, 
    emp.FirstName, 
    emp.LastName, 
    mgr.JobTitle AS SupervisorTitle
FROM 
    staff_data AS emp
INNER JOIN 
    staff_data AS mgr
    ON emp.ManagerName = mgr.FirstName + ' ' + mgr.LastName
WHERE 
    emp.ManagerName IS NOT NULL
ORDER BY 
    mgr.JobTitle ASC, 
    emp.JobTitle ASC;

Second Query - Department Pay Analysis

This analyzes total compensation by department, finds highest paid person per department, and calculates what percentage the top earner represents.

WITH DeptPayCTE AS 
(
    SELECT DepartmentID, SUM(Compensation) AS TotalDeptPay
    FROM staff_records_2024
    GROUP BY DepartmentID
),
HighestPaidCTE AS 
(
    SELECT DepartmentID, EmployeeName, Compensation, 
           ROW_NUMBER() OVER (PARTITION BY DepartmentID ORDER BY Compensation DESC) AS RankNum
    FROM staff_records_2024
),
TopEarnerCTE AS 
(
    SELECT DepartmentID, EmployeeName AS HighestPaidName, Compensation AS MaxPay
    FROM HighestPaidCTE
    WHERE RankNum = 1
)
SELECT 
    dp.DepartmentID,
    dp.TotalDeptPay,
    te.HighestPaidName,
    te.MaxPay,
    ROUND(CAST(te.MaxPay AS DECIMAL(12, 4)) / dp.TotalDeptPay * 100, 4) AS PayPercentage,
    CASE
        WHEN te.MaxPay < 0.5 * dp.TotalDeptPay THEN 'Yes'
        ELSE 'No'
    END AS HasOptimizationOpportunity
FROM 
    DeptPayCTE dp
JOIN 
    TopEarnerCTE te ON dp.DepartmentID = te.DepartmentID
ORDER BY 
    dp.DepartmentID ASC;

Main concerns:

  1. Is the string concatenation join in the first query problematic for performance?
  2. Could the decimal casting cause precision or division errors?
  3. Should I use DENSE_RANK instead of ROW_NUMBER for handling salary ties?
  4. Are there better approaches for these requirements?

The first query seems slow and the name matching feels unreliable. The second query works but I’m worried about edge cases with zero values.

man, that first query is a total performance nightmare—string concatenation in joins really ruins your chances of using indexes. way better to store manager IDs instead, makes it cleaner and faster. and be careful with that second query, it’ll crash if any dept has zero total pay due to division by zero!

Yes, the string concatenation in your first query can significantly impact performance. It won’t utilize indexes efficiently and requires evaluating the function for each row. A better approach would be to create a computed column with the concatenated name and index it, or even store the manager ID directly instead of names, which enhances foreign key relationships.

Regarding your second query, casting to decimal is a great strategy to avoid integer division issues, but ensure you accommodate departments with zero total compensation to prevent division errors. The use of ROW_NUMBER is suitable here since you want a single top earner per department, while DENSE_RANK could complicate results by allowing multiple top earners.

In summary, revise the first query to use numeric IDs instead of string concatenation, and include a filter for departments with zero compensation in the second query.

nice setup! quick question though - what if you’ve got employees with the same name in that first query? looks like it could easily grab the wrong manager. and for the second query, what happens when multiple people tie for the top salary? row_number might skip some of them.