Why does SQL Server handle similar queries differently in terms of optimization?

I’ve been experimenting with a basic Employees table in SQL Server 2019. I’m trying to find employees who earn less than 30,000 and whose managers have left the company. I wrote two queries that should give the same result:

-- Query 1
SELECT E1.employee_id
FROM Employees E1
LEFT JOIN Employees E2 ON E1.manager_id = E2.employee_id
WHERE E1.salary < 30000
  AND E2.employee_id IS NULL
  AND E1.manager_id IS NOT NULL
ORDER BY E1.employee_id

-- Query 2
SELECT employee_id
FROM Employees
WHERE salary < 30000
  AND manager_id NOT IN (SELECT employee_id FROM Employees)
ORDER BY employee_id

The first query runs in about 362 ms, while the second takes 776 ms when tested on 13 different tables. I thought SQL Server would optimize both queries the same way. Can someone explain why there’s a performance difference? Is it because of the query structure, or could it be related to indexes and statistics? I’m confused because I always believed SQL Server’s optimizer would handle logically equivalent queries similarly.

hey, lucaspixel23, thats a good point—maybe the optimizer treats outer joins diff than subqueries. its kinda weird, right? have you looked at the execution plans to see which part is slower? curious to hear if any indexing tweaks helped?

Your observation is intriguing, LucasPixel23. The performance difference likely stems from how SQL Server’s optimizer handles JOIN operations versus subqueries. The LEFT JOIN approach in Query 1 allows for more efficient data access patterns, especially if there’s an index on manager_id. Query 2’s NOT IN subquery might force a less optimal execution plan, potentially causing more table scans or lookups. To truly understand the discrepancy, I’d recommend examining the execution plans for both queries. This will highlight the specific operations causing the performance gap and potentially reveal opportunities for query or index optimization.

yo lucas, good catch on that! sql server’s optimizer can be weird sometimes. maybe the subquery in Query 2 is throwing it off? have u tried rewriting it as an EXISTS clause? that might help. also, check if there’s any missing indexes on manager_id or employee_id columns. just a thought!