I’m working with two database tables and trying to combine them using a LEFT JOIN operation. Here’s my current query:
SELECT DISTINCT emp.badge_number, proj.project_code
FROM project_files proj
LEFT JOIN employee_data emp ON emp.employee_id = proj.employee_id
AND lower(emp.department) LIKE N'%dev-team-alpha%'
WHERE proj.project_code IN ('A1205.pdf','B3847.pdf','C4921.pdf','D7103.pdf');
The results I’m getting look like this:
badge_number project_code
--------------------------------
NULL B3847.PDF
NULL C4921.PDF
NULL D7103.PDF
DEV-EMP192 C4921.PDF
DEV-EMP205 D7103.PDF
I have two main issues here. First, the file A1205.pdf is completely missing from my results. Second, I’m seeing duplicate NULL entries for C4921.PDF and D7103.PDF when I only want one row per file.
What I actually want to see is:
badge_number project_code
--------------------------------
NULL A1205.PDF
NULL B3847.PDF
DEV-EMP192 C4921.PDF
DEV-EMP205 D7103.PDF
How can I modify my query to get these results without the duplicates?
You’re getting multiple NULL rows because there are several employee records per project that don’t match your ‘dev-team-alpha’ filter. When the LEFT JOIN fails to find a matching employee, SQL creates a NULL row - but if there are multiple non-matching employees for the same project, you get multiple NULL rows.
Here’s how to fix it - filter the employee data first, then join with distinct project records:
SELECT
CASE WHEN emp.badge_number IS NOT NULL THEN emp.badge_number ELSE NULL END as badge_number,
proj.project_code
FROM (
SELECT DISTINCT project_code, employee_id
FROM project_files
WHERE project_code IN ('A1205.pdf','B3847.pdf','C4921.pdf','D7103.pdf')
) proj
LEFT JOIN (
SELECT employee_id, badge_number
FROM employee_data
WHERE lower(department) LIKE N'%dev-team-alpha%'
) emp ON emp.employee_id = proj.employee_id;
This eliminates the duplicate NULL issue by ensuring you only get one row per project.
Hmm, interesting! First thing - is A1205.pdf actually in your project_files table? Also, how many employees are linked to each project? I’m wondering if you’ve got multiple employees per project, which would explain those null duplicates when the join fails.
your WHERE clause is totally filtering out A1205.pdf - it prolly doesn’t have any employee records. try a subquery to grab distinct projects first, then left join with employees. this’ll work better and stop those annoying duplicate nulls.