Database query to fetch ID when specific value is absent from all records

I need help with a database query that returns a customer_id only when NONE of the associated records contain a particular value.

Here’s my sample data:

customer_id | purchase_date
100         | 2020-01-15
100         | 2020-07-22
100         | 2020-12-10

I want to write a query that returns the customer_id only if none of the purchase_date values match 2020-07-22. My current attempt doesn’t work:

select customer_id
from sales_table
where purchase_date != '2020-07-22'

This query incorrectly returns customer_id 100 because two rows don’t match the date, even though one row does match. I need zero results since at least one record contains the target date.

The query should return results only for cases like this where no matching dates exist:

customer_id | purchase_date
200         | 2020-03-10
200         | 2020-08-15
200         | 2020-11-20

How can I modify my query to exclude entire customer_id groups when any row contains the specified date?

try GROUP BY with a HAVING clause instead:

select customer_id 
from sales_table 
group by customer_id 
having count(case when purchase_date = '2020-07-22' then 1 end) = 0

this groups records by customer, then filters out anyone who bought something on that date.

Your WHERE clause is checking individual rows instead of looking at each customer as a whole. You need to work at the customer level with NOT EXISTS or NOT IN subqueries. NOT EXISTS is the cleanest approach: sql SELECT DISTINCT customer_id FROM sales_table s1 WHERE NOT EXISTS ( SELECT 1 FROM sales_table s2 WHERE s2.customer_id = s1.customer_id AND s2.purchase_date = '2020-07-22' ); This checks if each customer has any records matching that date. Only customers without matches get returned. NOT EXISTS usually runs faster than NOT IN on larger datasets, especially if you’ve got indexes on customer_id and purchase_date.

Interesting problem! What happens with null dates in your data though? Does NOT EXISTS handle those properly or do you need additional null checks? Also wondering about performance - have you tested both approaches on a bigger dataset to see which one’s actually faster?