I’m looking for assistance with grouping continuous date records in SQL Server 2016. I have a list of products along with their entry dates, and I want to count how many consecutive dates exist in each sequence.
Sample data:
ProductID
EntryDate
5678
01/03/2022
5678
02/03/2022
5678
03/03/2022
5678
04/03/2022
5678
07/03/2022
5678
09/03/2022
9876
05/03/2022
9876
06/03/2022
9876
20/03/2022
Expected output:
ProductID
EntryDate
SequenceCount
5678
01/03/2022
4
5678
07/03/2022
1
5678
09/03/2022
1
9876
05/03/2022
2
9876
20/03/2022
1
I want to visualize the first date in each sequence of consecutive dates and how many dates fall within that sequence.
Here’s what I have tried so far:
WITH DateGroups AS (
SELECT ProductID, EntryDate,
GroupDate = DATEADD(day, -ROW_NUMBER() OVER (ORDER BY EntryDate), EntryDate)
FROM ProductTable
)
SELECT ProductID, MIN(EntryDate) as EntryDate, COUNT(ProductID) as SequenceCount
FROM DateGroups
GROUP BY GroupDate, ProductID, EntryDate
ORDER BY ProductID, EntryDate
However, this returns incorrect results, as each row shows a count of 1 instead of grouping the consecutive dates. Can anyone help me correct this logic to accurately identify and count the consecutive date sequences for each product?
you’re on the right track, but drop EntryDate from the GROUP BY - it’s breaking the consecutive date grouping. Here’s the fix:
WITH DateGroups AS (
SELECT ProductID, EntryDate,
DATEADD(day, -ROW_NUMBER() OVER (PARTITION BY ProductID ORDER BY EntryDate), EntryDate) as grp
FROM ProductTable
)
SELECT ProductID, MIN(EntryDate) as EntryDate, COUNT(*) as SequenceCount
FROM DateGroups
GROUP BY ProductID, grp
ORDER BY ProductID, MIN(EntryDate)
You’re grouping by EntryDate, which breaks the aggregation. That’s your main problem. Your ROW_NUMBER() idea is solid but needs tweaking. Here’s the trick: when you subtract row position from consecutive dates, you get the same value for each sequence. I ran into this exact issue tracking daily logins. You’ve got to partition by ProductID or sequences from different products get jumbled together. Also, COUNT(ProductID) acts weird with NULLs - just use COUNT(*) instead. Fix: partition ROW_NUMBER() by ProductID, drop EntryDate from GROUP BY, then use that calculated group identifier to aggregate your sequences. Works every time for consecutive date problems.
Interesting approach! Quick question though - what happens with duplicate dates for the same product? Your sample data looks clean, but real-world data gets messy. Also, does performance take a hit with larger datasets when you’re using windowing functions?