Limiting SQL Server results by distinct values using FETCH NEXT

I’m working on a database query for training sessions and need help with pagination.

My table returns data like this:

training_id | session_date | student_name
-----------+--------------+-------------
TR001      | 2022-01-15   | Mike Smith
TR001      | 2022-01-15   | Sarah Jones
TR001      | 2022-01-16   | Mike Smith
TR001      | 2022-01-16   | Sarah Jones

Each training course appears multiple times because of different dates and students. I need to paginate by getting the first 25 unique training courses, not 25 total rows.

My current query doesn’t work properly:

SELECT training_id, session_date, student_name
FROM training_sessions
ORDER BY session_date
OFFSET 0
FETCH NEXT 25 ROWS ONLY

This gives me 25 rows but they might represent only a few actual training courses since each course has multiple entries.

I tried using a CTE approach:

WITH unique_trainings AS 
(
    SELECT training_id
    FROM training_sessions
    WHERE session_date <= '2022-12-31'
    ORDER BY training_id
    OFFSET 0 ROWS
    FETCH NEXT 25 ROWS ONLY
)
SELECT * 
FROM training_sessions t
INNER JOIN unique_trainings ut ON ut.training_id = t.training_id
WHERE session_date <= '2022-12-31'

Is there a better way to limit results based on distinct column values rather than total row count?

your CTE approach works well, but there’s a simpler way. skip the join and use EXISTS instead - it’s typically faster for this. another option: use DISTINCT in a subquery with TOP 25, then join back for the related rows.

You’re on the right track with CTEs, but there’s a way to make this cleaner. The issue is you’re grabbing all columns in the first CTE when you only need distinct training IDs for pagination. Try DENSE_RANK() instead of separate CTEs. It numbers distinct values consecutively - perfect for paginating unique training courses. Use DENSE_RANK() OVER (ORDER BY training_id) to give the same rank to all rows from the same training, then filter with rank <= 25. This cuts out the joins and gives you cleaner code. You’ll get exactly 25 distinct training courses with all their rows.

Interesting challenge! Try ROW_NUMBER() with a partition - windowing functions should work well here. What SQL Server version are you on? And do you need all rows from those 25 trainings or just samples from each?