Query Optimizer underestimating row count after join in SQL Server

DancingButterfly · March 25, 2025, 10:39pm

I’m having trouble with the Query Optimizer in SQL Server 2000. It’s way off on estimating rows after some joins. It thinks there’s only one row when there are actually thousands!

This causes problems later on. The optimizer picks loop join and index seek strategies that are super slow. I fixed it temporarily by forcing hash and merge joins, but that’s not ideal.

The weird thing is, the row count estimates are fine before the joins. It’s just after certain joins that things go haywire. All the stats are up to date too.

One tricky join is between a main ‘Person’ table and a smaller table that only about 5% of people are in. Both use INT as the primary key.

So my questions are:

How can I get the optimizer to estimate post-join row counts correctly?
Is there a way to hint that a join will produce lots of rows without manually setting the whole join order?

Any ideas? This is driving me nuts!

Ryan_Nebula · April 1, 2025, 10:18pm

hey, i get ur pain. have u tried fullscan stats update or trace flag 4199?
what about checking the join plan details? maybe the join misleads the optimizer. what do u think?

Jasper_Witty · April 1, 2025, 9:20pm

I’ve encountered similar issues with older SQL Server versions. One approach that’s worked for me is creating filtered statistics on the smaller table, focusing on the subset of data that joins to the larger table. This can give the optimizer more accurate information for estimating row counts.

Another technique is to use query hints like OPTION (RECOMPILE) on problematic queries. This forces the optimizer to generate a new plan each time, potentially using more up-to-date statistics.

If these don’t help, you might consider updating to a newer SQL Server version if possible. Later versions have significant improvements in cardinality estimation algorithms that could resolve your issue without manual intervention.

ZoeString42 · March 31, 2025, 1:58am

yo dancingbutterfly, that sucks. have u considered using query hints like FORCE ORDER or OPTION (HASH JOIN, MERGE JOIN)? might help guide the optimizer better. also, check if theres any funky data distribution or hidden nulls messin with the estimates. good luck man!