Friday, 6 November 2015

Out of Date Statistics

This week a BI specialist came up to me and asked me to inspect a query that was not performing very well.  He knew that I have attended a week long Microsoft course called ‘SQL Server 2012: Performance Tuning- Design, Internals and Architecture’ so I have had the opportunity to help him out on a number of occasions for many performance based questions. 

He sent the query over and I ran it and inspected the actual query plan by pressing the ‘Include Actual Execution Plan’ button in SQL Server 2012 Management Studio.







(Example: Include Actual Execution Plan button)

The query was not very long or complicated so I was surprised when it took several minutes to return data.  When I inspected the query execution plan I didn’t notice anything that looked like it should be causing a long execution time.  Quite the opposite was true actually.

When I broke the query into chunks and ran each chunk individually the performance was fine until I introduced a specific table.  We will call it TableA.  I investigated the meta-data of TableA and discovered that it was a large table but it had appropriate indexes on it including a clustered b-tree index on a surrogate integer column. 

When I examined the query plan for that TableA, it indicated that it was using an Clustered Index Seek on with a Key Look up.  I was surprised to see this because this really is the ideal situation and is the best choice for highly selective queries.  It means that the optimizer has located an appropriate index.  The Clustered index seek means that the engine will not have to scan all the rows in the data pages to find what it is looking for.  It can traverse the b-tree index and in just a couple reads get right to the index key it is looking for.  It can then do a quick key look up and go right to the needed data.








(Example: Clustered Index Seek operator)





(Example: Key Lookup operator)

So this struck me as strange. If everything was so ideal then why was it taking so long?  If you guessed ‘Statistics’ then you are correct.  The first thing that came to mind was that the statistics may have been out of date for that index. 

I suggested that the developer identify the statistics associated with that index and see when it was last updated. 

















(Example: Statistics node associated with the Products table)






















(Example: Property page of a statistic object shows the last time they were updated.)

If the statistics were out of date then they should be updated by running a similar command to the following:
 
USE AdventureWorks2012;
GO
UPDATE STATISTICS Production.Product(Products)
    WITH FULLSCAN, NORECOMPUTE;
GO

'The following example updates the Products statistics in the Product table, forces a full scan of all rows in the Product table, and turns off automatic statistics for the Products statistics.'

The developer came back to me later that day and said he had updated the statistics and now the query was completing in just a few seconds.

Thanks for reading,

SQL Canuck

No comments:

Post a Comment