I’m on my way to the PASS Business Analytics conference today. If you are attending, and you are reading this blog post, come find me and Rob Collie (blog | @powerpivotpro) during the welcome reception on Wednesday night; we’d love to talk data.
Speaking of data…I was finishing up a session at the Colorado Springs SQL Saturday recently when I was struck with an idea. The session is “Database Design: Size DOES Matter” and it’s been well received over the past year (probably because there are so many awful database designs out there). The talk focuses on datatypes, how to recognize when they are a problem, and what to do to mitigate the performance issues.
My thought was brought about by the question I ask during the session: “How many rows do you want to fit on a page?” The answer is “all of them”, but I also accept “as many as possible”. Since SQL Server (and most all database platforms) will read data pages from disk into a memory buffer, the more rows you can store on one page results in better performance. If you have a query that returns 100 rows, but you can only store one row on a page, then you will need to pull 100 pages from disk in order to return the 100 rows. If you can fit the same 100 rows onto one page then your query will perform faster.
So, wouldn’t it be cool if you could quickly look into a database to see which objects were close to storing one row per page?
Turns out we can! I have a script here that works for SQL2012 and later. This script will return the name of the table, the name of the index, the number of pages, the number of rows, and the ratio of rows to pages. It then orders the results by the objects storing the fewest number of rows on a page, giving you the opportunity to focus your tuning efforts on those objects first.
Here’s the quick script for you. And yes, you *could* consider this my own “Rowsie Project“.
Script disclaimer, for people who need to be told this sort of thing:
DISCLAIMER: Do not run code you find on the internet in your production environment without testing it first. Do not use this code if your vision becomes blurred. Seek medical attention if this code runs longer than four hours. On rare occasions this code has been known to cause one or more of the following: nausea, headaches, high blood pressure, popcorn cravings, and the impulse to reformat tabs into spaces. If this code causes your servers to smoke, seek shelter. Do not taunt this code.
/*============================================= File: SQL_Server_check_rows_and_pages.sql Author: Thomas LaRock, http://thomaslarock.com/contact-me/ http://thomaslarock.com/2014/05/size-matters-table-rows-and-database-data-pages Summary: This script will check the number of rows stored on a database data page. The more rows on a page, the better. If you are storing few rows per page then you will want to check the design, specifically the column datatypes chosen. Date: May 5th, 2014 SQL Server Versions: SQL2012, SQL2014 You may alter this code for your own purposes. You may republish altered code as long as you give due credit. THIS CODE AND INFORMATION IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A PARTICULAR PURPOSE. =============================================*/ SELECT st.name, si.name, si.dpages, sp.rows , 1.0*sp.rows/si.dpages AS [Ratio] FROM sys.sysindexes si INNER JOIN sys.partitions sp ON si.id = sp.object_id INNER JOIN sys.tables st on si.id = st.object_id WHERE si.dpages > 1 --objects that have used more than one page AND st.type = 'U' --user tables only AND si.indid = sp.index_id AND si.rows > 1000 --objects with more than 1000 rows ORDER BY [Ratio] ASC
I’m thinking the next logical step for something like this is to then find which of these objects are currently in memory. While it is good to know which parts of the schema need adjusting, it is also useful to know which ones are being accessed frequently. But I’ll leave that exercise for another blog post.
See you in San Jose!