You’re Doing it Wrong: 5 Factors That Affect Database Performance

Yes, I mean you.You’re doing it wrong.

You didn’t mean to be that way, I know. But the truth is things happen. When they do I am brought in to find out what is happening, what has gone wrong. I see repeating patterns in the database issues I am asked to investigate. I keep track of them as I see them and have been able to place them into some general buckets for you here.

When it comes to database performance these are the five factors that I see causing all end users pain.

1. Memory Configuration

I’ve written before about memory configurations but this bears repeating: if you are relying on the default settings for Microsoft SQL Server then you are simply asking for trouble.

SQL Server (and other database systems such as Oracle and Sybase) need to read data pages into their internal memory before they can be used. Of course your server needs memory to operate as well. When your database engine and your server are competing for the same memory resources, you get bad performance. You want your server and your database engine to be like dancing partners, and less like my kids fighting over the last cupcake.

It’s is often easy to spot when you are having memory pressure for your database server. The Available Mbytes memory counter is the leading indicator that your server O/S is seeing significant memory pressure (PRO TIP: if the amount of ‘available’ memory is low, then that is often bad). From there you can investigate a bit more to see how much of the available server memory is being used by your database instance. For SQL server a quick query would be this one from Glenn Berry (blog | @GlennAlanBerry):

-- Good basic information about OS memory amounts and state (Query 28)
 SELECT total_physical_memory_kb, available_physical_memory_kb,
 total_page_file_kb, available_page_file_kb,
 system_memory_state_desc
 FROM sys.dm_os_sys_memory WITH (NOLOCK) OPTION (RECOMPILE);

 -- You want to see "Available physical memory is high"
 -- This indicates that you are not under external memory pressure

The best way to avoid this issue altogether is to configure the max memory setting for your instance. If you want to know what value you should choose for your max memory setting I would point you to this blog post by Jonathan Kehayias (blog | @SQLPoolboy), I like the formula he has been using, it is better than any others I have been using previously.

2. Putting It All On One Disk

As a database administrator I know that for years we have been advocating the use of dedicated drives for our data and log files. Not only is it good for performance, but it helps to mitigate disasters by keeping the transaction logs on a different drive than the data. We have even been able to convince server admins that we need to have a dedicated set of disks just for tempdb in order to have more performance gain. So why am I seeing a sudden increase in database servers using on one disk for data, logs, and tempdb?

Virtualization.

It would seem that for some reason folks think that by being virtual it is now OK to load up everything on one disk device. If you are lucky that disk device is dedicated to your guest but I am willing to bet that it is actually shared with other guests as well. This leads to a LOT of disk saturation or an excessive amount of read and write activity. It also leads to my colleagues saying things like “hey, can you pass along my contact info to those guys, they’ll be needing some help soon.”

It is extra overhead to be creating extra devices, I know. But loading up everything on one big RAID-5 device isn’t the answer (those can fail, too). As a result of this over-provisioning of disk resources I now tell my customers that they can forget to expect an average disk read and write rate of 20ms or less. For many, those days are long past. I see averages these days closer to 40ms, more than twice what had been considered acceptable previously.

You can get that performance back by using dedicated devices, just like you used to have dedicated drives.

3. Excessive Indexing

I often see database designs that contain WAY too many indexes. How many is too many? That depends, of course. How I like to define too many is by first examining to see if there are duplicate indexes. If I have duplicates, then I have too many indexes. Simple enough in theory, right?

The way I see a lot of duplicate indexes being introduced into systems is twofold: using the Database Engine Tuning Advisor (DTA), or using the missing index hints from SQL Server Management Studio (SSMS). I can spot the use of DTA indexing easily, as they will have a name prefixed with ‘_dta’. The indexes create from the hints shown in SSMS do not have a similar default naming convention but when you come across a table with a dozen indexes that are essentially duplicates of others you can start to get a sense that someone has been relying on advice from one tool or the other.

The idea behind an index is a simple one: you want to find a piece of information faster than going through the each and every page of data on disk. When your database is only used for reads, then the additional indexes are not likely to be as much of a problem with query performance. The problems really kick in with the DUI (Delete, Update, Insert) statements, as each one of those statements will need to touch each of the indexes on the table. That’s when performance comes to a crawl, and this can also result in deadlocking.

You can follow the link from earlier in this section for my script on finding duplicate indexes. Here is another piece of code from Glenn Berry (blog | @GlennAlanBerry) that will help you to determine if you have indexes that are seeing more write activity than reads (PRO TIP: that’s not ideal):

-- Possible Bad NC Indexes (writes > reads)
 SELECT OBJECT_NAME(s.[object_id]) AS [Table Name], i.name AS [Index Name], i.index_id,
 user_updates AS [Total Writes], user_seeks + user_scans + user_lookups AS [Total Reads],
 user_updates - (user_seeks + user_scans + user_lookups) AS [Difference]
 FROM sys.dm_db_index_usage_stats AS s WITH (NOLOCK)
 INNER JOIN sys.indexes AS i WITH (NOLOCK)
 ON s.[object_id] = i.[object_id]
 AND i.index_id = s.index_id
 WHERE OBJECTPROPERTY(s.[object_id],'IsUserTable') = 1
 AND s.database_id = DB_ID()
 AND user_updates > (user_seeks + user_scans + user_lookups)
 AND i.index_id > 1
 ORDER BY [Difference] DESC, [Total Writes] DESC, [Total Reads] ASC;
 -- Consider your complete workload
 -- Investigate further before dropping an index

Get rid of the indexes you don’t need. You will see a reduction in your I/O consumption and likely better performance. Your database and its backup files will be smaller, too.

4. Plan Cache Bloat

When you submit a query for execution the database engine will check to see if a plan already exists in memory. If it does, great! This saves time as the query optimizer will not need to create a new plan. The area of memory that contains the query plans is named the plan cache. With each new release of SQL Server we are given more ways to dive into the internal memory stores and thus I find the topic of plan cache and plan cache bloating coming up more often these days.

Feeling bloated? This won't help your plan cache.
Feeling bloated? This won’t help your plan cache.

There is a lot of information available about the plan cache, another on troubleshooting plan cache issues, and a great article written by Bob Beauchemin (blog | @bobbeauch) on how data access code affects query performance. It’s a lot of reading so I am going to do my best to simplify things for you a bit here.

You don’t want to see an excessive number of query plans in your plan cache that are only used once (i.e., ‘ad-hoc’ plans) because these plans take up memory that could otherwise be used to store data pages instead. You can determine for yourself how your plan cache is allocated by using this query written by Robert Pearl (blog | @PearlKnows):

WITH CACHE_ALLOC AS
(
SELECT objtype AS [CacheType]
       ,COUNT_BIG(objtype) AS [Total Plans]
       ,sum(cast(size_in_bytes as decimal(18,2)))/1024/1024
       AS [Total MBs]
       , avg(usecounts) AS [Avg Use Count]
       , sum(cast(
           (CASE WHEN usecounts = 1 
           THEN size_in_bytes 
           ELSE 0  
           END) 
          AS decimal(18,2)))/1024/1024 
       AS [Total MBs - USE Count 1]
       ,  CASE 
          WHEN (Grouping(objtype)=1) THEN count_big(objtype)
          ELSE 0 
          END AS GTOTAL
       FROM sys.dm_exec_cached_plans
GROUP BY objtype 
) 
   SELECT
      [CacheType], [Total Plans],[Total MBs],
      [Avg Use Count],[Total MBs - USE Count 1],
      Cast([Total Plans]*1.0/Sum([Total Plans])OVER() * 100.0 AS DECIMAL(5, 2)) 
      AS Cache_Alloc_Pct
      FROM CACHE_ALLOC
      Order by [Total Plans] desc

That helps you discover if you have plan cache issues after they have already happened. How do you discover if you are having problems at the moment? That’s simple, too. You just want to check for an excessive number of compilations. After all, you can’t have a lot of these ad-hoc plans getting stored in the plan cache without them all being compiled. Robert comes to the rescue again with this bit of code for us:

select t1.cntr_value As [Batch Requests/sec], 
    t2.cntr_value As [SQL Compilations/sec],
    plan_reuse = 
    convert(decimal(15,2),
    (t1.cntr_value*1.0-t2.cntr_value*1.0)/t1.cntr_value*100)
from 
    master.sys.dm_os_performance_counters t1,
    master.sys.dm_os_performance_counters t2
where 
    t1.counter_name='Batch Requests/sec' and
    t2.counter_name='SQL Compilations/sec'

You can take some quick measurements to see if your system is having an excessive number of recompiles. The defined acceptable level is 10% of your batch requests should be compilations.

The best way I know to avoid plan cache issues is to write your code with query plan reuse in mind.

5. One Size Does Not Fit All

My experience as a database administrator is that most systems start out as a way to store data. They are designed with one purpose in mind: to stuff as much data as possible, as quickly as possible, inside the tables. If we are lucky these databases are designed to be normalized, to help with the insertion of data and protect data integrity. These are often called online transaction processing systems or OLTP.

Eventually someone else comes along and says “what a lovely bunch of data you have there, may I cursor through all your rows?” They start asking for exports of the data, or they start writing reports directly against the data. They want to mine every piece of that data looking for some piece of information that will help them make the best possible business decision. What they are doing is trying to use the data for analytical processing. They want an OLAP (online analytical processing) system.

Here’s the problem: OLTP is not the same as OLAP. When end users try to use an system designed for OLTP as an OLAP system it leads to performance issues. For SQL Server this is usually where locking and blocking rear their heads. It was not uncommon for me to see contention between users that wanted to insert data and users that wanted to generate reports. I had an alert built just to notify me when a session was blocked for more than five minutes. Can you imagine waiting that long and consider it to be part of your “normal” processing?

I can. I see it frequently when OLTP systems are being used for OLAP purposes.

What you want to do here would be to build a reporting solution for those OLAP users. That solution could be just a few tables optimized for reporting purposes or a full blown SSRS solution. Another possible answer is to denormalize your data, if possible, but understand that this moves performance issues from the reporting functions on to the DUI functions.

There you go, the five buckets that I see are causing performance issues and each one is something you control.

If only you weren’t doing it wrong.

15 thoughts on “You’re Doing it Wrong: 5 Factors That Affect Database Performance”

  1. whoa whoa whoa, you mean to tell me that the vendor app that has a 93.81% share of the plan cache via ad-hoc queries is doing it wrong?!

    Reply
  2. Just discovered your site, definitely interesting. I agree with what you’re saying and clearly you know your stuff, one thing I’ve seen a lot of is using OLTP databases for reports / analytics and in fact I’d say it’s fairly common. What do you propose is a good solution for this? What if people want to run reports and analysis on a very live and active database?

    Reply
    • Jeremy,

      It depends. What I have seen and heard done involves things like replication, or mirroring. The idea is to offload the data to a different system. With SQL2012 you can use AlwaysOn to easily create some read-only replicas of your data to use as a reporting solution.

      I’ve also seen the case where a database was restored nightly as the reporting database, so that way each user had a different database to use each day in order to reduce contention.

      HTH

      Reply
    • Philo,

      It is a well documented best practice to configure the max server memory setting. I linked to Jonathan’s post on the topic (and he has others), but here is a link to MSDN as well: http://msdn.microsoft.com/en-us/library/ms178067.aspx

      That MSDN article states that “SQL Server adds memory only when its workload requires more memory; a server at rest does not increase the size of its virtual address space.” Of course I’ve seen wrong articles before on MSDN, but I don’t believe that to be the case here.

      Reply
    • This is very incorrect. Starting in SQL Server 2005 when the SQLOS was introduced, the Resource Monitor thread inside of SQL Server monitors for LowMemoryResourceNotification being set by the Windows OS, which is what triggers external memory pressure notifications internally in SQLOS and causes the internal and external clock hands for all caches to sweep. If you don’t set ‘max server memory’ the SQLOS will constantly being in a process of growing, then shrinking memory usage causing cache flushing to repeatedly occur and it will never get to a stable state. Setting ‘max server memory’ is still a best practice in all versions of SQL Server through 2012. You can look at the resource monitor ring buffer entries as shown here
      http://sqlskills.com/blogs/jonathan/post/Identifying-External-Memory-Pressure-with-dm_os_ring_buffers-and-RING_BUFFER_RESOURCE_MONITOR.aspx
      You are misinterpreting what Guy, who I’ve discussed numerous memory internals topics with, is saying on that reply.

      Reply
  3. A great article, I declare myself ‘guilty’ of excessive indexing after up-sizing an Access database. The script exposed the ‘overhead’ indexes, all writes & no reads 🙁

    Reply
    • Jacob,

      That depends on your workload and if you are seeing memory issues, and if the ad-hoc plans are being reused at all. I would tell you to enable the ‘optimize for ad-hoc workloads’ configuration setting in order to save space for the single use plans.

      Reply
  4. Quick question – for 4. Plan Cache Bloat, how does one interpret the results that appear in the various columns? I don’t mean exact answers, just relative guidelines for what might appear high or low. Thanks – great article.

    Reply
    • There is no definitive answer for this, but if I saw that I had a majority of the plans in cache to be ad-hoc or single use, then I’d investigate further. HTH

      Reply

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.