29 Aug 2013 You Need Three Heads To Read This Post Properly
Here’s the first item for you:
Recently Grant Fritchey (blog | @gfritchey) wrote an article titled “Be Cautious Offering Guidance”. The point of the article was not to call out the fact that Microsoft offered limited information on the MSDN article for the CXPACKET wait event. The article was explaining that offering guidance can be a hard thing to convey. In the CXPACKET entry Microsoft had offered a one sentence suggestion as to how to eliminate CXPACKET waits:
You may consider lowering the degree of parallelism if contention on this wait type becomes a problem.
Yeah, and I may consider a handful of other things, too. Why not tell me about those as well? One sentence is simply not enough information for this wait event.
Keep that in mind as we move on to item number two:
Around the time that his post came out I was alerted that the CXPACKET article inside of the Confio LogicalRead section was out of date and contained errors. Unlike the MSDN entry our post was more than one sentence. I’m not excited about having bad information on SQL Server laying about the internet, and neither is anyone else. More to the point, Confio is not in the business of passing around bad information. So having this article sitting there was an issue for us, one that we took very seriously.
Keep that in mind as we move to item number three:
My role for Confio has recently been altered. My new title is “Technical Evangelist”. Confio has never had a full time Technical Evangelist before and in many ways we are both trying to figure out what it is I will be doing on a regular basis. One item that has been tasked to me? Helping to ensure the accuracy of articles that currently exist inside of LogicalRead.
There we have it. Three items, all in one blog post. A fine example of working in parallel (i.e., parallelism), right?
No, not really.
Parallelism for the three items above would have required you to grow two extra heads so that each head could read one-third of this post. That didn’t happen here (I assume), so this post isn’t an example of something working in parallel.
The Two Questions Every Administrator Needs To Answer
I often tell people that as data administrators they are tasked with two questions, no matter if they are being reactive or proactive, that they will ask themselves over and over again:
- Is this a problem?
- What can I do about it?
When users see CXPACKET waits they often assume that parallelism is the problem. (HINT: It isn’t, really). Once they have (wrongfully) determined that they need to eliminate CXPACKET waits they then skip to question 2 and start thinking about ways to solve the issue without solving the real root cause.
And that’s where the trouble begins. The intertubz is full of links that tell you the only solution is to alter the ‘max degrees of parallelism’ configuration setting in order to “fix” CXPACKET waits.
You aren’t fixing anything by doing that. You are, at best, avoiding the real problem, and likely making performance even worse for everyone else using that instance by forcing their queries to use fewer processors than they have been using previously.
But It *IS* a Problem, So What Should We Do About CXPACKET?
Here are the five things I would recommend a user do if they are concerned about CXPACKET and have identified that a query is taking longer to execute than it should.
- Tune the query by focusing on what the underlying sessions are actually waiting for (HINT: It is not CXPACKET). You can find the associated parallel tasks in the sys.dm_os_waiting_tasks system view.
- If tuning is not an option the adjust the cost threshold for parallelism.
- If neither option above works, then consider using the MAXDOP query hint.
- If none of that works, then consider limiting parallelism for the entire instance by setting max degree of parallelism to be a number less than the number of logical cores. (Note that this setting will be overridden by the use of the MAXDOP query hint, should both be used at the same time).
- If absolutely none of that works for you, then you may want to try using Resource Governor to set the degree of parallelism for a workload group.
You can read the new LogicalRead post on CXPACKET for yourself, it contains more information on what the wait event is, why it happens, and the possible resolutions for you to try.