Just when I thought we had put him away for good, Egghead is back. After we caught him previously violating our servers in ways never thought possible, he disappeared for a few weeks, only to resurface this morning. In what I surely thought was to be an April Fool’s Day joke there was an email in my inbox this morning detailing a slight issue with the hourly transaction log backups on a particular server. Seems that our job failed on a specific vendor database, and the workaround was to take a full backup and then rerun our job to completion.
But why the sudden failure? And then I reacted as if there was a bacon grease fire that can only put out by eating all of the bacon as quickly as possible before pouring salt in the pan. I recalled a blog entry just two days ago by Colin Stasiuk and started looking for a possible correlation. Could it be possible that something, somewhere, had issued this command and broken the chain?
I looked and found that this vendor was nice enough to violate my SQL Agent by installing a few jobs. I guess I should thank them because this made my forensics much easier. I saw that one of the jobs had run at 2AM last night, which was precisely the time our transaction log job had failed. Coincidence? No way, not with vendor code involved. I skipped through the job, found the stored procedure it was calling, which led me to another stored procedure that was doing the nasty.
As best as I can tell, the logic is quite easy to follow. At 2am on the first day of month this job will run to clean up old data. Truth be told this made me more excited than trying to design my own donut. I rarely see this type of functionality built into systems these days, where older data is either archived or in this case just deleted. As I step through the code I came across this gem, which made me only slightly nervous:
/* Check to see if the cutoff date is NULL. If so, the cleanup process is skipped, but the transaction log is still truncated. */
Um…what? You would skip your cleanup process, but still decide to truncate your log? Ok, no need to panic. At least, not yet. Still, it seems odd that this is a process only running once a month. Would you want to skip your cleanup process for another month? And would you want to wait a month in between your attempt to keep your log growth in check? So, I press on and find just what Egghead has buried:
BACKUP LOG WITH TRUNCATE_ONLY
OK, it’s official, you hosed my backup chain, forcing me to do a full dump to put things back on track. I check the SQL error log and find a nice informational message, saying this statement is deprecated. In fact, this statement is not even in SQL 2008. A little more digging in the 2005 BOL tells me the following nuggets:
“After truncating the log using either NO_LOG or TRUNCATE_ONLY, the changes recorded in the log are not recoverable. For recovery purposes, immediately execute BACKUP DATABASE to take a full or full differential backup.”
Now I am pissed. Not only did they go out of their way to issue this command, but no where else in their application are they even trying to do a full backup of their database. It is as if they are completed fixated on the size of their log (hey now!) and care little about their ability to actually recover. The info in the 2005 BOL has some extra details as well, including this highlight complete with a caution icon:
“Caution: Although this option can be used to manually truncate the transaction log, we strongly recommends that you do not do this, as it breaks the log chain. Until the next full or full differential backup, the database is not protected from media failure. Use manual log truncation in only very special circumstances, and create backups of the data immediately.”
Which is followed by this note:
“Note: If you do not want to take log backups, set the database to the simple recovery model.”
And as if I didn’t have enough to shake my head over, I stumble across the most disgusting thing I think Egghead has ever spewed in my direction:
/* Shrink the log file as much as possible. */
EXEC (N’DBCC SHRINKFILE (‘ + @filename + ‘, TRUNCATEONLY)’)
WTF? Why the case of OCD about the transaction log? What did it ever do to you? It wasn’t enough to do the backup (even if you skipped your cleanup process), you now want to shrink it back down to the original create size? For someone that seemed so obsessed with keeping your system in good working order you have certainly taken steps to prevent that from happening.
Now, to be fair, it would be nice if you wanted to size it properly by using the shrinkfile command, I suppose. But that would require you to shrink it to a specified size, which is not what you are doing here. Oh, and the TRUNCATEONLY command has no meaning when executed against a log file. Nice try. But i like the way you think, you were at least being mindful of not wanting to auto-fragment your database.
I have placed a call to the local authorities and they expect to have Egghead captured (again) before nightfall. Somehow I suspect it will not be long before I find his work lurking elsewhere.
Classic. I bet they had a customer run out of space for the transaction log and that’s why they do the truncate and shrink.
Is the shrink database in there, too? They might as well fragment the database while they are at it.
i looked, but could not find the shrink database statement. they had done some good work, but they fumbled at the goal line on this one.
Reminds me of a company I worked at several years back. I set up some replication to keep a reporting server up to date, but every monday morning the subscription had errors indicating it needed a reinit.
After a couple of weeks of this I dug through the jobs with the DBA (read sysadmin who thought he knew SQL) watching.
I found a job scheduled for midnight on sunday:
sp_repldone
BACKUP LOG … with truncate only
DBCC ShrinkDatabase
Hmmm. I wonder if that could be the problem…
Oh, and this was SQL 2000 where log backups would continue without error after a log truncation. They were useless for restores though.
I think the statement TRUNCATE_ONLY should be replaced by WITHOUT_BACON.