Disaster Planning

For the past few months I have been trying to outline the most effective way for our team to recover a server in the event of a disaster. Say, for example, if you have a production box on the SAN and someone accidentally reallocates your server storage, wiping away everything and forcing you to rebuild the box and recover files from tape backups. While I have a rough idea of the outline to follow, I have yet to formalize things. Why? Well, as you know, DR is usually the last thing we tend to think about, even though it is the most important thing we need to do.

Honestly, in the past few months I have been quite busy with other items. I do have an outline, just not a formal one, and since every situation is different I have been struggling with trying to put together a process that is applicable to everything. So, with that in mind, I have decided to start outlining a process that is applicable to when a server needs to be completely rebuilt. There, that should make things easier.

The process I have in mind is as follows, and is particular to our environment.

  • Server team recovers the server from tape backups (so, there should be no need to reinstall SQL).
  • Copy the master.mdf and .ldf files from a corresponding box (say, the test or dev server). This should allow SQL to start, and the instance should be at the correct build level.
  • Restore master from last good backup.
  • Restore msdb, model, our Litespeed database, and our internal monitoring database, DBA_Perform.
  • Restore the remainder of the user databases.
  • Take full backups when complete.

Why am I blogging this to you? Well, I am wondering if we have forgotten something important. I have heard that you should reapply service packs manually and was wondering if anyone has any experience one way or the other about this.

I think the outline above will get us back up and running in a minimal amount of time, but before I start making this outline something more formal, I was hoping that I could get some feedback. I would hate to go into battle having missed something, no matter how big or how small.

1 thought on “Disaster Planning”

  1. you should be careful, your assumption is that the tape will have your entire sql instance (binaries) on it when in reality it has been my experience that many of the binaries are locked at backup time because you didnt stop the server (who still deos that??) and since those binaries are locked they dont make it to tape. maybe its just our backup software but this one has bitten me before.

    Reply

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.