Notes on a SBS Disaster Recovery

On the last Friday of June I came into the office and found myself confronted with one of the more unique disaster recovery scenarios I have ever encountered. The primary symptom was that no one could get to their email. The cable modem and the router attached to it looked like they were turned off. Turning the power on got the lights to blink for only a few seconds before they went off. Our Small Business Server server appeared to be working but the lights on the LAN adapters in the back were off. Even when we inserted a known good LAN connection the lights would not come on. I rebooted the server and it stopped seeing the disk drive array. It was about that time I noticed the aroma of burnt insulation. Ah! The smell of burnt insulation in the morning! The cable modem, router, and server were fried. During the night the RoadRunner cable serving us must have been hit by lightning.

It is not surprising that we were not prepared for this disaster recovery scenario. Here are my notes on how we recovered our original server and migrated it to a new server.

  1. Probably one of the more interesting aspects of this disaster recovery story was that I took half of a mirrored drive pair and used it to create a Virtual Server of the SBS server. Since our development server(HP DL380) is similar to the server(HP DL360) that got fried, I was able to put the mirrored drive into the development server chassis. Everything on the drive looked good except for Exchange partition. The Exchange partition was missing. The lightning strike occurred during our backups, so our backups were not complete. We were looking at losing Thursday’s email. Since the development server had sufficient processing power and disk space, I decided to see if we could bring up the Small Business Server as a Virtual Machine. Using a virtualized server could allow our office to be fully operational  while we worked on getting a new server delivered. It looked like a fast way to recover the Active Directory and the office email. So I gave VMware vCenter Converter a try and I was amazed that the Small Business Server came up with only minor errors. The Exchange software complained that it could not find the Exchange partition and the HP diagnostic software complained about the hardware. Other than those problems the active directory, print queues, and the fax server were all operational.
  2. The good news was that I had a virtual server running. The bad news was having problems recovering Exchange. The backup located on an external USB drive was restoring with errors. The first time I tried to restore Exchange I got a file corruption problem. This probably was due to USB problems with virtual servers. The next morning I decided to try something different. I downloaded some partition recovery software off of the Internet and to my surprise it found the partition on the mirrored drive. Using the EASEUS Data Recovery Wizard, I was able to recover the Exchange partition. The database had some integrity problems but it looked promising. So I followed this article, Using the Exchange tools ISINTEG and ESEUTIL to Ensure the Health of your Information Store, to repair the Exchange database. My final trick to getting Exchange to mount was to delete the Exchange log files. A little bit more than 24 hours after the lightning strike, our Small Business Server was operational and we had not lost any emails.
  3. About a week later we had a “new” server delivered. Actually it was an old server we got off of eBay but it was identical to the server that had failed. Although it was tempting to leave the SBS server in virtualized form, we opted to install the server natively using the SBS Migration procedure. In this case both our SourceDC server and the MigrationDC server were virtualized servers.The first time I tried the SBSMigration procedure I failed. It took me awhile to figure out why but the SYSVOL share was not getting created on the MigrationDC. I traced the problem back to a communication configuration problem. The DNS parameter on the LAN adapter configuration for the SourceDC was pointing at the office router rather than itself. Although normal communications with the server appeared to be working fine, the active directory communications with the backup domain controller was not working. The domain controller could not find itself. ;(  After I changed the DNS parameter the domain communications and file replications worked correctly. As Jeff Middleton reminded me, a good indication that the backup Domain Controller is working properly occurs when the SYSVOL share is created on the MigrationDC sever.
  4. My next mistake was installing the Exchange database to a new drive letter. Exchange is very finicky about this. I had to “repair” Exchange to get it to recognize the database at the new location. It was after I had started the repair operation that I figured out how long the repair was going to take. I ended up running the repair overnight. In hindsight we would have been up and running much earlier if I restored it to its original drive letter location and moved it to a new drive letter at a later time.
  5. My final mistake was made when I upgraded the NewDC to Windows 2003 SP2 before completing the SBS installation. I had to uninstall SP2 and install SP1 before I complete the SBS installation.