With all the examples of disasters with Sandy & blizzards… this week, it reminded me of the IT calamities I’ve encountered or discussed with people who were at the sharp end of the stick.
Recovery failures take place even for organizations that plan and test their disaster recovery plans on a regular basis. It is not a lack of preparedness (in the traditional sense) that catches organization off guard, but a lack of imagination about what can really go wrong. Some of the big problems I can recall were:
- When 3 Mile Island took place, EDS had a large data center downwind in PA. What do you do when the National Guard comes in and tells you to drop everything and leave? You can’t get tapes. You can’t start jobs. You just have to leave. This makes you look at disaster plans a bit differently from then on.
- During Katrina there were situations where an organization’s data center physically was OK, but there was no power and not likely to be any soon. The off-site backup is underwater. Some extreme measures had to be taken to fly someone into the data center and get the latest information and move it somewhere else for the processing to be performed. This situation can cause you to think differently about geographic diversity and how real-time is time enough?
- Let’s not forget one of the issues that hit Fukushima was not that they didn’t have sufficient backup cooling, but that their pumps were flooded by the tsunami. Sometimes it may not just be one disaster you need to deal with.
It is the variety of personal experience and interaction with a network of experience that can be useful in pointing out flaws.
Once I was part of a team brought in to look at a client’s data centers. They were very proud of their disaster recovery center, until we pointed out that the failover site was in the flood plain below a dam, near an earthquake fault and also close to where wildfires regularly take place. It would likely be most vulnerable when they needed it. They’d never really thought about it quite that way before.
That’s one reason why the disaster situations you test for disaster planning should come from someone outside your normal organization. Shake it up, move it around, get some imagination through diverse perspectives, rather than just counting on someone to think up a good case.
I received a notice the other day about an upcoming webcast (April 21st) from Innovation INSIGHT titled: Disasters Happen: Is Your Enterprise Protected in the Cloud? It’s free, so check it out.
We tend not to think about it much but unpredictable and sometimes even unthinkable -- disasters happens. Mission critical IT systems require mission critical protection, no matter the platform or the supplier who may be operating the underlying hardware. It is not just a matter of the systems, but the network connections and the integrated applications that are important. No one cares if the lights are flashing and the disks are spinning if the end-to-end transactions can’t take place.
When moving to cloud this level of system interaction needs to be understood and failover or business continuity options tested. Cloud vendors will need to participate in these tests (at least to some degree).
No protection means no chance for quick recovery. The result? Your enterprise’s business will be deeply impacted and in unpredictable ways.
This webcast will likely look at some of these issues and assist those attending in understanding alternatives – before you need them. No one wants to go down with the ship!