How one CTO avoided a Web site disaster after data center fire |
Connect with TechFlash on our Facebook page for all the latest technology news headlines and commentary, plus information and access to special events, photos from events, promotions and more.
Flickr photo via Jamison_Judd
Most Seattle geeks probably didn't think they'd be spending a portion of their 4th of July holiday dealing with broken Web sites, back-up generators and damaged servers. But the small fire at the Fisher Plaza data center in downtown Seattle late last night knocked a number of sites offline for most of Friday, raising questions on TechFlash about how companies handle disaster planning and server co-location.
We actually first learned of the problem around 1 a.m. when Seattle-based Redfin posted a message on Twitter noting that their real estate site was offline because of problems at the data center. But by 4 a.m. Redfin's site was back online, purring along whereas other sites struggled.
We asked Redfin CTO Michael Young how they avoided the catastrophic failure that other sites are experiencing today. Turns out, the company learned some important lessons after a similar electrical fire hit the same data center last June.
Here's what Young told TechFlash today.
We were pretty embarrassed last June when Adhost had a similar electrical fire and took our site down for 8 hours (well into our core business hours) with brown-outs a day or two after that had us scrambling. 'Fool me once, shame on you; fool me twice, shame on me' resonated in our brains.
So by October 2008, we basically instituted a disaster avoidance plan where we had redundant-everything for our mission-critical databases, servers and networks in separate buildings.
When the problem happened last night, our beepers went off, we saw what looked like a major outage in one building, and were able to switch to the redundant systems.
Everything was up and running by 4am PST / 7am EST, well before our core business hours. We’re a startup, but we try to maintain high standards in our datacenter operations without spending too much money. The failover didn’t happen at the push-of-a-button, but the disaster planning paid off for us.
Young's explanation is interesting given that many sites -- including high-profile consumer-oriented sites such as AllRecipes, Bing Travel and Big Fish Games -- have been offline most of the day.
I have a feeling there will be some high-level meetings with CTOs, IT administrators and co-location operators on Monday discussing some of the ways to make sure this doesn't happen again.
I asked Young -- who was up at 5 a.m. dealing with the situation -- why other larger companies didn't appear to have a similar plan in place.
"It's hard to get every single point of failure," said Young. "And most people need to be burned once, like us."
[Flickr photo via Jamison_Judd]
If you are commenting using a Facebook account, your profile information may be displayed with your comment depending on your privacy settings. By leaving the 'Post to Facebook' box selected, your comment will be published to your Facebook profile in addition to the space below.
Follow, like, and connect to a broader audience for your company!
The Puget Sound Business Journal announces Social Madness: A Corporate Social Media Challenge, presented by Capital One Spark Business. This a local and national challenge that will spotlight the best social media programs of companies in 43 cities. The local challenge begins (following the nomination period) on June 1, 2012. The promotion will culminate in a national bracket challenge that will crown Social Madness champions in 3 categories based on company size. To see the official rules, visit http://www.socialmadness.com/rules.
For more information on how your company can participate, visit the nomination page here. Nominations are due May 15th.
BizDev Seminar Series - Leadership: Rallying People to a Brighter Future
Join us for this one-of-a-kind seminar series where you hear directly from the experts about hot topics to grow your business.
The skills to be effective as a leader can be learned. What are the skills and attributes needed to be effective top leaders? How do you tell what level your people are at, and what development skills each person needs? Workshop attendees will learn the answers to these questions and more.
Tuesday, May 17, 2012
8:30am - 10:30am
The Harbor Club, Seattle
Register here.