Friday, July 03, 2009

FAIL - Fisher Plaza Outage

When I get a phone call at 3am it is one of two things - either a relative has passed away or a server has gone down. Looking at those two options I would always take the latter. Bryan was on Caller ID, so I knew it was the latter. What I didn't realize was that this wasn't the ordinary check, reboot and "all's good" kind of server issue.

I tried to VPN into the machines and nothing was resolving, which is generally a Bad Thing. I dusted off my cardkey (unused for at least a year), grabbed my car keys and headed out the door in that daze that only a mom or dad knows during a 4am feeding. The usual fearful thoughts went through my head about the worst case scenarios.

Once I arrived at the Fisher Plaza building I circled the block and drove to the entrance to the parking garage, passing my keycard and receiving a satisfying beep and green light, but the gate didn't raise to let me enter. Through my dozy haze I realized that the parking garage was a dark maw with only the peek of a car in the shadows. It dawned on me that this was no longer just My Problem but Someone Else's Problem. This is one part frustrating and one part relieving. Although a power outage was a shock to the machines, we've handled this before at Fisher Plaza* and the machines were restarted with some light caresses. So I settled into the parking lot across the street and walked over to the entrance.

Arriving at the entrance I validated my initial observation that, yes, the power was out, so I struck up a conversation with some folks loitering near the entrance. "Is the power out?" I inquired, more as a conversation starter to get some additional information about what happened.

"Duh," a woman responded with a linguistic flourish. Apparently this was a KOMO employee Who Doesn't Speak with Tech Guys, but I persisted and got some general information. Yes, the power was out. It was due to a fire and one of the generators was fried. And it had been down since around 11pm. And they were here since 3 and they were sooo bored.

I realized that whatever welcome I had spent had emptied the account, so I noticed a person looking just as dazed and techie as me and struck up a conversation with him. Andrew was from PopCap Games and was also called out to check on his servers. Although most of their sites were up, their Facebook games were hosted exclusively at Internap in the Fisher Plaza Building.

After a while, more tech folks would appear and congregate, sharing any emails received by Internap which were sparse and devoid of any real content, but we'd get wisps of information from other folks who "know people" - generally the guys who actually have to get the work done. What I heard was this:
  1. An electrical cabinet set itself on fire between 11pm and midnight
  2. There was 6 inches of water in the parking garage where the generators are. This happened from putting out the fire.
  3. No one was injured (thanks for asking)
  4. One generator is fried and the other is sitting in the water so it is unusable
  5. The Fire Marshall wasn't letting anyone in (until around 8am when I got in)
  6. Rumors stated that the sprinklers did turn on but as alarming as it sounds, they didn't turn on in the colocation facility - so no wet machines
  7. Kiro stations like Kiro Channel 4 in the building weren't broadcasting - instead they were broadcasting content from sister stations.
Once I got in I inspected the machines to make sure that they weren't buried in rubble or water. With nothing else I to do I headed home to wait for the "experts" to resolve the issue and update geocachers about the situation.

Realizing I couldn't get the word out on the site outages via the normal channels, like email and the web sites, I resorted to posting to Twitter, which in turn posted to Facebook. Fortunately the geocachers continued to retweet and repost the updates throughout the day, so hopefully the information is getting out there. We even updated the text for the iPhone application but that won't satisfy many folks who are experiencing network errors when trying to look up caches.

Fortunately we're in good company. Authorize.Net and Bing Travel and other larger clients are at the facility so Internap is being very serious about getting this resolved. Sometimes it is good to ride on the coattails of the bigger guys when you have a big problem to solve.

I'll update when I have more information. In the meantime I'm monitoring emails and Twitter. When the machines come back online I'll get a flurry of emails so I can head down to make sure that the machines get back online. And I'll continue to update via Twitter and Facebook until this gets resolved. You can follow the latest developments with the hashtag #fisherfire or search for Fisher Plaza or follow me @locuslingua on Twitter.

*Around 2006 Fisher Plaza lost power allegedly due to a safety switch meant to shut down power if someone was electrocuted. It's a nice safety feature unless some knucklehead hits it by mistake.

Edited: It was KOMO, Not KIRO. Thanks for the comments!


Unknown said...

Thanx for keeping us in the loop on facebook and here, as to what is going on. :) We're just muttering through the weekend on what we already had in the GPSr. Hope you find some time to catch a nap! lol

Frank Broughton said...

Thanks for the update Jeremy. We have been following the events here in Western NY and documenting them on one of the local boards.

Thank goodness for offline DB's! This is one time I hope ya do not mind us sharing data with each other. I saved one vacationer so far.

Sadly though my son took his iPhone and left his GPS home and is out of luck - haha. Dad's old fashioned OLDB is the cat's meow.

Now can we please get larger allotments of PQ's per day.... haha

Good luck with the reboots when power is restored. Hope all the servers do not need FSCK's and shoot out tons of errors. They did soft off I HOPE!

blorengia said...

Thanks for keeping us informed via twitter and blogger.

Unknown said...

Thanks for the update and all that you do!

blorengia said...

Thanks for keeping us informed via twitter and Blogger.

The Blorenges (UK)

Jenn said... is down too. I bet they have their servers there also.

I can't bake OR geocache.... Guess I HAVE to do that laundry :-)

Karl Witsman said...

Some things are out of our control. Don't ya just hate that? heh heh

Miguel said...

Good luck inquickly getting the site back up.

Unknown said...

KIRO doesn't operate out of Fisher Plaza. KOMO does.

GeoJoe said...

Thanks for keeping us posted on twitter all day long. Much appreciated. Perfect picture on the blog post too!

Larry Robinson said...

thanks for being there.

Unknown said...

I was not in the know until I just saw someone mention it on Facebook. Thanks for the update and hope things get resolved soon!

0ccam said...

You are probably aware of this already, but some sites have a status page that's hosted at a completely different place than the main service.

Like LiveJournal has

Thanks for the update post, though! I like having as much of the whole story as I can get.

Backburner said...

Thanks for keeping us up to date with what is happening champ. Your informationis being rebroadcast through Australian Geocaching communication circles.

Glistener said...

I now know I am addicted to caching, officially. Have the shakes of withdrawal!

Thanks for the updates

The Nieborgs said...

Thanks for the hard work Jeremy. If people ever wondered what the size of your 'footprint' in the world is, I think we now know. We appreciate what you have done, both in creating geocaching, and your efforts to sustain it.


The Nieborgs said...

Thanks for the hard work Jeremy, we truly appreciate it. If you ever wondered about the size of your 'footprint' on the world, this episode is certainly a strong indicator.

Hope you can get some rest now, and take care of yourself.


Unknown said...

Just FYI, sprinklers are triggered by heat at the actual sprinkler head, so only the one(s) near the fire will go off. The heat melts a low-temp hunk of metal or burst a glass bulb to open the valve only where they're needed. Pretty cool, eh?

ericgallant said...

Great posts! Thanks for the info. I referenced you in my blogpost re #fisherfire