Mozilla Network Outage Report – 09/23/2008, 3am PDT – 10am PDT

mrz

(Apologies for the delayed posting… been holding off until I had more authoritative information.)

At around 3:30am PDT, Mozilla’s primary datacenter in San Jose went offline.

From initial conversations with CRG West, we learned they had a catastrophic HVAC failure at 2:46am PDT.  CRG West later reported,

“…service interruption that affected customers at the Market Post Tower data center this morning from approximately 2:46am to 5:30am PST.  After thoroughly looking into this matter, CRG West Operations has identified a tripped utility main input breaker feeding the 3rd, 10th and 16th floors of the data center as the cause of the issue.”

At around 3:17am PDT, as a preventative measure, a lot of the servers started to automatically shut themselves down to prevent thermal damage.  Ambient air temperature was somewhere close to 42° C.

Mozilla IT was onsite shortly afterwards and, after dealing with a failed out-of-band/NMS switch and blown fuses, had the majority of the infrastructure back online by 9:10am PDT and declared everything “all clear” by 9:47am PDT.

We apologize for any inconvenience this may have caused and will continue to follow up with CRG West.