(Apologies for the delayed posting… been holding off until I had more authoritative information.)
At around 3:30am PDT, Mozilla’s primary datacenter in San Jose went offline.
From initial conversations with CRG West, we learned they had a catastrophic HVAC failure at 2:46am PDT. CRG West later reported,
“…service interruption that affected customers at the Market Post Tower data center this morning from approximately 2:46am to 5:30am PST. After thoroughly looking into this matter, CRG West Operations has identified a tripped utility main input breaker feeding the 3rd, 10th and 16th floors of the data center as the cause of the issue.”
At around 3:17am PDT, as a preventative measure, a lot of the servers started to automatically shut themselves down to prevent thermal damage. Ambient air temperature was somewhere close to 42° C.
Mozilla IT was onsite shortly afterwards and, after dealing with a failed out-of-band/NMS switch and blown fuses, had the majority of the infrastructure back online by 9:10am PDT and declared everything “all clear” by 9:47am PDT.
We apologize for any inconvenience this may have caused and will continue to follow up with CRG West.