At around 7:30pm PST Friday night Mozilla’s primary data center in San Jose went offline, affecting multiple services.
CoreSite, Mozilla’s San Jose data center provider, indicated that they had lost city water and had suffered site-wide CRAC unit failures. These units provide cooling for the data center and without them, the ambient air temperature quickly rose to about 120° F.
To prevent thermal damage, the servers automatically shut themselves down.
Mozilla IT was onsite by 8:30pm PST. By 9:30pm CoreSite had brought the ambient air temperature under 90° F. Mozilla IT had the majority of the infrastructure back online by 11:30pm PST.
We apologize for any inconvenience this may have caused. We are working with CoreSite to better understand the points of failure and how they will work to prevent a re-occurrence.
Further, we’ll be evaluating our own internal procedures so we can more quickly failover production services to our Phoenix data center.