Some of the biggest data centers in the UK were forced to power down following the recent record high temperatures.
Facilities belonging to Google Cloud and Oracle were among those affected as temperatures in the UK topped the 40C (104F) mark for the first time.
The issues led to outages for customers across the country as both technology giants shut down parts of their systems in order to protect the stability of the entire network.
Heatwave outage
Oracle Cloud was the first to report issues, with a company status alert (opens in new tab) spotted by The Register reporting a cooling failure that caused “non-critical hardware” to be turned off.
“As a result of unseasonal temperatures in the region, a subset of cooling infrastructure within the UK South (London) Data Centre experienced an issue. This led to a subset of our service infrastructure needed to be powered down to prevent uncontrolled hardware failures,” the report read.
“This step has been taken with the intention of limiting the potential for any long term impact to our customers.”
The outage affected several Oracle Cloud Infrastructure resources including networking, storage, and compute.
Google Cloud later reported a cooling failure in one of its facilities in its europe-west2-a zone that covers its europe-west2 region.
“There has been a cooling related failure in one of our buildings that hosts zone europe-west2-a for region europe-west2. This caused a partial failure of capacity in that zone, leading to VM terminations and a loss of machines for a small set of our customers,” the Google Cloud incident report (opens in new tab) noted.
“We’re working hard to get the cooling back online and create capacity in that zone. We do not anticipate further impact in zone europe-west2-a and currently running VMs should not be impacted. A small percentage of replicated Persistent Disk devices are running in single redundant mode.”
“In order to prevent damage to machines and an extended outage, we have powered down part of the zone and are limiting GCE preemptible launches. We are working to restore redundancy for any remaining impacted replicated Persistent Disk devices.”
Both companies were able to repair the failures and reactivate their full networks within a few hours, with customers then able to access the full suite of services shortly after.