Companies with heavy compute workloads are stepping up adoption of liquid cooling techniques and starting to kick the tires on immersion cooling, experts told Silverlinings.
Plenty more deployments could be on the way, as the rise of compute-intensive artificial intelligence (AI) workloads and power constraints increase pressure on air-cooling systems. But adoption of immersion cooling could be tricky for a number of reasons, the experts added.
Liquid cooling is a term that encompasses several different techniques. These include evaporative systems which can be used to cool the air in data centers; direct-to-chip systems that pipe cool liquid over some of the hottest elements inside a server; and rear door heat exchanger units, which are basically big doors with liquid-filled coils inside that can be attached to the back of server racks to cool them. Immersion cooling is an offshoot of liquid cooling that — as the name implies — involves submerging whole or parts of servers into a liquid solution in a tank.
Holland Barry, SVP and field CTO at data center company Cyxtera, told Silverlinings it already has customers with direct-to-chip and rear door heat exchanger systems in production in its facilities. He added “the amount of inquiries we’re getting on it [immersion cooling] has gone up in the last six months.”
Maikel Bouricius, CCO at immersion cooling specialist Asperitas, similarly said “interest in immersion cooling has grown exponentially” over the past few years. This, he added, has been driven in part by increased interest in and regulations around sustainability.
Barry said while it’s possible immersion cooling could end up all hype and little substance — like blockchain or Web3 — he noted that immersion vendors were among those that showcased at HPE’s recent Discover conference. “That’s the first time I’ve seen them, this class of partner being at these OEM events. So, something is brewing…I think this one has a little more legs.”
Gartner Senior Director Tony Harvey said issues around heat and power density are driving the need for liquid rather than air cooling.
“At the rack level, as you add more servers to a rack there’s a physical limit of 50 kilowatts per rack after which it becomes difficult to cool with just air,” he explained.
Barry pointed out rear door heat exchanger designs can enable racks to get up to 70 kilowatts. Immersion systems can take things even further, “comfortably” getting to more than 100 kilowatts per rack.
Zooming in on the servers themselves, Harvey noted CPUs and GPUs are consuming more and more power. “I can remember the days when a server CPU was 40 Watts. We now have 400 and 500 Watt CPUs and 750 Watt GPUs.”
As Silverlinings previously noted, AI is increasingly requiring GPU power and is giving rise to GPU data center specialists.
Harvey continued: “The power’s going up and the temperature that the vendors will allow us to keep the silicon at is going down. That has a magnifying effect on that power-density problem.”
He noted liquid is much more efficient at cooling. And if data centers can slash the amount of power they use on cooling, they can dedicate more to computing. That’s a big deal given companies across the board are increasingly focused on sustainability and key markets like Virginia are increasingly faced with constraints on commercial power availability.
The cooling hurdles
It all sounds great in theory, but Barry said immersion cooling specifically raises questions around drainage, exhaust and where exactly data centers would put immersion cooling tanks.
Indeed, Harvey stated both immersion and direct-to-chip cooling “are not impossible but hard to retrofit” within existing data centers. He added that immersion specifically will likely also make server maintenance and upgrades if not harder at least much more messy.
Though Gartner doesn’t have a market forecast for liquid or immersion cooling, Harvey said his personal view is that “you’ll see most of the big OEMs – HPE, Dell, Lenovo – going for direct-to-chip [cooling].”
“Most of them are evaluating immersion cooling, but the radical difference in the way that immersion cooling works and the retrofit problem makes it more difficult to think about,” he said.
Harvey concluded that the market for liquid cooling is “definitely going to grow,” but he added there will still be plenty of lower-intensity compute systems that don’t actually need to be liquid cooled.
Cooling and the cloud giants
So, what about the big three cloud giants – where do they stand on liquid and immersion cooling?
- AWS – An Amazon Web Services representative told Silverlinings the company primarily uses direct air and evaporative cooling for its data centers. “When possible, we incorporate direct evaporative technology for cooling our data centers, reducing energy and water consumption…During the hottest months of the year, outside air is cooled through an evaporative process using water before being pushed into the server rooms, and we have optimized our cooling systems to minimize water usage,” the representative explained.
- Google Cloud – A Google Cloud spokesperson told Silverlinings it uses water rather than air as a cooling solution in many of its data centers. The company noted in a blog post that it tries to avoid using freshwater wherever possible and utilized reclaimed non-potable water at more than 25% of its data center campuses in 2021. It did not specify, however, how that water is used – whether for evaporative cooling like at AWS or for other things like rear door heat exchanger systems on its server racks.
On the immersion cooling front, the Google Cloud spokesperson said the technology is “an innovative solution that is still in development and has yet to be adopted at scale. As with any cooling solution, we would evaluate the full range of impacts and trade-offs over the product lifecycle” prior to adoption."
- Microsoft – Microsoft demonstrated closed-loop immersion cooling in a production environment at one of its data centers in Washington state in 2021. At the time, it noted it looked into the solution as a way to meet the needs of high-performance compute workloads like those driven by artificial intelligence. It had just one immersion tank running workloads at the time and said it was planning to run additional tests to prove out the viability of the technology.
“This shift to two-phase liquid immersion cooling enables increased flexibility for the efficient management of cloud resources,” the company wrote in a blog in 2021. “For example, software that manages cloud resources can allocate sudden spikes in data center compute demand to the servers in the liquid cooled tanks." It is not clear if it has since expanded its use of immersion cooling.
Want to learn all about cloud data center strategies? Catch Silverlinings’ on-demand Cloud Data Center Strategies virtual event here.