Data Center Servers Suck — But Nobody Knows How Much

If the computer industry’s dirty little secret is that data centers are woefully inefficient, the secret behind the secret is that nobody knows how bad things really are.
Mozilla's servers average around 6 percent CPU utilization but maybe that's OK.
Mozilla's servers average around 6 percent CPU utilization, but maybe that's OK.Photo: Ariel Zambelich/Wired

If the computer industry's dirty little secret is that data centers are woefully inefficient, the secret behind the secret is that nobody knows how bad things really are.

On its surface, the issue is simple. Inside the massive data centers that drive today's businesses, technical staffers have a tendency to just throw extra servers at a computing problem. They hope that by piling on the processors, they can keep things from grinding to a halt -- and not get fired. But they don't think much about how efficient those servers are.

The industry talks a lot about the power efficiency of data centers as a whole -- i.e. how much of the data center's total power is used for computing -- but it doesn't look as closely at the efficiency of the servers inside these computing facilities -- how much of the time they're actually doing work. And it turns out that getting a fix on this is pretty hard.

The folks who run the most efficient data centers in the world -- the Amazons and Googles and Microsofts -- view this information as a competitive secret, so they won't share it. And in the less-efficient enterprise data centers, staffers may not welcome any type of rigorous measurement of their server efficiency. "Think about it -- who would want their boss to know how poorly utilized that incredibly expensive asset was?" said David Cappuccio, a Gartner analyst speaking in an email interview.

But that keeps the industry from getting a proper fix on things, says Amy Spellmann, a global practice principal with the Uptime Institute. "I think there are good reasons for getting the benchmarks and the analysis out there," she says. "We should be tracking these things and how we are doing as an industry."

When The New York Times ran its recent investigative exposeon data center waste, they had to peg the story on a 4-year-old data center report by McKinsey & Co. -- and a whole lot of anecdotal evidence.

That seems to be the current state of research on data center utilization rates: one report based on data from 20,000 servers that was compiled in 2008. Back then, Amazon's EC2 cloud service was in beta; nowadays, EC2 and its sister Amazon Web Services run as much as one percent of the internet. The industry has changed, but the research has not.

McKinsey spokesman Charles Barthold says that the only systematic study McKinsey has ever done was this 2008 analysis. Back in 2008, it pegged server utilization at 6 percent -- meaning servers in the data center only get used 6 percent of the time. The firm guesses that the rate is now between 6 to 12 percent, based on anecdotal information from customers, Barthold says. McKinsey declined to talk in depth about the report.

And that's too bad. It's not ever clear whether this is the best way to measure the efficiency of our data centers.

Over at Mozilla, Datacenter Operations Manager Derek Moore says he probably averages around 6 to 10 percent CPU utilization from his server processors, but he doesn't see that as a problem because he cares about memory and networking. "The majority of our applications are RAM or storage constrained, not CPU. It doesn't really bother us if the CPU is idle, as long as the RAM, storage, or network IO [input-output] is being well-utilized," he says. "CPU isn't the only resource when it comes to determining the effectiveness of a server."

After we contacted him, Moore took a look at the utilization rates of about 1,000 Mozilla servers. Here's what he found: the average CPU utilization rate was 6 percent; memory utilization was 80 percent; network I/O utilization was 42 percent.

Uptime's Spellmann says that companies that are using virtualization software like VMware or who have bought into cloud-based computing have been able to boost their CPU utilization rates to the 20 percent to 30 percent range. The Googles of the world are probably closer to 50 percent.

But how that affects the overall data center picture is unclear. Is this an industry of wasteful power hogs? Are things improving? Pundits may have their gut feelings on this, but the truth is that the picture really is quite, um, cloudy.

"We know the utilization is low, we think it probably hasn't changed in the past four years," says Jonathan Koomey a consulting professor at Stanford University who studies energy efficiency. "But we really don't have the data to show that."