Data centers are notorious for wasting power, or at least they were. An in-depth study of data center power efficiency by the US Department of Energy’s Berkeley Lab found that the data center industry has made enormous strides in obtaining efficiencies in power usage that were sorely lacking when in a similar study conducted in 2007. A leader of the study, research scientist Arman Shehabi, who participated in the 2007 study, shares the mechanics of the study and shares not only the findings, but his view on the results.
Who did the research team study and how did they extrapolate the results into the future? What’s changed in the data center industry to lead to this vast improvement in efficiency, and how much of an influence did financial self-interest have in the advancements? To what extent did virtualization and the move toward larger data centers improve the results?
You can listen to the full conversation in the player, or read the transcript below the player.
Kevin O’Neill, Data Center Spotlight: This is Kevin O’Neill with Data Center Spotlight, and I have with us today Arman Shehabi. Arman is a research scientist at the Berkeley Lab. Arman was just involved in a study of data center efficiency, and the overall power efficiency and electric use of data centers nationwide and he came up with some interesting stuff. Arman, thanks for joining us today.
Arman Shehabi, Berkeley Laboratory: Hey, glad to be here.
Data Center Spotlight: Arman, tell us about Berkeley lab, what you all do there, and what you do.
Arman Shehabi: Yeah, so the Berkeley Lab, that’s the nickname, or the shorter name, for the longer name, which is the Lawrence Berkeley National Laboratory. We are one of the national laboratories in the country that’s managed by the Department of Energy, federal government. We are also here locally managed by the UC Regents, so we’re connected to UC Berkeley in that way, so we’re just up the hill from UC Berkeley. The history of the lab, it started as a nuclear lab, and so anybody that’s looked at a periodic table, if you look down at the bottom, you’ll see a lot of elements that are named by the Berkeley Lab, so you’ll see Berkelium, and Lawrencium, Californium. But now, there’s a lot of hard core science that goes on here, no more nuclear sciences, it’s a lot of physics and chemistry, but the group that I work with is in the energy, technology area, where we do a lot of analysis to understand how much energy is being used in the country, look at energy efficiency strategy, so it’s a lot of energy efficiency that’s been built up since the 70s came out of this section of the lab.
My work is focused, I have a background in life cycle assessment, which has to do with the systems analysis of looking at all the energy that’s associated with certain products and services, and I focus on emerging technologies in general, and a lot of my work is in information communication technologies like data centers, to try to understand how much energy they use, and what kind of environmental impacts they might have.
Data Center Spotlight: Sounds like some interesting work that you do there.
Arman Shehabi: Fun stuff.
Data Center Spotlight: You were new at the Berkeley Lab in 2007 when you folks did a previous study of the data center industry and made some projections about future energy usage. Can you tell us a little bit about that study and maybe roll that into what you found this time around?
Arman Shehabi: Sure, sure, so not only was I new, I was kind of before new, so I was a grad student at the time. I was a grad student at UC Berkeley, and I was working up here at the lab part-time as a grad student in 2006, when the report was being worked on. So, I was involved with it as a student, and the way that report came out was, there was a lot of concern of growth of data centers, just right around the turn of the century, and the amount of electricity they were using, and nobody really had a good idea of how much electricity they were using, other than the fact that when you look at a data center, it’s drawing an enormous amount of electricity, and more and more of these were popping up, like right around the turn of the century with the first dot-com boom, and Congress actually put forward a rule making, asking the EPA, the Environmental Protection Agency, to determine how much electricity is actually being used in data centers, and then the EPA came and tapped us, the National Lab at Berkeley to do that analysis.
So we worked on that in 2006 to be able to publish it in 2007, and the findings were surprising at the time. We saw that the electricity used from 2000 to 2005, in the first half of that decade, essentially doubled and it was continuing to double based on the current trends that we saw for growth and for the amount of electricity that the servers were drawing, and the amount of power that the cooling systems required in the data centers. So, the concern that if staying at that current trend, that there was going to be this doubling of electricity in data centers every five years as services for data centers kept on increasing in demand and more and more of these buildings were built.
Data Center Spotlight: And so you came up with those findings, and were those released in 2007, or you were doing the research in 2007?
Arman Shehabi: We were doing it in 2006 and 2007, and then we published it in 2007.
Data Center Spotlight: Okay, and a few years after that, there was a lot of publicity in the 2011-2012 range by some media folks who had investigated, I remember the New York Times in 2012 did a series that was highly critical of the data center industry, and in a lot of cases, probably deservedly so. I think back then, there were still a lot of energy waste, zombie servers continued to be a big problem and there was a lot of power and efficiency, but it sounds like from the results of the study, you did recently at the Berkeley Lab that the data center industry has become a lot more efficient. Can you tell us what you found this time around?
Arman Shehabi: Yeah, well, we wanted to go back and take a look at that study, and you mentioned that 2011 is when a lot of press kind of built up around data center electricity use. There was generally a little bit of lag time from the time that we published the report, so the time that the information got out there, and there was a couple of high profile journal publications that some of the authors from the 2007 report published around 2011, so that helped provide interest in looking at the study, and the New York Times did their write-up, and all that information was also in the original report, and some of that inefficiency that we saw, like when I would go and walk through some data centers, that then you would see all these servers haphazardly put together in a big room, and there would be some air conditioning in there, and it would all be mixing together, the hot and cold air. Essentially that’s what I call leaving the refrigerator door open type of inefficiency, so it’s just totally unnecessary and very inefficient.
Data Center Spotlight: So the science of airflow and cooling of data center environments hadn’t really caught on at that point, sounds like.
Arman Shehabi: Not really, it was this organic growth where data centers essentially started as though somebody had a server under their desk, and then they had two, and they had maybe three, so they put them in a closet, and then maybe they had four or five, and there was not enough space in the closet, so they put them in a room, and it was getting hot in there, so they put in window air conditioners to cool the room, and then they put it into a bigger room, and it grew like this so that by the time you have an entire floor of the building, it’s just a bunch of servers in there that just naturally grew, and then they just kept adding in more air conditioning power to just cool the area, but yeah, when you think you want to cool your beer in the refrigerator, you want to isolate that cold area. So if you keep the door open, and all that cool air is just flowing with all the warm air that’s outside, it’s going to cause the compressor, cause the refrigerator to work a lot harder.
Data Center Spotlight: Sure, okay, that’s interesting, and so this time around, what did you find?
Arman Shehabi: Yeah, so there was an interest for us to go back and take a look at the report, because a lot of things had changed since 2007 when I was updating my MySpace page back then, and now information communication technology is quite different. We were seeing from our initial analysis before we dived into the actual report, we were seeing IP traffic was increasing by 20% every year, and the amount of data storage had gone up by 20 times, and it’s a whole Internet of Things, and a lot of new equipment that was being put into data centers. So, we felt that it was time to refresh that report and get in another analysis of how much energy we thought was being used, and it was pretty surprising what we saw. Our electricity use matched the old report up till about 2007, 2008, but then there was a sharp change right around 2008, and part of that, you can imagine is because of the economic recession. So when we did the first report, it was right before the economic recession, that nobody saw that coming, or at least we didn’t see it coming.
So, looking at it now, we were seeing that there’s a big change right around 2008, but then after 2008, the increase didn’t really start again. It stayed pretty much at that same level, like a slight increase, 4%, 5% increase every year, beyond that up until now, and when we’re looking a few years into the future, with the model that we’ve built, it’s staying at that very slow growth rate of 4%-5% every year.
Data Center Spotlight: It’s interesting, because just based on if everything was equally efficient, the IT infrastructure industry is growing by leaps and bounds of all sorts of providers and cloud computing and everything won’t store anything on their phones, on their computer. Everything is a service these days, so I guess it stands to reason that the industry has become a lot more efficient in how they handle their power usage.
Arman Shehabi: Yeah, that’s right. The fact that the amount of electricity is essentially stable during this decade has nothing to do with the amount of service. The services have been increasing drastically, and as a way to compare that, we did essentially a counter-factual, where we imagined if the efficiency measures that we were modeling in 2010, if everything stayed that way, that the type of cooling that was used, and the efficiency of how all the IT equipment was being operated, if it stayed at 2010, but the data center industry grew to meet the services that it did over the decade, the amount of electricity would have doubled by now, and in fact, we tripled by 2020, compared to what we’re thinking of it as being in 2020.
Data Center Spotlight: It just stands to reason, because a lot of data centers, for the most part, their largest expense is electricity, so as you referenced earlier, when you did your first study, a lot of the data centers you would go to were just sort of haphazardly put together. Now, you have a lot of purpose-built data centers, or large facilities that have been repurposed for data center usage, and from the get go, they’re designed to be more efficient and to reduce the expenses, because the providers, and then the companies that are running these data centers, they don’t have interest in their costs doubling every few years.
Arman Shehabi: That’s right. A lot of it has to do with the consolidation in those servers, that there’s still a lot of small servers out there in closets and small rooms, like the one, two servers running local email or something like that, but a lot less on a percentage basis than back in 2007. So now you have, like you said, these really large data centers that are being built with the idea of being a data center in mind, and to be as efficient as possible, rather than just being some floor of office space somewhere.
Data Center Spotlight: So you’ve got a lot of these private data centers, people go into cloud computing, to virtualization. You have companies like Amazon and Microsoft, and other major cloud providers building these huge data centers which are certainly more efficient. You have multi-tenant data centers, which are probably because of their size more efficient than if all those companies had their own private centers. It just seems like there are a lot of things, just economically, from a business perspective, it seems there are a lot of drivers that sensibly made the energy consumption a lot more efficient.
Arman Shehabi: Yeah, you think of all the thousands or millions of people that are using Amazon Web Services for example, like if they all had their separate little servers somewhere, and there was something that you could do to make it 20% more efficient, that would come out to being a few pennies you could save every year. There’s a not a lot of incentive to do that, even if it doesn’t cost anything, there’s not a lot of incentive to that. But when you put it all together, and Amazon’s paying millions of dollars every month for electricity, there’s a lot of incentive to make it as efficient as possible.
Data Center Spotlight: Now, Arman, when I was looking through your report, I was just overwhelmed by the enormity of the task that you and your team had undertaken. Can you give us an idea as far as how you gathered the data that you analyzed to develop your results, and then how you extrapolated that data out into the future to come up with an accurate projection for future data center energy use?
Arman Shehabi: One of the big challenges is getting data for data centers, ironically, and it’s because companies that are operating data centers, it’s proprietary information, they want to keep that close to them, and don’t want to let you know how much electricity they’re using, or how big their data centers are, how many servers are there, because they want to keep a competitive advantage. They don’t want everyone to know essentially how good the market might be. So, it’s challenging to figure out what’s out there, so what we have to do is work with different industry group that collect data on equipment sales and shipments. Essentially, we took a step back, our model is essentially a combination of a bottom-up and a top-down model, and by bottom-up that means looking at all the small pieces and putting them together, and those small pieces include sales of different IT equipment and shipments of where they’re going into the US. And then, a bottom-down approach is working with the industry to be able to make reasonable assumptions on what the growth might be for certain areas, how things are operated, how those operations might be different in different types of data centers.
And so, coming from a top-down, we put certain assumptions and create certain categories of those assumptions, and then with all that data together, we build different algorithms that allow us to essentially have a trend of where the growth might be based on how much IT equipment is going to be sold in the future, and will be part of the install base, and what kind of data centers those are going to go into, what size, who’s operating them, and based on those characteristics, how they’re operating, what kind of cooling systems they’re using, what their UPS, their uninterruptible power supplies might be. All that has to be put together to come up with an estimate.
Data Center Spotlight: Now, did you survey any large end users as well? Did you talk to cloud providers, and major enterprise data center users, and multi-tenant colocation data centers, or largely just equipment sales, and dealing with the industry groups that you spoke of?
Arman Shehabi: In two ways, we did that. First, we did an initial draft of our report, and then we sent it out for comment to different industry contacts, and different advocates that we know that could represent certain areas of the data center landscape, so private data center owners were a part of that. We also worked with industry groups like GreenGrid, which has been really helpful in getting information from their members, so GreenGrid is an industry trade group that works to represent the data center industry, and they worked with us to reach out to their members to answer questions that we have, and provide information anonymously about how they operate their data centers.
Data Center Spotlight: Now, we talked about the motivation that these companies have, providers as well as end users have the economic motivation they have to be more efficient. We talked about the fact that services are delivered at scale now, as opposed to a lot of people doing it at a small level individually, and we talked about, you hinted at some hot aisle, cold aisle, and just airflow, and basic energy efficiency that comes with planning of data centers as opposed to them just sort of organically happening in a haphazard manner. Beyond those basic reasons that we’ve already discussed, can you give us some other key reasons that you see for the increase in efficiency, or is there anything out there that would be a little less obvious to people that follow the industry, that you uncovered that surprised you?
Arman Shehabi: Well, the main three drivers that we saw, though, were increasing the efficiency, one was we were seeing an overall improvement in infrastructure energy use, and by infrastructure I’m including everything from cooling and the electronics, like uninterruptible power supplies, everything except for the IT equipment, and the industry actually uses PUE as a metric for that, which is power usage effectiveness. It’s just the total amount of electricity in a data center, divided by the IT electricity, so that just means that if you have a PUE of 2, that means that all your infrastructure electricity is equivalent to the amount of electricity that’s used by all the IT equipment. So, with that background, we were seeing in the smaller data centers PUEs of 3, meaning that the data center’s using 3 times the amount of electricity as just the IT equipment itself. But in the larger data centers, they’ve gotten those down to PUEs of 1.1, 1.2, so the cooling equipment’s gotten much more efficient. So, that’s the first part.
The second part is that we’ve seen really big improvements in virtualization and consolidation of servers and by that, that’s allowing one server to do the work of many different servers, what would have been many different servers before, because in reality, when you have a server, most of the time it’s not doing much. It’s just sitting there, ready to go, so if you think of the percent usage, it’s in the 1% to 10% of its overall utilization. It’s not being cranked at full speed, most of the time. But, those servers still use almost the same amount of electricity when they’re essentially sitting idle, compared to when they’re running at full speed.
So, we’ve seen that increase in virtualization and consolidation since 2010 which has been a big effect of dropping the overall electricity use.
And the third thing is, that there has been improvements with the IT equipment itself. One is, before 2007, there was this general trend of increasing maximum power on volume servers, which is your typical type of server that you see in a data center. The maximum power was increasing every year, since 2000, and you can imagine that they’re bulking up every year, but since then, it’s stabilized, and there really isn’t an increase in maximum power anymore. There’s more cores going into the processors, but the power draw’s been staying about the same. Also, there’s a bit of an improvement, though a lot more is still needed in what we call power proportionality, and that’s the amount of electricity those servers are drawing, how proportional is that to the amount of work it’s doing, the idea that if you’re going full speed, 100% of its utilization, it should be using 100% of its power, it’s doing half of its potential work, it should use half the power. That proportionality has improved a bit. It used to be really terrible, now it’s a little bit better.
Data Center Spotlight: Sounds like just a lot of different factors contributing to much greater efficiency.
Arman Shehabi: Right.
Data Center Spotlight: Let me ask you, you touched on it, and this might be a topic for another day, but you folks, you relied on PUE measurements a little bit, but I know a lot of people are moving away from PUE. For instance, ASHRE, which si the American Society of Heating, Refrigerating, and Air Conditioning Engineers, recently dropped PUE as their metric of choice in favor of design MLC, which is design mechanical load component. Do you guys have any opinion on the accuracy of some of the different metrics as some people are deciding to move away from PUE?
Arman Shehabi: Yeah, it’s understandable that ASHRE that moved away from it, and PUE, its beauty is its simplicity, even though maybe from we’re talking here on the blog, it didn’t sound that simple, but compared to the other type of metrics, very simple, it’s just a ratio of two values, but by being so simple, you lose a lot of the nuance that could be there in a data center, and whenever you create a certain metric, essentially you create a challenge of how can you game that particular metric, and it being such a simple metric, it allows you to potentially move towards strategies that might not be more efficient, but would improve the PUE. Examples of that are, if the amount of electricity used on all the IT equipment, efficiency in that area is not taken into account in the PUE. So, if you use more electricity in the IT equipment, that doesn’t hurt your PUE. In some ways, it actually helps it, and this can be a problem, as you’re starting to make more efficiency measures within the IT equipment, and especially as what the IT equipment is, what a server is, is changing in some of these larger data centers, like what Google and Facebook are different. What’s inside in the box, and what’s outside the box, those lines are being blurred, so to say, what’s the IT equipment, and what’s all the other equipment in the data center, like comparing those two electricity uses, starts to get confusing.
So I understand them moving away from it, and I think there’s a need to improve the metric, but it’s very challenging to find one that is much more accurate but also very simple.
Data Center Spotlight: It’s interesting, because we’ll have another podcast interview going up here where we talked about how an enterprise can project their future data center use and data center sizing and cloud sizing and that sort of thing. You’re talking about some of the same issues. The servers, back when you did your first study, probably have had about as much compute power as the cell phones sitting next to us today. What are things going to be like in five years, or ten years, from a compute power perspective?
Arman Shehabi: Right, right.
Data Center Spotlight: Let’s wrap with maybe a final question, Arman, and again, I appreciate your time, it’s really interesting stuff and I’m just thinking about two or three other times I’d like to grab you to get deeper into some of the issues that you’ve brought up. As we project forward, you’re projecting just continued greater efficiency in the data center industry moving forward. Is that based on all the topics, and all the factors that you’ve discussed during this interview, are there some other things on the horizon that factored into your very optimistic projections about the efficiency in the data center industry?
Arman Shehabi: I think we’ve covered all the areas for the projections that we made, but I should remind you, Kevin, that we only projected out to 2020, so we only went out a few years from now, and there’s a reason for that, because it’s really hard to understand how the data center landscape is going to change because the whole IT, communication technology landscape is changing, and there’s a concern, I want to end this with, we’re not completely done, there’s not a mission accomplished banner over it. There’s efficiency improvements that have been made, but this level of efficiency improvements, can they keep on having to be made, because the demand for computational power, the demand for storage is only increasing, and it’s going to keep increasing. When we start to think about moving to an Internet of Things, and virtual reality, driverless cars, like the computational demand is going to completely change, and the efficiency measures that we saw in our report, which is just a handful of things, you can bring it down to a handful of things, those can only go so far. We’re going to hit a point where it’s going to be close to 100% efficient, so new efficiency measures are going to have to be found to address these, this growth in computational demand.
Data Center Spotlight: That’s some good stuff, Arman, I really appreciate it. How can people become more familiar with the work that you folks do at the Berkeley Lab?
Arman Shehabi: So they can at least go to the Lawrence Berkeley Lab website, if they just search for Lawrence Berkeley Lab, that will come up and they can look for me, Arman Shehabi, and go to my website and see the different papers that we’ve been writing and to stay up to date on new information that we’ve put out.
Data Center Spotlight: And what is the URL of your website, Arman?
Arman Shehabi: [LAUGH]
Data Center Spotlight: Trick question. Tell you what, I’ll get it from you and we’ll put on the introduction to the podcast so people will be able to see it, but that should work out well. Arman, so grateful for your time today, and for your input, and for all the good work that you folks are doing at the Berkeley Lab.
Arman Shehabi: Okay, sounds good, oh, and Kevin, it’s lvl.gov.
Data Center Spotlight: Well, I could see how you couldn’t remember that, Arman.
Arman Shehabi: [LAUGH]
Data Center Spotlight: lvl.gov, terrific. Good stuff, thank you so much, Arman, I appreciate it.
Arman Shehabi: All right, thanks Kevin.