Techzine Talks on Tour

Dynamic AI workloads require dynamic power solutions

Coen or Sander Season 2 Episode 1

Techzine Talks on Tour is back after the holidays with a new episode. This one is about how to make sure data centers in the AI age. We attended the Vertiv: Driving Innovation event recently and talked about the implications of AI workloads on how to power data centers. 

Data centers play a vital role in the development and the deployment of AI. Most discussions about this topic focus on the IT side of things, but there's also a different side to this story. How do you make sure AI gets the power it wants and needs? We sat down with Arturo de Filippi, Offering Director Global Large Power at Vertiv to talk about all things power and AI. 

This latest episode of Techzine Talks on Tour focuses on the evolving demands of powering data centers in the age of AI. We talk about the need for dynamic power solutions with De Filippi, but also about the changing role of UPS systems in data centers. They have become much more than just a fallback when something goes wrong. 

We touch on other subjects too, like the integration of renewable energy sources into future infrastructure. Another topic we discuss is the densification of power systems, that has to keep up with the densification of IT systems. 

All in all, there's a lot more to powering data centers in the age of AI than you might think at first. Listen to this conversation to learn more about that.

Speaker 1:

Welcome to this new episode of Techzine Talks on Tour. I'm at Vertiv's Driving Innovation event in Bologna, italy, and I'm here with Arturo Di Filippi. He's the Offering Director. Large Power for Vertiv. Welcome to the show, arturo, thank you. Okay, so we saw a lot of things today about powering the age of AI right, which is a hot topic at the moment, especially for a company like Vertiv, I would imagine. So one key takeaway that I had, based on the, on the keynotes that we had and the sort of conversations that we had, is sort of dynamic. That's sort of stuck in my head for some reason, right.

Speaker 2:

Absolutely so. First of all, hi everyone and thanks, for I'm really delighted to be here and absolutely you are spot on. So you say the word that we have been mentioning multiple times in today and also in another event that we had just two days ago. It's dynamic, so dynamic. It's something that we associate to the evolving, evolving challenges, but also developments that are happening around the data centers.

Speaker 2:

so I'm in charge for the product management or offering for large power and our customers are mainly those that are deploying data centers, because our products are 500 kVA and going up, so 500 kilovolt per year yeah, from 500 kV up to 2.5 megawatt.

Speaker 2:

We have currently also products that are getting even more than that to 3 megawatt, and therefore our target of customers is data center, the it's. We have seen some numbers today of what are the projections about how much the power can be absorbed and utilized by data centers and critical infrastructure from now up to 2030, and it's huge. It's massive, and that is why we will need to make sure that our equipment is going to be dynamic enough to support this.

Speaker 1:

I saw some numbers 300 to 500 kilowatts per rack in about before 2030 or something like that, and we're at 150 now, or something.

Speaker 2:

We are even less right now. So actually right now, when we go to oh, we're at 50, right. Yes, we are, we don't get, depending on the area of the world. So in America, this let's call it AI boom is probably already having the highest level right now, whereas in Asia and EMEA it still need to be there and we are seeing currently installation where the rack power density is below 50 kilowatt, but it's coming. It's coming. Go to here, and you're right, the projections that have been made, where, considering hundreds of kilowatts per rack.

Speaker 1:

So basically what the likes of NVIDIA are doing with higher densities, you basically have to do the same, but from a power and a UPS kind of perspective.

Speaker 2:

Yeah, we're willing to support the critical infrastructure. So we, as Vertic, we have been always supporting critical infrastructures and we will continue to do so, not just for the traditional data center but even more for the ai data center. And back to your question, say about the dynamic, uh, the load, especially for the application.

Speaker 2:

Right now it's much more dynamic as traditional center where we are, we're more or less set around traditional uh loads, traditional stationary loads. Right now it's very dynamic and, as consequence, we will need to be able to adapt to that and then, of course, our features are also. In some cases, operating modes are having dynamic in there.

Speaker 1:

Without going to detail, but can you concretely, can you run AI workloads at the moment on existing power and backup server or UPS solutions? I would say yes, right.

Speaker 2:

I'm glad that you asked because the answer is yes. Of course, this is evolving right now, so if we, probably one week from now, we will have some other information about the AI workloads profile, other information about the AI workloads profile, and they are challenging to be managed by the power train we define because we can provide all equipment for the entire.

Speaker 1:

You talk about a power train and all cars on the power train, maybe sort of so what you're saying is you deliver everything that has to do with power from the grid all the way up to the server. Yes, pretty funny.

Speaker 2:

You know we are in the car valley here, so we have a lot of great cars and, of course, our products can be classified as the same. You're right. So we are covering the entire powertrain, meaning all the power. I'm now speaking specifically for the power equipment, everything that goes from the grid to the ship, but all this equipment will be affected by the variability of the workloads and, as such, we need to be able to support them and to provide most important guidance to those that are consultants, but even customers that are basically trying to understand what happens in that case.

Speaker 1:

So, getting back to the running AI workloads on existing solutions, which is possible, obviously, but is it something that customers need to sort of reassess and look at replacing because of evolving needs when it comes to AI workloads, or how does that work?

Speaker 2:

Yeah, good one we have been, I would say lucky, not clever. Lucky enough to design, since many years already, equipment that is able to withstand the most challenging and demanding load profiles. Our products have already overload capabilities which are very, we say, higher than most of our competitors.

Speaker 1:

Yeah, we did a tour of the facilities today and I think somebody if I remember correctly, somebody said that they test it at 150%. That's correct. So 50% overload 50% overload.

Speaker 2:

So instead of 1 megawatt, if you have a megawatt load, we can do 1.5 for one minute, and we have been having already this capability on the existing products Now. So the answer is yes. On new products that we are launching right now, we are also performing some firmware adjustments such that we can support operating conditions that may arise, because let's be honest on that the workloads profile are changing all the time.

Speaker 2:

And there's no one profile that we have to support. We need to be ready to support all of them, also with the help of batteries when it's working.

Speaker 1:

So, and just to be clear, it's not only about more power, right, it's also about the form of the world, or the behavior of the workload maybe is a better word which has AI. Workloads usually have lots of spikes. It's very spiky, so it's very. Sometimes it needs a lot and sometimes it doesn't need all that much.

Speaker 2:

Yeah.

Speaker 1:

That's also very, but that's something you can't really do. Maybe very well with the existing solutions and that's something that you're actually working on, or so right now, with the existing.

Speaker 2:

We can, because there are many two aspects of this spike in the erratic load. One is the overload, which is more power, as we were saying, and the other one is the variability in the very short time. So what is the difference between the existing products and the new products? So the existing products can support the overload very well, as per the variability in the very short time. That's something that they can do, but of course it's a matter of firmware calibration on how much you see the variation, how quickly you see that variation, and on the new products, we are certainly emphasizing that aspect, but at the end we have existing products that will be used for the AI application. So eventually we will need to have also this feature that we are driving.

Speaker 1:

On new products we will need, we may need to have them also because I was wondering, also based on sort of a parallel with with it, with Nvidia, that you just sort of there are many parallels between what you're doing. I mean, does it also mean that maybe there's more focus on on the software side of things to keep all this running, or is it still sort of mainly a hardware play?

Speaker 2:

so it's both the. The firmware is certainly so on the existing unit the hardware is already there. So the extra hardware that may be needed to support this variability is not in the UPS itself. It's out of that Because maybe, for instance, additional energy storage that you may need to provide power for the specuload. So, for instance, we touched also the aspect of the dynamic grid support today, where on big sharing where probably the batteries are expected to give the energy needed when the grid is not there.

Speaker 2:

That's more or less the same for the variability of the air loads. If the grid operator does not want to see that very huge step load, so batteries can support on that. So the additional hardware may most likely be on that side, whereas the firmware is something that probably will take most of the work on our side, which is the power converter side.

Speaker 1:

So and how about? I mean, we talked about UPSs, right. Basically, I also got the impression that the role of the UPS is changing, right? So, because from my experience, following this market for quite some time, upss are there for if there's something up and you need power to keep running, you have a UPS, right. But I also heard from the man from Sweden, from the data center, that they're using their UPSs also for sort of smart grid or how does that work?

Speaker 2:

Yeah, and we are actually embracing this change and in fact so our product line has recently been renamed as large power converter, because we are not just now speaking about ups, because the role of ups is changing into power converter, and this power converter does no longer only provide backup power and power quality two main features that the UPS has traditionally been providing, but can also provide something more, can enable and we have actually mentioned that in the discussion can enable the interface with new energy sources and also can enable services like fast frequency response or peak shaving. So in that case, probably the power converter or power center, let's say, will enable some of the services that you won't have on a, let's say, standard.

Speaker 1:

UPS and, like the case in Sweden, they can export power into the grid and get revenues for that, even if when they have no load, that's actually quite interesting because, as you also know, obviously data centers sort of have a bad reputation when it comes to power, being very power hungry, and obviously when you're talking about megawatt kind of deployments, they do use a lot of power.

Speaker 2:

I mean, you can't really argue against that, you can get.

Speaker 1:

Argue against that, you can get away with that no, but if you do it like this, so if you actually use them to make the grid smarter, then they have sort of an extra reason to exist and so putting a positive spin. Basically.

Speaker 2:

Absolutely. Let's see the glass half full right? Yeah, Because I agree. So we have been lucky today because one of the panelists was Stefan, so we know that in Nordics is very ahead in terms of decarbonization sustainability.

Speaker 1:

I know last year you had also someone from a data center in Sweden when we talked about the timber mod.

Speaker 2:

That was awesome.

Speaker 1:

So it sounds like you do have customers outside of Sweden, right they?

Speaker 2:

still need to get there for what concerns decarbonization and sustainability, but it's older and they can certainly. Basically they want to provide also a contribution, so some additional services to the municipality. So Stefan was mentioning district heating so they said that they're at the center. They actually, the municipality, is asking them to build more data centers because they need them. They need for district heating, for also the energy use and, of course, for the data processing.

Speaker 1:

Yeah, there's. Obviously. I come from the Netherlands and I have district heating at home as well, but there's no room for a data center next to where I live right. So obviously Sweden has lots of room for extra data centers and they have thermal hydroelectric stuff.

Speaker 2:

They have a lot of power.

Speaker 1:

And it's relatively cold, so it's easier to cool. So it's not something that you can do everywhere, but I do think, at least from my perspective. I'm curious to hear what you think about this. Is that saying no to new data centers, like some countries are doing and I think in the Netherlands around Amsterdam there's also sort of a? That may make sense in the short run, but in the long run the workloads are not going to go away right. So, especially with AI coming up, they're only going to get more and more intensive. So how do you get the compute for your workloads as an organization or as whatever, if you're not allowed to build, so you're going to do it yourself and you build many smaller, probably. That doesn't solve anything, does it?

Speaker 2:

No I fully agree Also it's not that we know that these data centers are absorbing power, and the transition to alternative energies, to renewables, probably is proceeding slowly than we thought, as well as also the adoption of hydrogen, for instance, for new power. But it's also true that we will need data, and now more and more with AI. Many applications are going to be using the AI, and also a lot of it's not just at the centers, it's also a lot of enterprises, cni, so, carlos, a lot of different customers that were not engaged there, so we have been discussing a lot of different customers that were not engaged there. So we have been discussing a lot about power availability. We know that in the US it's probably much easier to get power compared to some area of Europe where regulations are really demanding.

Speaker 1:

And in the US probably, that's also the region where you see the demand for the highest capacity units as well. Right, so the 2.5 megawatts and all that and up there, usually for the US, right, not necessarily for the power we will define the power block.

Speaker 2:

Yeah, probably that is there already using a power block 2.5 megawatt. Here we are seeing more, 3 megawatt in some cases, but 2 megawatt is most common. We are following, of course, this demand. But, yes, in the US it's easier to get power for new sites and also there are a lot of space more, of course, it's something that about recently in Ireland we are seeing a lot of. So the grid operator in Ireland say no, I don't give you, I won't give you any more power. But at the same time there's a the use of a lot of gas to buy.

Speaker 1:

Let's just just move over to to what what this means for customers, other than maybe rethinking their, their, their power needs and and all that because, uh, I would imagine you as a customer, you also need to sort of do some thinking beforehand before you actually you actually go for this new powering AI kind of phase. So what are the things that they need to think about?

Speaker 2:

One of the topics that we covered today was the speed of deployment, because it's possible that most of the center owners are designing now equipment that will probably see the light in three years, two, three years and we heard Francesco that was mentioning that. This means that they will need to be ready to get all the regulatory and everything in place and then start doing something else. But then, once they get that, they will need to have the equipment fast enough and for that we are actually trying to make sure that we, as Verti, can also support in the fast deployment of the infrastructure through the use of integrated services, integrated equipment. We saw the integration of UPS with switchgear, with power models. So on our side we will support this large deployment in the short term by being quick in reacting.

Speaker 1:

So it's going to be more integrated, more converged, maybe for want of a better word. There are two reasons for that.

Speaker 2:

One is the speed of deployment and the second is the. We have been hearing a lot about the word densification today, sometimes massification densification, so basically meaning same as it is happening at the rack level, so more equipment in less space. Very, very simplified that, and that is also something that we want to do with this interview. Very simplified that, and that is also something that we want to do with these integrated Well, on the one hand, that makes sense.

Speaker 1:

But on the other hand, if data centers are already densifying if that's a word densifying in their sort of server area, so with the GPUs and all that stuff, is it really necessary to also densify in the power, because they're going to have more room, because they may even just close off half the data center because there's so much denser? Then it will also make sense. Look, let's just do it the old-fashioned way, because we have room to spare now anyway.

Speaker 2:

Yeah, absolutely. Yeah, that's a very good argument and I guess that also because probably liquid cooling that will save even more space compared to traditional one, and that's why in this case, for existing infrastructures, we don't think that this integration specification will be probably the priority number one but for the existing ones, where probably they want to have in the new space much more power that is where, if you go green field, then that's correct, then this is the this is the diversifying that that case yeah, so that.

Speaker 1:

So that's something that they need to think of. But do they need to prepare their kind of their, their, their environment for for this, for this new type of power solutions I mean they need?

Speaker 2:

Yeah so we are seeing it, even if we are not a data center. We are supplying equipment to data centers and electrical infrastructure but if you have seen the customer witness test area. So we had to expand a couple of years ago from the current power for the existing capacity to additional megawatt and it was very difficult to get that additional megawatt from the grid. So I don't want even to imagine what they may face if they want to start a new campus of 16 or 20 megawatts.

Speaker 2:

So they will for sure need to check that part and probably build in areas where this power is available. But it's also true that we are seeing probably some cities, countries, that are slowing down a little bit. So we know, in Ireland, for instance, there are not so many data centers, but we have heard that in Milano, in Italy, they are going to build a lot of data centers. So I guess that what they are doing and that's why they are expanding a lot. We have seen some. We have heard about some pipeline or extended pipeline from their side as well. Heard about some pipeline or extended pipeline from their side as well. Also, philip today was mentioning that he has been in Asia, so there will be more global customers that are going to be here, and also it was Carlo. He was mentioning also others.

Speaker 1:

But especially in Europe. In some areas of Europe, this can be quite challenging to prepare for right.

Speaker 2:

It is, it is.

Speaker 1:

Yeah, but on the other hand, we need to do it. But there's also something to be said for, look, most AI workloads aren't necessarily very latency-heavy anyway, yeah, so it doesn't really matter where you run them, if obviously rules and regulations aside, which has a big impact as well. So maybe running them somewhere where they can still build data centers makes makes sense yes, because now we are seeing.

Speaker 2:

You know we are preparing for the worst. Also, when we speak about workloads, we go overload, variability.

Speaker 1:

But we will see in the real installation from a practical perspective, I've always wondered, because when you power down for a test, for example, and then you have to pick it up with your UPSs, does that also have an impact on how you do that from when you when you're when you're supporting AI workloads, especially because I've heard that powering up a big data center with massive AI workloads and huge numbers of GPUs, you can't just flick a switch and power up everything at the same time, right? So that's something that you need to address in your solutions as well.

Speaker 1:

I've heard the analogy that it's like starting a jet engine, so a little bit at a time and then there should be.

Speaker 2:

it's more or less like we heard about some fuel cells that were running in models and also they had to start up one at a time. I guess the main reason for that, based on what we have heard. First, it depends on how many racks you have and how they are synchronized. Again, it's all synchronization but of course the load profile on how we are seeing that may have as soon as they start, let's say, as we churn a peak, and this peak can get up to 150% immobilized. So this peak may be managed by the PSU or even by the rack themselves, but may not. That's why, to avoid that all these racks get the same peak of power at the same time, I agree that having a kind of offset between the starting of all of them maybe, yeah but that's not something that you can integrate in your products, right, because that that's something that the the data centers need to need to arrange themselves.

Speaker 2:

They will need to basically uh, and back to what carlos was mentioning. Uh, they may try to work on the firmware side like they've done so far in order to optimize the synchronization between the.

Speaker 1:

But you can work together on this right Absolutely. That's why we are partnering with them, because you have a big history in power and NVIDIA doesn't necessarily have that right, so they can learn from your expertise.

Speaker 2:

That's why we are partnering, also because they asked us to indicate how we can support them to manage these workloads, and support means to look at the whole piece of equipment and see what happens, and this is one of the aspects that we are doing.

Speaker 1:

Right then last point, and then I think we have to go and visit the Leonardo supercomputer, which I'm quite excited about. It's always nice to visit one of those places? Renewable I heard some of it, some about it today, but it's not necessarily a big topic at the moment because obviously there are different focus. But what's the role of renewable in all of this? Can you? Is renewable stable enough to actually play part in this sort of UPS landscape?

Speaker 2:

So that's. I guess that these are these. We have been debating about that. You're right. A couple of years ago that wasn't the main focus. It is, of course, for now, but now it has. It has been surpassed by the AI and it wasn't.

Speaker 1:

The focus of today is fine but they can certainly contribute.

Speaker 2:

I really wish, like everyone in the room today, that they may contribute more, because but at the end we will need to accept that you can add equipment. For instance, if you want to add more solar, you will need to add some energy storage that is going to be managing that additional energy. So if the the problem is the space, you will need to have panels, you need to have battery adjustable system, you need to have probably some wind. So that goes a little bit in contrast with the demand for densification and so on, but that's not necessary, they need to be in the same area of the data center.

Speaker 2:

So we have also spoken about the bring your own power, meaning there should be power at site. This power generation at site does not necessarily need to be in the same site, but should be close enough to that. So my opinion, and that was confirmed by the panelists today, was, yes, they will play a role, a very important role. As for the stability, that is where the great support that we are promoting comes in, because then you get.

Speaker 1:

because I heard from the, I think from the guy from Sweden, stefan, he talks about, because I heard from the guy from Sweden, Stefan Stefan. Yes, he talks about the requirements. I think it was 49.9, between 49.9 and 50.1.

Speaker 2:

Absolutely so, they can intervene, and they can support the grid in a very short time.

Speaker 1:

And this is something that we can do with 49.9. But then, obviously and to your point, renewable or solar is not really fit for that purpose, right? Not yet at least. It is indirect because more renewables they get into the grid.

Speaker 2:

So the grid frequency when they are intermittent renewables. So when they are down, the grid is in under frequency and that is where the support from the limit comes. But of course that's something that we are seeing now only in a few areas of Europe. That will need to be extended everywhere else in the world.

Speaker 1:

All right, I think we're out of time. We have to visit the supercomputer, obviously, so thank you very much. I think it was a very interesting conversation. Thank you very much for joining.

Speaker 2:

Thank you very much.