Techzine Talks on Tour
Techzine Talks on Tour is a podcast series recorded on location at the events Coen and Sander attend all over the world. A spin-off of the successful Dutch series Techzine Talks, this new English series aims to reach new audiences.
Each episode is an approximately 30-minute discussion that Coen or Sander has with a high-level executive of a technology company. The episodes are single-take affairs, and we don't (or hardly) edit them afterwards, apart from polishing the audio up a bit of course. This way, you get an honest, open discussion where everyone speaks their mind on the topic at hand.
These topics vary greatly, as Coen and Sander attend a total of 50 to 60 events each year, ranging from open-source events like KubeCon to events hosted by Cisco, IBM, Salesforce and ServiceNow, to name only a few. With a lot of experience in many walks of IT life, Coen and Sander always manage to produce an engaging, in-depth discussion on general trends, but also on technology itself.
So follow Techzine Talks on Tour and stay in the know. We might just tell you a thing or two you didn't know yet, but which might be very important for your next project or for your organization in general. Stay tuned and follow Techzine Talks on Tour.
Techzine Talks on Tour
NetApp solves new challenges to data infrastructure at the platform level
NetApp is all in on what it calls its Intelligent Data Infrastructure. More than just a fancy sounding name, it includes various fundamental components that are necessary to have to support the latest AI workloads and to combat the data silos that continue to plague organizations. Creating a robust platform approach is the way forward for all data and storage vendors. What makes NetApp's platform special? And why should NetApp be in a good position to be successful in this market? Hear all about it in this new episode of Techzine Talks on Tour.
For this episode, we sit down with Krish Vitaldevara. Krish is the SVP and GM for Platform at NetApp. The concept of the platform plays a crucial role in NetApp's approach to data and storage infrastructure. Without it, much of what is needed to stay relevant as a storage vendor is simply impossible. On top of that, NetApp also recently announced that its architecture will become fully disaggregated. That is a major and fundamental change for the company that also impacts its platform story.
Besides the changes in the architecture, AI also has a big effect on data management and storage. Especially when it comes to vectorizing data and the database bloat that comes with it, organizations can run into some issues. It is up to vendors like NetApp to figure out how to help companies limit vectorized data growth as much as possible.
All in all, there is enough to talk about with Krish, that is for sure. Together with his team, he is responsible for the development of the platform at NetApp, and the platform is where virtually all innovations converge. Listen to the conversation we had with him during the most recent edition of NetApp Insight now.
Welcome to this new episode of Techzine Talks on Tour. I am at NetApp Insights in Las Vegas and I'm here with Krish Vitale-Devra. You are the SVP and General Manager for Platform with NetApp, so maybe lots of listeners will also have this association. When they hear Platform and NetApp, they immediately think ONTAP and Data Fabric, maybe from a long time ago Ten years ago. I mean, you're not allowed to say that anymore, I would imagine nowadays. So that's not what platform means today, right?
Speaker 2:No, I think from NetApp's perspective. For us platform, we consider NetApp ONTAP as the core platform, because all of our technology is built into ONTAP and everything we do actually starts from there. So we, once the ONTAP platform is built, you're gonna do your engineered hardware using the platform. Like all our hybrid cloud offerings are built on top of the same ONTAP goodness. The same thing happens when you go to ONTAP Select, which is our software-defined storage platform. So the data services are built on ONTAP.
Speaker 2:The way we think about it is NetApp. Ontap enables us to get the offer architectures right, and that's what we mean. Hybrid Cloud is a very good example of that. It helps us actually build consumption models right. Keystone is a very good example of that. It helps us get workloads right. So AI is a very good example of that. We kind of want to take a very differentiated view of the role NetApp on-tap plays, as opposed to some of the general platform definitions. Most companies don't have a technology stack that's built from bottom up, because what ends up happening is you stitch things together at the top and you start calling that a platform. But in our context, we take platform from a truest engineering definition of a technology platform.
Speaker 1:It's more of a sort of a vertical platform than a horizontal platform.
Speaker 2:They're horizontal, on which multiple work.
Speaker 1:Yeah, but I mean it's very vertical in the sense that it's very deep. So it's not just what you meant, what you said, just stitching something on top of it.
Speaker 2:Absolutely, because you have the core storage OS at the bottom, you have the data engine on the top, you have the data engine on the top, you have the service delivery engine on the top and then you get to the unified control plane. So as a platform you kind of both go broad and pretty deep yeah.
Speaker 1:So what does that mean for customers in the sense? Why should they know about this? What makes that different from, say, the rest of the industry?
Speaker 2:Yeah, actually, the approach we take for the platform is incredibly important for the customers because this is what helps them future-proof their investment. So because everything is built on the same platform. When you go to cloud using us, there is no silo between the on-prem and the cloud. The same things that actually work in the on-prem would work in the cloud, and we can build technology like FlexCache, so where you can burst into the cloud AI. In the context of AI, it becomes really important because your GPUs are in the cloud.
Speaker 1:So just to round this part of the platform, so it's partly. It's the hardware, obviously it's the software, it's the cloud, it's all the services that you, that's the.
Speaker 2:We kind of leave the hardware out of it, because hardware you can think of some of the hardware platforms are tied to the consumption models and offers. Okay, but the core software stack is what we're talking about.
Speaker 1:Yeah, okay, that's the platform. That's good to know. And you already mentioned the word silo, which is, I mean, that always triggers me, because I have sort of a pet peeve when it comes to silos. I mean, I've been doing this job for I don't know how long already, and we've been talking about breaking down silos for at least 15 years. I bet I think NetApp is doing unified storage for about 20 years already, yes, and I don't really see them disappearing. Are we? Is it just one long marketing kind of stunt with breaking down the silos, or how do you see that? Do you see that it's possible to actually break down all those silos?
Speaker 2:I think the way to think about it is if you are aware of silos, you will be more explicit. When you create a silo, you might be trying to get a proof of concept out and you decided to start something small. You want to not have all the constraints. That's how usually the silos get created, and then they become real and then they become large, and now you have to deal with two different environments.
Speaker 1:Yeah, because then if you want to break down that silo, you have to more or less physically break it down and move everything towards somewhere else. Right?
Speaker 2:And once you get past the point where the silo is big enough, the way to think about when I talk about breaking down silos or talking about how data silos are a big inhibitor in the context of AI. The way for you to think about it is you can have silos because the data gravity kind of ends up creating silos, and some of these silos are created over time, like how you were talking about. Right, they're not created today and nobody wakes up thinking I want to create a silo. Silos come in because that's how teams operate, that's how people think, Like that's the modest operandi, Like that's how most people approach solving a technology problem.
Speaker 2:But you don't have to physically break a silo down. You can stitch things together across silos in a way that the friction that's associated with things like moving data between the two silos or two environments disappears. In our context, that's still considered breaking down a silo. It's not that these are the one in Amsterdam and the one in New York, or they could still be two different data centers, but if they can speak the same language, you have this global, unified namespace that looks at both of them and stitches them together at a higher level, then we consider that as actually solving for the data silo problem Because all of a sudden, for the applications that are on the top. It doesn't look like two silos In the context of AI. That's what it's going to come down to.
Speaker 1:On the one hand, AI can be a reason to create new silos at first, but I think maybe on the other hand, it can also be a catalyst to break them down, Because for AI to be successful, you need to ingest or do something with a lot of unstructured data which makes you think about where that is and where you want it to be right.
Speaker 2:Yeah, actually we see both ends right. Like we see in the context of training, some silos are getting created because you are taking these GPUs and you're creating a specialized training environment or a specialized inferencing environment and all of a sudden now you're figuring out how to move the data, which data moves, into this particular environment. So it's creating silos that way, but it's also helping you break it down, because people are quickly realizing hey, my data is spread across these 10 different things and I need all of them to power this one AI model. So all of a sudden they're looking for ways to actually build structure over that unstructured data set or build that unified namespace. And that's kind of what I mean. I haven't really changed the physical attributes of where the silos are, but just by building that unified namespace I take that silo away from the context of AI because it can look at the data across, know which data to use where and train on all of it.
Speaker 1:There is one thing that I've been running into when it comes to data unstructured data that you use for AI is that you use very high is that you have to vectorize it right in order to make it suitable to be queried. That's right, but that means, if I'm not mistaken, that you can get sort of a vector DB bloat. I think it's called.
Speaker 2:Yes.
Speaker 1:That's what we're calling it, for sure? Well, I mean, I'm not sure what, that's what I heard you call it yesterday. Yes, so that could be up to 10x on top of the original kind of size.
Speaker 2:That's the average 8 to 10x, and we saw some customers with 24x.
Speaker 1:Yeah, but that's probably depending on how many dimensions you want. That doesn't sound like a very, very good plan. If you want to be a bit frugal with your storage, or whatever you don't want to have too much storage we tend to agree. So what's the? Is it possible to solve that problem?
Speaker 2:Oh yeah, so let's talk a little bit about how the vector plot gets created, right? So you have a file, you can take a text file, you can take an image file as an example and you want to say, hey, I want to go create my embeddings and I want to create them with a set number of dimensions. So these could be like 750 plus dimensions, 1,500 dimensions, and there are some customers who want 60,000 dimensions.
Speaker 1:So it all depends on what they think is what's the reason to pick a certain dimension? Is that ignorance, sometimes Because they don't know that they can actually use fewer dimensions, or no, I would actually say it depends on what the problem that you're solving for.
Speaker 2:And just similar to silos, I think some people tend to assume larger is better. So, even though that particular application might not need that number of dimensions, and and the reason why you're doing it, is so that you were at the time of, you know, inferencing when the prompt comes in, you were nearest neighbor. Search is faster and more efficient. Right, but in some ways, by creating all these dimensions, you're also making the problem harder. Yeah, because if the, if the vector blood actually gets to a size that you cannot keep it in the memory now, all of a sudden you have other problems outside of just the raw space usage, right?
Speaker 2:Your inferencing gets lower Performance, latency gets added. So how do we think about that one? So we don't think that you need to just deal with the 8 to 10x bloat that we are seeing on the average. We think by actually creating those embeddings and by bringing the vector DB natively into ONTAP we can take the investments that we have been doing for a long time in compression, compaction, the global deduplication, to be able to actually give the customers about 5x the efficiency on the data stored in these vector database.
Speaker 1:What is it that you do then? So, when you look at, what happens is, if you convert unstructured data into queryable structured data, that's what basically, what you're doing with vectorized data is white spaces are added and all sorts of stuff is happening. Yes, can you just then remove all these white spaces, or what's what's happening in the?
Speaker 2:scene. You're kind of chunking let's take an image file. You're kind of chunking it, you know, into whatever the number of dimensions are, and every time you chunk a file into dimensions, you are carrying some extra information to be able to stitch it together, right? So there is a lot of deduplication of information between each of these embeddings that you create and the more the dimensions, the more duplication happens with respect to the data that you carry with it right overhead. Basically it's overhead.
Speaker 2:White space is definitely part of it, like the indexes that go with it, because the more the dimensions are, the larger your indexes are to be able to do these searches very efficiently. So we bring everything we have been doing from a storage efficiency perspective, but a lot of the intelligence is not even. It's not the only storage efficiency part. It's also how do you write the data, how do you duplicate some of these chunks. So we think this one is a real problem. Every time we talk to customers, the number one thing they complain about is the vector bloat, because many storage infra-advents walk into this not expecting the bloat.
Speaker 1:Yeah, because it's not a very well-publicized kind ofeven though it's very real everybody. It's not like one customer has lots of trouble with it and another one hasn't. Everybody has the same issue, right? Do you think this could be solved earlier in the process, so with the vectorizing itself, it can that be more efficient? I mean, that's not what you do, all right, but do you think this is a permanent situation where you you will always have to think of trying to decrease the vector bloat, or is it something?
Speaker 2:that you know, actually that's a very good question, right? The way we think about it is by actually bringing the embedding models closer to the storage engine, and that's what we mean by, you know, netapp, ai data engine. So we bring the the embedding models, we bring the vector database, so now we get to actually control how about the embedding models? We bring the vector database, so now we get to actually control how both the embedding output is written onto the file and how the vector database is constructed, so all of a sudden, we can be significantly more efficient than normal so there is sort of light at the end of the tunnel.
Speaker 1:maybe we don't need to do this as radical for a long time, absolutely.
Speaker 2:I think more and more customers would actually want to solve this one, and I think storage is one thing. Right now, maybe not enough customers are being conscious about storage costs because compared to the overall cost of what they're spending on their you know these training environments, it's not large, but it gets to the point where your latency gets worse. Your searches cannot be efficient. Now, all of a sudden, the load becomes real.
Speaker 1:Yeah, so just moving along a little bit to another topic that I found quite interesting when I heard you say it on stage is the disaggregated architecture that you are creating, or that you have created, and I would imagine you have it running somewhere already we definitely have it running in our labs.
Speaker 1:I also attached to that the global metadata namespace that you have also announced. It doesn't really because it's very early days, it doesn't really get a lot of attention, but it seems to me from a relative outsider perspective it's a very fundamental change for NetApp right.
Speaker 2:We think so. We think it's one of the largest innovations that we are doing in the context of ONTAP to be able to bring that disaggregated back-end storage architecture. We think it's incredibly important because, as I was speaking this morning at the keynote, many of the competitors actually prefer to take a different approach to this problem by kind of bifurcating their product lines. Right, so they'll build one for random IO, they'll build one for sequential IO, and we have always optimized for a unified approach and we continue to invest in ways where we can keep the overall approach unified. And a big part of it is how do you scale your compute and capacity independently? Because if you cannot scale that, then that forces you to kind of think about different workloads differently, and for us, disaggregation was the way to do that. It also lets you fully utilize the backend storage shelf.
Speaker 1:So just for people who don't really understand what disaggregation means it's you're not in a maybe traditional systems if the two the network or the compute and storage are tightly linked tightly.
Speaker 2:If you want to add more capacity, what you end up is actually also adding more computing capacity, because they go as a unit.
Speaker 1:Yeah, so you may end up with way more compute than you need, and you pay for that as well, or vice versa, or vice versa. So the point of disaggregated architecture is that you can use all your storage as efficiently as possible.
Speaker 2:Yeah, and you get what we call price performance right. This is how you scale, but you scale in a very cost-effective way. Not only that, like you don't have to scale in HA pairs. Like, every time you add more storage, you're most likely adding two nodes, not one node, because you're adding a HA pair. But in a disaggregated architecture you can get to the same amount of scale with, like, n-1 or n-2 nodes. So all of a sudden, if you have a 16-node cluster, you're taking full advantage of about 14 of those nodes' performance capacity. Yeah, right, in the traditional one, because your load balancing gets about 50-50.
Speaker 1:Yeah, and what does the global metadata namespace have to do with this?
Speaker 2:Oh, okay, that's actually a very good question.
Speaker 2:Once you disaggregate compute and storage, all of a sudden you can actually do things that you have not traditionally done in the cluster, because I can bring more compute-intensive things into the cluster. So one of those compute intensive things could be that you know building that structure over unstructured data. Another example of that is you could bring in classification so you could come in, bring the docs, like you can run some intelligence on it and say which, however, ways you want to classify something, you can anonymize things. So all of a sudden, given that you can bring compute, it's going to be game-changing because you can bring all the intelligent services closer to the storage right. The other thing we would talk about from a trend perspective is agentic AI. If you're building multiple workflows that are all answering a very specific question or a specific, very specific part of your AI workflow, you can bring as many of these as possible next to the same storage because they're all operating on the same data set, but they all need their own compute and they need their own set of resources to scale.
Speaker 1:This is quite a. I mean, I've been following this space for quite some time and I've heard lots of companies talk about global namespaces. There's lots of movement in that area. Is yours different from the others or is this? You hear the likes of Amerspace and Fast talk about global namespaces and all that stuff. Is this another one of those, or is it different?
Speaker 2:But I think, if you treat it just as a namespace, a namespace is a namespace right.
Speaker 1:Well, what's in a namespace? Yeah, so.
Speaker 2:But I think where we kind of differentiate is how you use the namespace to do some of the things that we were talking about. If you're creating these vector embeddings we were talking about before and you want to create them only on a set of changes, now having the namespace, having the namespace be updated instantaneously, because we are doing this in the context of actually writing the data into the storage back end Now all of a sudden, you can only operate on the changes, so you don't have to recreate your embeddings for the entire storage. You take the changes to the namespace and you use that to drive the AI workflows. So I think the power of the namespace is actually not the namespace, but it's about how you integrate it with the rest of the data engine to be able to drive these AI workflows.
Speaker 1:And what's the differentiating aspect of your deep cloud integration when it comes to this global namespace? Is that also something that you would highlight?
Speaker 2:One. I would actually say, even without the namespace, what we have from our cloud integrations is incredibly unique.
Speaker 1:You've been talking about that for years already.
Speaker 2:Absolutely right. The namespace now, all of a sudden, in the context of cloud, becomes even more important because a lot of the times, why AI is driving hybrid cloud workflows is because of this delta between where the GPUs are available, right? So the reason why even companies that have not considered cloud before or considering cloud is so that they can actually just burst their on-prem data into the cloud, do the training, do the embedding, do whatever that requires that GPU compute, do those operations and then bring the models back or bring the embeddings back. Now, all of a sudden, if you can do that and treat it all as a single namespace, that's actually going to be very powerful.
Speaker 2:So for us yeah, so I think this is the true hybrid cloud workflow and we are very excited, just in general, because you're approaching this from a platform play.
Speaker 1:It's not necessarily nothing to do with namespaces or disaggregated architectures or whatever do you do. You constrain yourself in in in product development as well because you want to do this from a from a unified kind of kind of approach. Are there things you you can do or you couldn't do but you would have liked to have done because it doesn't fit with the platform or unified platform approach that you're going for?
Speaker 2:I think the good thing is, when we talk about our platform, it's a technology platform, right, which actually means taking that approach helps us, not hurts us. So the reason is, if I invent a replication technology, if I now build a global namespace, I don't have to do it twice. There are no two product lines, I do it once and the innovation gets everywhere. So we do more efficient storage. All of a sudden, the storage efficiency actually helps everything else that's built on top of the platform. So for us, we think this is actually like a good thing, not it.
Speaker 1:But is it also possible to to? Because you underneath it are different storage stacks, so different different.
Speaker 2:We have single singular stack. That's I know, but I mean, they're different there's a, there's a there's the ASA there's the AFF, there's the FAS.
Speaker 1:Those are different products. Yes, so if you're building your platform to also partially run on those different platforms, I would imagine that some things that you would ideally spread over all those platforms you can do because it's maybe not suitable for one of the platforms that was more the point.
Speaker 2:That was okay. I, I think, I think, I think, um, the way we think about that one is because the engineered platforms are on top. So, aff, asa, and if you're doing the disaggregated on tap, they all kind of have the same underlying code base, which actually helps us not have to go redo a lot of work for different of these platforms.
Speaker 1:Yeah, because, if I remember correctly, asa is basically built on the same technology as AFR, right yeah?
Speaker 2:And the same thing for where we are going with the disaggregated ONTAP, and the same thing for cloud right. The change in the cloud is because of how each of the hyperscalers actually consume us. In the context of AWS, fsxn, in Amazon, ontap is ONTAP, so they get the benefit of all the ONTAP features right away is ONTAP, so they get the benefit of all the ONTAP features right away. In the context of Azure and Google Cloud, there is a service delivery engine and we kind of choose which features to expose to the cloud and we did that based on the customers that are there. But what we are doing as part of the AI push is to also accelerate the delivery of all the on-tap features in all the clouds so that they are consistent across all the three clouds.
Speaker 1:And maybe to an outsider it could also come across as because you're adding layers, more or less. They could also interpret it as that. It also adds complexity and maybe more management, more things to think about. What would be your reply to that concern?
Speaker 2:Now that's actually like a really good question, right? Because I'll give you a perfect example of what you're talking about In the context of our unified. You could argue some of the manageability pieces for block were complicated. So that is the reason. When we actually launched ASA, we spent significant time simplifying these workflows. So when you come in as an ASA customer, you don't get any of the unified complexity. So all of a sudden, the silos actually might make it harder for us as an engineering team, but the consumption model almost always dictates that you cater the simplicity to the consumption model. So from that perspective, maybe there's slightly more work, but what I will tell you is I would rather do that work than actually have to go build the entire stack for these multiple product lines.
Speaker 2:More meaningful work yeah so from an engineering perspective, I think I'm very comfortable with the trade-off.
Speaker 1:Yeah, and I think that leads me to the last question that I have. So for listeners, or for customers or whatever, or potential customers, what do they need to know about all this? Is it just, I mean, do they actually need to know anything about this, or is it About the platform? Yeah, what we've been talking about.
Speaker 2:I think there are two aspects to it. If you're a very tech-savvy customer, you want to understand all the inner workings of ONTAP. This should give you immense confidence that, hey, the same resiliency, same multi-tenancy, same security things that have been hardened over years and years of investment will continue to serve you. You should know that the data management capabilities, like all the things we have done with our efficiency replication engines, you get to take the benefit of it. So even when there is a new consumption model, for example Keystone, it gets to take advantage of all the things that we have worked on. So to an extent, our approach makes us unique. But it also should give you immense confidence that you know, hey, the technology stack has been hardened over the years. It's ready for prime time. It has been ready for prime time for a long time.
Speaker 1:Yeah, okay Well, thank you. I think I'm out of questions.
Speaker 2:That's actually good news.
Speaker 1:I'm never out of questions, but we need to think about the listeners. We don't want to take more than half an hour of their time.
Speaker 2:I'm happy to come back. I'm actually excited. Thanks for all the questions. They were very insightful.
Speaker 1:I hope listeners also thought that and they are a bit wiser about it, because it is actually quite a complex kind of topic, right? I mean we're way past the time where you just storage was storage and that was it right.
Speaker 2:It's no longer right. That's why we are an intelligent data infrastructure company. We are no longer a storage company.
Speaker 1:You managed to get that in at the end. That's very good. All right, Krish. Thanks again for joining us. Thanks a lot, Sander, Very nice.