Behind the Scenes Episode 388: StorageGRID 11.8

Welcome to the Episode 388, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

2019-insight-design2-warhol-gophers

Object storage is gaining more and more of a foothold in an ever evolving cloud storage world and NetApp StorageGRID is delivering more functionality to address the growing demands.

Vishnu Vardhan joins us to discuss the latest release of StorageGRID.

For more information:

Finding the Podcast

You can find this week’s episode here:

I’ve also resurrected the YouTube playlist. Now, YouTube has a new podcast feature that uses RSS. Trying it out…

You can find this week’s episode here in the RSS feed:

You can also find the Tech ONTAP Podcast on:

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

Transcription

The following transcript was generated using Descript’s speech to text service and then further edited. As it is AI generated, YMMV.

Tech ONTAP Podcast Episode 388 – StorageGRID 11.8
===

Justin Parisi: This week on the Tech ONTAP Podcast, we dive deep into StorageGRID, as well as the newest release, StorageGRID 11.8.

Podcast Intro/Outro: [intro]

Justin Parisi: Hello and welcome to the Tech ONTAP Podcast. My name is Justin Parisi. I’m here in the basement of my house, and with me today I have a special guest. Well, on the phone, not in my basement. But to talk to us all about StorageGRID. And to do that we have Vishnu Vardhan. So, Vishnu, what do you do here at NetApp, and how do we reach you?

Vishnu Vardhan: Hey Justin, I’m the Director of Product Management for StorageGRID and me and my team decide what we build in StorageGRID next.

Justin Parisi: Alright, excellent. So if we wanted to reach you for feedback or for questions, how do we do that?

Vishnu Vardhan: There’s a DL called ng-StorageGRID-ses. It’s extremely well used by the field. That’s the best way to reach me and my team and all of the other experts at NetApp who understand StorageGRID.

Justin Parisi: Alright, great. So we are here to talk about StorageGRID and we want to kick it off with just the general, what is it? What is StorageGRID? If people have not heard about it, or if they have heard about it and they want a refresher, what is your pitch for StorageGRID?

Vishnu Vardhan: StorageGRID is NetApp’s premier object storage platform. It is the way you get an AWS S3 compatible object store that has an ILM engine that allows you to manage data simply and easily. It allows customers to deploy new age applications that have been built in the cloud, but deploy them on prem, and solve use cases like backup.

I think the simple way to think about it is, if you need object storage, which almost every customer needs today, StorageGRID is the answer.

Justin Parisi: Okay, and object storage, let’s dive into that a little bit more. So, traditionally when you think of shared storage, you think of file based NAS operations such as NFS or SMB, or you think about block based, which is your SAN technologies where you do iSCSI or FCP or NVMe. Object storage, it’s not new, but it’s becoming more and more prevalent because of some of the benefits of object storage. So talk to me about object storage itself, like what is it, how does it work, and why do people want it?

Vishnu Vardhan: I just want to first start with the assumption that it’s becoming new.

Object storage today is the leading storage in the world, right? As a storage footprint, object storage is the dominant footprint. That dominance is in the cloud and is now shifting on prem. It’s shifting on prem as our customers start to understand how to deploy this on prem and what are the use cases on prem. So it’s really been an application issue, and that is going away rapidly. So I think the question then becomes, Why did this become dominant? And so, how can I understand how to use it better. Why it became dominant is because object storage is the only protocol that lets you do things at a distance. You don’t have to be next to storage for you to store data. NFS, CIFS, all of them have timeouts. All of them have protocol assumptions built in deep into the protocols that your application is within the same data center within a few milliseconds of round trip time to your data.

And that’s one fundamental assumption built in. The second assumption built in is a read/modifywrite workload that you are going to modify that piece of data. You’re opening a file, you’re editing it, and you’re saving it back, and it overwrites that particular block on the system.

And so these assumptions are inherent in the protocol that object does not have. So object is always overwriting, always creating new objects, and it’s working at a distance. And so what happens when you take out these two fundamental assumptions, you enable a lot of things.

The most simple thing, for example, is QLC. We never overwrite QLC. So, in StorageGRID, for example, and we will soon have QLC in the platform write across a cluster, so we are not bound by an aggregate and do not cause overwrites at an aggregate level, and now because we’re striping across a cluster, and we never overwrite, we are subjected less to issues, such as drive wear out in QLC, and so object storage is ideally suited for QLC platform.

And this is a more current example of how object is really well suited for the new age workloads that we’re starting to see.

Justin Parisi: Another benefit of doing it in this way is the ability to scale out to more immense sizes, right? You can essentially grow much further past what you could with a SAN or NAS, correct.

Vishnu Vardhan: Absolutely. So today, StorageGRID scales to 200 nodes and a 780 petabyte infrastructure in one cluster. And that is us striping data across all those drives in the cluster. There is no logical construct that StorageGRID writes are limited by. We can stripe simultaneously to all of those drives and extreme scale both in terms of raw capacity and the number of objects in a single namespace where you can access all of that and get the throughput that you need from all of that.

Justin Parisi: So you mentioned that object storage is basically like a rewriting of whatever you’re trying to access. Whereas with NAS and file, you’re basically appending to existing data. How does that impact things like storage efficiencies when you’re dealing with object?

Vishnu Vardhan: No storage efficiencies. But that’s really not a problem, and it’s really not a problem because most use cases for objects, they are either encrypted because they are secure, or they are backup apps that are already deduping data and sending it to us, so you rarely see this use case of I have this highly compressible data that is sitting on object and people are compressing that and getting storage efficiency out of it.

You’re storing videos that are not very easily compressible. You’re storing backup applications. You’re storing encrypted data. And so in those use cases, it is not as valuable for us to be able to do dedupe. Also the kind of dedupe you see is very different because it’s the entire cluster wide dedupe, so we don’t see that to be a limiting factor. There are some use cases where compression makes sense. StorageGRID supports compression, and you can use that, but typically with compression, there’s also this notion of range reads, where customers want to go in and range read. So there’s always a trade off here between storage efficiency and application compatibility. But the amount of storage efficiency you get from a traditional NFS and CIFS platform is much better because the workloads are typically different from object.

Justin Parisi: Yeah, and that wasn’t so much to highlight the deficiencies of object storage, but to kind of call out that object storage is not for everything. It’s for certain use cases and certain things that you want to do and there’s always a place for any type of storage here. But what we are seeing is that a lot of applications that were NAS based are moving more towards an object methodology. So why is that? Why are people saying, you know what, I really like file, but object is where I need to be because of X, Y, Z.

Vishnu Vardhan: Yeah. So file has the simplistic notion of I can browse it, I have a hierarchy, I can easily access it on my laptop and on my dev environments, so developers have always used it. But the cloud has changed that, and with the cloud, developers are always doing object first. Because of that, it is just more simple for that application that is built in the cloud to come on-prem. So it’s really the simplicity of object, which gives up certain file semantics. It gives that up for scale. And because of that, you have had developers adopt more object, and because of that, you’re starting to see the shift on-prem, where now almost every application starts object, and then the question is, does it also support file? I think the world has inverted, now for new applications, do you actually support file? How do you support file? What would you do in this file situation? That’s the new workload. I think there are definitely existing use cases where transactional databases will never run on object, but for most other use cases, it’s really the other way around, does it support file in the new world, right?

Justin Parisi: I would imagine a lot of the move towards object is because it can do native REST, native HTTP calls so you don’t have to deal with, how do I get this file into my web app, right? It’s just already there, right?

Vishnu Vardhan: Right. Effectively, object is a gigantic web server with its own authentication mechanism. That’s exactly what it is. At its heart, it’s a gigantic web server.

Justin Parisi: And it removes that extra hop, that extra detail you need to worry about as an application developer, which I think that with the cloud, like you mentioned, is A couple reasons why people are moving more and more towards that.

Vishnu Vardhan: Yep, spot on.

Justin Parisi: Now, as far as performance goes, Object is not something that was classically thought of as a high performing, but I would imagine StorageGRID has something to say about that. So, tell me about the performance story with StorageGRID.

Vishnu Vardhan: Yeah, so StorageGRID is a highly performant object storage solution, right? There is this notion that object is not performant because it started off in the backup archive space. But increasingly applications are asking and demanding the highest levels of performance from object. This is because now the application is sitting in the same data center and they just want the highly parallel throughput that object can deliver and StorageGRID is extremely performant. We had a flash appliance that we released last year with the 6112 that drives significant throughput. We’ll be launching QLC soon and that will continue on that journey. So StorageGRID is extremely performant and we’ll continue to become more performant. It’s actually the primary focus for us. I think there is this trend, where applications are moving more and more on-prem, there is this need for object store to become more and more performant. So there’s always this trade off between how performant do you need to be today? And we anticipate that in the next three, five years, performance will be a key differentiator for object storage. And so StorageGRID will continue to be getting better in terms of where we are.

Justin Parisi: So you mentioned that you’re adding QLC support, and I wouldn’t say it flies in the face of performance, but it definitely makes you think about more economics than performance, and I would imagine that’s because you want to try to blend cost with performance and with normal flash, it’s probably way too much for StorageGRID or for object. Is that pretty accurate?

Vishnu Vardhan: No, QLC has a drive wear leveling problem. It does not have a performance problem. I think the reason QLC is viewed negatively, is because of the drive wear leveling. With object storage, we don’t see a lot of drive wear leveling. The only way to distinguish QLC and TLC for us is your workload. If you have a lot of deletes and you’re going to use the StorageGRID system as a landing pad where you’re constantly revving and deleting on an hourly basis a lot of data. Maybe then you want to think about TLC, but otherwise there’s no reason why QLC does not deliver the performance you need from the object storage world. So it is not a poor cousin to TLC. It’s a horses for courses conversation.

Justin Parisi: Okay, so basically it’s a little less expensive because of the drive wear leveling aspect of it rather than performance.

Vishnu Vardhan: From an industry perspective, yes. From an object storage perspective, the performance is the same and we will just pass on the benefits to customers, in fact, because the fact is the industry prices are lower, and that’s great, and we’ll just pass that on to customers.

Justin Parisi: Yeah, because the nature of object storage doesn’t require the same type of flash as something like file would.

Vishnu Vardhan: That’s right. So this is what I find extremely interesting. We initially had this fear that QLC, how would wear leveling work out with StorageGRID? And so we did all of the work, and it turns out it’s not a problem. But the other thing that we have found is that we are able to deliver a very highly performant, extremely competitive on price performance solution.

And if you missed it, we have announced 30TB QLC drives with StorageGRID. That, we think, will increasingly replace HDDs over time. So definitely not something that’s going to happen today, but we expect QLC over the next three to five years to be replacing HDDs. And this StorageGRID platform is poised to be the basis on which that happens.

So very excited about QLC, very excited for customers in terms of the density it’s giving and the power and cooling benefits that QLC is able to deliver. I think you take this and you mix it in with the fact that today we can mix different nodes, and I’ll talk more about this when I talk about 11.8, but you think about mixing QLC with HDD systems in a single grid, and I think you start talking about a really performant dollar per gig attractive solutions So we’re very excited for what QLC is going to do to our growth this year.

Justin Parisi: Now, we touched on some of the use cases or talked about there’s certain use cases for object storage and StorageGRID. So let’s delve into that a little bit more. I know one specific use case that’s kind of a hot button topic and that NetApp has really dived into is the AI ML training piece. So talk to me about how StorageGRID fits in there.

Vishnu Vardhan: So A I is not one thing. I was at NVIDIA before I came to NetApp and so I have some experience there. There are multiple different use cases. There are the large language models, which are a different kind of a use case from training video files and training on video data which is a different use case. So we have to really be careful when we talk about AI in terms of what are we trying to say.

The second thing I want to say is AI is effectively pushing the traditional storage paradigms to the limit. And what do I mean by that? It needs extreme performance with an extremely low dollar per gig. And this is true for high capacity use cases.

So large language models, that are training on a small text file, those are different kinds of use cases. But if you’re training on video and images, these are generally large capacity storage systems. There are petabytes of data on the systems, but you need access to that extremely fast and you don’t need access to all of it randomly, you’re training in epochs, so you’re kind of reading the same data again and again repeatedly, so you need access to that really fast, but you have a pool of data that is very large that you also want access to. So you have this large capacity data, in which there is a small subset that is extremely hot, and you want to read it again and again.

And then tomorrow it may be a different set of the data that you want to train on. So you have a large capacity and you’re trying to train repeatedly on a smaller set that is hot. Object storage is extremely well set to form the basis of that large capacity storage.

In fact, almost any large AI solution will have a high performance file tier to act as a cache for intermediate steps of the training cycle, and then it will use object as the object repository from which you can store the bulk of your data. There’s also this use case of, am I training for a particular competition.

So for example I’ve been making a submission to be in the top 500 supercomputers in the world. And so those kinds of training use cases are different from me as a developer trying to do training and those developers typically tend to go to object straight. And so I want to draw this distinction between the use case of, I have all of my data sitting on object and that is extremely common and available all over and used everywhere.

And the second set is this use case of where do I train the data from? And you have 80 percent of your developers will go straight to Object and train from there. And then you have the 20 percent of extreme use cases where they will stage the data on a high performance file system and then train from there. Object is the foundation of all of this. But I also want to say that It depends on the use cases.

Justin Parisi: So basically what I’m hearing is if you have a training model, that’s a bunch of text files, like small text files, maybe object isn’t the best, right? But if it’s images, doing like healthcare images where you’re trying to train a model on detecting cancers, then that would make more sense for an object.

Vishnu Vardhan: That’s right.

Justin Parisi: Okay. So, AI/ML, we talked about that use case. What are some other use cases other than backup and archive? Let’s just avoid that because we already know that people look at that for object anyway, and it’s not really the exciting use case. It’s kind of the boring use case, right?

So tell me more about other industries that are looking more towards object.

Vishnu Vardhan: So the other very new use case that we’re starting to see emerge is Hadoop replacement. And with Hadoop replacement, Customers are asking themselves, how do I modernize my legacy infrastructures? What has happened is Hadoop evolved in the year 2010, and object was very new then, and so it really Optimized for direct file access, HDFS, for example, which is the file system basis for Hadoop was a file system effectively, and so the applications in the traditional Hadoop world were optimized for files specifically, for example, Hive. And Hive is optimized for file systems.

So, as the object world evolved Hadoop added an S3 A connector, and that S3 A connector tries to present a file system view to applications in the Hadoop world. But the Hadoop application ecosystem, Hive, and everything else around it really didn’t evolve, and so Hive does not do very well with the S3A connector.

But the rest of the world didn’t wait, and so now you have a whole set of new alternatives to traditional Hadoop that are extremely performant on objects. You have one of the key innovations there was the Apache Iceberg project with Iceberg tables. The other has been Databricks with Delta Lake. And both of these have really formed the basis of a new set of applications. Snowflake, for example, is really based on the Iceberg table format and what we have seen is 20X the performance. Snowflake, Dremio, and Databricks are all able to deliver on object, right?

Dremio in particular is a really interesting story because they’re able to be on-prem. So Dremio is an on-prem first kind of a solution. And for on-prem customers trying to replace Hadoop, they’re a great story. We’re starting to find these new use cases where customers are asking themselves, How do I get out of my legacy data lakes into these new data warehouses? We are really participating in that with Dremio and Databricks and Snowflake.

Justin Parisi: So how are people moving from their HDFS to object? What’s the migration process for that look like?

Vishnu Vardhan: So that is a complex use case by use case story. it’s a journey that customers have to make happen, especially because of authentication. And how authentication works in their enterprise. We at NetApp ourselves have actually done that. So we at NetApp have moved away from Hadoop completely to Dremio. There’s probably one use case that still uses something in the old Hadoop infrastructure. But it’s a very interesting story of how we ourselves are today running Dremio on StorageGRID having gone away from Hadoop. It’s a project you need to consult with the experts in the field. I’m not going to be here in a position to advise you on it but from the customer experiences that I’ve had, With adequate planning, it can be quite transparent to end users. And you will seamlessly overlay the new infrastructure on your existing infrastructure. So Dremio, for example, can span your existing Hadoop infrastructure with a new object based infrastructure and gradually migrate the data off from the two systems.

There are some user impacts, but this definitely. has been done and is definitely what we’re seeing customers looking at today.

Justin Parisi: And my understanding is the NetApp project was the auto support data lake, right? Is that accurate?

Vishnu Vardhan: That is right. That is right.

Justin Parisi: Yeah, we spoke to some of those guys about that whole thing as well, like moving that data lake off of HDFS and that sort of stuff.

Vishnu Vardhan: That’s right. 20 times better performance, which I think was astonishing.

Justin Parisi: That’s pretty good. Yeah. I mean, it could be better. That’s another good success story there doing a data lake move away from HDFS. We talked about the AI and training and that sort of thing.

That’s a dependent use case. What about video use cases, like streaming video, I’m thinking of surveillance, archival, that sort of thing?

Vishnu Vardhan: Yeah. I think those use cases are all relevant, but I do want to touch on the AI/ML pieces, right? So one of the interesting things that we have done on the AI/ML side, which we discussed object and AI/ML, but I didn’t touch upon this one here. StorageGRID, for example, has partnered with a company called LakeFS, and what we do there for the AI/ML use case is create versions of data. So think of LakeFS as the Git for data, and they use StorageGRID as the back end store. So if you’re training an AI model that’s based on images, and you’re a developer, so you have this raw data, you want to take a snapshot of it, and then you want to make modifications on that.

And then you want to take another snapshot of it, and when I say snapshot, I’m using a term that we would understand, but if you’re a developer, you’re basically having a branch. So you have a branch and you have another branch, and in that new branch, you make some changes, and then you train your model, and then you’re happy, and you merge that back with your master.

And so you do that entire flow on data, because that’s the basis of AI and ML. And so with LakeFS, we have done this whole partnership that lets you do these kinds of workflows at scale. If you have five petabytes of data and you’re trying to create branches of the data and have a complex merge off those branches back, how would you do that in AI and ML, so that’s kind of something that StorageGRID supports.

But we want to spotlight that use case that we are supporting on the AI/ML use case. And then I’ll go back your question about video surveillance, video imaging. These are traditionally very object friendly use cases. StorageGRID is uniquely placed in that we stream data.

We have multiple customers that serve media in EMEA, in Australia, in Japan, and all of those are being served from StorageGRID. Customers can play a video and then see the video from StorageGRID directly, and this could be a large file. And the way we’re able to do it is because of a streaming architecture where we are not constrained by the size of the file, but as a user seeks into a file, we will go to that point and keep serving data from that point. So very well suited use cases for StorageGRID.

Justin Parisi: Yeah, and I would imagine beyond video surveillance, you think about your streaming video platforms like Netflix. I’m not saying Netflix is a customer, but a Netflix type of video streaming platform where when you click a video that you want to watch, a movie, that spins up an instance, usually a container on the back end, and then it probably sources an object. I doubt it uses file there.

Vishnu Vardhan: Yeah, Netflix actually is probably based on S3. So, video, images or documents. These are all great use cases for object storage. What you’re also finding is, in the financial industry, for example, there are a bunch of use cases where customers want to store a lot of small objects. And object is a very good framework for that because you don’t have to keep them in a hierarchy, because object is more or less a flat namespace. You want to store an object, it has an ID and you store it and you go and retrieve it. We have customers that have billions of objects sitting in a particular bucket. It’s a very flat hierarchy and it’s very simple, take a picture, you tag it with a timestamp, and you save it. That’s it. You don’t put it in a folder. You’re not managing any other thing here. You timestamp it, save it, go back, read it, and then delete the older timestamps. Simple as that. So we find a lot of these small object use cases also being used with object storage.

Justin Parisi: Yeah, that’s one of the challenges when you deal with a lot of files in a NAS environment because number one, you have to put it in the right place, you have to remember to do that.

Right. Number two, you have to name it a certain way or else you will never find it. So a good example is, I have a digital camera and those things get named as like D blah blah blah, number number number, and I’m like, I’m never going to find this, right? And then that’s right. The other issue is finding it later on, being able to remember what folder you had it in and if you wanna search it, there’s not really a good indexing that happens a lot of the time. So finding it is a pain. So how does StorageGRID build its own index? Tell me about the way we make search for those objects easier.

I know we have tags, but what else do we do?

Vishnu Vardhan: This was a very common ask from customers. Hey, I want to search my object. I want to find metadata. I want to find basic characteristics of that file. How do I do it? And we evaluated competition and some of our competitors actually had search in their product. And so we kind of evaluated all of that. And it turns out that customers actually have very different SLAs and requirements for search. Some people want very fast response. Other customers don’t really care about the response, but they care about particular sets of data.

They have very particular queries. The use case space for search is very broad. And so what we finally decided to do was really build the tooling to enable customers to set up their own search infrastructure. And so what StorageGRID does is we generate notifications for the non object customers. Think of it like Fpolicy, but it’s simpler. So the Fpolicy equivalent for StorageGRID where we will notify you when things happen. But again, this is HTTP based and we follow the AWS SNS standard. So we have a standards based way to actually notify any other application about a change in StorageGRID. So we use that framework to then integrate with Elasticsearch. And so now you can stand up an Elasticsearch cluster. You can size it to your performance requirements. You can size it to your capacity requirements. You can say, I’m going to keep all the metadata since the time I have set up my grid and have it searching all the objects.

Or you can say, I’m only going to search this one bucket because that’s really what I care about. And you can set it up for it to perform at the SLAs you care about, and so now, because we stepped away from the way others were doing it. And I think if we have found a lot of resonance from customers. We provide the tooling and then you can really customize it to the performance levels you care about.

And so it’s been a very successful way in which we’ve been able to provide search for customers.

Justin Parisi: And do we provide a reference architecture, like a TR or some sort of document to help

Vishnu Vardhan: Oh, yes, we do. We have TRs and we have best practices on this. And they’re all available on field portal.

Justin Parisi: So the Elasticsearch gets me thinking more about a cloud presence. With object storage, that is a heavy cloud use case. What is StorageGRID doing in the cloud today?

Vishnu Vardhan: So there are two parts to that question. As people start digging into object, the question is going to be, do you run in the cloud. If you ask the question of why, it’s because customers already have an option with AWS S3, or with Google Object Storage, or with Azure Blob. And so, the question of really, do you run the cloud, tends to not be as driven from real core needs.

There are some other customers that are trying to create this global namespace from on prem and off prem, on prem and the cloud, and those use cases make some sense, and we’re working towards addressing those, but there’s this one use case of, hey, do you run in the cloud like an FSx, and there we don’t see StorageGRID to be present there because we think there are already really good alternatives there for you in the cloud itself. The second piece is, can you integrate with the cloud? And that’s kind of our focus is we believe that doing the on prem piece really well, but making sure we can integrate that with your cloud infrastructure, assuming that that’s a given, that you’re going to have a cloud infrastructure, I think is really the story that we are focused on.

And so we start with that notion that. Hey, Mr. Customer, we believe you are going to have a cloud infrastructure also. So, how do we make sure we can be your best partner there? In that vein, we have integrated StorageGRID notifications. So not only is the notification the standards, we can send notifications to your cloud infrastructure so you can build a workflow from on prem to the cloud.

We also use the cloud as a way to expand your StorageGRID capacity. So if you have 10 petabytes on prem, but you have 40 petabytes in the cloud, and you want to use that capacity, we can use that capacity as a StorageGRID system or drives. So we can use a cloud as a drive, whether that is Google or AWS or Azure.

And so that’s the second use case that we support. And the third is where you actually want to replicate data from on prem to the cloud, where you want to use the data in its original format, because your application is doing something on prem and then doing something in the cloud.

And you want the data to be available in the cloud, so just move the data from on prem to the cloud, and we support that use case. So between those three things, where we send notifications, so you can build joint workflows, we use the cloud as a drive, and we replicate to the cloud, we pretty much have a very broad coverage of most use cases where you want to use the two solutions together.

Justin Parisi: Yeah, so no StorageGRID as a service, but you can plug an S3 or an Azure Blob in as a part of the grid is what I’m hearing.

That’s right. You can kind of replicate to those.

That’s right.

Now that leads us into another use case here, and that is integration with existing ONTAP deployments. So StorageGRID does a really good job with that. So talk me through what it can do when you have an ONTAP system in place.

Vishnu Vardhan: So, I think this is a really important question, and I should have also addressed it in your use cases questions. A lot of customers have asked us about getting access to object storage using a file interface. So they want to access data either using files or objects. And so that’s called duality as a whole. Normally there are many use cases here and it’s a whole topic here that we can talk about, but I want to touch upon one important thing here, right? Many of these duality customers, there are some that actually want duality, and so there’s that segment. There are some that actually want object and the file system is just a side business. It’s something they also want, not really care about. It’s really object that they care about. And then there’s this other set of customers who actually want file, and they don’t really care about object. They want file, but they want the economics of object. And so that last use case where you want file, but you want the economics of objects. That is a really big use case. And there is no better solution in the industry today than what NetApp provides with FabricPool. FabricPool delivers all flash performance with all the data management that we have at NetApp, with all its evolution, history, and maturity. And that FabricPool solution is able to then back the data up to object storage that now gives you the economics at scale. You can have a hundred petabyte cluster of data backed by a very small layer of all flash that delivers all the value of NetApp. And I wanna do a quick callback to AI, because what does AI want? AI wants to deliver the highest performance at the lowest cost.

So if you marry FabricPools with FlexCache and object storage. You have a story of compelling value even in AI/ML use cases. So I want to make sure we get that FabricPool has typically been a phenomenally valuable solution that gives excellent value to customers to drive them to the best dollar per gig perspective, but also if you think about the use cases, it’s very effective in AI/ML, and HPC use cases because when the data is hot, it can be used, and when it’s not, it doesn’t have to be in the all flash tier.

I want to definitely touch upon FabricPool as a way that ONTAP and StorageGRID and I really don’t even think about it as ONTAP and StorageGRID. It’s FabricPool as a solution.

Justin Parisi: One of the biggest benefits to that, aside from the obvious, where you’re using StorageGRID and the performance, but the cost, you’re not paying those licensing fees that you would pay with other S3 or cloud.

Vishnu Vardhan: That’s right. But to me, if that becomes very incidental because just your overall TCO is just so massively impacted by doing this. FabricPool solution with StorageGRID and ONTAP together is the economics of comparing an on prem deployment with a cloud deployment is driven not just by that license, but by also the cloud TCO.

And so to your point, Justin, I think the license is one part of it. There are other parts of this cloud TCO story, including, for example, number of accesses. because with the cloud, you’re charged for every access. And then you’re charged for the cloud storage. And so, that’s one line item.

As you work through all the line items, The impact is huge. But I also want to say it’s not just a TCO issue. It’s also a performance story. It’s a, how do I get to a higher performance? As an anecdotal story, there was this customer who set up FabricPool because they had a customer using an HDD platform with regular FAS systems, and then the customer upgraded them to a FabricPool solution, still getting to the same dollar per gig, actually a better dollar per gig at scale.

But the performance was so much better that the customer got hooked on to the performance and now he started to ask for more performance and the infrastructure teams are challenged about how to deliver that. But the customer got hooked on to this additional performance and they got addicted to it.

So FabricPool actually delivered more performance.

Justin Parisi: So there’s also another part of ONTAP that can tie into StorageGRID and that’s your SnapMirror to, is it S3 or Object?

Vishnu Vardhan: Yes, there are multiple other places where our portfolio is integrated tightly, and we’ll continue to find other ways to integrate it.

So, I’ll talk about two things. So one is ONTAP using SnapMirror and being able to SnapMirror to the cloud and therefore being able to SnapMirror to StorageGRID. I think that’s a use case that we’re finding. The other use case is, where ONTAP has S3 too, so ONTAP S3 can also replicate to StorageGRID. And you’re finding customers are using that to be able to migrate. So they start with ONTAP S3, and then they wanna migrate to StorageGRID and they use the SnapMirror S3 to migrate to StorageGRID. So definitely many ways in which we have the two systems connected. We are also looking at BlueXP, StorageGRID is in BlueXP. We integrated with that and almost all the other services, Cloud Insights, for example. We are figuring out integrations already there and more happening on all the other platform services that NetApp provides.

Justin Parisi: Cool. So that’s a lot of options there for use cases, your AI/ML, your images, your video, as well as your ONTAP integration. So that leads us into the main topic here, and that is what’s new in the latest release of StorageGRID, and that’s what our release 11. 8. So talk to me about that.

Vishnu Vardhan: Tell me what you are delivering in that release. Sure. So. You know, if you’re an existing StorageGRID customer, there’s definitely going to be something here for you. So the way we prioritize our features, and I think everybody does that, but I think we are pretty tight on saying what are the highest impact features from a cost perspective? We try to do a lot of small things so as a result of that, we have something for almost every customer, so definitely expecting a lot of upgrades into 11.8. There are probably three major things I want to talk about for this release, and then we can dive deeper into a bunch of others. The first is ILM policy tags. So with ILM policy tags, what a customer can do is have very granular policy. Today, when a customer sets up policies, he has one gigantic policy that has many sub rules in it. And so when he wants to automate it using an Ansible or a Terraform script, he has to go in and change a giant policy and the problem with that is if you make a mistake, you can potentially move petabytes of data inadvertently, right? And so customers were still doing it but they were more wary and especially if they made an error in the way they wrote their policy. So what we have done with ILM policy tags is now we have said that you don’t have to have one giant policy.

You can have 25 smaller policies. And then you can, with a tag, associate that policy to a bucket. So the policy is the same. It has all the same flexibility that the prior policy had. But now there’s not just one policy active. You can have 25 of them, all of them active at the same time. And you can tie them to particular buckets using a tag and say, for example, gold, silver, bronze. And now when you go change the policy that’s tagged to gold, you’re only affecting the buckets that are attached to that tag. And so now we have a level of indirection that makes it easier for customers to go and create many policies and change them without worrying about a wide blast radius that effectively could be as wide as a grid.

And so really simplify how customers can do policy management. The second thing I want to talk about is mixed node support. What mixed node support means is that previously, if you had to mix an SSD node with a regular HDD node in StorageGRID, the best practice was to create a site, and that was a separate site That would have all SSD nodes, and then you would ingest data in there, for example, and tier it off to the HDD nodes.

Because of that, the number of nodes you needed from an SSD perspective was larger. You would need more number of nodes from an SSD perspective to be able to tier to HDD. Now with this release you can just have two nodes that are SSD and you have 10 nodes in your site, you can mix SSD and HDD in the same site and effectively get a FabricPool for object, and so really have a much, much lower dollar per gig. And achieve a better performance capacity mix that you want because you can have one SSD node also.

I just think you want to have two because you want to potentially have a little bit more redundancy but you can have even one SSD node with 10 different HDD nodes in a grid and scale from there. And we have a whole TR and a best practice on this, but really help you get to the right price performance for your object storage.

So that’s the mixed node support. The way we think about it is a FabricPool for object in 11.8.

And then the last one is metadata-only nodes. We have found that, especially in the financials and in the public sector, customers have a lot of small objects and so with StorageGRID, you basically grow in units of nodes.

So you buy one node, two nodes, three nodes, 10 nodes. Hidden inside that node are two different software stacks. We have a data stack and we have a metadata stack. And the reason we package them together is just keep things simple. You don’t want to think about data and metadata at the same time.

You just want it all to be scaling simply and keep it easy for you but what we have found is that customers do want to scale the metadata differently in particular use cases. And so in 11.8, we have a new metadata node. So you can add more metadata capacity to your grid. For example, you have a three node grid, today we have a per node limitation of number of objects you can have on a per node basis, but now with metadata nodes, we lift that limit.

And so effectively you can have a three node grid, but adding enough metadata nodes, you can store 300 billion objects on that. So you can push it to the max. So that’s the third big thing that we have done with StorageGRID. There’s a bunch of others. We have done a lot of work on security.

We now support HashiCorp as a KMS. We already support a palace. We support UFI secure boot. We have a local key management for our new 6112s. So we have a bunch of other security items there. One more major thing is our 6112 performance, so 6112s were introduced last year as a 1U, all flash appliance. That appliance today, we have now bumped performance up about 40 percent on that platform. And so that 40 percent performance bump, I think is pretty significant.

Justin Parisi: Yeah. It’s pretty big. So it sounds like you’ve got new ILM policies, cold and hot tiering with your mixed nodes, new performance platform. And then there was a third one I’m missing there. What’s that other one that I missed?

Vishnu Vardhan: So ILM policy tags, mixed node support, where we had a FabricPool for StorageGRID.

We did metadata only nodes so that we can scale small object capacity performance for the 6112 and a bunch of security work.

Justin Parisi: Okay, yeah, so the small object support, having the metadata nodes to offload that extra work to make that a more realistic use case.

Vishnu Vardhan: Yep. I have 22 features in this release that I can talk about. I just touched on four.

Justin Parisi: All right. So if I wanted to find out about all those other use cases, cause I would imagine you don’t want to cover them all here. Where should I do that?

Vishnu Vardhan: Field portal is great. We have a customer deck there where our field team can find all of the detail.

And that’s the best place to start.

Justin Parisi: Okay. So that’d be for our field. What about our customers? Where can they find information?

Vishnu Vardhan: We have a blog post that we did on 11.8 they could go and read that up. We have our release notes for 11.8 on the download page. That’s a good place to go and find the latest and greatest.

Justin Parisi: All right. Cool. And of course you could always contact your sales representative and they can get you access to those customer decks as well.

Vishnu Vardhan: That’s right. We are also on Discord, and so customers can get to us on Discord. And if you don’t get a response on Discord, please reach out to me personally, and I’ll make sure that your questions on Discord get responded to.

Justin Parisi: All right, Vishnu, sounds like 11.8 has a lot of great stuff in it. Again, if we wanted to find more information, we can always go to your blog, but where could we contact you if we wanted to do that?

Vishnu Vardhan: So I would start with ng-StorageGRID-ses. That has not just me and and my team, but it has experts across NetApp. And it has an engineering team on it. Which very proactively responds to questions. So it’s a very good place to go. Please feel free to reach out to me directly vardhan@netapp.com, V-A-R-D-H-A-N at netapp.com. I’d be happy to direct and provide appropriate priority if you feel something is not getting addressed quickly enough.

Justin Parisi: All right, well, thanks again for joining us and talking to us all about StorageGRID as well as the latest release, StorageGRID 11.8.

All right. That music tells me it’s time to go. If you’d like to get in touch with us, send us an email to podcast@netapp.com or send us a tweet @NetApp.

As always, if you’d like to subscribe, find us on iTunes, Spotify, Google Play, iHeartRadio, SoundCloud, Stitcher, or via techontappodcast.com. If you liked the show today, leave us a review. On behalf of the entire Tech ONTAP Podcast team, I’d like to thank Vishnu Verdhan for joining us today. As always, thanks for listening.

Podcast Intro/Outro: [Outro]

 

Leave a comment