Behind the Scenes Episode 379: NetApp Astra 23.10 Updates

Welcome to the Episode 379, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

2019-insight-design2-warhol-gophers

When you have a Kubernetes cluster that your applications depend on to stay online, you have to seriously consider how you plan on recovering in the case a disaster strikes. Ransomware, power failures, cloud region outages all can wreak havoc on Kubernetes clusters.

NetApp Astra helps mitigate disasters with strong backup, recovery and replication technology backed by NetApp ONTAP and SnapMirror.

Dean Steadman (NetApp Discord) and Luis Rico (luis.rico@netapp.com) from the NetApp Astra team join us to discuss the latest updates for NetApp Astra in version 23.10.

For more information:

Finding the Podcast

You can find this week’s episode here:

I’ve also resurrected the YouTube playlist. You can find this week’s episode here:

You can also find the Tech ONTAP Podcast on:

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

Transcription

The following transcript was generated using Descript’s speech to text service and then further edited. As it is AI generated, YMMV.

Tech ONTAP Podcast Episode 379 – NetApp Astra 23.10 Updates
===

Justin Parisi: This week on the Tech ONTAP podcast we talk about the latest Astra Control updates with Luis Rico and Dean Steadman.

Podcast Intro/outro: [Intro]

Justin Parisi: Hello and welcome to the Tech ONTAP podcast. My name is Justin Parisi. I’m here in the basement of my house and with me today I have a couple of special guests to talk to us all about NetApp’s Astra software. So to do that we have Dean Steadman. Dean, what do you do here at NetApp and how do we reach you?

Dean Steadman: Yeah, I’m a manager on the Astra go to market product management team. And I focus on all things Kubernetes. The easiest way to reach me is on NetApp’s Discord channel, just Dean Steadman on there. I enjoy chatting with folks and interacting online and answering a lot of questions for our customers.

Justin Parisi: Alright, excellent. Also with us today we have Luis Rico. So Luis. What do you do here at NetApp and how do we reach you?

Luis Rico: Thank you, Justin. I’m a principal product manager for Astra. So Astra, as you know, is the data management for Kubernetes piece in the NetApp portfolio. And the way to reach to me is luis.rico@netapp.com.

Justin Parisi: So you said that Astra is the data management piece for Kubernetes. Let’s dive into that a little bit more. So there are several aspects of Astra, it’s kind of like an umbrella of products, right? So tell us about Astra, tell us what it consists of and what sort of things we do with it.

Luis Rico: Yeah, sure. So Astra is based on two products. The first one is a very well known product, formerly known as Trident, and is our connector, CSI driver, to connect any Kubernetes distribution to any Astra solution out there on premises or in the public cloud.

The second product is Astra Control. Astra Control is the data management piece itself, so it’s able to provide snapshots, clones, backups, restore replication, disaster recovery, portability, and migration for applications running in Kubernetes, not containerized applications. So these are the two products of the Astra portfolio.

Justin Parisi: My understanding of Astra Control is it’s a backup and recovery module, like you mentioned, but it doesn’t just back up the data. It does a lot of other things as well. Right?

Luis Rico: Yeah, exactly. So we are able to do quick snapshots and clones obviously.

The snapshots are based on our wonderful ONTAP snapshot technology, but also we are able to do backups and restore and quick clones, and we integrate ourselves with our replication technology with SnapMirror. So we are able to provide quick disaster recovery for Kubernetes applications. And when we do that, we are not only covering the storage, part of the Kubernetes workload, but also the metadata, the Kubernetes resources that are part of the application, but they are not in the volume that is provided by NetApp.

So, we are also able to orchestrate all that, no? So, metadata and data together. To restore an application or migrate or do portability of an application or disaster recovery of an application to a different Kubernetes cluster or OpenShift cluster. So we extend this capability not only to the storage, but also to the application layer.

Dean Steadman: Yeah, and I think just to hammer home what Luis said there from a Kubernetes administration perspective, that is absolutely critical. And it’s because it is a different way of deploying an application that we’ve done in IT traditionally. We’ve usually thought about applications as monolithic elements within our environments, with the data all living inside of a single set of volumes that all need to be managed separately and together as a group from a backup perspective. In Kubernetes, though, with so much of that metadata living elsewhere inside the system. Being able to have the knowledge of how an application is laid out, whether it’s in one namespace or across multiple namespaces and being able to tie all of those components together makes it so that once you have that understanding, it opens up a lot of different use cases. Data protection absolutely is bread and butter for us, but we also open up use cases that allow folks to clone applications more efficiently. Migrate applications more efficiently. It just really changes the way that Kubernetes admins have to track their workloads.

Justin Parisi: Does it also handle the secrets? Does it take care of those? Or do you have to do that on the back end yourself?

Dean Steadman: I’m going to answer yes and no to that one. Yes, we do allow you to grab the Kubernetes secrets and to replicate those as part of the metadata of a application. You also have the ability to exclude different components within an application. We can actually go through the namespace and say, you know what? Back up everything but the secrets. So that way, if you’re replicating those out to somewhere that’s outside of your control you can keep those in house. So we give folks a ton of flexibility in what they back up along with an application. But we also make it super easy in that we make it easy to grab everything.

Justin Parisi: So I know that the data aspect can be replicated using SnapMirror, and that’s all done over TLS, and it can be secure.

What about the Kubernetes configuration? How is that replicated? Because I would imagine that’s not going to be SnapMirror, because it doesn’t necessarily live in a volume.

Luis Rico: You can use TLS you can establish TCP tunnel encrypted with TLS and all the communication with the Kubernetes API will go through this encrypted channel.

So we are saving all of this information and it’s going to Three buckets. So obviously the storage, so it’s outside the Kubernetes cluster. So even if you lose your entire Kubernetes cluster, you can recover the data and also the metadata because it’s outside the Kubernetes cluster.

Justin Parisi: And you mentioned orchestration as an aspect. Am I able to basically from start to finish stand up a brand new Kubernetes cluster on a destination site as a disaster recovery plan without actually having to interact with anything but Astra?

Luis Rico: So this is a very good question, because when we were designing Astra, we were thinking about do we add the capability to deploy from scratch a Kubernetes cluster as part of the disaster recovery?

And then we realize that customers do like to create the Kubernetes clusters by themselves with Terraform or any other automation tool. And they customize the Kubernetes cluster. They secure, harden the Kubernetes clusters in their own way, their own manner. So we couldn’t try to unify the way we provide this automation of creating a Kubernetes clusters as part of Astra and make everybody happy about the way we create Kubernetes clusters. So, this is not part of Astra. So, for Astra, you need a Kubernetes cluster up and running just to provide the disaster recovery aspect, okay? But, for customers that are used to play with Kubernetes, they have their own automation tools to spin up a Kubernetes cluster in a few minutes. So it’s not really a problem for them that the Kubernetes cluster is already there to provide disaster recovery.

Justin Parisi: As you know, there’s lots of ways to deploy Kubernetes. There’s Rancher, there’s GKE, there’s AKS. What does Astra support? Does it support anything Kubernetes or is there a specific set of Kubernetes applications or deployment methods that it supports?

Luis Rico: Yeah, we support all of them, to be honest. Okay. So we support all the Kubernetes managed service in the public cloud, so GKE in Google Cloud or AKS in Azure or EKS in AWS, the Elastic Kubernetes Service. So anything that has like a K in the public cloud we are able to support it. And we also support the commercial distribution of Kubernetes. So Red Hat OpenShift, SUSE Rancher. VMware Tanzu and also Kubernetes upstream. So if the open source community Kubernetes, okay. So we in general support them all and we are starting to support also all the on premises part of the hyperscalers like Azure Stack or Google Anthos. So we also support all of them. The idea is to try to provide data protection and data portability and data management to any Kubernetes application out there, on premises or in the public cloud.

Dean Steadman: The majority of the customers that we work with on premises are using Red Hat OpenShift. It is probably the most common distribution or flavor of Kubernetes that we see. And then up in the cloud, I would expect that as we continue to grow, our AWS numbers will increase the most.

But today we’ve got a lot of great customers that are using the Azure Kubernetes service. So that’s our two strongest spots from an install-based perspective.

Luis Rico: Yeah. And just to add to what Dean was saying, is in this new release 23.10, we are starting to support ROSA. This is the Red Hat OpenShift service on AWS. So it’s just managed OpenShift available in AWS for easy consumption. And we are starting to support that in this new release of Astra Control 23.10.

Justin Parisi: We’re starting to get into talking about public cloud and on prem. I understand that Astra can support both, where does it actually reside? Is it a cloud resident application or is it installable on prem or does it do both?

Luis Rico: We have two flavors of Astra Control.

We have Astra Control Center. That is our software that can be deployed mainly on premises, but can be also deployed in Kubernetes in the public cloud. And Astra Control Center is just a piece of software. You deploy it in Kubernetes or OpenShift cluster, and you can protect multiple Kubernetes or OpenShift or Rancher or VMware Tanzu clusters out there.

So this is the piece that you have to install and manage by yourself with the help of NetApp, of course. And the second flavor is Astra Control Service. Astra Control Service is all the power of Astra Control, but in a managed service that is managed by NetApp SREs. So you are just consuming data protection for Kubernetes as a service. And this is wonderful for customers. This initially was oriented to providing data protection and data management in the public cloud. So the AKS, EKS, GKE clusters. But now with the 23.10 release, we are going to be able to support on premises clusters with Astra Control Service with a single pane of glass, a single console. We are able to provide data management for clusters in the public cloud and also clusters on premises. So you can do easily data migration from on premises to the public cloud or between different hyperscalers in the public cloud or between the public cloud and on premises.

Justin Parisi: With the cloud and NetApp offerings, there are two different approaches. There’s the managed service and then there’s the self service. And the managed service includes things like CVS and ANF. And then the other stuff is like the Cloud Volumes ONTAP or the Amazon FSx. So with Astra Control, I would imagine it supports both of those, but does it live in any sort of marketplace? Is it available as a first party service or a third party service? How does that all work?

Luis Rico: So we are in the process to make Astra Control and Astra Trident first party player in the hyperscalers. So we are in this process. So right now with Astra Control, we are able to support all of our own managed storage solutions. So as you said, FSx for NetApp ONTAP and Azure NetApp Files and also Cloud Volume Services. And now Google Cloud NetApp Volumes in the next release and also obviously our Cloud Volumes ONTAP, our CVO. So this is already supported. So you have covering all backends of all persistent storage possibilities for the public cloud.

But we also provide data protection. If customers are using EKS, AKS, or GKE, the Kubernetes services with the default persistent storage option in the hyperscaler. So we are able to provide data protection for Azure Disk or Amazon EBS. Okay, Elastic Block Storage or for Google Persistent Disk.

So we try to cover all. You can provide this data protection for customers that are using Kubernetes that are not even using a NetApp storage solution. They are just using the hyperscaler storage, but we can provide this help. We can help them and then we can easily convince them to use our own storage that is much better than the hyperscalers storage. Okay. About the marketplace thing. So Astra Control Service is available in the AWS marketplace and also in the Azure marketplace. So today is a third party product. Okay. But even if it’s a third party product is a solution that is helping us to win the Kubernetes workload, and if we win the Kubernetes workload, we can win the first party storage in the public cloud and on premises, we can help to win in flash. We are just trying to help NetApp to be the persistent storage solution for Kubernetes workload.

So this is our mission.

Dean Steadman: And for us, being in those marketplaces really simplifies a lot of the customer experience. It gives them the flexibility to say, I either want to procure this and license Astra Control directly from NetApp.

That’s one route. Or, if they’re already in the marketplace and they have Microsoft credits being able to do that licensing directly through those marketplaces can simplify the customer’s journey from a buying perspective. So we give folks that choice.

Justin Parisi: Yeah. I would imagine that when you’re dealing with an application provider, it’s going to be table stakes to be able to offer disaster recovery, data protection integrated within the product and having an Astra Control, having an Astra Trident as a CSI is going to enable those workloads to consider NetApp seriously because they have those things already built in.

Luis Rico: Exactly. It’s just another value added to the storage solution. Just to provide this disaster recovery capability that is covering not only the storage part, but also the application part.

So we are adding more value to the storage itself. This is the idea, to provide more value added services that help the persistent storage to be more important for customers.

Justin Parisi: All right. Sounds like we know what Astra is. So let’s talk about what’s new in Astra. So what is the latest in the new Astra releases?

Luis Rico: Yes. The latest and greatest release of Astra is the new 23.10. The main exciting new feature is our capability to protect applications based on ONTAP qtrees. Customers have been using Astra Trident to provide persistent storage to the Kubernetes clusters.

Some of them, they found that the limits in the number of volumes for our ONTAP systems were too low for Kubernetes workloads. They needed to provide thousands of volumes from ONTAP to Kubernetes distributions. So they started to use our ONTAP NAS economy backend in Trident.

ONTAP NAS economy backend is based on Qtrees. And this is wonderful, no? Because with Qtrees, you have almost unlimited number of volumes in ONTAP clusters. So, these differences between the ONTAP hardware model related to the number of volumes that we can support, or even the limitation when you are using MetroCluster configuration, for example, that the number of volumes that you can provide with this configuration is more limited.

So, this goes away with Qtrees. And everybody was happy about that, but we cannot do a snapshot with Qtrees. We cannot do a SnapMirror to Qtrees. So you have several limitations with Qtrees. So finally, in these versions, we are able to provide backup and restore for Kubernetes applications that are using Qtrees. It has been painful for these customers to try to protect these Kubernetes workloads, because this lack of capability of having a snapshots with Qtrees. So we are using our snap dir access to do a snapshot of the parent volume containing all the qtrees. When we do that, we can provide our execution hooks to warranty that you can. Do some scripts, pre snapshot and post snapshot to provide application consistency to your database or to your application. So we do this snapshot of the parent volume and then we extract using the snap dir access, the persistent volume that we are protecting. So we can do the backup of this persistent volume and you can restore anytime you want. This temporary snapshot of the parent volume is going to be deleted afterwards. So if you are using 200 persistent volumes with Qtrees in a parent volume, you don’t have to worry about having 200 times snapshots kept for a long time.

You are able to provide backups and restore and the backups are going to be consistent. We are able to help all the customers using Kubernetes that they had to scale to thousands or tens of thousands of persistent volumes with our ONTAP clusters.

Justin Parisi: You said nearly unlimited for Qtrees, which isn’t exactly true because every volume has a Qtree limit. Right. I think there’s like 4, 000 per volume, which is a lot.

Luis Rico: 4,995.

Justin Parisi: Yes. So yeah, there’s a lot available in the volume. And then you have a lot of volumes. So it’s a multiplication exercise at that point, but it is usually going to be enough for most customers.

So with qtrees and protection, I would imagine that you can replicate them.

Since we can’t really do SnapMirror at the Qtree level, how are we doing the replication? Is it a file copy? How is that being done?

Luis Rico: So we are not doing a storage replication yet for that part. We have a data mover that is able to read only the clone of this snapshot directory that we are extracting.

And we are reading the data, reading the content of the Qtree and sending it to object storage outside the Kubernetes cluster. So normally S3 object storage to a bucket. And if we are trying to provide the restore, we are recreating a Qtree in a parent volume, FlexVol, and then we have this data mover to restore the data inside the Qtree. So we are not yet able to provide this storage replication at Qtree level.

Justin Parisi: But you can replicate at the volume level at least.

Luis Rico: Yeah, the volume level, it’s already part of the Astra Control capabilities. So for FlexVols, we will provide this integration with a SnapMirror that we are not only replicating FlexVols with a SnapMirror, we are also replicating all the Kubernetes resources and orchestrating all the process of disaster recovery. So when the customer has to failover between two data centers, Astra will be able to promote the FlexVolumes with a SnapMirror to break this or reverse the replication relationship and connect these persistent volumes replicated with a SnapMirror to the Kubernetes resources in the recovery cluster. All the disaster recovery is orchestrated like customers are used to with other technologies like VMware site recovery manager.

So we are the site recovery manager for Kubernetes.

Justin Parisi: So qtrees can be used with other things like export policy rules as well as quotas. So when you have to recover in a disaster recovery scenario, does Astra take care of the extra configuration pieces like the quotas and the expert policy rules, or do you have to do that with something like Terraform on the other end?

Luis Rico: No, today there is no interaction with all these details in the ONTAP configuration. We just use the snapdir access to make sure that we can access the snapshot of the parent volume.

And we are able to extract the data and back up the data and then recover the data later. Okay, but we are not providing data protection for the rest of the capabilities in the ONTAP cluster for Qtrees.

Justin Parisi: All right, so Qtree management and backup is part of the new Astra update. Is there another aspect of Astra that’s new in this release?

Luis Rico: Another interesting one that I mentioned before is our capability to, in Astra Control Service, to protect not only public cloud Kubernetes cluster, but also on premises Kubernetes cluster.

So this is important, especially with OpenShift. So we can now protect OpenShift on prem and OpenShift in the public cloud with ROSA support. So we are covering on prem and public cloud use cases and the migration between on premises and public cloud.

We are also providing ransomware protection for Kubernetes. Astra now is able to recognize if you are using object storage bucket that is immutable, that has a kind of S3 lock in the bucket. So write once, read many bucket and we automatically adapt our protection policies to recognize that this is an immutable bucket.

So we convert the backups that are stored in the bucket to immutable backups. So nobody can delete these backups unless the expiration policy in the bucket has expired. This is good because if for any reason a hacker is able to get the administrator account of Astra Control he or she won’t be able to delete backups or corrupt backups that are already part of this immutable policy. This is just To complete our strategy in NetApp about ransomware, doing the Kubernetes piece.

Justin Parisi: And that’s powered by the SnapLock replication, right?

Luis Rico: No, we are not using a SnapLock yet. This is a plan in our roadmap to use a SnapLock.

So we are just recognizing i customers are using buckets in the object storage, StorageGrid, for example, with this S3 lock capability. So if they are converting the buckets in WORM or immutable, we are acknowledge this and we are acting in consequence of that. So making sure that the backups are also immutable on our side. We are doing that with Azure blob with AWS S3 and also with our StorageGrid technology.

Justin Parisi: All right. What else we got?

Luis Rico: We also have more security things. For example, we are providing now in flight data encryption with Astra Control between containers and pods that are running applications and the volumes that are holding the data used by applications. So all this traffic can now be encrypted using Kerberos, so you can provide more security and you can comply with regulations just to make sure if applications in Kubernetes are using and accessing sensitive data in the persistent volumes. Traffic between the application, the containers and the volumes is also encrypted using Kerberos. In addition to that we are also increasing for Astra Trident with a new protocol. So we are able now to provide NVMe over TCP as a new protocol as a new driver for Astra Trident. So if you have an all flash array, you can enable NVMe over TCP and Trident is able to automate the provisioning of persistent volumes using NVMe over TCP.

So it’s just another ONTAP SAN backend that is going to provide more performance and especially more performance at the scale, and the scale thing is part of the Kubernetes DNA.

Justin Parisi: Alright, so we just had NetApp Insight and I imagine you did some things there. So tell us about what you did at NetApp Insight and then also tell us about anything else that’s coming up as far as the conferences go.

Dean Steadman: Yeah, we were just out at Insight this last week.

It was a great conference for our team. We met with a lot of customers. We had Paul from Amadeus present a session. Amadeus is a company that does IT services for travel industries. Basically, anytime you book a ticket with any airline carrier, it’s processed somewhere through an Amadeus system, and they’re a great customer of ours.

Very large ANF presence that they’re using. And then Astra provides data protection for all of their Kubernetes workloads in the cloud. So super good session. Really insightful into real world challenges of doing Kubernetes at scale. Looking forward, we’ve got KubeCon coming up November 6th through 9th.

That’s the largest Kubernetes event in the industry. It’s taking place in Chicago this year. We’ll have a team right now just of about 30 folks from all across NetApp. We’ll be a gold sponsor. And we’ll have live demos in our booth of course, of Astra. The FSx team has some really cool demos of FSx use cases for Kubernetes.

And then our friends in the Cloud Insights team will be there showing how Cloud Insights can provide observability and monitoring for Kubernetes workloads. They’ve got some great demos, debugging latency issues real time. So really looking forward to the event. It’s a good show for us. We meet a lot of new customers there and opens up a lot of doors for NetApp by being there.

Then looking out to the future, after the new year there’s a OpenShift Summit event in Zurich that we’ll have a team of folks at, meeting with OpenShift customers. And then, next year, the big event on the horizon is KubeCon Paris which is in March and team is really looking forward to that as well. We’re really engaged in the Kubernetes space. We see Kubernetes as a really new set of workloads for NetApp. A lot of our existing customers are migrating and modernizing applications to Kubernetes. But every new Kubernetes customer we talk to is an opportunity for us to sell a new ONTAP system.

And it’s a new workload. It’s got new data. If there’s new data, they need new storage. So I always stress to our teams that if you want to sell flash, you want to sell ONTAP, talk to Kubernetes teams because they have data.

Luis Rico: Don’t forget, please the next year’s event is important for us is the Red Hat Summit that is going to be in your hometown, Denver. So come on. You have to also to mention that.

Dean Steadman: Yeah, that’s looking off to May, which seems like a million years from now for me. But yeah, I’ll host folks here in Denver for that summit as well.

Justin Parisi: So you’re not looking forward to KubeCon because it’s in Paris?

Dean Steadman: Yeah no one wants to go to Paris in March. It’s really no, I’ve already had about 50 people reach out to me already asking for tickets.

Justin Parisi: "Oh I’ll do it!" "Oh I’ll do it!"

Dean Steadman: It’s amazing.

Justin Parisi: What you guys ought to do is like, you should start advertising that these places, these conferences will be like in Hawaii and then really hold them in Kansas. No offense to Kansas, you’re just not Hawaii.

All right, cool. So it sounds like you’re going to be globetrotting here talking all about the goodness of Astra and Kubernetes and NetApp. If I wanted to find more information where would I go to do that?

Dean Steadman: So, of course, for NetApp folks, the NetApp field portal is available. We’ve got a ton of information there for our internal teams as well as for our channel partners.

So that’s always the first spot. For external folks the Astra website or Astra page on the NetApp website has a ton of links to good information. Most importantly, on there is a link to our registration page for a free trial of Astra. We allow customers to start with up to 10 namespaces for free.

No time limit, no capacity limits, nothing like that. Just allows you to get in there, get started with the product, add some Kubernetes workloads and give yourself some free data protection. Great opportunity. And then finally, as I mentioned in the beginning, our team is available on Discord.

We have a team of folks that we call the Astranauts. We’re really clever that way. Our Astranauts are product experts in all things Kubernetes, Trident, ONTAP storage. Love engaging with customers, love hearing the new use cases that they’re having and love coming up with new solutions to those with them.

So it’s a great place to engage with our experts.

Justin Parisi: And you guys have stickers too, right? I mean,

Dean Steadman: And we have stickers, we have all kinds of stuff. I’m going to all these conferences. You have a ton of swag to hand out..

Justin Parisi: Yeah. You got to hand it out or nobody listens. Nobody even stops by if there’s no free stuff. They’re like, nah, whatever.

Dean Steadman: If you have a blank laptop, I am the person you want to come see.

Justin Parisi: Or if you have stickers that you’re tired of, you just want to cover it. All right. Sounds like we got a lot of good things coming up with Astra. I’m sure there’s even more to come in future months and releases. Again, if we wanted to reach you, Dean, how do we do that?

Dean Steadman: The easiest way is to find me on the Astra channel of the NetApp Discord, and it’s just Dean Steadman on there.

Justin Parisi: All right. And Luis.

Luis Rico: Yes, please send me an email anytime. So Luis.Rico, L-U-I-S.R-I-C-O at netapp.com.

Justin Parisi: Excellent. We’ll include links to the emails as well as the Discord in the blog that comes with this podcast. And again, thanks for joining us.

Luis Rico: Thank you Justin.

Justin Parisi: Alright, that music tells me it’s time to go. If you’d like to get in touch with us, send us an email to podcast@netapp.com or send us a tweet @NetApp. As always, if you’d like to subscribe, find us on iTunes, Spotify, Google Play, iHeartRadio, SoundCloud, or via techontappodcast.com If you liked the show today, leave us a review. On behalf of the entire Tech ONTAP Podcast team, I’d like to thank Luis Rico and Dean Steadman for joining us today. As always, thanks for listening.

Podcast Intro/outro: [Outro]

 

Leave a comment