Why Is the Internet Broken: Greatest Hits

When I started this site back in October of 2014, it was mainly to drive traffic to my NetApp Insight sessions -and it worked.

(By the way… stay tuned for a blog on this year’s new Insight sessions by yours truly. Now with more lab!)

As I continued writing, my goal was to keep creating content – don’t be the guy who just shows up during conference season.

blogfieldofdreams

So far, so good.

But since I create so much content, it gets hard to find for new visitors to this site, The WordPress archives/table of contents is lacking. So, what I’ve done is create my own table of contents of the top 5 most visited posts.

Top 5 Blogs (by number of visits)

TECH::Using NFS with Docker – Where does it fit in?

NetApp FlexGroup: An evolution of NAS

ONTAP 9.1 is now generally available (GA)!

TECH::Become a clustered Data ONTAP CLI Ninja

TECH::Data LIF best practices for NAS in cDOT 8.3

 

DataCenterDude

I also write for datacenterdude.com on occasion. To read those, go to this link:

My DataCenterDude stuff

How else do I find stuff?

You can also search on the site or click through the archives, if you choose. Or, subscribe to the RSS feed. If you have questions or want to see something changed or added to the site, follow me on Twitter @NFSDudeAbides or comment on one of the posts here!

You can also email me at whyistheinternetbroken@gmail.com.

Behind the Scenes: Episode 93 – Women in Technology

Welcome to the Episode 93, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

group-4-2016

This week on the podcast, we chat with a few of NetApp’s Women in Tech about the Women in Tech program (@NetApp_WIT). In October, 2009, NetApp launched the Women in Technology group. Their mission is to support and foster the development of NetApp’s women by providing a forum for mentoring, networking, communication, and professional development. WIT is active in volunteer activities supporting women in our community, encouraging girls to study STEM, and helping those in need. There are currently over 900 members of WIT worldwide.

We invited WIT’s co-chair, Anna Schlegel (@annapapallona), as well as the RTP site leader, Fran Melia to talk about the program. We also brought in Sam Moulton, leader of the NetApp A-Team (@NetAppATeam), as well as Amy Lewis – the @CommsNinja herself – to discuss their unique experiences and philosophies regarding their experiences as women in tech.

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

You can listen here:

Behind the Scenes: Episode 92 – FabricPool Deep Dive

Welcome to the Episode 92, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

group-4-2016

In Epsiode 63, we spoke with FabricPool PM Arun Raman about what FabricPool was. This week, we invited the FabricPool lifeguard – John Lantz – to give us a deeper look at the cloud-enabling technology. We also welcomed ONTAP Senior Vice President, Octavian Tanase (@octav) and ONTAP’s Chief Architect, Ravi Kavuri, to give us the value prop and the business decisions that went into developing FabricPools.

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

You can listen here:

You can also now find us on YouTube. (The uploads are sporadic and we don’t go back prior to Episode 85):

Introducing: ONTAP Recipes!

Official NetApp ONTAP recipes blog here:

https://newsroom.netapp.com/blogs/recipes-for-ontap-success/

One of the key initiatives NetApp has had over the course of the past few years is driving the simplicity and ease of use of ONTAP, its flagship storage software. Some of that work is going into the GUIs that run ONTAP, such as:

  • OnCommand System Manager being moved on-box to prevent the need to manage it on external systems, starting in ONTAP 8.3
  • Application provisioning templates for NAS and SAN applications starting in ONTAP 8.3.2 (including Oracle, VMware, Hyper-V, SQL, SAP HANA and others)
  • Performance headroom/capacity in System Manager in ONTAP  9.0
  • Top client/performance visibility in OnCommand System Manager via ONTAP  9.0
  • Intelligent, automatic balanced placement of storage objects when provisioning volumes and LUNs in ONTAP 9.2
  • Simplified cluster setup, ease of management when adding new nodes, automated non-disruptive upgrades starting in ONTAP 8.3.2 and later
  • Unification of OnCommand Performance Manager and Unified Manager into a single OVA in OnCommand 7.2
  • Better overall look and feel of the GUIs

There’s plenty more to tout, but this is a blog about NetApp’s newest way to help storage administrators (and reluctant, de facto storage administrators) manage ONTAP via…

ONTAP Recipes!

If you’ve ever watched a cooking show, the chef will show you the ingredients and how to assemble/mix/prep. Then, into the oven. Within seconds, through the magic of television, the steaming, hot, fully cooked dish is ready to eat. Super easy, right?

maxresdefault[1]

What they don’t show you is the slicing, chopping, cutting and dicing of the ingredients. That’s done ahead of time and measured out into little dishes. They also don’t show you the various times you inevitably forget to add an ingredient, or you add too much, or you have to run to the store to pick up something you forgot.

Then, the ultimate lie – they don’t let on that the perfectly cooked meal was prepared well before the show was filmed, waiting in the oven in all its perfection.

And that’s ok! We don’t want to see “how the sausage is made.”

We just want to consume it. And our storage is not that much different.

That’s the idea behind ONTAP recipes – they are intended to be written in an easy to follow order. Easy to read. Easy to consume. The goal is to deliver a new recipe each week. If you have a specific recipe you’d like to see, comment here or on the official NetApp ONTAP recipe page. Happy eating!

maxresdefault[1]

Here’s the latest one. The goal was to correspond with MongoDB World in Chicago on June 20-21:

https://community.netapp.com/t5/Data-ONTAP-Discussions/ONTAP-Recipes-Deploy-a-MongoDB-test-dev-environment-on-DP-Secondary/m-p/131941

For all the others, go here:

https://community.netapp.com/t5/user/viewprofilepage/user-id/60363

Behind the Scenes: Episode 91 – Learning to Code, with Ashley McNamara

Welcome to the Episode 91, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

group-4-2016

This week on the podcast, we chat with developer advocate, Ashley McNamara (@ashleymcnamara) of Pivotal to talk about how storage administrators (and pretty much anyone) should be learning to code. Ashley also gives us places to look for resources for aspiring developers and scripters to be successful. Feel free to check out her Git repository here:

http://ashleymcnamara.github.io/learn_to_code/

And her Gopher work here:

ashley-gopher.png

https://gopherize.me/

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

You can listen here:

You can also now find us on YouTube. (The uploads are sporadic and we don’t go back prior to Episode 85):

Behind the Scenes: Episode 90 – ONTAP Performance Enhancements, including QoS Minimums

Welcome to the Episode 90, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

group-4-2016

This week on the podcast, we invited SAN and Performance TME Mike Peppers (@NTAPFLIGuy) to discuss the new performance enhancements in ONTAP 9.2!

Join us as we talk about QoS minimums, balanced LUN placement and more!

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

The official NetApp blog is here:

https://newsroom.netapp.com/blogs/tech-ontap-podcast-ontap-performance-enhancements/

You can listen here:

You can also now find us on YouTube. (The uploads are sporadic and we don’t go back prior to Episode 85):

Behind the Scenes: Episode 89 – NetApp HCI: Enterprise-Scale

Welcome to the Episode 89, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

group-4-2016

NetApp has a hyperconverged infrastructure platform!

This week on the podcast, we invite Derek Leslie (@derekjleslie) and Gabriel Chapman (@Bacon_Is_King) for the big reveal of NetApp’s hyperconverged infrastructure solution! Come find out what hyperconverged means for NetApp and its customers, as well as how NetApp’s HCI solution ticks.

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

And now, YouTube!

https://www.youtube.com/playlist?list=PLXo73OlUJ12PLDR-jeko6OwWurESXlsjK

As an added bonus, we’ve started to do transcripts for podcast episodes. You can find this one after the jump.

You can listen here:

You can also now find us on YouTube. (The uploads are sporadic and we don’t go back prior to Episode 85):

Transcript: Episode 89: NetApp HCI

Justin:                   This week on the Tech ONTAP podcast we unveil the latest crown jewel of the NetApp portfolio, NetApp hyper converged.

Automated:       Welcome to the Tech ONTAP podcast with Justin Parisi, Glenn Sizemore and Sully the Monster. I love [00:00:30] NetApp.

Justin:                   Hello and welcome to the Tech ONTAP podcast. My name is Justin Parisi and in the studio with me today is Mr. Andrew Sullivan. Hi.

Andrew:              Hello.

Justin:                   You are here.

Andrew:              I am here.

Justin:                   Oh, my goodness. You’ve been traveling so much. You were missed.

Andrew:              I know. I only have another week before I travel again.

Justin:                   That’s good. And then I’ll be on vacation that week, so none of us will be here. Also with us today on the phone is Mr. Glenn Sizemore. Hi, Glenn.

Glenn:                  How are you doing, Justin?

Justin:                   I am super. All right, [00:01:00] so we’re going to get right into this because it’s a very special podcast where we are talking about a brand new addition to the NetApp portfolio, and I won’t spoil it right now but we’re going to go ahead and talk to the key speakers here, Mr. Derek Leslie and Mr. Gabe Chapman. Derek, if you could chime in and tell us about what you do at NetApp and I guess what we’re going to be talking about today.

Derek:                  Yeah, so at NetApp I came over as part of the Sol Por acquisition. Really happy to be here. [00:01:30] I get the privilege of working on NetApp HTI, so on the product team. I’m leading out those efforts and I’m really excited to bring a new product to market in such a big company with so much influence, so it’s a really great place to be right now.

Justin:                   All right. And Gabe Chapman.

Gabe:                    Gabriel Chapman here. I am the, for lack of better term, the HCI whisperer, I whisper to the HCI’s. In reality, I too, like Derek, came over with the [00:02:00] Sol Por acquisition and have been focused on bringing HCI to the field for NetApp customers, partners and internal teams.

Justin:                   All right, so there you have it. HCI and we’re done, right guys? I mean, that’s all we gotta talk about? That’s the announcement?

Andrew:              So, we’re going to talk about storage grid, is that right? We’re talking …

Justin:                   Storage grid. No. HCI’s. Tell us about HCI. Most people are either unfamiliar with the term HCI or Hyper Converged, or they are [00:02:30] not familiar enough to know what it actually means. I would challenge one of you to give us the elevator pitch for HCI.

Derek:                  I don’t know, I’m gonna put the sale’s guy on the spot for this, so Mr. Whisperer.

Gabe:                    Oh, okay. Yeah. So Hyper Convergent Infrastructure. Let’s do this. We can go back and do a precursor to kind of lay out how the market came about. If you go back to 2009, essentially, you have two companies that came about roughly within a couple month period; Nutanix and Simplivity [00:03:00] that decided to take the traditional stack of infrastructure components and abstract away the complexity behind them and put into kind of an all in one consumable package. So you had servers and switches and storage arrays and appliances and all good things. Under the covers, those are naturally just commodity X86 components running some form of Linux. There’s no reason why we could not virtualize those constructs, package them together [00:03:30] and essentially sell you a consumable building block for the data center that provisioned storage, memory, compute, and laid a hypervisor on top of that to simplify the provision in a virtual assets. Does that make sense?

Justin:                   Yeah, it makes sense. Let’s talk about why someone would want to do that. Why would I want to do something like HCI rather than traditional storage architectures?

Gabe:                    There’s a couple of ways to look at it. If we started to look at the traditional siloed based IT shop, [00:04:00] you know, I have my storage team, my network team, my server team, my virtualization team, the application team, and for those of us who spent a lot of time and worked in that space, a lot of times whenever a new project would come about, it was kind of like a meeting of the five armies. Everyone would come in, they would fight for their bit and piece and things would slow down and they would not respond to the business needs quickly.

You see things like public cloud come about that get a lot more dominance in terms of the application [00:04:30] developers look at it and it’s like, “hey, I can just go swipe a credit card at Amazon and provision a workload”. So when the Csuite people come in and go, “why can’t we do that internally?”, well, it’s because of the complexity of that siloed architecture. So, we started to see a shift towards more of a IT generalist mindset for some organizations where the virtualization teams started to take over the provision of infrastructure because all the machines they were putting out there were the requests with those credit card swipes.

Consolidating [00:05:00] and simplifying that infrastructure into something that your generalist IT admin could do was a key component of hyper converged infrastructure. So, abstracting away the complexities of the storage array of the provisioning of physical assets and whatnot, and making it a simple to deploy, easy to deploy, one size fits all approach focused more on common denominator, or the lowest common denominator in some respects to workloads was attractive to a lot of people.

Justin:                   [00:05:30] Would you say this also consolidates and simplifies the overall buying of the storage units? So I mean, instead of having to buy separate entities for this, would HCI simplify that?

Gabe:                    Oh, definitely. My background is I was one of the first people that went in as a sales engineer for Simplivity in the very beginning times before we even understood the term Hyper Converge for the most part, before it was kind of common place. I could walk in and [00:06:00] say, “hey, you would like infrastructure to run, say, 500 virtu machines, well, here is the two line item quote. Here’s a quote with five units of box of hyper converters, and here’s a quote for support”, and that’s it. I know, most of us who’ve bought storage technologies from lots of different companies over the course of our career, sometimes, a quote for something like that, just the storage array itself could be five or six pages. There was a huge ability to walk in and go, “everything you need is just in the box”.

[00:06:30] The analogy I traditionally make is the all in one printer. If I go back to 2006 or so, a point in time when I had a flatbed scanner, I had a printer, I had a copy machine, you had a fax machine, you had all these different bits and pieces. Isn’t it easier to consolidate those down into a single device that does all of those things relatively okay, and give it to anybody in your house to manage and leverage? Yes. That was the value that Hyper Converge brought, it was the simplification of the [00:07:00] process of provisioning a bunch of disparate pieces together, but in a consumable package that almost anybody could leverage.

Justin:                   If you were to sum HCI up in one word, wouldn’t you say it’s just simpler. Right? Everything about it is easier, more automated, more efficient.

Gabe:                    The major push in the beginning of HCI as it comes to market is the simplification effort. I think there are, I think we’re starting to see a shift away from, “yes, everybody can do the simple part”, to now looking at more of “hey, now we want to start to put more [00:07:30] complex, sophisticated workloads on these platforms. Now we need to make sure that the underlying infrastructure or architecture of that solution can meet those needs”, and that’s one of the things we have gone after. Yes, we can do the simple, day one, day zero provisioning, but the real meat, and the solution is everything that happens after the fact and how well the system can scale to meet the requirements of the customer.

Justin:                   So, borrowing your printer scanner all in one combo example, would you also say that maybe HCI could save us some space, [00:08:00] in terms of rec space, data center utilization, or is that something that just is not really related to this whole solution.

Gabe:                    I think anytime, the concept of the greening of the data center is key in the forefront of a lot of people’s minds. Obviously, the fact that I can consolidate storage and compute into a smaller form factor, and because we’ve gotten to specific densities now with solid state drives, because we do data efficiency, like dedupe compression, and provisioning those types of technologies, [00:08:30] we’re able to simply drop the footprint of the data center in many respects. It really depends on the architecture and design of the HCI solution, depending on how it’s packaged, on whether you can scale those resources independent of each other and get a little more granular level of the scaling of the disparate resources, or if I have to scale all of them all at once. It really is going to depend on what the design point was like and what point you’re wanting [00:09:00] to go to in your journey to what we’re traditionally calling next generation data center.

Justin:                   I’ve heard the term next generation data center quite a bit. Let’s go ahead and knock that one out right now. Give me an example of what you mean by next generation data center so we can educate our listeners about what that might be.

Gabe:                    Sure. The way we tend to present it is such. I kind of, if I go back to, I was [inaudible 00:09:25] for fifteen years, I did storage and virtualization stuff for a long time. Let’s [00:09:30] start the year 2000. Back then, what was I doing? I was doing server consolidation, I was starting to put things on storage rays, I was creating SAN, I was starting to implement processes that made my life simpler from a life cycle management standpoint.

Around 2004 – 2005, we start playing around with virtualization, and virtualization simplifies our lives from the standpoint of that common server resource utilization which is sitting at five or six percent. Now I could [00:10:00] scale those resources up, virtualize a lot of things, put a lot more workloads in a denser package, and simplify my life there. The common building block to enable a lot of that was storage area networks or SAN, so I had a storage place to put all my data, I had a lot of different compute notes to put to it, and it made for a very simpler, easier, provisioning process.

As those spaces started to expand, the customers started to get bigger, as we started to get towards 20%, 50%, 90% [00:10:30] virtualized, we started to realize that there were a lot of challenges around how I operate and orchestrate that from a policy driven data center standpoint. How do I get to IT as a service offering? Or infrastructure as a service.

That’s when we started to see the public cloud giants come in and say, “we’ve got this licked. We know how to do this really, really well”. Like I said, going back to that example earlier when I was talk about it is that, those pressures of what public cloud can do, force a lot of internal IT people [00:11:00] to have to make changes to their infrastructure to support that as well. If you look at it from the gardener world, it’s mode one, it’s kind of the old way of doing things. If you’re looking towards the next generation, it’s mode 2, it’s cloud first strategy, or it’s IT as a servicer. All the as’s that exist out there. It’s leveraging those models to simplify, automate, or orchestrate and get towards a more cloud-like implementation process.

Justin:                   Okay. So, let’s go back to Andrew’s [00:11:30] original question before we got the overviews here. So, Andrew, if you could re-ask your bespoke question, because I don’t remember what you said.

Andrew:              Sure. What is the difference from an infrastructure architect or infrastructure operator as well as from a application user, application administrator developer perspective between a bespoke infrastructure. One that has been created, crafted specifically for a particular environment, and [00:12:00] most often this comes down to, “yeah, we’ve got 20 years worth of stuff in the data center that’s all cabled and cobbled together and it works”, versus something like a flexpod converged infrastructure versus hyper converged.

Gabe:                    First, we have to thank our friends in the United Kingdom for bringing the term bespoke to the public masses.

Andrew:              Are we thanking Martin Cooper for that specifically?

Gabe:                    We can thank Mr. Coops, or Mr. Pitcher. Reality is, I think a lot of it really boils down [00:12:30] to who the customer is and what they want to do. A bespoke infrastructure, if I look at twitter, that’s very bespoke. They’ve purposely built and delivered a platform to meet their specific needs, and it’s probably better for them to do that than it is to go out and try to buy, you know, they build versus buy. I think a lot of those bespoke infrastructures are those type of builders; they take an opensource technology, they run with it, they don’t feel like they have to be beholden to any one particular vendor. Or, the more traditional [00:13:00] enterprise space, they have the ability to say, “I can take the best of breed components and put them together. I have internal skill sets that’ll allow me to do that, and by doing that, I can get to a competitive advantage”.

There are other organizations that say, “you know what, we’re just more focused on the operational aspects of this and we want a simplified purchasing model, and we want, we have a specific preference around what those vendors look like”. A converged infrastructure for [00:13:30] them can definitely be of valued. They look and say, “we know we can get 6,000 virtu machines in this particular set of infrastructure and these set of outcomes pushed up and running on this particular platform”. Then, obviously, go into HCI, which is my general purpose IT people that go, “you know what, I know virtualization really well, I know a little bit of SAN, but I just want to basically right click, create storage and be done with it, and maybe my environment isn’t that sophisticated, so for me, [00:14:00] an off the shelf appliance model is one that makes sense”.

Justin:                   I think that that aligns, and thus far, we’ve been pretty consistent, I guess we could say that. On the podcast, we’ve traditionally just tried to simplify this problem down to who’s going to use it? Let’s just talk about who’s going to be living with the infrastructure on a day to day basis, what other work that individual is expected to do, and that kind of helps us rationalize where [00:14:30] in that decision tree, as Andrew laid it out, we find customers. If you still have architects in full blown opps team with network guys and server guys and virtualization guys, you’re going to be able to design something as good if not better than anything we’re going to give you. It’ll be properly aligned with no waste anywhere inside the stack. It’s also very expensive and you need a lot of very specialized talent to do that.

FlexPod lets you get rid of all those roles. You don’t have to worry about the design, we take care of that, but you still need [00:15:00] all those operations guys, you still need a network team, you still need a storage team, you still need virtualization teams. Today. The tooling to make that easier is getting better but you still need to have those.  With an HCI market, typically there, we’re just looking at an admin. They don’t have to know anything beyond that. If they do, great, the system has the knobs that they can get into, but from an organization perspective, it just opens the door right up. You can make the jump to owning your own infrastructure and having some control over your [00:15:30] destiny without having to also having to jump up 40 – 50 people in staff.

Gabe:                    I think you’re spot on there, and I think one of the things, though, for to look is that there is a little bit of a blurring of lines between those. I think you could apply bespoke practices to, across all three of those same spectrums and customer types. It just depends on what their comfort level is. Also, really, for me, it’s all about getting to the outcome that the customer really wants. So, we look at NetApp as a portfolio company. [00:16:00] We have all of these technologies, they’re tied together with data fabric that allows for global data management, portability, visibility type of thing. That’s kind of the additional bit of piece of the puzzle that we bring to the market with all of those different consumption routes or consumption methods. I can go the bespoke route, I can go the converged infrastructure route, or packaged solution, I can go cobble together the bits and pieces that I really like or I can go for a simple approach that just lets [00:16:30] me take a building block and scale those resources.

In my viewpoint, I think we’re coming to the market with something fairly unique with a lot of additional value add laid on top of a platform that is one of several consumption model choices. There’s no one size fits all, it’s really up to each one of the individual customers and how they want to approach this, but when it comes to the data management aspect, we have some additional secret sauce on there that really drives a lot of value, drives a lot of benefit to the customer.

Justin:                   [00:17:00] Okay.

Derek:                  So, from [inaudible 00:17:02] perspective, acquisition has to be incredibly simple. If you fall down with a really complicated quote, like Gabe was saying earlier and make it really hard for the customer to understand, “what am I getting?”. That’s going to be your first hurdle, because I’ve seen where we’ve talked to customers in the field where they had given us the feedback that, “but the other vendor’s solution was just so easy. It was just so easy to understand, I knew exactly what I was getting, I didn’t need to do forty hours of training time. [00:17:30] I could just go”.

That perception right up front needs to be easy. Down to even how we quote it. We’ve done it with six line items, so, six skews. There’s three different sizes you can purchase; small, medium, and large for storage, small, medium, and large for compute. Those sizes go up starting at 480 GB for storage all the way up to 1.92 TB, and then from a compute perspective, we start out with a sixteen [00:18:00] core and then go all the way up to 32 cores. Not small, but it’s scale, so, from a scalability perspective, it’s really important to understand that you can mix and match any of these sizes. From the customer perspective, they also have that initial acquisition comfort of, “if I start out with the small, I can never go up to the medium or the large because I’ll already have bought in”. They can have comfort to start where they’re good now, and then our infrastructure [00:18:30] and the way we scale can all be combined with one another later.

Justin:                   Very cool.

Derek:                  Yeah. It’s easy, cause, if you look at some of the other architectures out there, some of the newer flash players, you’ll get bought into a size, and it’s basically throw it away or send it to the DR site if you need to go bigger. The bigger the size is where the cheapest dollar per unit of IT acquisition is going to be, but you don’t necessarily start there. You know, my first car, [00:19:00] I didn’t buy a 15 passenger van, right. I wanted to make sure I had a family first, so, you buy what you need when you need it and then you can grow and expand later on. That’s initial acquisition.

From there, set up needs to be incredibly simple, so, if you take an expert and have them deploy, from scratch, a complete VMware environment, the storage system, do all the networking, deploy [00:19:30] the management VM’s, deploy the V center, what would you guys say, how long would you guys set aside? How many days or hours would you set aside to complete all that? From scratch?

Glenn:                  Probably somewhere between two days to a week depending on how much experience and how good the team was.

Derek:                  Okay. Any other opinions?

Andrew:              Yeah, I would agree with that assessment, depending on the size of the infrastructure, all of those things.

Derek:                  Would you guys characterize yourselves as experts or beginners? [00:20:00] Generalists or beginners? Where would you put yourself on that spectrum?

Glenn:                  Well, I write Flexpod, so I’d better be an expert.

Derek:                  Okay. There we go. I like it.

Justin:                   I need a month because I’m really dumb.

Derek:                  Okay. So, deployment needs to be able to take a generalist and give them the ability to deploy a complex infrastructure, and we’ve done that in 30 minutes or less, and by no means are we claiming we’re the first people to do that. We just know, to meet the simplicity score expected by HCI, [00:20:30] that’s going to be required. We did deploy VMware, our all flash array, as well as all the VM’s and it took us the better part of a day. We consider ourselves experts in deploying that and through automation and everything that we’ve included in the HCI project with 30 inputs, you’re going to deploy the entire infrastructure.

That’s really easy and that reduces the intimidation factor for a customer going from [00:21:00] the bespoke infrastructure or a traditional 3 tier, buy storage from vendor A, compute from vendor B, will get networking off of Ebay, and let’s make it work. It makes them really comfortable that, “man, I just saw that this is simple, I don’t have to worry about getting this thing up and running and having my boss breathe down my neck. I bought that, where is it, when can I start using it?” Speed of time to value.  How long does it take for me to get value out of what I purchased [00:21:30] and that just has to be really fast for HCI.

Justin:                   So, is that 30 minutes of unboxing, racking and then all the way down to provisioning the storage, or is that like, you know, the cabling, is involved with that? What’s involved with the overall 30 minute guideline there?

Derek:                  Great question. Just because racking and cabling skills vary, we’ve all heard the stories of people racking gear and tipping over the entire row or a portion of that row. We like to not include that in our statements. [00:22:00] The portions that we as a product, when we’re delivering that HCI control, which is meaning after we’re cabled and powered on. So, from the time we’re cabled up and powered on, you can have the infrastructure up and running in 30 minutes.

Glenn:                  That’s the standard we use for everything. That’s the same measurement we use for Flexpod with infrastructure automation; it goes from power on to usability.

Derek:                  Yeah.

Andrew:              So, I was thinking something more along the lines, when I think of something along the lines of, I don’t know Cluster and OnTap, I think [00:22:30] of all the shelf cables and all the stuff that’s involved behind the scenes there. How does the hardware look from an ACI perspective? Is it looking similar to what we do with the OnTap cluster, or is it the Sol Por cluster, what’s it looking like?

Derek:                  Great question. It more similarly represents the architecture of solid fire, so we have nodes. Obviously, we all know the ESX has nodes as well, so they’re both scale out systems, so it’s a really good match. What we’ve done is we’ve taken a generic chassis, a blade [00:23:00] chassis that fits four of these nodes or server blades, and like we said in the beginning, in acquisition, you can put either a small storage or a large compute in any one of those empty blade spots. The hardware in the front is 24 SD drives, so you put those in. Six are assigned per storage node. We don’t use any local storage for the ESX nodes. [00:23:30] As you scale out, you just add six drives in the front for storage and one sled in the back and add it to your cluster. It’s really simple from a hardware perspective.

On networking, we’ve chosen to go with 25 gig, and that’s an interesting choice because a lot of people initially give us the feedback of, “hey, we’re just not there yet in our data center”. What do you guys think of going straight for 25 gig?

Glenn:                  25 makes a ton of sense man. The [00:24:00] networking world, they doubled and quadded up ten gig and we got forty gig lines, but then they needed 100 gig for the back halls and then in developing the 100 gig, they discovered it was cheaper to build 25 gig and bond four of them together, so they’ve just skipped an entire generation which is on its way out anyways and they’re on the future bandwagon.

Derek:                  Yeah. I agree. The little known fact about 25 gig that unless you really look into it, it’s an SFP 28 connector. What that [00:24:30] means is that an SFP plus, meaning, reuse your word, bespoke infrastructure, when you’re plugging in your existing 10 gig stick into our systems, it works, but as soon as you upgrade to 100 gig or 25 gig, because there are 25 gig data switches coming out, there’s a couple out already, it just works. You don’t have to, we’ve given you essentially dual port. It’s 10 or 25. We think that’ll be really good for the future of our customer’s data [00:25:00] center. Like I said, as they’re going towards the next generation data center, redesigning, deciding, “hey, we skipped 40 gig, what’s next? What are we going to deploy in our infrastructure now?”, we think that’ll be a very popular choice with customer base.

Gabe:                    I was going to shoot for the fiber channel overtook rings, but I got shot down.

Justin:                   I am the lord of the tooken rings.

Derek:                  We killed your Bluetooth idea too, Gabe, sorry.

Andrew:              Oh, that’d be pretty sweet.

Glenn:                  I was mainly commenting on the 32 gig versus forty gig because 25 gig is not [00:25:30] something you hear about a lot.

Derek:                  Its become the new ten gig. People are going from figuring out what’s next. I think I’ve just seen people who have, again, chosen to just skip 40 gig all together because maybe it was too much, too expensive, too proprietary, are settling on 25 gig, and the switch vendors, you’ll see, the major ones are following suit, and that’s the real indicator, I think.

Gabe:                    One of the things behind it is simplification of land on motherboard, so ten gig parts are the predominant defacto ones that go on there now along [00:26:00] with the one gigs. It’s easier to take that form factor and uplift it, and also it doesn’t require a huge amount of shift in the optics that a customer has to use, whereas, if you wanted to go 40 gigs, sometimes, you had to go to a very, not a cheap set of optics to go pure 40 gig with one cable, or you had to break out into the four, so that caused some challenges, some headaches as well. It really kind of depended on how the implementation was brought to fold, and forty gig didn’t get quite the adoption I think we were looking [00:26:30] at getting. But the ability of the server manufacturers to get design wins with the common people who make these technologies, it was easier to go with something that customers had already been fairly comfortable with and didn’t require a lot of physical and significant changes to their infrastructure.

Justin:                   It also sounds like it may have been an economics decision, like keeping the price point of this particular product at a place where we don’t have to seem like the most expensive [00:27:00] option by adding something like 40 gig, which is, what I’m hearing, is largely unnecessary.

Gabe:                    40 gig in my view, is no longer necessary now that there are the cheap 100 gig and 25 gig options. As I talk to all of our switch vendors and partners, they’re exactly that. There’s just not the huge price delta. Originally a couple of years ago, if you wanted to go after 40 gig, or we pitched 40 gig as a storage company, the total cost of ownership, ripping out your existing [00:27:30] infrastructure’s a little too high. They were really too high. This is a much easier, no brainer acquisition. Customers that don’t even want to think about 25 gig, they just don’t have to. They can continue to use their 10 gig infrastructure. They can be fully supported, and that’s not an additional set of skews or something different we have to support.

That’s another thing, no customer, as they’re moving towards their next generation data center, wants to be the special customer or the only [00:28:00] one running in a certain config, so the fact that we’re helping the entire broad set of our customer base standardize, is really helpful. Even from a networking perspective.

We are choosing to target from a hypervised perspective on what we’re deploying at first and automatically setting up for them to be VMware. We’re setting up with VMware 6.0 with a fast follow of 6.5, and that’s largely because huge part of our customer base and a very large part of the HCI customer base is still at VMware.  [00:28:30] That’s not the limit of what we support, and this is a really important detail, our HCI storage can be externally mounted by openstack, through Cinder, on KBM, HyperV, Zen, or even another ESX environment can also mount our storage, so it’s not a proprietary closed storage system. So, we are deploying and automatically set up VMware, but we do support far more than that. So, for that user, what we’ve done is we’ve greatly [00:29:00] expanded our Vcenter plugin and you can do 100% of the operations that you’re going to need to on a day to day basis inside of Vsphere, so that’s really convenient.

We talked to a lot of our early customers and prospects on this, got some feedback from our strategic partners, and if you’re running a VMware environment and you have those tools set up and you have that automation, you’re not necessarily looking to change that all at once. We’re not saying it’s all in their vision to go to hyperV or [00:29:30] try some other hypervisor like Openstack on KBM, but for the VMware environment, they want to stay within Vcenter. Do you guys have any comments or feedback on that?

Glenn:                  No, I think it’s the only sensible approach in 2017. The vast majority of the market is still VMware, so, obviously, if you want to be able to address the largest portion of the problem, you need to service those customers, and what comes after that, that’ll be interesting, but we’ll all wait and see.

Gabe:                    [00:30:00] Yeah, but here’s the unfortunate thing. We’re going to get knocked a little bit about not having support for a vast amount of hyper visors, but we have to realize that our work definitely a quality based company and we listen to our customer base and what they want. We could build our own customer hypervisor, or do something completely proprietary. That’s not to say the customer base is going. I kind of hinted our direction of, “we’re not going to close you in and lock you [00:30:30] in, we’re going to remain very open minded and focus on migrating workload and really getting into the data fabric on being able to migrate your data and manage your data versus lock you into a big long contract”. That’ll be more of our focus.

Inside of Vcenter, you can integrate it into Vcenter alarm, so, typically, administrators will have at least a few Vcenter alarms set, so when they’re in there they’ll know what to be alerted on. All of our system reports in through [00:31:00] Vcenter alarms, so you don’t have to go look to a third party site, or go log into something new, and if you already have a monitoring or network operations center that is integrated to Vcenter alarms, it’s automatically integrated with NetApp HCI. So it’ll all just work.

Justin:                   The decision to use VMware, because like you said, they’re the largest, is good. I mean, you have to let the market drive that decision, especially for initial product rollout because if you don’t, you’re just basically playing guesswork, and I [00:31:30] don’t think that’s a good strategy to use for a product you’re trying to make successful. You want to base it on real data, real usage and real feedback from your customers as opposed to just trying to tell them what they’re going to want.

Gabe:                    One of the real important things we’re focusing on is the workload.  We look at the market today, they’ve simply said, “I love the simplicity and the ease of use, and how well this HCI infrastructure works together, [00:32:00] but what happens when I try to migrate my database workload? I’ve heard that it can handle databases. I’ve also heard HCI is really good for BDI or end user computing, and I’ll do a little bit of virtualization. What about when I migrate that all into the same infrastructure? And not over provision?”. That today, is where today’s HCI falls down. They simply can’t handle that.

When we talk to analysts, there’s tons of east west [00:32:30] traffic altercations, where you try to do basically, old school tiering which is labeled data locality, meaning I try to move it to the tier which is closest to the VM to get a local read. That’s not even SAN technology. That’s just attempted local storage at all time technology which is kind of stepping back in time. We’ve combined the guaranteed quality of service that you set up with a single click, with HCI so that you can confidently migrate your workloads and get all of the simplicity [00:33:00] onto NetApp HCI. That will yield great results for our customer base because you wouldn’t imagine how many people are saying, “hey, I know the answer is HCI, let’s go find a problem for it”. We can handle a lot of problems with that storage technology. Being able to give per tenant isolation and guaranteed performance.

Andrew:              A couple of questions for you. The primary interface for the administrator is going to be Vcenter, is that also the user [00:33:30] interface, and does that surface up things like hardware failures, I had a hardware drive fail on the storage system. Those types of things. Are they all through that one interface?

Gabe:                    Yes, everything will come through there, so you can get all the information on logging, current status, any faults within the system, and through Vcenter alarms, that’s going to combine both from VSX host side, from the VSX server side as well as from anything going on from the NetApp [00:34:00] AFA that is powering that underneath there. AFA from those of you who don’t know is all flash array.

Derek:                  That’ll be a pretty seamless transition for Vcenter administrators already because they’re familiar with the interface so they’ll be able to use it pretty seamlessly when they’re trying to integrate our stuff.

Gabe:                    Yeah, and when you’re trying to sell a guy new technology, that is definitely a win to say, “and it will work exactly like your other things, except you won’t be getting calls in the middle of the night for performance problems because we’ll handle and automate that for [00:34:30] you, but you can manage it the exact same way you manage your other infrastructure”. What we see the market wanting is to make the IT problems go away or make the IT department go away. “Don’t make me put in a ticket to make this change, just make me capable of making this change. Make it easy to understand and be informed on how and why I need to do this”. That’s really resonated well with our early partners and customers.

Glenn:                  Are we going to be using [00:35:00] Vwalls in order to assign those policies to the virtual machines?

Derek:                  Great question. So, a lot of customers will ask us, “do you rely on Vwalls? Is that required?”. The answer is no. You can do both. You can use data stores and Vwalls or just Vwalls or just data source. We’re compatible with both technologies, and that really is up to your style of infrastructure management and what you want to do, so, if you use Vwalls today, you’re going to have storage policy based management on a [00:35:30] per VM basis, and one VM has three to four, typically, three or four, virtual volumes on our side, so that’s a dream. That’s almost too easy, it’s like giving a MLB player a T and just saying, “swing”, and that’s exactly what we want.

With the data store, we had a few more challenges, and we solved that by partnering with VMware, with storage IO control, so it assigns shares on a per diem [00:36:00] bases. We take those shares, those get interpreted as Min I-ops, so no matter where that VM goes, no matter where that data store goes to, it always has the minimal amount of I-ops it has, and we’ll do a multiplier for max, and then the storage IO control, it assumes that the underlying storage is consistent, always there and has the I-ops it expects and it can enforce the fairness. Without that, it can’t really enforce the fairness. We’re able to do this per tenant, guaranteed Q-OS, [00:36:30] or per application Q-OS either way. It’s good to have options and you can try both if you’re a new customer looking at which option’s best for you.

Glenn:                  For sure man. I love the fact that we’re not forcing customers down a particular path. Would it be safe for me to assume that Vwalls is a strong preference and we would be looking for customers to try that?

Derek:                  I can’t say that with confidence because I’m not seeing the market evidence. I see the anecdotal [00:37:00] evidence of “wow, this sounds great, really meets our architecture needs”. I wish people would do it. We do have some very large customers that are using Vwalls, but I’m consistently surprised when I do hear some of our largest partners and customers saying, “you know what, it’s just not for us, we’re just not going to do it”. That could be two reasons. One, “this is the way I’m always used to doing it and I don’t see the advantages of changing”, and it’s a new [00:37:30] way, how many years have we been provisioning our applications on data stores? I don’t think people are just going to switch to Vwalls immediately.

There’s gonna be some fear there. Fear of not knowing whether this will work in my environment, and then the other thing is, especially in VDI, that’s just too many volumes. So, 10,000 VDI seats times, let’s say, an average of four volumes per VDI instance. No one wants that many volumes to manage [00:38:00] if you’re a VDI administrator. I’d much rather go back to Horizon and have, “hey, here’s my few dozen data stores and I know how to manage it, I know where it is”, so, for a couple reasons, I don’t know how you guys feel about those, but I’m just not seeing it. There’s still a strong push from VMware to make it happen.

Glenn:                  None of those, I would imagine, with our ACI offering, [00:38:30] none of those challenges are there, right? They’re not having to think about or manage the storage, that’s what the platform itself is taking care of, and the Vwalls provider is just going to extract away, so who cares? Yeah, 10,000 volumes, that’s a lot for a person to take care of, but computers don’t care how big the numbers get.

Derek:                  I think we’ll see that mindset come to light more and more with HCI. Because, typically the feedback that we’re hearing [00:39:00] is a storage guy knows that there’s going to be this many volumes and he knows that he’s not going to be in control of it, so as you see the next generation data center coming, I think that’ll be there, but largely people are just a little scared to change I think. I would love to see us be more successful with Vwalls on the market, because, again, we should want that. We have storage technologies that can take advantage of that like no one else on the market can.

Gabe:                    I think it kind of boils down to customer preference and their comfort level. [00:39:30] You go look and survey the market, there are not a lot of arrays today that really service Vwalls properly or very well, so I think that people have been kind of gun shy about it because it hasn’t gotten the same level of adoption, but if you look at VMR VSAN technology, you start to see a lot of adoption to it, and it’s built on the same premises of simplifying automation, storage based policy management, software data defined constructs, leveraging [00:40:00] the software infrastructure to provision and mange those things and abstract away the complexity of having to deal with 7,000 volumes.

You can go in there and create tiers, precious metal tiers and you can align them any way you want and you can kind of automate that process. We saw a lot of that in the open stack world being now applied to the traditional VMR based world where I look at it from the standpoint, as a former VM administrator myself, I would have loved to go in and set per VM level Q-OS to segment [00:40:30] performance between disparate virtu machines and make sure that there was no competition of resource, period.

I think a lot of people would love to have that contact, but they definitely need to test it out to see how it works in practice, and in theory, it sounds great, but in practice it may be a different thing. The beauty of our platform because it’s based on the solid fire technology, is we already have about six years worth of proof points around multi-tendency disparate workloads leveraging Q-OS technologies. Now we have about, with [00:41:00] the release of the Vwall technology and the last software release, now we have customers at points where we can go back and say, “hey, here in an HCI solution, here’s the first one that actually provides fully functioning Vwall support and on top of it we’re going to let you segment workload based on performance characteristics and then we’ll do 100% of it automated as it integrates with the common tools and practices in use today”. I think that’s a compelling argument for a lot of customers to start to take a serious look at it.

Justin:                   I look at it a lot like the NFS V-3 versus V-4. So, [00:41:30] V-4’s been out for awhile, but there isn’t a huge amount of adoption, and again, it goes back to Derek’s point, “why would I replace something that’s already working for me very well with something that isn’t necessarily compelling for me to move off of and hasn’t been vetted thoroughly by everyone else”. You have to wait for the fear of missing out to overcome the fear of being the first adopter.

Derek:                  Dude, I suffer from FOMO for sure, fear of missing out, and conventional wisdom tells me, go to Vwalls, but again, like I said, conventional wisdom [00:42:00] is often wrong where you just don’t see, you don’t see the benefits. I think that’s something we’ve been trying to work on with our partnership with VMware is really compelling our customers to understand the value and force them to have a reason to pivot. They need a something, like, “my life will be better because” statement for their individual business to see why it’s worth the effort. Also, migrating from the underlying storage technology, how many customers that [00:42:30] you guys have talked to has that always gone 100% perfectly when you changed something about storage?

Glenn:                  Yeah, I mean, any time you touch the storage, that’s the most terrifying thing you can do in IT every single time, but

Derek:                  Exactly

Glenn:                  I’m a little more bullish than you are. I feel perfectly confident saying, “you guys should try Vwalls, that should be your first swing. I get it, there is a lot of anxiety and fear associated with change, but [00:43:00] choosing to manually have to micromanage an aspect that has been fully automated is just silly in my opinion”.

Derek:                  Yeah. It is.

Glenn:                  In my professional opinion, you need to make those transitions, they need to be safe, you need to make sure that we test it and all that fun stuff, but with the solid fire Vwalls provider, we don’t have fear there. It’s a great platform. I would go into it broadly.

Derek:                  The nice part about us is that we don’t have to make a choice, [00:43:30] we can support both options. It’s a really cool place to enter the market, especially if it’s not decided, A versus B. I think it’s really cool to give them both, run them simultaneously, pick one or the other. It really doesn’t matter to us because we can deliver a similar quality and similar value proposition customer base.

Gabe:                    Fundamentally, if we look at this, to bring it back to what HCI is, it’s not a complex technology. Yes, the most complex [00:44:00] and hard to lick problems are in the storage layer for hyper converged infrastructure, but it’s the most simple to manage and administer. It’s, how big, how fast, and who should access it? That’s it. That’s really the three points you should actually have to determine when you’re provisioning storage products within a hyper conversioning environment. The value is in the packaging, simplification and rapid deployment and scalability of the solution is a common building block approach, and [00:44:30] then additionally to that, how you can scale the resources independently of each other. It’s moving away from just the pure storage talk points of it, it really is a more of a comprehensive infrastructure stack solution that we’re providing, and while the storage is important and it’s definitely part of all the decisions you’re going to make, there are a whole bunch of additional decisions that need to be made on top of that.

Andrew:              So, if I’m an application guy, I’m a consumer of HCI, how does this [00:45:00] look to me? When I define things like, I need X number with VM’s with Y amount of CPU and RAM, and Z amount of storage with whatever policies, how does that, how do I express my requirements, how do I get to that end result after its been provisioned?

Derek:                  For us, we do have a VRO plugin coming out soon, so hopefully, you automate that through one of those workflows, but if not, it’s [00:45:30] kind of the same, so you go and say, “create VM”, and obviously, modeling this after what would you do in the gooey, create a VM, storage, select the policies, say you did gold, silver, bronze. Say this one needs gold storage because it’s really fast storage, and you say provisions.

It kind of fits into our goals of really making the complexity hide so you just create the VM and then select in the drop down box what you want. We’ve also extended [00:46:00] our [inaudible 00:46:01] capabilities of, again, if you’re not using Vwalls, you can define inline, “I want a min, max, and [inaudible 00:46:09] of what and which size”. It’s really, really simple. There’s nothing to go through with what type of rate groups, that’s gone, we don’t use that. What type of disk? We’ve separated the media and the performance; you just have a pool of performance and a pool of capacity. It’s really how you would [00:46:30] want it to be if you’re an administrator. You simply decide, “I need CPU, RAM, and storage of X with performance of Y and go”. Hopefully that answered your question.

Andrew:              Yeah, thank you, I think it makes a lot of sense, and particular, as you said, using the realize orchestrator, using the tools inside of the VMR ecosystem, allowing teams that already have existing VMware automation in place to simply leverage that as it stands today. [00:47:00] We’re not doing anything fancy, we’re not going, or leveraging, or putting on top a bunch of different requirements.

Derek:                  Yep. People want simple.

Gabe:                    Also, you don’t have a lot of the caveats associated with some of those first gen technologies around data locality and how does it effect my DRS and how do I, am I able to scale significantly past six or eight nodes because a lot of those early systems had a challenge with metadata and how do they track all those informational [00:47:30] changes. We’ve solved those problems quite some time ago. I think that’s one of the bigger issues of bringing a very mature scale out storage platform into hyper converged infrastructure that doesn’t have any of the caveats associated with some of those first generation packaging exercises.

Derek:                  As we’re starting to close, one thing that we haven’t touched on is scaling, so, as you scale, a lot of the customers that you talk to, especially, I just talked to a huge financial services [00:48:00] firm that they have to buy still, and they buy, you know these are $3 – $4 million a piece, so they’re spending a lot of money, but they’re buying compute when all they need is storage. They don’t, they thinly provision, know compute, right, that everyone gets full access, dedicated compute, but they’re always running out of hardware for storage.

Obviously, you’d think if we are approaching this market, we definitely designed this for independently scalable resources, so, at a node at a time, you can buy one node of any type of storage. [00:48:30] One node of any type of compute, and that speaks to the financial side of this, that you don’t have to waste the money and there is no tax or overhead because what you’ve seen the competitors do is they’re like, “oh, yeah, well, if you’re running a VMware environment, you need to employ …” Their own, it’s a virtualized storage, and it works. They’re trying to adjust, but they’re never going to be as strong as our architecture is for getting you the best [00:49:00] bang for your buck and not having to use compute resources to run storage and having that overhead.

Andrew:              Do the day 365 operations, do they follow the same principles of a Core Solid Fire product where I Can add and remove nodes at will?

Derek:                  Yeah, you can. I think the only people that are not excited about that are the people within NetApp who have to keep track of where the nodes are and who has support where because we’ve talked to them about, “yeah, [00:49:30] we’ve had customers with two nine node sites and they needed a six node cluster at site C and they took three nodes from each cluster and sent them to site C”. Customers love that, they love scale up, scale down, because then you not only get the protection of, I can go any size, but then if I overshot, I can move it. That’s fantastic. So, absolutely, we’re going to support that 100% with NetApp HCI.

Andrew:              Awesome, [00:50:00] and anything else we should touch on this first episode? I’m sure we’ll get a lot of questions from the listeners and maybe even have to get you guys back on here.

Derek:                  No, I just think that we’re really excited to bring this out and get our feet wet in HCI. We know we’re not the first to market but we’d like to say we’re coming to market correctly and really listening to our customer base and looking at the market and seeing what it needs so we hope you guys are as excited about it as we are.

Andrew:              Is your slogan, “HCI, we come correct”?

Derek:                  [00:50:30] It should be, it should be. It’s a cool advantage to be able to learn from the market and learn from what’s out there existing and not have to spin the cycles of release a feature, pivot or persevere, so it’s really cool to just start with a really clear vision on what we should do. We have a good advantage here.

Justin:                   Gabe, got anything to add?

Gabe:                    Live long and prosper.

Justin:                   Okay. Works for me.

Gabe:                    [00:51:00] No, I’m just really excited about getting a chance to bring this product to market with NetApp logo on it. I think it’s a long time coming, I think it’s the NetApp organization that have been chomping at the bit to get a product to go out to market with and we’re listening to our customers and we’ve surveyed the market place and we’ve seen what’s come before us, and we’ve made some purposeful design decisions that will allow us to bring this product into the enterprise, core enterprise customer base.

Justin:                   All right Gabe, thanks. [00:51:30] If we wanted to get in touch with you or Derek on social media, how would we go about doing that?

Gabe:                    For me, I’m on the twitters, I am @baconisking, some of you may follow me. That’s the best way to reach me. Obviously Linkdin, Ello, whatever, there’s about 50 different ways to stalk people online right now but the twitter is probably best.

Justin:                   Wait, did you just say Ello?

Gabe:                    Yeah.

Justin:                   Are you like the one of five users on that? That’s fantastic.

Gabe:                    No, I signed up for it and I still get the e-mail alerts but I’ve never been on it.

Justin:                   I do too, and I’ve never been there. Excellent. [00:52:00] We should add each other on Ello and that’ll be one of our three.

Gabe:                    Hello.

Justin:                   Derek? How do we contact you?

Derek:                  I’m mostly on AOL instant messenger these days, so that’s derekjleslie, just kidding. It’s Twitter @derekjleslie as well as Linkdin and happy to extend the conversation and happy to interact with you guys. Thanks so much for the time today. Thanks for having us.

Justin:                   All right. Thanks guys.

All right. Thank music tells me it’s time to go. If you’d like to get [00:52:30] in touch with us, send us an e-mail to podcast@netapp.com or send us a tweet @netapp. As always, if you’d like to subscribe, find us on iTunes, sound play and stitcher, or via techontapodcast.com. Also we have a YouTube channel. If you liked the show today, give us a review. On behalf of the entire tech on tap team, I’d like to thank Gabe Chapman and Derek Leslie for talking to us about NetApp HCI. As always, thanks for listening.

Andrew:              What do we put on YouTube?

Justin:                   I take the audio from the podcasts and put a tech [00:53:00] on tap logo there so people can listen to it without having to download it. Pretty nifty, huh? Somebody asked for that on the podcast.

Andrew:              I think it’s brilliant. Is it just me that’s [crosstalk 00:53:11]

Justin:                   We’ve also got transcripts that are coming, so if you want text transcripts, we’ll have those as well. We’re stepping into the 20th century here.

Andrew:              Fantastic.

Justin:                   What we don’t tell you is we have a little monkey that actually transcribes the entire episode.

Andrew:              Can I get the episode on vinyl?

Glenn:                  They’ll figure that out once they read it.

Justin:                   Yeah, [00:53:30] I know, right?

Behind the Scenes: Episode 88 – Migrating to ONTAP, FlexGroup volumes

Welcome to the Episode 88, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

group-4-2016

This week on the podcast, we invited Hadrian Baron of NetApp’s migration team to talk about moving from 7-Mode and competitor storage over to clustered ONTAP, as well as the advancements made in the simplicity and speed of moving there. We also discuss multiprotocol NAS challenges and FlexGroup volumes and their benefits.

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

You can listen here:

You can also now find us on YouTube. (The uploads are sporadic and we don’t go back prior to Episode 85):

Behind the Scenes: Episode 87 – The NetApp A-Team visits RTP!

Welcome to the Episode 87, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

group-4-2016

This week on the podcast, we recap the annual NetApp A-Team meeting with NetApp product owners and executives on campus. This meeting is called the ETL (Extract, Transform, Load). In this meeting, NetApp leadership presents to the A-Team about roadmap items and the A-Team gives their candid feedback to leadership.

The A-Team has grown quite a bit since they started – this year, we had nearly 30 people onsite!

DSC_4444

We did a podcast with them and I tried to get each of them on. If you’ve ever seen the podcast studio, it’s quite literally a storage closet, so fitting 10-15 sweaty IT dudes in there at a time was… interesting.

The end result was a lengthy (nearly 2 hours!) – but great – recap of the 2-3 day event.

If you’re interested in joining the NetApp A-Team, have a look at NetApp United!

http://community.netapp.com/t5/Technology/NetApp-United-2017-Officially-Launches/ba-p/129957

Official NetApp blog:

https://newsroom.netapp.com/blogs/netapp-a-team-visits-rtp/

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

You can listen here:

You can also now find us on YouTube. (The uploads are sporadic and we don’t go back prior to Episode 85):

NFS Kerberos in ONTAP Primer

Fun fact!

Kerberos was named after Cerberus, the hound of Hades, which protected the gates of the underworld with its three heads of gnashing teeth.

cerberos

Kerberos in IT security isn’t a whole lot different; it’s pretty effective at stopping intruders and is literally a three-headed monster.

In my day to day role as a Technical Marketing Engineer for NFS, I find that one of the most challenging questions I get is regarding NFS mounts using Kerberos. This is especially true now, as IT organizations are focusing more and more on securing their data and Kerberos is one way to do that. CIFS/SMB already does a nice job of this and it’s pretty easily integrated without having to do a ton on the client or storage side.

With NFS Kerberos, however, there are a ton of moving parts and not a ton of expertise that spans those moving parts. Think for a moment what all is involved here when dealing with ONTAP:

  • DNS
  • KDC server (Key Distribution Center)
  • Client/principal
  • NFS server/principal
  • ONTAP
  • NFS
  • LDAP/name services

This blog post isn’t designed to walk you through all those moving parts; that’s what TR-4073 was written for. Instead, this blog is going to simply walk through the workflow of what happens during an NFS mount using Kerberos and where things can fail/common failure scenarios. This post will focus on Active Directory KDCs, since that’s what I see most and get the most questions on. Other UNIX-based KDCs are either not as widely used, or the admins running them are ninjas that never need any help. 🙂

Common terms

First, let’s cover a few common terms used in NFS Kerberos.

Storage Virtual Machine (SVM)

This is what clustered ONTAP uses to present NAS and SAN storage to clients. SVMs act as tenants within a cluster. Think of them as “virtualized storage blades.”

Key Distribution Center (KDC)

The Kerberos ticket headquarters. This stores all the passwords, objects, etc. for running Kerberos in an environment. In Active Directory, domain controllers are KDCs and replicate to other DCs in the environment, which makes Active Directory an ideal platform to run Kerberos on due to ease of use and familiarity. As a bonus, Active Directory is already primed with UNIX attributes for Identity Management with LDAP. (Note: Windows 2012 has UNIX attributes by default; prior to 2012, you had to manually extend the schema.)

Kerberos principals

Kerberos principals are objects within a KDC that can have tickets assigned. Users can own principals. Machine accounts can own principals. However, simply creating a user or machine account doesn’t mean you have created a principal. Those are stored within the object’s LDAP schema attributes in Active Directory. Generally speaking, it’s one of either:

  • servicePrincipalName (SPN)
  • userPrincipalName (UPN)

These get set when adding computers to a domain (including joining Linux clients), as well as when creating new users (every user gets a UPN). Principals include three different components.

  1. Primary – this defines the type of principal (usually a service such as ldap, nfs, host, etc) and is followed by a “/”; Not all principals have primary components. For example, most users are simply user@REALM.COM.
  2. Secondary – this defines the name of the principal (such as jimbob)
  3. Realm – This is the Kerberos realm and is usually defined in ALL CAPS and is the name of the domain your principal was added into (such as CONTOSO.COM)

Keytabs

The keytab file allows a client or server that is participating in an NFS mount to use their keytab to generate AS (authentication service) ticket requests. Think of this as the principal “logging in” to the KDC, similar to what you’d do with a username and password. Keytab files can make their way to clients one of two ways.

  1. Manually creating and copying the keytab file to the client (old school)
  2. Using the domain join tool of your choice (realmd, net ads/samba, adcli, etc.) on the client to automatically negotiate the keytab and machine principals on the KDC (recommended)

Keytab files, when created using the domain join tools, will create multiple entries for Kerberos principals. Generally, this will include a service principal name (SPN) for host/shortname@REALM.COM, host/fully.qualified.name@REALM.COM and a UPN for the machine account such as MACHINE$@REALM.COM. The auto-generated keytabs will also include multiple entries for each principal with different encryption types (enctypes). The following is an example of a CentOS 7 box’s keytab joined to an AD domain using realm join:

# klist -kte
Keytab name: FILE:/etc/krb5.keytab
KVNO Timestamp Principal
---- ------------------- ------------------------------------------------------
 3 05/15/2017 18:01:39 host/centos7.ntap.local@NTAP.LOCAL (des-cbc-crc)
 3 05/15/2017 18:01:39 host/centos7.ntap.local@NTAP.LOCAL (des-cbc-md5)
 3 05/15/2017 18:01:39 host/centos7.ntap.local@NTAP.LOCAL (aes128-cts-hmac-sha1-96)
 3 05/15/2017 18:01:39 host/centos7.ntap.local@NTAP.LOCAL (aes256-cts-hmac-sha1-96)
 3 05/15/2017 18:01:39 host/centos7.ntap.local@NTAP.LOCAL (arcfour-hmac)
 3 05/15/2017 18:01:39 host/CENTOS7@NTAP.LOCAL (des-cbc-crc)
 3 05/15/2017 18:01:39 host/CENTOS7@NTAP.LOCAL (des-cbc-md5)
 3 05/15/2017 18:01:39 host/CENTOS7@NTAP.LOCAL (aes128-cts-hmac-sha1-96)
 3 05/15/2017 18:01:39 host/CENTOS7@NTAP.LOCAL (aes256-cts-hmac-sha1-96)
 3 05/15/2017 18:01:39 host/CENTOS7@NTAP.LOCAL (arcfour-hmac)
 3 05/15/2017 18:01:39 CENTOS7$@NTAP.LOCAL (des-cbc-crc)
 3 05/15/2017 18:01:39 CENTOS7$@NTAP.LOCAL (des-cbc-md5)
 3 05/15/2017 18:01:39 CENTOS7$@NTAP.LOCAL (aes128-cts-hmac-sha1-96)
 3 05/15/2017 18:01:39 CENTOS7$@NTAP.LOCAL (aes256-cts-hmac-sha1-96)
 3 05/15/2017 18:01:39 CENTOS7$@NTAP.LOCAL (arcfour-hmac)

Encryption types (enctypes)

Encryption types (or enctypes) are the level of encryption used for the Kerberos conversation. The client and KDC will negotiate the level of enctype used. The client will tell the KDC “hey, I want to use this list of enctypes. Which do you support?” and the KDC will respond “I support these, in order of strongest to weakest. Try using the strongest first.” In the example above, this is the order of enctype strength, from strongest to weakest:

  • AES-256
  • AES-128
  • ARCFOUR-HMAC
  • DES-CBC-MD5
  • DES-CBC-CRC

The reason a keytab file would add weaker enctypes like DES or ARCFOUR is for backwards compatibility. For example, Windows 2008 DCs don’t support AES enctypes. In some cases, the enctypes can cause Kerberos issues due to lack of support. Windows 2008 and later don’t support DES unless you explicitly enable it. ARCFOUR isn’t supported in clustered ONTAP for NFS Kerberos. In these cases, it’s good to modify the machine accounts to strictly define which enctypes to use for Kerberos.

What you need before you try mounting

This is a quick list of things that have to be in place before you can expect Kerberos with NFS to work properly. If I left something out, feel free to remind me in the comments. There’s so much info involved that I occasionally forget some things. 🙂

KDC and client – The KDC is a given – in this case, Active Directory. The client would need to have some things installed/configured before you try to join it, including a valid DNS server configuration, Kerberos utilities, etc. This varies depending on client and would be too involved to get into here. Again, TR-4073 would be a good place to start.

DNS entries for all clients and servers participating in the NFS Kerberos operation – this includes forward and reverse (PTR) records for the clients and servers. The DNS friendly names *must* match the SPN names. If they don’t, then when you try to mount, the DNS lookup will file the name hostname1 and use that to look up the SPN host/hostname1. If the SPN was called nfs/hostname2, then the Kerberos attempt will fail with “PRINCIPAL_UNKNOWN.” This is also true for Kerberos in CIFS/SMB environments. In ONTAP, a common mistake people make is they name the CIFS server or NFS Kerberos SPN as the SVM name (such as SVM1), but their DNS names are something totally different (such as cifs.domain.com).

Valid Kerberos SPNs and UPNs – When you join a Linux client to a domain, the machine account and SPNs are automatically created. However, the UPN is not created. Having no UPN on a machine account can create issues with some Linux services that use Kerberos keytab files to authenticate. For example, RedHat’s LDAP service (SSSD) can fail to bind if using a Kerberos service principal in the configuration via the ldap_sasl_authid option. The error you’d see would be “PRINCIPAL_UNKNOWN” and would drive you batty because it would be using a principal you *know* exists in your environment. That’s because it’s trying to find the UPN, not the SPN. You can manage the SPN and UPN via the Active Directory attributes tab in the advanced features view. You can query whether SPNs exist via the setspn command (use /q to query by SPN name) in the CLI or PowerShell.

PS C:\> setspn /q host/centos7.ntap.local
Checking domain DC=NTAP,DC=local
CN=CENTOS7,CN=Computers,DC=NTAP,DC=local
 HOST/centos7.ntap.local
 HOST/CENTOS7

Existing SPN found!

You can view a user’s UPN and SPN with the following PowerShell command:

PS C:\> Get-ADUser student1 -Properties UserPrincipalName,ServicePrincipalName

DistinguishedName : CN=student1,CN=Users,DC=NTAP,DC=local
Enabled : True
GivenName : student1
Name : student1
ObjectClass : user
ObjectGUID : d5d5b526-bef8-46fa-967b-00ebc77e468d
SamAccountName : student1
SID : S-1-5-21-3552729481-4032800560-2279794651-1108
Surname :
UserPrincipalName : student1@NTAP.local

And a machine account’s with:

PS C:\> Get-ADComputer CENTOS7$ -Properties UserPrincipalName,ServicePrincipalName

DistinguishedName : CN=CENTOS7,CN=Computers,DC=NTAP,DC=local
DNSHostName : centos7.ntap.local
Enabled : True
Name : CENTOS7
ObjectClass : computer
ObjectGUID : 3a50009f-2b40-46ea-9014-3418b8d70bdb
SamAccountName : CENTOS7$
ServicePrincipalName : {HOST/centos7.ntap.local, HOST/CENTOS7}
SID : S-1-5-21-3552729481-4032800560-2279794651-1140
UserPrincipalName : HOST/centos7.ntap.local@NTAP.LOCAL

Network Time Protocol (NTP) – With Kerberos, there is a 5 minute default time skew window. If a client and server/KDC’s time is outside of that window, Kerberos requests will fail with “Access denied” and you’d see time skew errors in the cluster logs. This KB covers it nicely:

https://kb.netapp.com/support/s/article/ka11A0000001V1YQAU/Troubleshooting-Workflow-CIFS-Authentication-failures?language=en_US

A common issue I’ve seen with this is time zone differences or daylight savings issues. I’ve often seen the wall clock time look identical on server and client, but the time zones or month/date differ, causing the skew.

The NTP requirement is actually a “make sure your time is up to date and in sync on everything” requirement, but NTP makes that easier.

Kerberos to UNIX name mappings – In ONTAP, we authenticate via name mappings not only for CIFS/SMB, but also for Kerberos. When a client attempts to send an authentication request to the cluster for an AS request or ST (service ticket) request, it has to map to a valid UNIX user. The UNIX user mapping will depend on what type of principal is coming in. If you don’t have a valid name mapping rule, you’d see something like this in the event log:

5/16/2017 10:24:23 ontap9-tme-8040-01
 ERROR secd.nfsAuth.problem: vserver (DEMO) General NFS authorization problem. Error: RPC accept GSS token procedure failed
 [ 8 ms] Acquired NFS service credential for logical interface 1034 (SPN='nfs/demo.ntap.local@NTAP.LOCAL').
 [ 11] GSS_S_COMPLETE: client = 'CENTOS7$@NTAP.LOCAL'
 [ 11] Trying to map SPN 'CENTOS7$@NTAP.LOCAL' to UNIX user 'CENTOS7$' using implicit mapping
 [ 12] Using a cached connection to oneway.ntap.local
**[ 14] FAILURE: User 'CENTOS7$' not found in UNIX authorization source LDAP.
 [ 15] Entry for user-name: CENTOS7$ not found in the current source: LDAP. Ignoring and trying next available source
 [ 15] Entry for user-name: CENTOS7$ not found in the current source: FILES. Entry for user-name: CENTOS7$ not found in any of the available sources
 [ 15] Unable to map SPN 'CENTOS7$@NTAP.LOCAL'
 [ 15] Unable to map Kerberos NFS user 'CENTOS7$@NTAP.LOCAL' to appropriate UNIX user

For service principals (SPNS) such as host/name or nfs/name, the mapping would try to default to primary/, so you’d need a UNIX user named host or nfs on the local SVM or in a name service like LDAP. Otherwise, you can create static krb-unix name mappings in the SVM to map to whatever user you like. If you want to use wild cards, regex, etc. you can do  that. For example, this name mapping rule will map all SPNs coming in as {MACHINE}$@REALM.COM to root.

cluster::*> vserver name-mapping show -vserver DEMO -direction krb-unix -position 1

Vserver: DEMO
 Direction: krb-unix
 Position: 1
 Pattern: (.+)\$@NTAP.LOCAL
 Replacement: root
IP Address with Subnet Mask: -
 Hostname: -

To test the mapping, use diag priv:

cluster::*> diag secd name-mapping show -node node1 -vserver DEMO -direction krb-unix -name CENTOS7$@NTAP.LOCAL

'CENTOS7$@NTAP.LOCAL' maps to 'root'

You can map the SPN to root, pcuser, etc. – as long as the UNIX user exists locally on the SVM or in the name service.

The workflow

Now that I’ve gotten some basics out of the way (and if you find that I’ve missed some, add to the comments), let’s look at how the workflow for an NFS mount using Kerberos would work, end to end. This is assuming we’ve configured everything correctly and are ready to mount, and that all the export policy rules allow the client to mount NFSv4 and Kerberos. If a mount fails, always check your export policy rules first.

Some common export policy issues include:

  • The export policy doesn’t have any rules configured
  • The vserver/SVM root volume doesn’t allow read access in the export policy rule for traversal of the / mount point in the namespace
  • The export policy has rules, but they are either misconfigured (clientmatch is wrong, read access disallowed, NFS protocol or auth method is disallowed) or they aren’t allowing the client to access the mount (Run export-policy rule show -instance)
  • The wrong/unexpected export policy has been applied to the volume (Run volume show -fields policy)

What’s unfortunate about trying to troubleshoot mounts with NFS Kerberos involved is that, regardless of the failures happening, the client will report:

mount.nfs: access denied by server while mounting

It’s a generic error and isn’t really helpful in diagnosing the issue.

In ONTAP, there is a command in admin privilege to check the export policy access for the client for troubleshooting purposes. Be sure to use it to rule out export issues.

cluster::> export-policy check-access -vserver DEMO -volume flexvol -client-ip 10.193.67.225 -authentication-method krb5 -protocol nfs4 -access-type read-write
 Policy Policy Rule
Path Policy Owner Owner Type Index Access
----------------------------- ---------- --------- ---------- ------ ----------
/ root vsroot volume 1 read
/flexvol default flexvol volume 1 read-write
2 entries were displayed.

The mount command is issued.

In my case, I use NFSv4.x, as that’s the security standard. Mounting without specifying a version will default to the highest NFS version allowed by the client and server, via a client-server negotiation. If NFSv4.x is disabled on the server, the client will fall back to NFSv3.

# mount -o sec=krb5 demo:/flexvol /mnt

Once the mount command gets issued and Kerberos is specified, a few (ok, a lot of) things happen in the background.

While this stuff happens, the mount command will appear to “hang” as the client, KDC and server suss out if you’re going to be allowed access.

  • DNS lookups are done for the client hostname and server hostname (or reverse lookup of the IP address) to help determine what names are going to be used. Additionally, SRV lookups are done for the LDAP service and Kerberos services in the domain. DNS lookups are happening constantly through this process.
  • The client uses its keytab file to send an authentication service request (AS-REQ) to the KDC, along with what enctypes it has available. The KDC then verifies if the requested principal actually exists in the KDC and if the enctypes are supported.
  • If the enctypes are not supported, or if the principal exists, or if there are DUPLICATE principals, the AS-REQ fails. If the principal exists, the KDC will send a successful reply.
  • Then the client will send a Ticket Granting Service request (TGS-REQ) to the KDC. This request is an attempt to look up the NFS service ticket named nfs/name. The name portion of the ticket is generated either via what was typed into the mount command (ie, demo) or via reverse lookup (if we typed in an IP address to mount). The TGS-REQ will be used later to allow us to obtain a service ticket (ST). The TGS will also negotiate supported enctypes for later. If the TGS-REQ between the KDC and client negotiates an enctype that ONTAP doesn’t support (for example, ARCFOUR), then the mount will fail later in process.
  • If the TGS-REQ succeeds, a TGS-REP is sent. If the KDC doesn’t support the requested enctypes from the client, we fail here. If the NFS principal doesn’t exist (remember, it has to be in DNS and match exactly), then we fail.
  • Once the TGS is acquired by the NFS client, it presents the ticket to the NFS server in ONTAP via a NFS NULL call. The ticket information includes the NFS service SPN and the enctype used. If the NFS SPN doesn’t match what’s in “kerberos interface show,” the mount fails. If the enctype presented by the client isn’t supported or is disallowed in “permitted enctypes” on the NFS server, the request fails. The client would show “access denied.”
  • The NFS service SPN sent by the client is presented to ONTAP. This is where the krb-unix mapping takes place. ONTAP will first see if a user named “nfs” exists in local files or name services (such as LDAP, where a bind to the LDAP server and lookup takes place). If the user doesn’t exist, it will then check to see if any krb-unix name mapping rules were set explicitly. If no rules exist and mapping fails, ONTAP logs an error on the cluster and the mount fails with “Access denied.” If the mapping works, the mount procedure moves on to the next step.
  • After the NFS service ticket is verified, the client will send SETCLIENTID calls and then the NFSv4.x mount compound call (PUTROOTFH | GETATTR). The client and server are also negotiating the name@domainID string to make sure they match on both sides as part of NFSv4.x security.
  • Then, the client will try to run a series of GETATTR calls to “/” in the path. If we didn’t allow “read” access in the policy rule for “/” (the vsroot volume), we fail. If the ACLs/mode bits on the vsroot volume don’t allow at least traverse permissions, we fail. In a packet trace, we can see that the vsroot volume has only traverse permissions:
    V4 Reply (Call In 268) ACCESS, [Access Denied: RD MD XT], [Allowed: LU DL]

    We can also see that from the cluster CLI (“Everyone” only has “Execute” permissions in this NTFS security style volume):

    cluster::> vserver security file-directory show -vserver DEMO -path / -expand-mask true
    
    Vserver: DEMO
     File Path: /
     File Inode Number: 64
     Security Style: ntfs
     Effective Style: ntfs
     DOS Attributes: 10
     DOS Attributes in Text: ----D---
    Expanded Dos Attributes: 0x10
     ...0 .... .... .... = Offline
     .... ..0. .... .... = Sparse
     .... .... 0... .... = Normal
     .... .... ..0. .... = Archive
     .... .... ...1 .... = Directory
     .... .... .... .0.. = System
     .... .... .... ..0. = Hidden
     .... .... .... ...0 = Read Only
     UNIX User Id: 0
     UNIX Group Id: 0
     UNIX Mode Bits: 777
     UNIX Mode Bits in Text: rwxrwxrwx
     ACLs: NTFS Security Descriptor
     Control:0x9504
    
    1... .... .... .... = Self Relative
     .0.. .... .... .... = RM Control Valid
     ..0. .... .... .... = SACL Protected
     ...1 .... .... .... = DACL Protected
     .... 0... .... .... = SACL Inherited
     .... .1.. .... .... = DACL Inherited
     .... ..0. .... .... = SACL Inherit Required
     .... ...1 .... .... = DACL Inherit Required
     .... .... ..0. .... = SACL Defaulted
     .... .... ...0 .... = SACL Present
     .... .... .... 0... = DACL Defaulted
     .... .... .... .1.. = DACL Present
     .... .... .... ..0. = Group Defaulted
     .... .... .... ...0 = Owner Defaulted
    
    Owner:BUILTIN\Administrators
     Group:BUILTIN\Administrators
     DACL - ACEs
     ALLOW-NTAP\Domain Admins-0x1f01ff-OI|CI
     0... .... .... .... .... .... .... .... = Generic Read
     .0.. .... .... .... .... .... .... .... = Generic Write
     ..0. .... .... .... .... .... .... .... = Generic Execute
     ...0 .... .... .... .... .... .... .... = Generic All
     .... ...0 .... .... .... .... .... .... = System Security
     .... .... ...1 .... .... .... .... .... = Synchronize
     .... .... .... 1... .... .... .... .... = Write Owner
     .... .... .... .1.. .... .... .... .... = Write DAC
     .... .... .... ..1. .... .... .... .... = Read Control
     .... .... .... ...1 .... .... .... .... = Delete
     .... .... .... .... .... ...1 .... .... = Write Attributes
     .... .... .... .... .... .... 1... .... = Read Attributes
     .... .... .... .... .... .... .1.. .... = Delete Child
     .... .... .... .... .... .... ..1. .... = Execute
     .... .... .... .... .... .... ...1 .... = Write EA
     .... .... .... .... .... .... .... 1... = Read EA
     .... .... .... .... .... .... .... .1.. = Append
     .... .... .... .... .... .... .... ..1. = Write
     .... .... .... .... .... .... .... ...1 = Read
    
    ALLOW-Everyone-0x100020-OI|CI
     0... .... .... .... .... .... .... .... = Generic Read
     .0.. .... .... .... .... .... .... .... = Generic Write
     ..0. .... .... .... .... .... .... .... = Generic Execute
     ...0 .... .... .... .... .... .... .... = Generic All
     .... ...0 .... .... .... .... .... .... = System Security
     .... .... ...1 .... .... .... .... .... = Synchronize
     .... .... .... 0... .... .... .... .... = Write Owner
     .... .... .... .0.. .... .... .... .... = Write DAC
     .... .... .... ..0. .... .... .... .... = Read Control
     .... .... .... ...0 .... .... .... .... = Delete
     .... .... .... .... .... ...0 .... .... = Write Attributes
     .... .... .... .... .... .... 0... .... = Read Attributes
     .... .... .... .... .... .... .0.. .... = Delete Child
     .... .... .... .... .... .... ..1. .... = Execute
     .... .... .... .... .... .... ...0 .... = Write EA
     .... .... .... .... .... .... .... 0... = Read EA
     .... .... .... .... .... .... .... .0.. = Append
     .... .... .... .... .... .... .... ..0. = Write
     .... .... .... .... .... .... .... ...0 = Read
  • If we have the appropriate permissions to traverse “/” then the NFS client attempts to find the file handle for the mount point via a LOOKUP call, using the file handle of vsroot in the path. It would look something like this:
    V4 Call (Reply In 271) LOOKUP DH: 0x92605bb8/flexvol
  • If the file handle exists, it gets returned to the client:
    fh.png
  • Then the client uses that file handle to run GETATTRs to see if it can access the mount:
    V4 Call (Reply In 275) GETATTR FH: 0x1f57355e

If all is clear, our mount succeeds!

If you’re interested, the successful Kerberos mount traces can be found here:

https://github.com/whyistheinternetbroken/TR-4073-setup/blob/master/nfs-krb5-trace-from-client.pcap

https://github.com/whyistheinternetbroken/TR-4073-setup/blob/master/nfs-krb5-trace-from-kdc.pcapng

But we’re not done… now the user that wants to access the mount has to go through another ticket process. In my case, I used a user named “student1.” This is because a lot of the Kerberos/NFSv4.x requests I get are generated by universities interested in setting up multiprotocol-ready home directories.

When a user like student1 wants to get into a Kerberized NFS mount, they can’t just cd into it. That would look like this:

# su student1
sh-4.2$ cd /mnt
sh: cd: /mnt: Not a directory

Oh look… another useless error! If I were to take that error literally, I would think “that mount doesn’t even exist!” But, it does:

sh-4.2$ mount | grep mnt
demo:/flexvol on /mnt type nfs4 (rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=krb5,clientaddr=10.193.67.225,local_lock=none,addr=10.193.67.219)

What that error actually means is that the user requesting access does not have a valid Kerberos AS ticket (login) to make the request for a TGS (ticket granting ticket) to get a service ticket for NFS (nfs/server-hostname). We can see that via the klist -e command.

sh-4.2$ klist -e
klist: Credentials cache keyring 'persistent:1301:1301' not found

Before you can get into a mount that is only allowing Kerberos access, you have to get a Kerberos ticket. On Linux, you can do that via the kinit command, which is akin to a Windows login.

sh-4.2$ kinit
Password for student1@NTAP.LOCAL:
sh-4.2$ klist -e
Ticket cache: KEYRING:persistent:1301:1301
Default principal: student1@NTAP.LOCAL

Valid starting Expires Service principal
05/16/2017 15:54:01 05/17/2017 01:54:01 krbtgt/NTAP.LOCAL@NTAP.LOCAL
 renew until 05/23/2017 15:53:58, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96

Now that I have a my ticket, I can cd into the mount. When I cd into a Kerberized NFS mount, the client will make TGS requests to the KDC (seen in the trace in packet 101) for the service ticket. If that process is successful, we get access:

sh-4.2$ cd /mnt
sh-4.2$ pwd
/mnt
sh-4.2$ ls
c0 c1 c2 c3 c4 c5 c6 c7 newfile2 newfile-nfs4
sh-4.2$ klist -e
Ticket cache: KEYRING:persistent:1301:1301
Default principal: student1@NTAP.LOCAL

Valid starting Expires Service principal
05/16/2017 15:55:32 05/17/2017 01:54:01 nfs/demo.ntap.local@NTAP.LOCAL
 renew until 05/23/2017 15:53:58, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96
05/16/2017 15:54:01 05/17/2017 01:54:01 krbtgt/NTAP.LOCAL@NTAP.LOCAL
 renew until 05/23/2017 15:53:58, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96

Now we’re done. (at least until our tickets expire…)

 

Behind the Scenes: Episode 86 – Veeam 9.5 Update 2

Welcome to the Episode 86, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

group-4-2016

We wrap up “Release Week” with an episode on Veeam’s new release with Veeam Technical Evangelist/NetApp A-Team member Michael Cade! (@michaelcade1)

ga3dhof9

Find out what new goodness is in Veeam’s latest release and get a rundown of what Veeam actually is. For the official blog:

https://newsroom.netapp.com/blogs/tech-ontap-podcast-episode-86-veeam-9-5-update-2/

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

You can listen here: