Behind the Scenes: Episode 63 – What is FabricPool?

Welcome to the Episode 63, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

ep63

This week, we invite NetApp Product Manager Arun Raman (pronounced like the noodles) on to discuss the new upcoming ONTAP feature that adds another thread to the Data Fabric – FabricPool!

FabricPool was announced at NetApp Insight 2016 Las Vegas and there was even a demo:

There is also a blog on it here:

http://community.netapp.com/t5/Technology/FabricPool-Preview-Building-a-Bridge-from-the-All-Flash-Data-Center-to-the/ba-p/124388

This episode functions more as a sneak preview of the feature, so be sure to stay tuned for updates and an eventual deep dive. If you’re interested in trying the feature out in an Early Access Program, email araman@netapp.com.

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

You can listen here:

Advertisements

Behind the Scenes: Episode 62 – ONTAP 9.1 Hardware Refresh

Welcome to the Episode 62, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

group-4-2016

This week, we geek out on the new hardware announcements made at Insight 2016 in Las Vegas that coincide with the ONTAP 9.1 release. We chat with Flash TME Skip Shapiro and PM Mukesh Nigam, as we learn about the latest and greatest updates with these beastly – yet densely compact – new hardware platforms including:

We also welcomed a customer from the College Board, Deanna McNeill (@deannie) to discuss Insight and her experience with NetApp.

    Finding the Podcast

    The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

    Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

    http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

    I also recently got asked how to leverage RSS for the podcast. You can do that here:

    http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

    You can listen here:

    vSphere 6.5: The NFS edition

    A while back, I wrote up a blog about the release of vSphere 6.0, with an NFS slant to it. Why? Because NFS!

    With VMworld 2016 Barcelona, vSphere 6.5 (where’d .1, .2, .3, .4 go?) has been announced, so I can discuss the NFS impact. If you’d like a more general look at the release, check out Cormac Hogan’s blog on it.

    Whither VMFS?

    Before I get into the new NFS feature/functionality of the release, let’s talk about the changes to VMFS. One of the reasons people use NFS with VMware (other than its awesomeness) is that VMFS had some… limitations.

    With the announcement of VMFS-6, some of those limitations have been removed. For example, VMFS-6 includes something called UNMAP, which essentially is a garbage collector for unused whitespace inside the VMFS datastore. This provides better space efficiency than previous iterations.

    Additionally, VMware has added some performance enhancements to VMFS, so it may outperform NFS, especially if using fiber channel.

    Other than that, you still can’t shrink the datastore, it’s not that easy to expand, etc. So, minor improvements that shouldn’t impact NFS too terribly. People who love NFS will likely stay on NFS. People who love VMFS will be happy with the improvements. Life goes on…

    What’s new in vSphere 6.5 from the NFS perspective?

    In vSphere 6.0, NFS 4.1 support was added.

    However, it was a pretty minimal stack – no pNFS, no delegations, no referrals, etc. They basically added session trunking/multipath, which is cool – but there was still a lot to be desired. On the downside, that feature isn’t even supported in ONTAP yet. So close, yet so far…

    In vSphere 6.5, the NFS 4.1 stack has been expanded a bit to include hardware acceleration for NFSv4.1. This is actually a pretty compelling addition, as it can help the overall NFSv4.1 performance of the datastore.

    NFSv4.1 also fully supports IPv6. Your level of excitement is solely based on how many people you think use IPv6 right now.

    Kerberos

    Perhaps the most compelling NFS changs in vSphere 6.5 is how we secure our mounts.

    In 6.0, Kerberos support was added, but you could only do DES. Blah.

    Now, Kerberos support in vSphere 6.5 includes:

    • AES-128
    • AES-256
    • REMOVAL of DES encryption
    • Kerberos with integrity checking  (krb5i – prevents “man in the middle” attacks)

    Now, while it’s pretty cool that they removed support for the insecure DES enctype, that *is* going to be a disruptive change for people using Kerberos. The machine account/principal will need to be destroyed and re-created, clients will need to re-mount, etc. But, it’s an improvement!

    How vSphere 6.5 personally impacts me

    The downside of these changes means that I have to adjust my Insight presentation a bit. If you’re going to Insight in Berlin, check out 60831-2: How Customers and Partners use NFS for Virtualization.

    Still looking forward to pNFS in vSphere, though…

    Behind the Scenes: Episode 61 – Security and Storage

    Welcome to the Episode 61, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

    ep61

    This week on the podcast, we discuss security in storage systems with the new security TME Andrae Middleton and NetApp A-Team member Jarett Kulm (@JK47theweapon) of High Availability, Inc. We cover security at rest, in-flight, methodologies, ransomware and much more!

    Also be sure to check out our podcast on NetApp Volume Encryption.

    Finding the Podcast

    The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

    Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

    http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

    I also recently got asked how to leverage RSS for the podcast. You can do that here:

    http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

    You can listen here:

    Behind the Scenes: Tech ONTAP Podcast Live!

    During NetApp Insight 2016 in Las Vegas, Andrew and I had the opportunity to do a podcast live, onstage, during the Tech Team Forum. We invited one of the NetApp A-Team members, Glenn Dekhayser (@gdekhayser) of Red8, to talk DevOps.

    We had a short list of questions prior, but didn’t rehearse anything until the day before. And even then, it was kind of a partial rehearsal – they ran out of time to complete it. So, we pretty much ad libbed the whole thing, which is the Tech ONTAP podcast way!

    We knew Glenn was going to be great, as was Andrew. I was the wild card. The guest beard made its triumphant public appearance.

    The story of the guest beard

    Each one of us on the podcast has a beard.

    group-4-2016

    Well, I have one during the fall and winter. Screw summer beards. However, many of our guests do not. So, we decided we needed a beard for one and all. I went to Amazon and found the rattiest beard I could find. Mission accomplished. Now, it’s the unofficial mascot of the podcast.

    The decision to wear it onstage, in front of co-workers and partners, as well as execs, was kind of a last minute audible. I’m pretty sure it went over like a lead balloon. No one expects a guest beard.

    If you’re interested, here’s the podcast on stage, in its entirety. Let us know what you think!

    Also, be sure to check out techontappodcast.com for more of our episodes!

    ONTAP 9.1 RC1 is now available!

    For info about ONTAP 9.0, see:

    ONTAP 9 RC1 is now available!

    ONTAP 9.0 is now generally available (GA)!

    While many of the features of ONTAP 9.1 were announced at Insight 2016 in Las Vegas, the official release of the software wasn’t scheduled until the first week of October, which was the week after the conference.

    For Insight Las Vegas highlights, see http://www.netapp-insight.com/las-vegas-highlights.

    Get used to more features being released for ONTAP in the coming years. We’ve sped up the release cycle to get more cool stuff out faster!

    But now, ONTAP 9.1 RC1 available!

    That’s right – the next major release of ONTAP is now available. If you have concerns over the “RC” designation, allow me to recap what I mentioned in a previous blog post:

    RC versions have completed a rigorous set of internal NetApp tests and are are deemed ready for public consumption. Each release candidate would provide bug fixes that eventually lead up to the GA edition. Keep in mind that all release candidates are fully supported by NetApp, even if there is a GA version available. However, while RC is perfectly fine to run in production environments, GA is the recommended version of any ONTAP software release.

    For a more official take on it, see the NetApp link:

    http://mysupport.netapp.com/NOW/products/ontap_releasemodel/post70.shtml

    What’s new in ONTAP 9.1?

    At a high level, ONTAP 9.1 brings:

    If you have questions about any of the above, leave a comment and I’ll address them in a future blog post.

    Happy upgrading!

     

     

     

     

    NetApp FlexGroup: An evolution of NAS

    evolution-of-man-parodies-333

    Check out the official NetApp version of this blog on the NetApp Newsroom!

    I’ve been the NFS TME at NetApp for 3 years now.

    I also cover name services (LDAP, NIS, DNS, etc.) and occasionally answer the stray CIFS/SMB question. I look at NAS as a data utility, not unlike water or electricity in your home. You need it, you love it, but you don’t really think about it too much and it doesn’t really excite you.

    However, once I heard that NetApp was creating a brand new distributed file system that could evolve how NAS works, I jumped at the opportunity to be a TME for it. So, now, I am the Technical Marketing Engineer for NFS, Name Services and NetApp FlexGroup (and sometimes CIFS/SMB). How’s that for a job title?

    We covered NetApp FlexGroup in the NetApp Tech ONTAP Podcast the week of June 30, but I wanted to write up a blog post to expand upon the topic a little more.

    Now that ONTAP 9.1 is available, it was time to update the blog here.

    For the official Technical Report, check out TR-4557 – NetApp FlexGroup Technical Overview.

    For the best practice guide, see TR-4571 – NetApp FlexGroup Best Practices and Implementation Guide.

    Here are a couple videos I did at Insight:

    I also had a chance to chat with Enrico Signoretti at Insight:

    Data is growing.

    It’s no secret… we’re leaving – some may say, left – the days behind where 100TB in a single volume is enough space to accommodate a single file system. Files are getting larger and datasets are increasing. For instance, think about the sheer amount of data that’s needed to keep something like a photo or video repository running. Or a global GPS data structure. Or Electronic Design Automation environments designing the latest computer chipset. Or seismic data analyzing oil and gas locations.

    Environments like these require massive amounts of capacity, with billions of files in some cases. Scale-out NAS storage devices are the best way to approach these use cases because of the flexibility, but it’s important to be able to scale the existing architecture in a simple and efficient manner.

    For a while, storage systems like ONTAP had a single construct to handle these workloads – the Flexible Volume (or, FlexVol).

    FlexVols are great, but…

    For most use cases, FlexVols are perfect. They are large enough (up to 100TB) and can handle enough files (up to 2 billion). For NAS workloads, they can do just about anything. But where you start to see issues with the FlexVol is when you start to increase the number of metadata operations in a file system. The FlexVol volume will serialize these operations and won’t use all possible CPU threads for the operations. I think of it like a traffic jam due to lane closures; when a lane is closed, everyone has to merge, causing slowdowns.

    traffic-jam

    When all lanes are open, traffic is free to move normally and concurrently.

    traffic-clear.png

    Additionally, because a FlexVol volume is tied directly to a physical aggregate and node, your NAS operations are also tied to that single aggregate or node. If you have a 10-node cluster, each with multiple aggregates, you might not be getting the most bang for your buck.

    That’s where NetApp FlexGroup comes in.

    FlexGroup has been designed to solve multiple issues in large-scale NAS workloads.

    • Capacity – Scales to multiple petabytes
    • High file counts – Hundreds of billions of files
    • Performance – parallelized operations in NAS workloads, across CPUs, nodes, aggregates and constituent member FlexVol volumes
    • Simplicity of deployment – Simple-to-use GUI in System Manager allows fast provisioning of massive capacity
    • Load balancing – Use all your cluster resources for a single namespace

    With FlexGroup volumes, NAS workloads can now take advantage of every resource available in a cluster. Even with a single node cluster, a FlexGroup can balance workloads across multiple FlexVol constituents and aggregates.

    How does a FlexGroup volume work at a high level?

    FlexGroup volumes essentially take the already awesome concept of a FlexVol volume and simply enhances it by stitching together multiple FlexVol member constituents into a single namespace that acts like a single FlexVol to clients and storage administrators.

    A FlexGroup volume would roughly look like this from an ONTAP perspective:

    fg-diagram

    Files are not striped, but instead are placed systematically into individual FlexVol member volumes that work together under a single access point. This concept is very similar in function to a multiple FlexVol volume configuration, where volumes are junctioned together to simulate a large bucket.

    multi-flexvol.png

    However, multiple FlexVol volume configurations add complexity via junctions, export policies and manual decisions for volume placement across cluster nodes, as well as needing to re-design applications to point to a filesystem structure that is being defined by the storage rather than by the application.

    To a NAS client, a FlexGroup volume would look like a single bucket of storage:

    flexgroup-client.png

    When a client creates a file in a FlexGroup, ONTAP will decide which member FlexVol volume is the best possible container for that write based on a number of things such as capacity across members, throughput, last accessed… Basically, doing all the hard work for you. The idea is to keep the members as balanced as possible without hurting performance predictability at all, and, in fact, increasing performance in some workloads.

    The creates can arrive on any node in the cluster. Once the request arrives to the cluster, if ONTAP chooses a member volume that’s different than where the request arrived, a hardlink is created within ONTAP (remote or local, depending on the request) and the create is then passed on to the designated member volume. All of this is transparent to clients.

    Reads and writes after a file is created will operate much like they already do in ONTAP FlexVols now; the system will tell the client where the file location is and point that client to that particular member volume. As such, you would see much better gains with initial file ingest versus reads/writes after the files have already been placed.

     

    Why is this better?

     

    When NAS operations can be allocated across multiple FlexVol volumes, we don’t run into the issue of serialization in the system. Instead, we start spreading the workload across multiple file systems (FlexVol volumes) joined together (the FlexGroup volume). And unlike Infinite Volumes, there is no concept of a single FlexVol volume to handle metadata operations – every member volume in a FlexGroup volume is eligible to process metadata operations. As a result, FlexGroup volumes perform better than Infinite Volumes in most cases.

    What kind of performance boost are we potentially seeing?

    In preliminary testing of a FlexGroup against a single FlexVol, we’ve seen up to 6x the performance. And that was with simple spinning SAS disk. This was the set up used:

    • Single FAS8080 node
    • SAS drives
    • 16 FlexVol member constituents
    • 2 aggregates
    • 8 members per aggregate

    The workload used to test the FlexGroup as a software build using Git. In the graph below, we can see that operations such as checkout and clone show the biggest performance boosts, as they take far less time to run to completion on a FlexGroup than on a single FlexVol.

    fg-git-graph

    Adding more nodes and members can improve performance. Adding AFF into the mix can help latency. Here’s a similar test comparison with an AFF system. This test used GIT, but did a compile of gcc instead of the Linux source code to give us more files.

    aff-fg.png

    In this case, we see similar performance between a single FlexVol and FlexGroup. We do see slightly better performance with multiple FlexVols (junctioned), but doing that creates complexity and doesn’t offer a true single namespace of >100TB.

    We also did some recent AFF testing with a GIT workload. This time, the compile was the gcc library, rather than a Linux compile. This gave us more files and folders to work with. The systems used were an AFF8080 (4 nodes) and an A700 (2 nodes).

    aff-completiontimes.png

    Simple management

    FlexGroup volumes allow storage administrators to deploy multiple petabytes of storage to clients in a single container within a matter of seconds. This provides capacity, as well as similar performance gains you’d see with multiple junctioned FlexVol volumes. (FYI, a junction is essentially just mounting a FlexVol to a FlexVol)

    In addition to that, there is compatability out of the gate with OnCommand products. The OnCommand TME Yuvaraju B has created a video showing this, which you can see here:

    Snapshots

    This section is added after the blog post was already published, as per one of the blog comments. I just simply forgot to mention it. 🙂

    In the first release of NetApp FlexGroup, we’ll have access to snapshot functionality. Essentially, this works the same as regular snapshots in ONTAP – it’s done at the FlexVol level and will capture a point in time of the filesystem and lock blocks into place with pointers. I cover general snapshot technology in the blog post Snapshots and Polaroids: Neither Last Forever.

    Because a FlexGroup is a collection of member FlexVols, we want to be sure snapshots are captured at the exact same time for filesystem consistency. As such, FlexGroup snapshots are coordinated by ONTAP to be taken at the same time. If a member FlexVol cannot take a snapshot for any reason, the FlexGroup snapshot fails and ONTAP cleans things up.

    SnapMirror

    FlexGroup supports SnapMirror for disaster recovery. This currently replicates up to 32 member volumes per FlexGroup (100 total per cluster) to a DR site. SnapMirror will take a snapshot of all member volumes at once and then do a concurrent transfer of the members to the DR site.

    Automatic Incremental Resiliency

    Also included in the FlexGroup feature is a new mechanism that seeks out metadata inconsistencies and fixes them when a client requests access, in real time. No outages. No interruptions. The entire FlexGroup remains online while this happens and the clients don’t even notice when a repair takes place. In fact, no one would know if we didn’t trigger a pesky EMS message to ONTAP to ensure a storage administrator knows we fixed something. Pretty underrated new aspect of FlexGroup, if you ask me.

    How do you get NetApp FlexGroup?

    NetApp FlexGroup is currently available in ONTAP 9.1 for general availability. It can be used by anyone, but should only be used for the specific use cases covered in the FlexGroup TR-4557. I also cover best practices in TR-4571.

    In ONTAP 9.1, FlexGroup supports:

    • NFSv3 and SMB 2.x/3.x (RC2 for SMB support; see TR-4571 for feature support)
    • Snapshots
    • SnapMirror
    • Thin Provisioning
    • User and group quota reporting
    • Storage efficiencies (inline deduplication, compression, compaction; post-process deduplication)
    • OnCommand Performance Manager and System Manager support
    • All-flash FAS (incidentally, the *only* all-flash array that currently supports this scale)
    • Sharing SVMs with FlexVols
    • Constituent volume moves

    To get more information, please email flexgroups-info@netapp.com.

    What other ONTAP 9 features enhance NetApp FlexGroup volumes?

    While FlexGroup as a feature is awesome on its own, there are also a number of ONTAP 9 features added that make a FlexGroup even more attractive, in my opinion.

    I cover ONTAP 9 in ONTAP 9 RC1 is now available! but the features I think benefit FlexGroup right out of the gate include:

    • 15 TB SSDs – once we support flash, these will be a perfect fit for FlexGroup
    • Per-aggregate CPs – never bottleneck a node on an over-used aggregate again
    • RAID Triple Erasure Coding (RAID-TEC) – triple parity to add extra protection to your large data sets

    Be sure to keep an eye out for more news and information regarding FlexGroup. If you have specific questions, I’ll answer them in the comments section (provided they’re not questions I’m not allowed to answer). 🙂

    If you missed the NetApp Insight session I did on FlexGroup volumes, you can find session 60411-2 here:

    https://www.brainshark.com/go/netapp-sell/insight-library.html?cf=12089#bsk-lightbox

    (Requires a login)

    Also, check out my blog on XCP, which I think would be a pretty natural fit for migration off existing NAS systems onto FlexGroup.