ONTAP 9.5 has been announced!

There are a few things in life that are certain… death, taxes and a new ONTAP release every 6 months!

ONTAP 9.5 was just officially announced at Insight 2018, and this blog will give you the technical breakdown of all the new goodness. We’ll have a new podcast up soon to cover it as well.

If you’re going to be at Insight (Las Vegas or Barcelona), or if you want to review sessions after the event, you can check out the following session:

1214-2 – What’s On Tap in the Next Major Release of NetApp ONTAP

What’s new?

Generally speaking, new stuff in ONTAP comes in the following forms:

  • New features
  • Enhanced features
  • Bug fixes

With the 6 month cadence, features are often phased in, with new features being released with stability as the top priority. Feature parity comes in chunks in later releases. Bug fixes are a part of every ONTAP release.

So, let’s start with…

New Features

ONTAP 9.5 continues the emphasis on the “modern datacenter” with a slew of new features that help enable higher performance and better resiliency, as well as extending your storage stack beyond on-premises and into a true global architecture.

SnapMirror Synchronous

SnapMirror Synchronous adds the ability to replicate – at a volume level – data cross a WAN connection (RTT <10ms – distance of ~150km) with zero Recovery Point Objective (RPO) and near-zero Recovery Time Objective (RTO). This helps address regulatory and industry mandated needs for synchronous replication.

sms-png

SnapMirror Synchronous will have two different modes available in the initial release.

Full Synchronous

This is the default mode and guarantees zero application data loss between sites by disallowing writes if the SnapMirror Synchronous replication fails for any reason. This provides the “zero RPO” guarantee.

Relaxed Synchronous

Alternatively, relaxed mode allows application writes to continue to a primary site if the SnapMirror Synchronous relationship fails. Once the relationship is able to resume, resync will automatically occur.

In the initial release of SnapMirror Synchronous, NFSv3, iSCSI and FCP will be supported. Licensing will be capacity-based, in addition to the base SnapMirror license.

FlexCache Volumes

One thing I’ve heard fairly often is “how can I serve NAS data across multiple sites while still honoring locking mechanisms?” Previously, the only way to accomplish this was by way of a 3rd party NAS lock orchestrator. Now, in ONTAP 9.5, NAS data can be shared across multiple global sites with performance as if the NAS data was local with FlexCache volumes and provides a true global namespace for ONTAP.

flexcache

FlexCache volumes are sparsely populated volumes that can be cached on the same cluster or a different cluster as the origin volumes to accelerate data access. FlexCache volumes are created on FlexGroup volumes and can cache reads, writes and metadata.  Writes are currently using write-around for locking orchestration at the origin. FlexCache volumes can also help offload mount points to avoid hot spots. Initially, NFSv3 will be the only supported protocol, but future releases will enable more data protocol support.

BGP routing support

The networking stack in ONTAP is getting a bit of a makeover in ONTAP 9.5 as well. Previously, data LIFs in ONTAP were hosted on a single physical port, which lived on a single physical node. Load balancing was done via layer 2 (L2) hashing, which wasn’t super efficient, as hash collisions would leave ports underutilized or even completely unused! When storage nodes have 40GB and 100GB ports, that can be an expensive waste of resources. Additionally, the L2 architecture meant that additional layer 3 (l3) switches needed to be in place to provide proper network traffic distribution.

ONTAP 9.5 introduces support for L3 routing via the Border Gateway Protocol (BGP), which allows ONTAP to automatically load balance traffic based on routing metrics, rather than L2 hashes. Additionally, this allows data LIFs to become Virtual IPs (VIPs) that can live anywhere in the network, which adds better redundancy for IP failover events, and avoids inactive links. This also eliminates the need for L3 switching infrastructure, which reduces overall CapEx and OpEx networking costs.

bgp

ONTAP 9.5 is further modernizing the datacenter by modernizing its networking stack.

Logical Space Accounting

ONTAP 9.4 introduced a way to report storage efficiency savings to storage administrators, but to mask those savings to users. For example, if a user is writing to a 10TB volume and 6TB of data has been written to the volume, but storage efficiencies have saved 2TB, then ONTAP can report the actual 6TB of capacity back to users, rather than the 4TB used by way of space savings. This provides storage administrators a way to charge back properly to end users and helps prevent overruns of storage capacity.

ONTAP 9.5 ups the game by integrating logical space accounting into quota enforcement, which not only displays the logical space used, but also prevents new writes once a quota has been reached based on the logical space used.

logical-space

MAX Data

While this was announced a couple months ago, MAX Data officially makes its debut alongside ONTAP 9.5. This is a server-side software product that lives outside of ONTAP. We covered it on the Tech ONTAP Podcast in Episode 154.

MAX Data offers ultra-low latency (think sub 10 microsecond) and more Ops/second with server-side software-based memory acceleration that leverages persistent memory such as NVDIMM and Optane Memory as they become available. Based on the Plexistor technology that NetApp acquired last year, MAX Data also offers enterprise-class data resiliency with MAX Recovery technology, for high availability and faster data recovery.

MAX Data can help accelerate database applications like Oracle, Cassandra, MongoDB and a variety of other Linux-based applications.

maxdata

NetApp Data Availability Services (NDAS)

While not technically an ONTAP feature (though there are ONTAP elements such as the NDAS proxy and copy-to-cloud APIs), NetApp Data Availability Services is an integral part of the NetApp Data Fabric. It’s a cloud-resident orchestration app that simplifies hybrid cloud data protection workflows behind a single pane of glass. It’s also an intuitive search catalog for easy file and volume restores and leverages intelligent S3 object storage in AWS for lower cost solutions for backing up your ONTAP data. For more information, see https://www.netapp.com/ndas.

ndas

Feature Enhancements

NVMe over FC – Industry’s only HA failover story for NVMeoFC namespaces via asymmetric namespace access (ANA), which is a NVMe standard that NetApp helped develop.

Storage efficiencies – Up to 15% more storage efficiencies seen with compression improvements.

FlexGroup volumes – New functionality such as FabricPool support, quota enforcement and qtree statistics open up a whole new set of workloads that can leverage FlexGroup volumes, such as home directories.

SnapLock – SnapLock adds feature enhancements such as Unified SnapMirror engine support, resync without data loss, clock synchronization in software defined ONTAP and 1,023 snapshot support.

MetroCluster (MCC) – ONTAP 9.5 adds support for SVM-DR and ONTAP Select with MetroCluster, increases the supported distance for MCC IP to 700km(!), and expands the platforms supported for use with MCC IP to the A300 and FAS8200 series.

Advertisements

Where can you find me at #NetAppInsight 2018?

NetApp Insight 2018 in Las Vegas is just a few weeks away and I’m currently in “get ready” mode. Between building sessions, updating docs for the upcoming ONTAP release and putting together podcast promos and episodes, I’ve been a busy little gopher.

For the general NetApp Insight preview, check it out here:

Behind the Scenes: Episode 158 – NetApp Insight 2018 Preview

Speaking of gophers, we’re unveiling a new shirt and sticker design this year, with a variation of the gopher, which I built on gopherize.me!

podcast-sticker-gopher-2018

The week before Insight, I’ll also be doing my annual “get in touch with nature before dealing with Las Vegas” side quest. The last 2 years, I did Zion and covered it here:

This year, I plan on driving from Sunnyvale through Yosemite, Sequoia and Death Valley over the course of a few days. Road trip!

insight-2018-roadtrip

I’ll be in Las Vegas on Sunday for a bootcamp on ONTAP AI, which I will be picking up on a more regular basis after Insight, which ties in nicely to my core competencies of NAS and FlexGroup volumes!

Insight 2018 – The Main Event

Starting Monday, I’ll be all over the place – booth, customer meetings and sessions. At this point, I only know what my session schedule looks like, so feel free to register for a spot in one of them!

Here are the sessions I’ll be presenting…

1214-2 – What’s On Tap in the Next Major Release of NetApp ONTAP

The next major release of NetApp ONTAP is upon us, and it is chock-full of data management goodness. Come and learn about the latest features in the new release, as well as what features came in the spring release, and how those features are enabling storage administrators to harness the power of their ONTAP systems.

Dates/times offered:

  • Monday Oct 22, 10AM PST (currently full – waitlist!)
  • Monday Oct 22, 3PM PST – Japanese translated session (119 seats remain!)
  • Tuesday Oct 23, 3:15PM PST (23 seats remain!)

1255-2 – FlexGroup: The Foundation of the Next-Generation NetApp Scale-Out NAS

Workloads are growing–both in capacity and performance needs. As data becomes the primary factor driving enterprise businesses, storage administrators need a simple and efficient way to store a lot of it in a single place. This session covers NetApp FlexGroup volumes and how their use expands beyond the high-file-count, large-capacity use cases and into others such as home directories, backup and archive, big data, and media/entertainment. Come learn how FlexGroup, the NetApp scale-out NAS solution, accelerates your enterprise file service performance, simplifies this traditional on-premises workload, and shifts it into the cloud-enabled paradigm.

Dates/times offered:

  • Monday Oct 22, 4:30PM PST (47 seats remain!)
  • Wednesday Oct 24, 2PM PST (75 seats remain!)

You can also find a bunch of other sessions in the Insight Event Catalog! If you’re a customer, you can also schedule 1:1 meetings through our EBC. Contact your sales rep for details. You can also tweet me @NFSDudeAbides or email me at whyistheinternetbroken@gmail.com if you just want to sync up or get free stickers!

See you at NetApp Insight 2018!

Docker + NFS + FlexGroup volumes = Magic!

tapete-as-creation-the-magic-unicorns-8-470937_l

A couple of years ago, I wrote up a blog on using NFS with Docker as I was tooling around with containers, in an attempt to wrap my head around them. Then, I never really touched them again and that blog got a bit… stale.

Why stale?

Well, in that blog, I had to create a bunch of kludgy hacks to get NFS to work with Docker, and honestly, it likely wasn’t even the best way to do it, given my lack of overall Docker knowledge. More recently, I wrote up a way to Kerberize NFS mounts in Docker containers that is a little better effort.

Luckily, realizing that I’m not the only one who wants to use Docker but may not know all the ins and outs, NetApp developers created a NetApp plugin to use with Docker that will do all the volume creation, removal, etc for you. Then, you can leverage the Docker volume options to mount via NFS. That plugin is named “Trident.”

mattel-dc-multiverse-super-friends-aquaman-review-trident-2

Trident + NFS

Trident is an open source storage provisioner and orchestrator for the NetApp portfolio.

You can read more about it here:

https://netapp.io/2016/12/23/introducing-trident-dynamic-persistent-volume-provisioner-kubernetes/

You can also read about how we use it for AI/ML here:

https://www.theregister.co.uk/2018/08/03/netapp_a800_pure_airi_flashblade/

When you’re using the Trident plugin, you can create Docker-ready NFS exported volumes in ONTAP to provide storage to all of your containers just by specifying the -v option during your “docker run” commands.

For example, here’s a NFS exported volume created using the Trident plugin:

# docker volume create -d netapp --name=foo_justin
foo_justin
# docker volume ls
DRIVER VOLUME NAME
netapp:latest foo_justin

Here’s what shows up on the ONTAP system:

::*> vol show -vserver DEMO -volume netappdvp_foo_justin -fields policy
vserver volume               policy
------- -------------------- -------
DEMO    netappdvp_foo_justin default

Then, I can just start up the container using that volume:

# docker run --rm -it -v foo_justin:/foo alpine ash
/ # mount | grep justin
10.x.x.x:/netappdvp_foo_justin on /foo type nfs (rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.193.67.237,mountvers=3,mountport=635,mountproto=udp,local_lock=none,addr=10.x.x.x)

Having a centralized NFS storage volume for your containers to rely on has a vast number of use cases, providing access for reading and writing to the same location across a network on a high-performing storage system with all sorts of data protection capabilities to ensure high availability and resiliency.

Customization of Volumes

With the Trident plugin, you have the ability to modify the config files to change attributes from the defaults, such as custom names, size, export policies and others. See the full list here:

http://netapp-trident.readthedocs.io/en/latest/docker/install/ndvp_ontap_config.html

Trident + NFS + FlexGroup Volumes

Starting in Trident 18.07, a new Trident NAS driver was added that supports creation of FlexGroup volumes with Docker.

To change the plugin, change the /etc/netappdvp/config.json file to use the FlexGroup driver.

{
"version": 1,
"storageDriverName": "ontap-nas-flexgroup",
"managementLIF": "10.x.x.x",
"dataLIF": "10.x.x.x.",
"svm": "DEMO",
"username": "admin",
"password": "********",
"aggregate": "aggr1_node1",
}

Then, create your FlexGroup volume. That simple!

A word of advice, though. The FlexGroup driver defaults to 1GB and creates 8 member volumes across your aggregates, which creates 128MB member volumes. That’s problematic for a couple reasons:

  • FlexGroup volumes should have members that are no less than 100GB in size (as per TR-4571) – small members will affect performance due to member volumes doing more remote allocation than normal
  • Files that get written to the FlexGroup will fill up 128MB pretty fast, causing the FlexGroup to appear to be out of space.

You can fix this either by setting the config.json file to use larger sizes, or specifying the size up front in the Docker volume command. I’d recommend using the config file and overriding the defaults.

To set this in the config file, just specify “size” as a variable (full list of options can be found here: https://netapp-trident.readthedocs.io/en/latest/kubernetes/operations/tasks/backends/ontap.html:

{
    "version": 1,
    "storageDriverName": "ontap-nas-flexgroup",
    "managementLIF": "10.0.0.1",
    "dataLIF": "10.0.0.2",
    "svm": "svm_nfs",
    "username": "vsadmin",
    "password": "secret",
    "defaults": {
      "size": "800G",
      "spaceReserve": "volume",
      "exportPolicy": "myk8scluster"
    }}

Since the volumes default to thin provisioned, you shouldn’t worry too much about storage space, unless you think your clients will fill up 800GB. If that’s the case, you can apply quotas to the volumes if needed to limit how much space can be used. (For FlexGroups, quota enforcement will be available in an upcoming release; FlexVols can currently use quota enforcement)

# docker volume create -d netapp --name=foo_justin_fg -o size=1t
foo_justin_fg

And this is what the volume looks like in ONTAP:

::*> vol show -vserver DEMO -volume netappdvp_foo_justin* -fields policy,is-flexgroup,aggr-list,size,space-guarantee 
vserver volume                  aggr-list               size policy  space-guarantee is-flexgroup
------- ----------------------- ----------------------- ---- ------- --------------- ------------
DEMO netappdvp_foo_justin_fg    aggr1_node1,aggr1_node2 1TB  default none            true

Since the FlexGroup is 1TB in size, the member volumes will be 128GB, which fulfills the 100GB minimum. Future releases will enforce this without you having to worry about it.

::*> vol show -vserver DEMO -volume netappdvp_foo_justin_fg_* -fields aggr-list,size -sort-by aggr-list
vserver volume                        aggr-list   size
------- ----------------------------- ----------- -----
DEMO    netappdvp_foo_justin_fg__0001 aggr1_node1 128GB
DEMO    netappdvp_foo_justin_fg__0003 aggr1_node1 128GB
DEMO    netappdvp_foo_justin_fg__0005 aggr1_node1 128GB
DEMO    netappdvp_foo_justin_fg__0007 aggr1_node1 128GB
DEMO    netappdvp_foo_justin_fg__0002 aggr1_node2 128GB
DEMO    netappdvp_foo_justin_fg__0004 aggr1_node2 128GB
DEMO    netappdvp_foo_justin_fg__0006 aggr1_node2 128GB
DEMO    netappdvp_foo_justin_fg__0008 aggr1_node2 128GB
8 entries were displayed.

Practical uses for FlexGroups with containers

It’s cool that we *can* provision FlexGroup volumes with Trident for use with containers, but does that mean we should?

Well, consider this…

In an ONTAP cluster that uses FlexVol volumes for NFS storage presented to containers, I am going to be bound to a single node’s resources, as per the design of a FlexVol. This means that even though I bought a 4 node cluster, I can only use 1 node’s RAM, CPU, network, capacity, etc. If I have a use case where thousands of containers spin up at any given moment and attach themselves to a NFS volume, then I might see some performance bottlenecks due to the increased load. In most cases, that’s fine – but if you could get more out of your storage, wouldn’t you want to do that?

docker-flexvol

You could add layers of automation into the mix to add more FlexVols to the solution, but then you have new mount points/folders. And what if those containers all need to access the same data?

docker-flexvol2

With a FlexGroup volume that gets presented to those same Docker instances, the containers now can leverage all nodes in the cluster, use a single namespace and simplify the overall automation structure.

docker-flexgroup.png

The benefits become even more evident when those containers are constantly writing new files to the NFS mount, such as in an Artificial Intelligence/Machine Learning use case. FlexGroups were designed to handle massive amounts of file creations and can provide 2-6x the performance over a FlexVol in use cases where we’re constantly creating new files.

Stay tuned for some more information on how FlexGroups and Trident can bring even more capability to the table to AI/ML workloads. In the meantime, you can learn more about NetApp solutions for AI/ML here:

https://www.netapp.com/us/solutions/applications/ai-deep-learning.aspx

How to find average file size and largest file size using XCP

If you use NetApp ONTAP to host NAS shares (CIFS or NFS) and have too many files and folders to count, then you know how challenging it can be to figure out file information in your environment in a quick, efficient and effective way.

This becomes doubly important when you are thinking of migrating NAS data from FlexVol volumes to FlexGroup volumes, because there is some work up front that needs to be done to ensure you size the capacity of the FlexGroup and its member volumes correctly. TR-4571 covers some of that in detail, but it basically says “know your average file size.” It currently doesn’t tell you *how* to do that (though it will eventually). This blog attempts to fill that gap.

XCP

I’ve written previously about XCP here:

Generally speaking, it’s been to tout the data migration capabilities of the tool. But, in this case, I want to highlight the “xcp scan” capability.

XCP scan allows you to use multiple, parallel threads to analyze an unstructured NAS share much more quickly than you could with basic tools like rsync, du, etc.

The NFS version of XCP also allows you to output this scan to a file (HTML, XML, etc) to generate a report about the scanned data. It even does the math for you and finds the largest (max) file size and average file size!

xcpfilesize

The command I ran to get this information was:

# xcp scan -md5 -stats -html SERVER:/volume/path > filename.html

That’s it! XCP will scan and write to a file. You can also get info about the top five file consumers (by number and capacity) by owner, as well as get some nifty graphs. (Pro tip: Managers love graphs!)

xcp-graphs

What if I only have SMB/CIFS data?

Currently, XCP for SMB doesn’t support output to HTML files. But that doesn’t mean you can’t have fun, too!

You can stand up a VM using CentOS or whatever your favorite Linux kernel is and use XCP for NFS to scan data – provided the client has the necessary access to do so and you can score an NFS license (even if it’s eval). XCP scans are read-only, so you shouldn’t have issues running them.

Just keep in mind the following:

NFS to shares that have traditionally been SMB/CIFS-only are likely NTFS security style. This means that the user you are accessing the data as (for example, root) should be able to map to a valid Windows user that has read access to the data. NFS clients that access NTFS security style volumes map to Windows users to figure out permissions. I cover that here:

Mixed perceptions with NetApp multiprotocol NAS access

You can check the volume security style in two ways:

  • CLI with the command
    ::> volume show -volume [volname] -fields security-style
  • OnCommand System Manager under the “Storage -> Qtrees” section (yea, yea… I know. Volumes != Qtrees)

ocsm-qtree

To check if the user you are attempting to access the volume via NFS with maps to a valid and expected Windows user, use this CLI command from diag privilege:

::> set diag
::*> diag secd name-mapping show -node node1 -vserver DEMO -direction unix-win -name prof1

'prof1' maps to 'NTAP\prof1'

To see what Windows groups this user would be a member of (and thus would get access to files and folders that have those groups assigned), use this diag privilege command:

::*> diag secd authentication show-creds -node ontap9-tme-8040-01 -vserver DEMO -unix-user-name prof1

UNIX UID: prof1 <> Windows User: NTAP\prof1 (Windows Domain User)

GID: ProfGroup
 Supplementary GIDs:
 ProfGroup
 group1
 group2
 group3
 sharedgroup

Primary Group SID: NTAP\DomainUsers (Windows Domain group)

Windows Membership:
 NTAP\group2 (Windows Domain group)
 NTAP\DomainUsers (Windows Domain group)
 NTAP\sharedgroup (Windows Domain group)
 NTAP\group1 (Windows Domain group)
 NTAP\group3 (Windows Domain group)
 NTAP\ProfGroup (Windows Domain group)
 Service asserted identity (Windows Well known group)
 BUILTIN\Users (Windows Alias)
 User is also a member of Everyone, Authenticated Users, and Network Users

Privileges (0x2080):
 SeChangeNotifyPrivilege

If you want to run XCP as root and want it to have administrator level access, you can create a name mapping. This is what I have in my SVM:

::> vserver name-mapping show -vserver DEMO -direction unix-win

Vserver: DEMO
Direction: unix-win
Position Hostname         IP Address/Mask
-------- ---------------- ----------------
1        -                -                Pattern: root
                                           Replacement: DEMO\\administrator

To create a name mapping for root to map to administrator:

::> vserver name-mapping create -vserver DEMO -direction unix-win -position 1 -pattern root -replacement DEMO\\administrator

Keep in mind that backup software often has this level of rights to files and folders, and the XCP scan is read-only, so there shouldn’t be any issue. If you are worried about making root an administrator, create a new Windows user for it to map to (for example, DOMAIN\xcp) and add it to the Backup Operators Windows Group.

In my lab, I ran a scan on a NTFS security style volume called “xcp_ntfs_src”:

::*> vserver security file-directory show -vserver DEMO -path /xcp_ntfs_src

Vserver: DEMO
 File Path: /xcp_ntfs_src
 File Inode Number: 64
 Security Style: ntfs
 Effective Style: ntfs
 DOS Attributes: 10
 DOS Attributes in Text: ----D---
Expanded Dos Attributes: -
 UNIX User Id: 0
 UNIX Group Id: 0
 UNIX Mode Bits: 777
 UNIX Mode Bits in Text: rwxrwxrwx
 ACLs: NTFS Security Descriptor
 Control:0x8014
 Owner:NTAP\prof1
 Group:BUILTIN\Administrators
 DACL - ACEs
 ALLOW-BUILTIN\Administrators-0x1f01ff-OI|CI
 ALLOW-DEMO\Administrator-0x1f01ff-OI|CI
 ALLOW-Everyone-0x100020-OI|CI
 ALLOW-NTAP\student1-0x120089-OI|CI
 ALLOW-NTAP\student2-0x120089-OI|CI

I used this command and nearly 600,000 objects were scanned in 25 seconds:

# xcp scan -md5 -stats -html 10.x.x.x:/xcp_ntfs_src > xcp-ntfs.html
XCP 1.3D1-8ae2672; (c) 2018 NetApp, Inc.; Licensed to Justin Parisi [NetApp Inc] until Tue Sep 4 13:23:07 2018

126,915 scanned, 85,900 summed, 43.8 MiB in (8.75 MiB/s), 14.5 MiB out (2.89 MiB/s), 5s
 260,140 scanned, 187,900 summed, 91.6 MiB in (9.50 MiB/s), 31.3 MiB out (3.34 MiB/s), 10s
 385,100 scanned, 303,900 summed, 140 MiB in (9.60 MiB/s), 49.9 MiB out (3.71 MiB/s), 15s
 516,070 scanned, 406,530 summed, 187 MiB in (9.45 MiB/s), 66.7 MiB out (3.36 MiB/s), 20s
Sending statistics...
 594,100 scanned, 495,000 summed, 220 MiB in (6.02 MiB/s), 80.5 MiB out (2.56 MiB/s), 25s
594,100 scanned, 495,000 summed, 220 MiB in (8.45 MiB/s), 80.5 MiB out (3.10 MiB/s), 25s.

This was the resulting report:

xcp-ntfs

Happy scanning!

Behind the Scenes: Episode 145 – AI, Machine Learning and ONTAP with Santosh Rao

Welcome to the Episode 145, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

tot-gopher

This week on the podcast, NetApp Senior Technical Director Santosh Rao (@santorao) joins us to talk about how NetApp and NVidia are partnering to enhance AI solutions with the DGX-1, ONTAP and FlexGroup volumes using NFS!

You can find more information in the following links:

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

This week’s episode is here:

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

Our YouTube channel (episodes uploaded sporadically) is here:

New and updated FlexGroup Technical Reports now available for ONTAP 9.4!

ONTAP 9.4 is now available, so that means the TRs need to get a refresh.

161212-westworld-news

Here’s what I’ve done for FlexGroup in ONTAP 9.4…

New Tech Report!

First, I moved the data protection section of the best practices TR (TR-4571) into its own dedicated backup and data protection TR, which can be found here:

TR-4678: Data Protection and Backup – FlexGroup volumes

Why? Well, that section is going to grow larger and larger as we add more data protection and backup functionality, so it made sense to proactively create a new one.

Updated TRs!

TR-4557 got an update of mostly just what’s new in ONTAP 9.4. That TR is a technical overview, which is intended just to give information on how FlexGroups work. The new feature payload for FlexGroup volumes in ONTAP 9.4 included:

  • QoS minimums and Adaptive QoS
  • FPolicy and file audit
  • SnapDiff support

TR-4571 is the best practices TR and got a brunt of the updates. Included in the TR (aside from details about new features), I added:

  • More detailed information about high file count environments and directory structure
  • More information about maxdirsize limits
  • Information on effects of drive failures
  • Workarounds for lack of NFSv4.x ACL support
  • Member volume count considerations when dealing with small and large files
  • Considerations when deleting FlexGroup volumes (and the volume recovery queue)
  • Clarifications on requirements for available space in an aggregate
  • System Manager support updates

Most of these updates came from feedback and questions I received. If you have something you want to see added to the TRs, let me know!

Behind the Scenes: Episode 134 – The Active IQ Story: Building a Data Pipeline for Machine Learning

Welcome to the Episode 134, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

tot-gopher

This week on the podcast, Active IQ Technical Director Shankar Pasupathy joins us and tells us how AutoSupport’s infrastructure and backend evolved into Active IQ’s multicloud data pipeline. Learn how NetApp is using big data analytics and machine learning on ONTAP to improve the overall customer experience

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

This week’s episode is here:

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

Our YouTube channel (episodes uploaded sporadically) is here:

FlexGroup Technical Reports Updated for ONTAP 9.3

fg-diagram

The latest updates for NetApp FlexGroup volumes for ONTAP 9.3 are available in the following Technical Reports:

Check it out and comment if you have a question!

Also check out previous blogs on FlexGroup volumes:

NetApp FlexGroup: Crazy fast

Tech ONTAP Podcast: Now powered by NetApp FlexGroup volumes!

NetApp FlexGroup: An evolution of NAS

And the lightboard video:

NetApp FlexGroup: Crazy fast

This week, the SPEC SFS®2014_swbuild test results for NetApp FlexGroup volumes submitted for file services were approved and published.

TL;DR – NetApp was the cream of the crop.

You can find those results here:

http://spec.org/sfs2014/results/res2017q3/sfs2014-20170908-00021.html

The testing rig was as follows:

  • Four node FAS8200 cluster (not AFF)
  • 72 4TB 7200 RPM 12Gb SAS drives (per HA pair)
  • NFSv3
  • 20 IBM servers/clients
  • 10GbE network (four connections per HA pair)

Below is a graph that consolidates the results of multiple vendor SPEC SFS®2014_swbuild results. Notice the FlexGroup did more IOPS (around 260k) at a lower latency (sub 3ms):

specsfs-fg

In addition, NetApp had the best Overall Response Time (ORT) of the competition:

specsfs-ort

And had the best MBps/throughput:

specsfs-mbps

Full results here:

http://spec.org/sfs2014/results/sfs2014swbuild.html

For more information on the SPEC SFS®2014_swbuild test, see https://www.spec.org/sfs2014/.

Everything but the kitchen sink…

With a NetApp FlexGroup, the more clients and work you throw at it, the better it will perform. An example of this is seen in TR-4571, with a 2 node A700 doing GIT workload testing. Note how increasing the jobs only encourages the FlexGroup.

average-iops

max-mbps-git

FlexGroup Resources

If you’re interested in learning more, see the following resources:

You can also email us at flexgroups-info@netapp.com.

Tech ONTAP Podcast: Now powered by NetApp FlexGroup volumes!

If you’re not aware, I co-host the Tech ONTAP Podcast. I also am the TME for NetApp FlexGroup volumes. Inexplicably, we weren’t actually storing our podcast files on NetApp storage – instead, we were using the local Mac SSD, which was problematic for three reasons:

  1. It was eventually going to fill up.
  2. If it failed, bye bye files.
  3. It was close to impossible to access unless were were local to the Mac, for a variety of reasons.

So, it finally dawned on me that I had an AFF8040 in my lab, barely being used for anything except testing and TR writing.

At first, I was going to use a FlexVol, out of habit. But then I realized that a FlexGroup volume would provide a great place to write a bunch of 1-400MB files while leveraging all of my cluster resources. The whole process, from creating the FlexGroup, googling autofs in Mac and setting up the NFS mount and Audio Hijack, took me all of maybe 30 minutes (most of that googling and setting up autofs). Not bad!

The podcast setup

When we record the podcast, we use software called Audio Hijack. This allows us to pipe in sound from applications like WebEx and web browsers, as well as from the in-studio microphones, which all get converted to MP3. This is where the FlexGroup NFS mount comes in – we’ll be pointing Audio Hijack to the FlexGroup volume, where the MP3 files will stream in real time.

Additionally, I also migrated all the existing data over to the FlexGroup for archival purposes. We do use OneDrive to do podcast sharing and such, but I wanted an extra layer of centralized data access, and the NFS mounted FlexGroup provides that. Setting it up to stream right from Audio Hijack removes an extra step for me when processing the files. But, before I could point the software at the NFS mount, I had to configure the Mac to automount the FlexGroup volume on boot.

Creating the FlexGroup volume

Normally, a FlexGroup volume is created with 8 member volumes per node for an AFF (as per best practice). However, my FlexGroup volume was going to be around 5TB. That means 16 member volumes would be around 350-400GB each. That violates the other best practices of no less than 500GB per member, to avoid too much remote allocation. While my file sizes weren’t going to be huge, I wanted to avoid issues as the volume filled, so I met in the middle – 8 member volumes total, 4 per node. To do that, you have to go to the CLI; System Manager doesn’t do customization like that yet. In particular, you need the -aggr-list and -aggr-list-multiplier options with volume create.

ontap9-tme-8040::*> vol create -vserver DEMO -volume TechONTAP -aggr-list aggr1_node1,aggr1_node2 -aggr-list-multiplier 4
ontap9-tme-8040::*> vol show -vserver DEMO -volume TechONTAP* -sort-by size -fields size,node
vserver volume size node
------- --------------- ----- ------------------
DEMO TechONTAP__0001 640GB ontap9-tme-8040-01
DEMO TechONTAP__0002 640GB ontap9-tme-8040-02
DEMO TechONTAP__0003 640GB ontap9-tme-8040-01
DEMO TechONTAP__0004 640GB ontap9-tme-8040-02
DEMO TechONTAP__0005 640GB ontap9-tme-8040-01
DEMO TechONTAP__0006 640GB ontap9-tme-8040-02
DEMO TechONTAP__0007 640GB ontap9-tme-8040-01
DEMO TechONTAP__0008 640GB ontap9-tme-8040-02
DEMO TechONTAP 5TB -

Automounting NFS on boot with a Mac

When you mount NFS with a Mac, it doesn’t retain it after you reboot. To get the mount to come back up, you have to configure the autofs service on the Mac. This is different from Linux, where you can simply edit the fstab file. The process is covered very well in this blog post (just be sure to read all the way down to avoid the issue he mentions at the end):

https://coderwall.com/p/fuoa-g/automounting-nfs-share-in-os-x-into-volumes

Here’s my configuration…. I disabled “nobrowse” to prevent issues in case Audio Hijack needed to be able to browse.

autofs.conf

Screen Shot 2017-09-22 at 10.04.37 AM

auto_master file

Screen Shot 2017-09-22 at 10.04.59 AM

auto_nfs

Screen Shot 2017-09-22 at 10.05.17 AM

After that was set up, I copied over the existing 50-ish GBs of data into the FlexGroup and cleaned up some space on the Mac.

ontap9-tme-8040::*> vol show -vserver DEMO -volume TechONTAP* -sort-by size -fields size,used
vserver volume size used
------- --------------- ----- -------
DEMO TechONTAP__0001 640GB 5.69GB
DEMO TechONTAP__0002 640GB 8.24GB
DEMO TechONTAP__0003 640GB 5.56GB
DEMO TechONTAP__0004 640GB 6.48GB
DEMO TechONTAP__0005 640GB 6.42GB
DEMO TechONTAP__0006 640GB 8.39GB
DEMO TechONTAP__0007 640GB 6.25GB
DEMO TechONTAP__0008 640GB 6.25GB
DEMO TechONTAP 5TB 53.29GB
9 entries were displayed.

Then, I configured Audio Hijack to pump the recordings to the FlexGroup volume.

Screen Shot 2017-09-22 at 10.01.00 AM.png

Then, we recorded a couple episodes, without an issue!

Screen Shot 2017-09-22 at 10.34.30 AM.png

As you can see from this output, the FlexGroup volume is relatively evenly allocated:

ontap9-tme-8040::*> node run * flexgroup show TechONTAP
2 entries were acted on.

Node: ontap9-tme-8040-01
FlexGroup 0x80F03817
* next snapshot cleanup due in 2886 msec
* next refresh message due in 886 msec (last to member 0x80F0381F)
* spinnp version negotiated as 4.6, capability 0x3
* Ref count is 8

Idx Member L Used Avail Urgc Targ Probabilities D-Ingest Alloc F-Ingest Alloc
--- -------- - --------------- ---------- ---- ---- --------------------- --------- ----- --------- -----
 1 2044 L 1485146 0% 159376256 0% 12% [100% 100% 79% 79%] 0+ 0 0 0+ 0 0
 2 2045 R 2153941 1% 159376256 0% 12% [100% 100% 98% 98%] 0+ 0 0 0+ 0 0
 3 2046 L 1415120 0% 159339950 0% 12% [100% 100% 76% 76%] 0+ 0 0 0+ 0 0
 4 2047 R 1690392 1% 159376256 0% 12% [100% 100% 98% 98%] 0+ 0 0 0+ 0 0
 5 2048 L 1675583 1% 159376256 0% 12% [100% 100% 98% 98%] 0+ 0 0 0+ 0 0
 6 2049 R 2191360 1% 159376256 0% 12% [100% 100% 98% 98%] 0+ 0 0 0+ 0 0
 7 2050 L 1630946 1% 159376256 0% 12% [100% 100% 87% 87%] 0+ 0 0 0+ 0 0
 8 2051 R 1631429 1% 159376256 0% 12% [100% 100% 87% 87%] 0+ 0 0 0+ 0 0

Node: ontap9-tme-8040-02
FlexGroup 0x80F03817
* next snapshot cleanup due in 3144 msec
* next refresh message due in 144 msec (last to member 0x80F03818)
* spinnp version negotiated as 4.6, capability 0x3
* Ref count is 8

Idx Member L Used Avail Urgc Targ Probabilities D-Ingest Alloc F-Ingest Alloc
--- -------- - --------------- ---------- ---- ---- --------------------- --------- ----- --------- -----
 1 2044 R 1485146 0% 159376256 0% 12% [100% 100% 79% 79%] 0+ 0 0 0+ 0 0
 2 2045 L 2153941 1% 159376256 0% 12% [100% 100% 98% 98%] 0+ 0 0 0+ 0 0
 3 2046 R 1415120 0% 159339950 0% 12% [100% 100% 76% 76%] 0+ 0 0 0+ 0 0
 4 2047 L 1690392 1% 159376256 0% 12% [100% 100% 98% 98%] 0+ 0 0 0+ 0 0
 5 2048 R 1675583 1% 159376256 0% 12% [100% 100% 98% 98%] 0+ 0 0 0+ 0 0
 6 2049 L 2191360 1% 159376256 0% 12% [100% 100% 98% 98%] 0+ 0 0 0+ 0 0
 7 2050 R 1630946 1% 159376256 0% 12% [100% 100% 87% 87%] 0+ 0 0 0+ 0 0
 8 2051 L 1631429 1% 159376256 0% 12% [100% 100% 87% 87%] 0+ 0 0 0+ 0 0

I plan on using this setup when I start writing the new FlexGroup data protection best practice guide, so stay tuned for that…

So, now, the Tech ONTAP podcast is happily drinking the NetApp FlexGroup champagne!

If you’re going to NetApp Insight, check out session 16594-2 on FlexGroup volumes.

For more information on NetApp FlexGroup volumes, see: