Transporte sans bouger (Moving FlexClone volumes without splitting them)

pch-bigsur4.jpg

I’m somewhat of a music buff. I spent most of my teenage years with headphones glued to my ears, and eventually made my way into college radio as a DJ. I was even once able to interview Layzie Bone from Bone Thugs and Harmony!

(I was DJ Sledgehammer)

https://studentmedia.ncsu.edu/web/?q=training/wknc/how-to-be-a-good-dj

I like most every type of music, from hip hop to rock to indie to bluegrass. For a while, I was really into a band from England called Stereolab. In particular, this song:

It’s a song about technology and how humanity becomes disconnected from one another. The title translates to “Transported without Moving” and its themes still resonate today.

But, for this blog, I wanted to apply “transporte san bouger” to an ONTAP feature that allows you to transport data without moving it – volume rehost – and the limitation it currently has on moving FlexClone volumes.

cluster::*> vol rehost -vserver DEMO -volume clone -destination-vserver NFS 

Error: command failed: Cannot rehost volume "clone" on Vserver "DEMO" because the volume is a clone volume.

What is a FlexClone volume?

To answer this question, let’s start with “what is a FlexVol volume”?

A FlexVol is NetApp’s way of presenting a file system to clients in ONTAP software. Each FlexVol volume has its own unique file system ID and can host data via NAS (via NFS and/or SMB) or SAN (via FCP/iSCSI with LUNs). It’s a logical container for data that allows storage administrators to carve out space for their end users without provisioning the entire array to everyone.

FlexVol volumes can host some very important production data, such as code repositories and databases. Sometimes, people need to be able to work with that data without impacting the production data or taking up a bunch of extra storage space. A FlexClone provides an instant copy of a FlexVol volume, based on a snapshot, that only takes up space for inode pointers to data blocks. Thus, you could clone a FlexVol volume that has 10TB of data in less than a second and use up less than 1MB of additional space.

The clone can be read-only or read-write, depending on what you use it for. If it’s a read-write clone, then any data that is written/deleted/overwritten in the clone will use space. But the initial clone? Negligible. Check out the space usage on the aggregate before and after the clone (physical used):

cluster::*> aggr show-space -aggregate-name aggr1_node1

Aggregate : aggr1_node1

Feature Used Used%
-------------------------------- ---------- ------
Volume Footprints 278.4GB 3%
Aggregate Metadata 2.65GB 0%
Snapshot Reserve 0B 0%
Total Used 281.0GB 3%

Total Physical Used 265.6GB 3%


cluster::*> vol clone create -vserver DEMO -flexclone clone -type RW -parent-vserver DEMO -parent-volume flexvol -junction-active true -foreground true -junction-path /clone
[Job 13563] Job succeeded: Successful


cluster::*> vol show -vserver DEMO -volume flexvol,clone -fields used,aggregate
vserver volume    aggregate used
------- ------    ----------- ------
DEMO    clone     aggr1_node1 1.77GB
DEMO    flexvol   aggr1_node1 1.79GB


cluster::*> aggr show-space -aggregate-name aggr1_node1

Aggregate : aggr1_node1

Feature Used Used%
-------------------------------- ---------- ------
Volume Footprints 283.0GB 4%
Aggregate Metadata 2.64GB 0%
Snapshot Reserve 0B 0%
Total Used 285.6GB 4%

Total Physical Used 265.6GB 3%

Some use cases I’ve heard of are:

That’s not nearly all you can do with clones. Leave feedback in the comments with how  you use clones!

So what’s the problem?

A FlexClone, however, for all its usefulness, isn’t super flexible when it comes to moving it around. In addition to volume rehost being unable to move a FlexClone, a non-disruptive volume move of a clone will result in splitting the clone, as well as negating any storage efficiencies you might have!

cluster::*> volume move start -vserver DEMO -volume clone -destination-aggregate aggr1_node2

Warning: Volume will no longer be a clone volume after the move and any associated space efficiency savings will be lost. Do you want to proceed? {y|n}: n

Yuck.

Fortunately, NetApp has some smart people like Jeff Steiner and Florian Feldhaus, who came up with a way to move a clone around without splitting it – thereby retaining the storage efficiencies!

How? With another cool ONTAP feature – SnapMirror.

What’s a SnapMirror?

SnapMirror is a disaster recovery technology that will replicate – at the block level – an exact copy of data to a destination site. That destination could be remote, or it could be on the same cluster. But, it can also be used to move FlexClone volumes around without losing storage efficiencies!

Wait… why would I want to move a FlexClone?

It might not seem like you’d want to move a FlexClone volume, but there are some reasons. For one, you can’t rehost a clone to a new SVM. So, if you were testing in a SVM and wanted to move your volumes to a new SVM and retain the clones, you would be stuck.

Another use case would be moving from ONTAP running in 7-Mode to clustered ONTAP. Using SnapMirror to move data is much faster in many cases than a file-based copy.

Great! How do I do it?

The basic steps are:

  1. Create a SnapMirror of the parent volume (requires cluster/SVM peering, which is all super easy to do in System Manager now)
  2. Ensure the clone volume has a common snapshot for a snapmirror resync operation
  3. Clone the destination volume
  4. Issue a snapmirror resync on the clone volumes (via CLI)

That’s it! Once you’ve done that, you can break the snapmirror, delete the source volume and clone after a final resync for a quick cutover to the new destination.

Here’s where I did it:

Create a SnapMirror of the parent volume

Use a “mirror” relationship type and “MirrorAllSnapshots” as the policy.

clone-mirror1

clone-mirror3

Ensure the clone volume has a common snapshot for a snapmirror resync operation

You can find the snapshot copies under the “Volumes” menu by clicking on the volume for more details and viewing the “Snapshot copies” tab.

clone-mirror-common-snap

Clone the destination volume

Be sure to select the common snapshot.

clone-vol1

Issue a snapmirror resync on the clone volumes (via CLI)

Since we’re kind of “cheating” here by resyncing a relationship that doesn’t actually exist, we have to use the CLI.

sm-resync.png

Once this is done, the relationship will show up in System Manager:

sm-resync-ocsm

Once that transfer completes, I can update the mirror when I’m ready to cutover, quiesce and break the mirror, point clients the new clone and delete the old one from the source SVM. Or, I can keep both around if I like. Pretty nifty, eh?

Any caveats?

There are a couple that immediately come to mind…

  • FlexClone doesn’t currently work with FlexGroup volumes
  • FlexClone and SnapMirror both require licenses

Questions? Comments? Hit me up in the comments!

Advertisements

Behind the Scenes: Episode 149 – Cloud Volume Services Performance with Oracle Databases

Welcome to the Episode 149, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

tot-gopher

This week on the podcast, TME Chad Morgenstern (@sockpupets) joins us to discuss how performance looks in Cloud Volume Services for Oracle Database workloads.

Interested in Cloud Volume Services? You can investigate on your own here:

https://cloud.netapp.com/cloud-volumes

You can also check out Eiki Hrafnsson’s Cloud Field Day presentation on Cloud Volume Services here:

http://techfieldday.com/appearance/netapp-presents-at-cloud-field-day-3/

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

This week’s episode is here:

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

Our YouTube channel (episodes uploaded sporadically) is here:

Behind the Scenes: Episode 148 – An Introduction to Cloud Volume Services

Welcome to the Episode 148, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

tot-gopher

This week on the podcast, we invited Eiki Hrafnsson (@EirikurH), Chief Architect, Data Fabric, and Sr. Manager, Product Marketing, Cloud, Ingo Fuchs (@IngoFuchs) to give us the lowdown on what Cloud Volume Services are and where people are expected to use them.

You can also check out Eiki’s Cloud Field Day presentation on Cloud Volume Services here:

http://techfieldday.com/appearance/netapp-presents-at-cloud-field-day-3/

Also, you can investigate Cloud Volume Services on your own here:

https://cloud.netapp.com/cloud-volumes

Or check out this demo with Ingo.

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

This week’s episode is here:

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

Our YouTube channel (episodes uploaded sporadically) is here:

 

Behind the Scenes: Episode 147 – SPC-1v3 Results – NetApp AFF A800

Welcome to the Episode 147, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

tot-gopher

This week on the podcast, we find out how the new NetApp A800 system fared in the rigorous SPC-1 v3 storage benchmarks. Can the NVMe attached SSDs truly help reduce latency while maintaining high number of IOPs? Performance TME Dan Isaacs (@danisaacs) and the workload engineering team of Scott Lane, Jim Laing and Joe Scott join us to discuss! 

Check out the published results here: 

http://spcresults.org/benchmarks/results/spc1-spc1e#A32007

And the official NetApp blog:

https://blog.netapp.com/nvme-benchmark-spc-1-testing-validates-breakthrough-performance-aff/

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

This week’s episode is here:

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

Our YouTube channel (episodes uploaded sporadically) is here:

Behind the Scenes: Episode 146 – OpenStack Summit 2018 Recap

Welcome to the Episode 146, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

tot-gopher

This week on the podcast, we bring in some of the NetApp OpenStack crew to discuss OpenStack Summit 2018 in Vancouver. Join Product Marketing Manager Pete Brey (@cloudstorageguy) and TMEs Amit Borulkar (@amit_borulkar) and David Blackwell as we discuss all the happenings of the conferences, as well as Amit’s new OpenStack FlexPod Cisco Validated Design!

Link to the CVD:

https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/UCS_CVDs/flexpodsf_openstack_osp10_design.html

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

This week’s episode is here:

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

Our YouTube channel (episodes uploaded sporadically) is here:

How to find average file size and largest file size using XCP

If you use NetApp ONTAP to host NAS shares (CIFS or NFS) and have too many files and folders to count, then you know how challenging it can be to figure out file information in your environment in a quick, efficient and effective way.

This becomes doubly important when you are thinking of migrating NAS data from FlexVol volumes to FlexGroup volumes, because there is some work up front that needs to be done to ensure you size the capacity of the FlexGroup and its member volumes correctly. TR-4571 covers some of that in detail, but it basically says “know your average file size.” It currently doesn’t tell you *how* to do that (though it will eventually). This blog attempts to fill that gap.

XCP

I’ve written previously about XCP here:

Generally speaking, it’s been to tout the data migration capabilities of the tool. But, in this case, I want to highlight the “xcp scan” capability.

XCP scan allows you to use multiple, parallel threads to analyze an unstructured NAS share much more quickly than you could with basic tools like rsync, du, etc.

The NFS version of XCP also allows you to output this scan to a file (HTML, XML, etc) to generate a report about the scanned data. It even does the math for you and finds the largest (max) file size and average file size!

xcpfilesize

The command I ran to get this information was:

# xcp scan -md5 -stats -html SERVER:/volume/path > filename.html

That’s it! XCP will scan and write to a file. You can also get info about the top five file consumers (by number and capacity) by owner, as well as get some nifty graphs. (Pro tip: Managers love graphs!)

xcp-graphs

What if I only have SMB/CIFS data?

Currently, XCP for SMB doesn’t support output to HTML files. But that doesn’t mean you can’t have fun, too!

You can stand up a VM using CentOS or whatever your favorite Linux kernel is and use XCP for NFS to scan data – provided the client has the necessary access to do so and you can score an NFS license (even if it’s eval). XCP scans are read-only, so you shouldn’t have issues running them.

Just keep in mind the following:

NFS to shares that have traditionally been SMB/CIFS-only are likely NTFS security style. This means that the user you are accessing the data as (for example, root) should be able to map to a valid Windows user that has read access to the data. NFS clients that access NTFS security style volumes map to Windows users to figure out permissions. I cover that here:

Mixed perceptions with NetApp multiprotocol NAS access

You can check the volume security style in two ways:

  • CLI with the command
    ::> volume show -volume [volname] -fields security-style
  • OnCommand System Manager under the “Storage -> Qtrees” section (yea, yea… I know. Volumes != Qtrees)

ocsm-qtree

To check if the user you are attempting to access the volume via NFS with maps to a valid and expected Windows user, use this CLI command from diag privilege:

::> set diag
::*> diag secd name-mapping show -node node1 -vserver DEMO -direction unix-win -name prof1

'prof1' maps to 'NTAP\prof1'

To see what Windows groups this user would be a member of (and thus would get access to files and folders that have those groups assigned), use this diag privilege command:

::*> diag secd authentication show-creds -node ontap9-tme-8040-01 -vserver DEMO -unix-user-name prof1

UNIX UID: prof1 <> Windows User: NTAP\prof1 (Windows Domain User)

GID: ProfGroup
 Supplementary GIDs:
 ProfGroup
 group1
 group2
 group3
 sharedgroup

Primary Group SID: NTAP\DomainUsers (Windows Domain group)

Windows Membership:
 NTAP\group2 (Windows Domain group)
 NTAP\DomainUsers (Windows Domain group)
 NTAP\sharedgroup (Windows Domain group)
 NTAP\group1 (Windows Domain group)
 NTAP\group3 (Windows Domain group)
 NTAP\ProfGroup (Windows Domain group)
 Service asserted identity (Windows Well known group)
 BUILTIN\Users (Windows Alias)
 User is also a member of Everyone, Authenticated Users, and Network Users

Privileges (0x2080):
 SeChangeNotifyPrivilege

If you want to run XCP as root and want it to have administrator level access, you can create a name mapping. This is what I have in my SVM:

::> vserver name-mapping show -vserver DEMO -direction unix-win

Vserver: DEMO
Direction: unix-win
Position Hostname         IP Address/Mask
-------- ---------------- ----------------
1        -                -                Pattern: root
                                           Replacement: DEMO\\administrator

To create a name mapping for root to map to administrator:

::> vserver name-mapping create -vserver DEMO -direction unix-win -position 1 -pattern root -replacement DEMO\\administrator

Keep in mind that backup software often has this level of rights to files and folders, and the XCP scan is read-only, so there shouldn’t be any issue. If you are worried about making root an administrator, create a new Windows user for it to map to (for example, DOMAIN\xcp) and add it to the Backup Operators Windows Group.

In my lab, I ran a scan on a NTFS security style volume called “xcp_ntfs_src”:

::*> vserver security file-directory show -vserver DEMO -path /xcp_ntfs_src

Vserver: DEMO
 File Path: /xcp_ntfs_src
 File Inode Number: 64
 Security Style: ntfs
 Effective Style: ntfs
 DOS Attributes: 10
 DOS Attributes in Text: ----D---
Expanded Dos Attributes: -
 UNIX User Id: 0
 UNIX Group Id: 0
 UNIX Mode Bits: 777
 UNIX Mode Bits in Text: rwxrwxrwx
 ACLs: NTFS Security Descriptor
 Control:0x8014
 Owner:NTAP\prof1
 Group:BUILTIN\Administrators
 DACL - ACEs
 ALLOW-BUILTIN\Administrators-0x1f01ff-OI|CI
 ALLOW-DEMO\Administrator-0x1f01ff-OI|CI
 ALLOW-Everyone-0x100020-OI|CI
 ALLOW-NTAP\student1-0x120089-OI|CI
 ALLOW-NTAP\student2-0x120089-OI|CI

I used this command and nearly 600,000 objects were scanned in 25 seconds:

# xcp scan -md5 -stats -html 10.x.x.x:/xcp_ntfs_src > xcp-ntfs.html
XCP 1.3D1-8ae2672; (c) 2018 NetApp, Inc.; Licensed to Justin Parisi [NetApp Inc] until Tue Sep 4 13:23:07 2018

126,915 scanned, 85,900 summed, 43.8 MiB in (8.75 MiB/s), 14.5 MiB out (2.89 MiB/s), 5s
 260,140 scanned, 187,900 summed, 91.6 MiB in (9.50 MiB/s), 31.3 MiB out (3.34 MiB/s), 10s
 385,100 scanned, 303,900 summed, 140 MiB in (9.60 MiB/s), 49.9 MiB out (3.71 MiB/s), 15s
 516,070 scanned, 406,530 summed, 187 MiB in (9.45 MiB/s), 66.7 MiB out (3.36 MiB/s), 20s
Sending statistics...
 594,100 scanned, 495,000 summed, 220 MiB in (6.02 MiB/s), 80.5 MiB out (2.56 MiB/s), 25s
594,100 scanned, 495,000 summed, 220 MiB in (8.45 MiB/s), 80.5 MiB out (3.10 MiB/s), 25s.

This was the resulting report:

xcp-ntfs

Happy scanning!