How to Map File and Folder Locations to NetApp ONTAP FlexGroup Member Volumes with XCP

The concept behind a NetApp FlexGroup volume is that ONTAP presents a single large namespace for NAS data, while ONTAP handles the balance and placement of files and folders to the underlying FlexVol member volumes, rather than a storage administrator needing to manage that.

I cover it in more detail in:

There’s also this USENIX presentation:

However, while not knowing/caring where files and folders live in your cluster is nice most of the time, there are occasions where you may need to figure out where a file or folder *actually* lives in the cluster – such as if a member volume has a large imbalance of capacity usage and you need to know what files need to be deleted/moved out of that volume. Previously, there’s been no real good way to do that, but thanks to the efforts of one of our global solutions architects (and one of the inventors of XCP), we now have a way and we don’t even need a treasure map. Plastic Treasure Map Party Accessory (1 count) (1/Pkg): Kitchen  & Dining

What is NetApp XCP?

If you’re unfamiliar with NetApp XCP, it’s NetApp’s FREE copy utility/data move that also can be used to do file analytics. There are other use cases, too:

Using XCP to delete files en masse: A race against rm

How to find average file size and largest file size using XCP

Because XCP can run in parallel from a client, it can perform tasks (such as find) much faster in high file count environments, so you’re not sitting around waiting for a command to finish for minutes/hours/days.

Since a FlexGroup is pretty much made for high file count environments, we’d want a way to quickly find files and their locations.

ONTAP NAS and File Handles

In How to identify a file or folder in ONTAP in NFS packet traces, I covered how to find inode information and a little bit about how ONTAP file handles are created/presented. The deep details aren’t super important here, but the general concept – that FlexGroup member volume information is stored in file handles that NFS can read – is.

Using that information and some parsing, there’s a Python script that can be used as an XCP plugin to translate file handles into member volume index numbers and present them in easy-to-read formats.

That Python script can be found here:

FlexGroup File Mapper

How to Use the “FlexGroup File Mapper” plugin with XCP

First of all, you’d need a client that has XCP installed. The version isn’t super important, but the latest release is generally the best release to use.

There are two methods we’ll use here to map files to member volumes.

  1. Scan All Files/Folders in a FlexGroup and Map Them All to Member Volumes
  2. Use a FlexGroup Member Volume Number and Find All Files in that Member Volume

To do this, I’ll use a FlexGroup that has ~2 million files.

::*> df -i FGNFS
Filesystem iused ifree %iused Mounted on Vserver
/vol/FGNFS/ 2001985 316764975 0% /FGNFS DEMO

Getting the XCP Host Ready

First, copy the FlexGroup File Mapper plugin to the XCP host. The file name isn’t important, but when you run the XCP command, you’ll either want to specify the plugin’s location or run the command from the folder the plugin lives in.

On my XCP host, I have the plugin named in /testXCP:

# ls -la | grep
-rw-r--r-- 1 502 admin 1645 Mar 25 17:34
# pwd

Scan All Files/Folders in a FlexGroup and Map Them All to Member Volumes

In this case, we’ll map all files and folders to their respective FlexGroup member volumes.

This is the command I use:

xcp diag -run scan -fmt '"{} {}".format(x, fgid(x))'

You can also include -parallel (n) to control how many processes spin up to do this work and you can use > filename at the end to pipe the output to a file (recommended).

For example, scanning ~2 million files in this volume took just 37 seconds!

# xcp diag -run scan -fmt '"{} {}".format(x, fgid(x))' > FGNFS.txt
402,061 scanned, 70.6 MiB in (14.1 MiB/s), 367 KiB out (73.3 KiB/s), 5s
751,933 scanned, 132 MiB in (12.3 MiB/s), 687 KiB out (63.9 KiB/s), 10s
1.10M scanned, 193 MiB in (12.2 MiB/s), 1007 KiB out (63.6 KiB/s), 15s
1.28M scanned, 225 MiB in (6.23 MiB/s), 1.14 MiB out (32.6 KiB/s), 20s
1.61M scanned, 283 MiB in (11.6 MiB/s), 1.44 MiB out (60.4 KiB/s), 25s
1.91M scanned, 335 MiB in (9.53 MiB/s), 1.70 MiB out (49.5 KiB/s), 31s
2.00M scanned, 351 MiB in (3.30 MiB/s), 1.79 MiB out (17.4 KiB/s), 36s
Sending statistics…

Xcp command : xcp diag -run scan -fmt "{} {}".format(x, fgid(x))
Stats : 2.00M scanned
Speed : 351 MiB in (9.49 MiB/s), 1.79 MiB out (49.5 KiB/s)
Total Time : 37s.

The file created was 120MB, though… that’s a LOT of text to sort through.

-rw-r--r--. 1 root root 120M Apr 27 15:28 FGNFS.txt

So, there’s another way to do this, right? Correct!

If I know the folder I want to filter, or even a matching of file names, I can use -match in the command. In this case, I want to find all folders named dir_33.

This is the command:

# xcp diag -run scan -fmt '"{} {}".format(x, fgid(x))' -match "name=='dir_33'" > dir_33_FGNFS.txt

This is the output of the file. Two folders – one in member volume 3, one in member volume 4:

# cat dir_33_FGNFS.txt
x.x.x.x:/FGNFS/files/client1/dir_33 3
x.x.x.x:/FGNFS/files/client2/dir_33 4

If I want to use pattern matching for file names (ie, I know I want all files with “moarfiles3” in the name), then I can do this using regex and/or wildcards. More examples can be found in the XCP user guides.

Here’s the command I used. It found 440,400 files with that pattern in 27s.

# xcp diag -run scan -fmt '"{} {}".format(x, fgid(x))' -match "fnm('moarfiles3*')" > moarfiles3_FGNFS.txt

507,332 scanned, 28,097 matched, 89.0 MiB in (17.8 MiB/s), 465 KiB out (92.9 KiB/s), 5s
946,796 scanned, 132,128 matched, 166 MiB in (15.4 MiB/s), 866 KiB out (80.1 KiB/s), 10s
1.31M scanned, 209,340 matched, 230 MiB in (12.8 MiB/s), 1.17 MiB out (66.2 KiB/s), 15s
1.73M scanned, 297,647 matched, 304 MiB in (14.8 MiB/s), 1.55 MiB out (77.3 KiB/s), 20s
2.00M scanned, 376,195 matched, 351 MiB in (9.35 MiB/s), 1.79 MiB out (48.8 KiB/s), 25s
Sending statistics…

Filtered: 444400 matched, 1556004 did not match

Xcp command : xcp diag -run scan -fmt "{} {}".format(x, fgid(x)) -match fnm('moarfiles3*')
Stats : 2.00M scanned, 444,400 matched
Speed : 351 MiB in (12.6 MiB/s), 1.79 MiB out (65.7 KiB/s)
Total Time : 27s.

And this is a sample of some of those entries (the file is 27MB):

x.x.x.x:/FGNFS/files/client1/dir_45/moarfiles3158.txt 3
x.x.x.x:/FGNFS/files/client1/dir_45/moarfiles3159.txt 3

I can also look for files over a certain size. In this volume, the files are all 4K in size; but in my TechONTAP volume, I have varying file sizes. In this case, I want to find all .wav files greater than 100MB. This command didn’t seem to pipe to a file for me, but the output was only 16 files.

# xcp diag -run scan -fmt '"{} {}".format(x, fgid(x))' -match "fnm('.wav') and size > 500*M" > TechONTAP_ep.txt 20x - Genomics Architecture/ep20x-genomics-meat.wav 4 Files/ep104-webex.output.wav 5 Files/ep104-mics.output.wav 3 181 - Networking Deep Dive/ep181-networking-deep-dive-meat.output.wav 6 181 - Networking Deep Dive/ep181-networking-deep-dive-meat.wav 2

Filtered: 16 matched, 7687 did not match

xcp command : xcp diag -run scan -fmt "{} {}".format(x, fgid(x)) -match fnm('.wav') and size > 100M
Stats : 7,703 scanned, 16 matched
Speed : 1.81 MiB in (1.44 MiB/s), 129 KiB out (102 KiB/s)
Total Time : 1s.

But what if I know that a member volume is getting full and I want to see what files are in that member volume?

Use a FlexGroup Member Volume Number and Find All Files in that Member Volume

In the case where I know what member volume needs to be addressed, I can use XCP to search using the FlexGroup index number. The index number lines up with the member volume numbers, so if the index number is 6, then we know the member volume is 6.

In my 2 million file FG, I want to filter by member 6, so I use this command, which shows there are ~95019 files in member 6:

# xcp diag -run scan -match 'fgid(x)==6' -parallel 10 -l > member6.txt

 615,096 scanned, 19 matched, 108 MiB in (21.6 MiB/s), 563 KiB out (113 KiB/s), 5s
 1.03M scanned, 5,019 matched, 180 MiB in (14.5 MiB/s), 939 KiB out (75.0 KiB/s), 10s
 1.27M scanned, 8,651 matched, 222 MiB in (8.40 MiB/s), 1.13 MiB out (43.7 KiB/s), 15s
 1.76M scanned, 50,019 matched, 309 MiB in (17.3 MiB/s), 1.57 MiB out (89.9 KiB/s), 20s
 2.00M scanned, 62,793 matched, 351 MiB in (8.35 MiB/s), 1.79 MiB out (43.7 KiB/s), 25s

Filtered: 95019 matched, 1905385 did not match

Xcp command : xcp diag -run scan -match fgid(x)==6 -parallel 10 -l
Stats       : 2.00M scanned, 95,019 matched
Speed       : 351 MiB in (12.5 MiB/s), 1.79 MiB out (65.0 KiB/s)
Total Time  : 28s.

When I check against the files-used for that member volume, it lines up pretty well:

::*> vol show -vserver DEMO -volume FGNFS__0006 -fields files-used
vserver volume      files-used
------- ----------- ----------
DEMO    FGNFS__0006 95120

And the output file shows not just the file names, but also the sizes!

rw-r--r-- --- root root 4KiB 4KiB 18h22m FGNFS/files/client2/dir_143/moarfiles1232.txt
rw-r--r-- --- root root 4KiB 4KiB 18h22m FGNFS/files/client2/dir_143/moarfiles1233.txt
rw-r--r-- --- root root 4KiB 4KiB 18h22m FGNFS/files/client2/dir_143/moarfiles1234.txt

And, if I choose, I can filter further with the sizes. Maybe I just want to see files in that member volume that are 4K or less (in this case, that’s all of them):

# xcp diag -run scan -match 'fgid(x)==6 and size < 4*K' -parallel 10 -l

In my “TechONTAP” volume, I look for 500MB files or greater in member 6:

# xcp diag -run scan -match 'fgid(x)==6 and size > 500*M' -parallel 10 -l

rw-r--r-- --- 501 games 596MiB 598MiB 3y219d techontap/Episodes/Episode 1/Epidose 1 Files/Tech ONTAP Podcast - Episode 1 - AFF with Dan Isaacs v3_1.aif
rw-r--r-- --- 501 games 885MiB 888MiB 3y219d techontap/archive/Prod - old MacBook/Insight 2016_Day2_TechOnTap_JParisi_ASullivan_GDekhayser.mp4
rw-r--r-- --- 501 games 787MiB 790MiB 1y220d techontap/archive/Episode 181 - Networking Deep Dive/ep181-networking-deep-dive-meat.output.wav

Filtered: 3 matched, 7700 did not match

Xcp command : xcp diag -run scan -match fgid(x)==6 and size > 500*M -parallel 10 -l
Stats : 7,703 scanned, 3 matched
Speed : 1.81 MiB in (1.53 MiB/s), 129 KiB out (109 KiB/s)
Total Time : 1s.

So, there you have it! A way to find files in a specific member volume inside of a FlexGroup! Let me know if you have any comments or questions below.

Using XCP to delete files en masse: A race against rm


XCP has traditionally been thought of as a way to rapidly migrate large amounts of data, or to scan data and generate reports. And those ideas still hold up today….

But what if i told you that you could use XCP to delete millions of files 5-6x faster than running rm on an NFS client?

Wait… why would I delete millions of files?

Normally, you wouldn’t. But in some workflows, such scratch space, this is what happens. A bunch of small files get generated and then deleted once the work is done.

I ran a simple test in my lab where I had a flexgroup volume with ~37 million files in it.

::*> vol show -vserver DEMO -volume flexgroup_16 -fields files-used
vserver volume files-used
------- ------------ ----------
DEMO flexgroup_16 37356098

I took a snapshot of that data so I could restore it later for XCP to delete and then ran rm -rf on it from a client. It took 20 hours:

# time rm -rf /flexgroup/*

real 1213m4.652s
user 1m39.703s
sys 41m16.978s

Then I restored the snapshot and deleted the same ~37 million files using XCP. That took roughly 3.5 hours:

# time xcp diag -rmrf
real 218m17.765s
user 149m16.132s
sys 40m47.427s

So, if you have a workflow that requires you to delete large amounts of data that normally takes you FOREVER, try XCP next time…

These are VMs with limited RAM and 1GB network connections, so I’d imagine with bigger, beefier servers, those times could come down a bit more. But in an apples to apples test, XCP wins again!

New Technical Report – Electronic Design Automation (EDA) Best Practices


With the introduction of FlexGroup volumes in ONTAP 9.1, I mention that one of the sweet spots for FlexGroup volume use cases is the EDA space, due to the high ingest and large number of files.

As such, I’ve written up a new TR for EDA best practices that can be found here:

What is EDA?

EDA stands for “Electronic Design Automation.” Essentially, it refers to software tools for designing electronic systems such as integrated circuits and printed circuit boards. The tools work together in a design flow that chip designers use to design and analyze entire semiconductor chips. Since a modern semiconductor chip can have billions of components, EDA tools are essential for their design. Here’s a list of EDA companies for reference:

Feel free to send feedback to the DL in the doc, or post in the comments here.