Windows NFS? WHO DOES THAT???

Image result for disgusted girl meme

Believe it or not, Windows NFS is a thing. Microsoft has its own NFS server and client, which can leverage RFC compliant NFSv3 calls to a Windows Server running NFS server or to a 3rd party NFS server, such as NetApp ONTAP. It’s actually so popular, that NetApp had to re-introduce it in clustered ONTAP (it wasn’t there until ONTAP 8.2.3/8.3.1).

While Windows NFS currently provides NFSv3 clients, they don’t have NFSv4.1 clients – yet. They do provide NFSv4.1 as a server option, though:

https://docs.microsoft.com/en-us/windows-server/storage/nfs/nfs-overview

I cover Windows NFS support in TR-4067 starting on page 116. I am bringing this topic up because it has come up again recently and I wanted to create a quick and easy blog to follow, as well as call out how you can integrate AD LDAP to help identity management.

There are a few things you have to do to get it working in ONTAP.

Specifically:

  • enable -v3-ms-dos-client option on the NFS server
  • enable -showmount on the NFS server – this prevents some weirdness with writing files
  • disable -enable-ejukebox and -v3-connection-drop

The command would look like this:

cluster::> set advanced
cluster::*> nfs server modify -vserver DEMO -v3-ms-dos-client enabled -v3-connection-drop disabled -enable-ejukebox false -showmount enabled
cluster::*> nfs server show -vserver DEMO -fields v3-ms-dos-client,v3-connection-drop,showmount,enable-ejukebox
vserver enable-ejukebox v3-connection-drop showmount v3-ms-dos-client
------- --------------- ------------------ --------- ----------------
DEMO false disabled enabled enabled

Once that’s done, you can mount via NFS inside Windows clients using the standard “mount” command, provided you’ve enabled the Services for UNIX functionality. There’s plenty of documentation out there for that.

Just by doing the above, here’s an example of a working NFS mount in Windows:

C:\Users\Administrator>mount DEMO:/flexvol X:
X: is now successfully connected to DEMO:/flexvol

The command completed successfully.

Here’s the cluster’s view of that connection:

ontap9-tme-8040::*> network connections active show -node ontap9-tme-8040-0* -service nfs*,mount -remote-ip 10.193.67.236
              Vserver   Interface         Remote
      CID Ctx Name      Name:Local Port   Host:Port            Protocol/Service
--------- --- --------- ----------------- -------------------- ----------------
Node: ontap9-tme-8040-02
2968991376  4 DEMO      data:2049         oneway.ntap.local:931
                                                               TCP/nfs

When I write a file to the mount, there is something that can prove to be an issue, however. Users other than Administrator will write as UID/GID of 4294967294 (-2).

ontap9-tme-8040::*> vserver security file-directory show -vserver DEMO -path /flexvol/student1-nfs.txt

                Vserver: DEMO
              File Path: /flexvol/student1-nfs.txt
      File Inode Number: 1606599
         Security Style: unix
        Effective Style: unix
         DOS Attributes: 20
DOS Attributes in Text: ---A----
Expanded Dos Attributes: -
           UNIX User Id: 4294967294
          UNIX Group Id: 4294967294
         UNIX Mode Bits: 755
UNIX Mode Bits in Text: rwxr-xr-x
                   ACLs: -

That means users won’t show up properly/as desired in UNIX NFS mounts. For example, this is that same file from CentOS:

[root@centos7 /]# cd flexvol
[root@centos7 flexvol]# ls -la | grep student1-nfs
-rwxr-xr-x 1 4294967294 4294967294 0 Feb 5 09:18 student1-nfs.txt

So, how does one fix that?

Configuring Windows NFS clients to negotiate users properly

There are a few ways to have users leverage UID/GID other than -2.

One way is to “squash” every NFS user to the same UID/GID via the old Windows standby – the Windows registry. This is useful if only a single user will be using an NFS client.

This covers how to do that:

https://blogs.msdn.microsoft.com/saponsqlserver/2011/02/03/installation-configuration-of-windows-nfs-client-to-enable-windows-to-mount-a-unix-file-system/

Some of the third party NFS clients (such as Cygwin and Hummingbird/OpenText) will provide local passwd and group file functionality to allow you to leverage more users. In some cases, all this does is add more registry entries.

Another was is to chmod/chown the file after it’s written. But that’s not ideal.

The best way is to leverage an existing name service (such as NIS or LDAP) and have Windows clients query for the UID and GID. If you have one already, great! It’s super easy to set up the client. Just run the following command as an administrator in cmd. My NTAP.LOCAL domain already has an LDAP server set up:

C:\Users\administrator>nfsadmin mapping WIN7-CLIENT config adlookup=yes addomain=NTAP.LOCAL

The settings were successfully updated.

Once I did that, I wrote a new file and the UID/GID was properly represented:

ontap9-tme-8040::*> vserver security file-directory show -vserver DEMO -path /flexvol/prof1-nfs.txt

                Vserver: DEMO
              File Path: /flexvol/prof1-nfs.txt
      File Inode Number: 1606600
         Security Style: unix
        Effective Style: unix
         DOS Attributes: 20
DOS Attributes in Text: ---A----
Expanded Dos Attributes: -
           UNIX User Id: 1100
          UNIX Group Id: 1101
         UNIX Mode Bits: 755
UNIX Mode Bits in Text: rwxr-xr-x
                   ACLs: -

ontap9-tme-8040::*> getxxbyyy getpwbyname -node ontap9-tme-8040-01 -vserver DEMO -username prof1
  (vserver services name-service getxxbyyy getpwbyname)
pw_name: prof1
pw_passwd:
pw_uid: 1100
pw_gid: 1101
pw_gecos:
pw_dir:
pw_shell:

If you’re interested, a packet trace shows that the Windows client will communicate via encrypted LDAP to query the user’s UNIX attribute information:

windows-ldap

An added bonus of having Windows clients query LDAP for UNIX user names and groups for NFS on ONTAP is that if you’re using NTFS security style volumes, you won’t have issues connecting to those mounts.

What breaks when doing NTFS security style?

When a UNIX user attempts to access a volume with NTFS security style ACLs, ONTAP will attempt to map that user to a valid Windows user to make sure Windows ACLs can be calculated. (I cover this in Mixed perceptions with NetApp multiprotocol NAS access)

If a user comes in with the default Windows NFS ID of 4294967294 (which doesn’t translate to a UNIX user), this is what happens.

  • The UNIX user 4294967294 tries to access the mount.
  • ONTAP receives a UID of 4294967294 and attempts to map that to a Windows user
  • That Windows user does not exist, so access is denied. This can manifest as an error (such as when writing a file) or it could just show no files/folder.

windows-nfs-ntfs-noaccess.png

windows-nfs-ntfs-noaccess2

That particular folder does have data. It’s just that the user can’t see it:

windows-nfs-ntfs-data-list

In ONTAP, we’d see this error, confirming that the user doesn’t exist:

2/5/2019 14:31:26 ontap9-tme-8040-02
ERROR secd.nfsAuth.problem: vserver (DEMO) General NFS authorization problem. Error: Get user credentials procedure failed
[ 15 ms] Hostname found in Name Service Cache
[ 19] Hostname found in Name Service Cache
[ 23] Successfully connected to ip 10.193.67.236, port 389 using TCP
**[ 28] FAILURE: User ID '4294967294' not found in UNIX authorization source LDAP.
[ 28] Entry for user-id: 4294967294 not found in the current source: LDAP. Ignoring and trying next available source
[ 29] Entry for user-id: 4294967294 not found in the current source: FILES. Entry for user-id: 4294967294 not found in any of the available sources
[ 44] Unable to get the name for UNIX user with UID 4294967294

With LDAP involved, access to the access to the NFS mounted volume with NTFS security works much better, because ONTAP and the client agree that user 1100 is prof1.

windows-nfs-ntfs-data-list-ldap

So, uh… what if I don’t have LDAP or NIS?

Well, in a Windows domain, you ALWAYS have an LDAP server. Active Directory leverages LDAP schemas to store information and any version of Windows Active Directory can be used to look up UNIX users and groups. In fact, the newer versions of Windows make this very easy. In older Windows versions, you had to manually extend the LDAP schema to provide UNIX attributes. Now, UNIX attributes like UID, UIDnumber, etc. are all in LDAP by default. All you have to do is populate these values with information. You can even do it via PowerShell CMDlets!

Once you have a working Active Directory LDAP environment, you can then configure ONTAP to communicate with LDAP for UNIX identities and you’re well on your way to having a scalable, functional multiprotocol NAS environment.

The one downside I’ve found with Windows NFS is that it doesn’t always play nicely when you want to use SMB on the same client. Windows gets a bit… confused. I haven’t dug into that a ton, but I’ve seen it enough to express caution. 🙂

Advertisements

How to find average file size and largest file size using XCP

If you use NetApp ONTAP to host NAS shares (CIFS or NFS) and have too many files and folders to count, then you know how challenging it can be to figure out file information in your environment in a quick, efficient and effective way.

This becomes doubly important when you are thinking of migrating NAS data from FlexVol volumes to FlexGroup volumes, because there is some work up front that needs to be done to ensure you size the capacity of the FlexGroup and its member volumes correctly. TR-4571 covers some of that in detail, but it basically says “know your average file size.” It currently doesn’t tell you *how* to do that (though it will eventually). This blog attempts to fill that gap.

XCP

I’ve written previously about XCP here:

Generally speaking, it’s been to tout the data migration capabilities of the tool. But, in this case, I want to highlight the “xcp scan” capability.

XCP scan allows you to use multiple, parallel threads to analyze an unstructured NAS share much more quickly than you could with basic tools like rsync, du, etc.

The NFS version of XCP also allows you to output this scan to a file (HTML, XML, etc) to generate a report about the scanned data. It even does the math for you and finds the largest (max) file size and average file size!

xcpfilesize

The command I ran to get this information was:

# xcp scan -md5 -stats -html SERVER:/volume/path > filename.html

That’s it! XCP will scan and write to a file. You can also get info about the top five file consumers (by number and capacity) by owner, as well as get some nifty graphs. (Pro tip: Managers love graphs!)

xcp-graphs

What if I only have SMB/CIFS data?

Currently, XCP for SMB doesn’t support output to HTML files. But that doesn’t mean you can’t have fun, too!

You can stand up a VM using CentOS or whatever your favorite Linux kernel is and use XCP for NFS to scan data – provided the client has the necessary access to do so and you can score an NFS license (even if it’s eval). XCP scans are read-only, so you shouldn’t have issues running them.

Just keep in mind the following:

NFS to shares that have traditionally been SMB/CIFS-only are likely NTFS security style. This means that the user you are accessing the data as (for example, root) should be able to map to a valid Windows user that has read access to the data. NFS clients that access NTFS security style volumes map to Windows users to figure out permissions. I cover that here:

Mixed perceptions with NetApp multiprotocol NAS access

You can check the volume security style in two ways:

  • CLI with the command
    ::> volume show -volume [volname] -fields security-style
  • OnCommand System Manager under the “Storage -> Qtrees” section (yea, yea… I know. Volumes != Qtrees)

ocsm-qtree

To check if the user you are attempting to access the volume via NFS with maps to a valid and expected Windows user, use this CLI command from diag privilege:

::> set diag
::*> diag secd name-mapping show -node node1 -vserver DEMO -direction unix-win -name prof1

'prof1' maps to 'NTAP\prof1'

To see what Windows groups this user would be a member of (and thus would get access to files and folders that have those groups assigned), use this diag privilege command:

::*> diag secd authentication show-creds -node ontap9-tme-8040-01 -vserver DEMO -unix-user-name prof1

UNIX UID: prof1 <> Windows User: NTAP\prof1 (Windows Domain User)

GID: ProfGroup
 Supplementary GIDs:
 ProfGroup
 group1
 group2
 group3
 sharedgroup

Primary Group SID: NTAP\DomainUsers (Windows Domain group)

Windows Membership:
 NTAP\group2 (Windows Domain group)
 NTAP\DomainUsers (Windows Domain group)
 NTAP\sharedgroup (Windows Domain group)
 NTAP\group1 (Windows Domain group)
 NTAP\group3 (Windows Domain group)
 NTAP\ProfGroup (Windows Domain group)
 Service asserted identity (Windows Well known group)
 BUILTIN\Users (Windows Alias)
 User is also a member of Everyone, Authenticated Users, and Network Users

Privileges (0x2080):
 SeChangeNotifyPrivilege

If you want to run XCP as root and want it to have administrator level access, you can create a name mapping. This is what I have in my SVM:

::> vserver name-mapping show -vserver DEMO -direction unix-win

Vserver: DEMO
Direction: unix-win
Position Hostname         IP Address/Mask
-------- ---------------- ----------------
1        -                -                Pattern: root
                                           Replacement: DEMO\\administrator

To create a name mapping for root to map to administrator:

::> vserver name-mapping create -vserver DEMO -direction unix-win -position 1 -pattern root -replacement DEMO\\administrator

Keep in mind that backup software often has this level of rights to files and folders, and the XCP scan is read-only, so there shouldn’t be any issue. If you are worried about making root an administrator, create a new Windows user for it to map to (for example, DOMAIN\xcp) and add it to the Backup Operators Windows Group.

In my lab, I ran a scan on a NTFS security style volume called “xcp_ntfs_src”:

::*> vserver security file-directory show -vserver DEMO -path /xcp_ntfs_src

Vserver: DEMO
 File Path: /xcp_ntfs_src
 File Inode Number: 64
 Security Style: ntfs
 Effective Style: ntfs
 DOS Attributes: 10
 DOS Attributes in Text: ----D---
Expanded Dos Attributes: -
 UNIX User Id: 0
 UNIX Group Id: 0
 UNIX Mode Bits: 777
 UNIX Mode Bits in Text: rwxrwxrwx
 ACLs: NTFS Security Descriptor
 Control:0x8014
 Owner:NTAP\prof1
 Group:BUILTIN\Administrators
 DACL - ACEs
 ALLOW-BUILTIN\Administrators-0x1f01ff-OI|CI
 ALLOW-DEMO\Administrator-0x1f01ff-OI|CI
 ALLOW-Everyone-0x100020-OI|CI
 ALLOW-NTAP\student1-0x120089-OI|CI
 ALLOW-NTAP\student2-0x120089-OI|CI

I used this command and nearly 600,000 objects were scanned in 25 seconds:

# xcp scan -md5 -stats -html 10.x.x.x:/xcp_ntfs_src > xcp-ntfs.html
XCP 1.3D1-8ae2672; (c) 2018 NetApp, Inc.; Licensed to Justin Parisi [NetApp Inc] until Tue Sep 4 13:23:07 2018

126,915 scanned, 85,900 summed, 43.8 MiB in (8.75 MiB/s), 14.5 MiB out (2.89 MiB/s), 5s
 260,140 scanned, 187,900 summed, 91.6 MiB in (9.50 MiB/s), 31.3 MiB out (3.34 MiB/s), 10s
 385,100 scanned, 303,900 summed, 140 MiB in (9.60 MiB/s), 49.9 MiB out (3.71 MiB/s), 15s
 516,070 scanned, 406,530 summed, 187 MiB in (9.45 MiB/s), 66.7 MiB out (3.36 MiB/s), 20s
Sending statistics...
 594,100 scanned, 495,000 summed, 220 MiB in (6.02 MiB/s), 80.5 MiB out (2.56 MiB/s), 25s
594,100 scanned, 495,000 summed, 220 MiB in (8.45 MiB/s), 80.5 MiB out (3.10 MiB/s), 25s.

This was the resulting report:

xcp-ntfs

Happy scanning!

Workaround for Mac Finder errors when unzipping files in ONTAP

ONTAP allows you to mount volumes to other volumes in a Storage Virtual Machine, which provides a way for storage administrators to create their own folder structures across multiple nodes in a cluster. This is useful when you want to ensure the workload gets spread across nodes, but you can’t use FlexGroup volumes for whatever reason.

This graphic shows how that can work:

junctioned-volumes.png

In NAS environments, a client will ask for a file or folder location and ONTAP will re-direct the traffic to wherever that object lives. This is supposed to be transparent to the client, provided they follow standard NAS deployment steps.

However, not all NAS clients are created equal. Sometimes, Linux serves up SMB and will do things differently than Microsoft does. Windows also will do NFS, but it doesn’t entirely follow the NFS RFCs. So, occasionally, ONTAP doesn’t expect how a client handles something and stuff breaks.

Mac Finder

If you’ve ever used a Mac, you’ll know that the Finder can do somethings a little differently than the Terminal does. In this particular issue, we’ll focus on how Finder unzips files (when you double-click the file) in volumes that are mounted to other volumes in ONTAP.

One of our customers hit this issue, and after poking around a little bit, I figured out how to workaround the issue.

Here’s what they were doing:

  • SMB to Mac clients
  • Shares at the parent FlexVol level (ie, /vol1
  • FlexVols mounted to other FlexVols several levels deep (ie, /vol1/vol2/vol3)

When files are unzipped after accessing a share at a higher level and then drilling down into other folders (which are actually FlexVols mounted to other FlexVols), then unzipping in Finder via double-click fails.

When the shares are mounted at the same level as the FlexVol where the unzip is attempted, unzip works. When the Terminal is used to unzip, it works.

However, when your users refuse to use/are unable to use the Terminal and you don’t want to create hundreds of shares just to work around one issue, it’s an untenable situation.

So, I decided to dig into the issue…

Reproducing the issue

The best way to troubleshoot problems is to set up a lab environment and try to recreate the problem. This allows you freedom to gather logs, packet traces, etc. without bothering your customer or end user. So, I brought my trusty old 2011 MacBook running OS Sierra and mounted the SMB share in question.

These are the volumes and their junction paths:

DEMO inodes /shared/inodes
DEMO shared /shared

This is the share:

 Vserver: DEMO
 Share: shared
 CIFS Server NetBIOS Name: DEMO
 Path: /shared
 Share Properties: oplocks
 browsable
 changenotify
 show-previous-versions
 Symlink Properties: symlinks
 File Mode Creation Mask: -
 Directory Mode Creation Mask: -
 Share Comment: -
 Share ACL: Everyone / Full Control
 File Attribute Cache Lifetime: -
 Volume Name: shared
 Offline Files: manual
 Vscan File-Operations Profile: standard
 Maximum Tree Connections on Share: 4294967295
 UNIX Group for File Create: -

I turned up debug logging on the cluster (engage NetApp Support if you want to do this), got a packet trace on the Mac and reproduced the issue right away. Lucky me!

finder-error

I also tried a 3rd party unzip utility (Stuffit Expander) and it unzipped fine. So this was definitely a Finder/ONTAP/NAS interaction problem, which allowed me to focus on that.

Packet traces showed that the Finder was attempting to look for a folder called “.TemporaryItems/folders.501/Cleanup At Startup” but couldn’t find it – and couldn’t create it, apparently either. But it would created folders named “BAH.XXXX” instead, and they wouldn’t get cleaned up.

So, I thought, why not manually create the folder path, since it wasn’t able to do it on its own?

You can do this through Terminal, or via Finder. Keep in mind that the path above has “folders.501” – 501 is my uid, so check your users uid on the Mac and make sure the folder path is created using that uid. If you have multiple users that access the share, you may need to create multiple folders.xxx in .TemporaryItems.

If you do it via Finder, you may want to enable hidden files. I learned how to do that via this article:

https://ianlunn.co.uk/articles/quickly-showhide-hidden-files-mac-os-x-mavericks/

So I did that and then I unmounted the share and re-mounted, to make sure there wasn’t any weird cache issue lingering. You can check CIFS/SMB sessions, versions, etc with the following command, if you want to make sure they are closed:

cluster::*> cifs session show -vserver DEMO -instance

Vserver: DEMO

Node: node1
 Session ID: 16043510722553971076
 Connection ID: 390771549
 Incoming Data LIF IP Address: 10.x.x.x
 Workstation IP Address: 10.x.x.x
 Authentication Mechanism: NTLMv2
 User Authenticated as: domain-user
 Windows User: NTAP\prof1
 UNIX User: prof1
 Open Shares: 1
 Open Files: 1
 Open Other: 0
 Connected Time: 7m 49s
 Idle Time: 6m 2s
 Protocol Version: SMB3
 Continuously Available: No
 Is Session Signed: true
 NetBIOS Name: -
 SMB Encryption Status: unencrypted
 Connection Count: 1

Once I reconnected with the newly created folder path, double-click unzip worked perfectly!

Check it out yourself:

Note: You *may* have to enable the option is-use-junctions-as-reparse-points-enabled on your CIFS server. I haven’t tested with it off and on thoroughly, but I saw some inconsistency when it was disabled. For the record, it’s on by default.

You can check with:

::*> cifs options show -fields is-use-junctions-as-reparse-points-enabled

Give it a try and let me know how it works for you in the comments!

Cache Rules Everything Around Me: New Global Name Service Cache in ONTAP 9.3

cache-rules

In an ONTAP cluster made up of individual nodes with individual hardware resources, it’s useful if a storage administrator can manage the entire cluster as a monolithic entity, without having to worry about what lives where.

Prior to ONTAP 9.3, name service caches were node-centric, for the most part. This sometimes could create scenarios where a cache could become stale on one node, where it was recently populated on another node. Thus, a client may get different results depending on which physical node the network connection occurred.

The following is pulled right out of the new name services best practices technical report (https://www.netapp.com/us/media/tr-4668.pdf), which acts as an update to TR-4379. I wrote some of this, but most of what’s written here is by the new NFS/Name Services TME Chris Hurley. (@averageguyx) This is basically a copy/paste, but I thought this was a cool enough feature to highlight on its own.

Global Name Services Cache in ONTAP 9.3

ONTAP 9.3 offers a new caching mechanism that moves name service caches out of memory and into a persistent cache that is replicated asynchronously between all nodes in the cluster. This provides more reliability and resilience in the event of failovers, as well as offering higher limits for name service entries due to being cached on disk rather than in node memory.

The name service cache is enabled by default. If legacy cache commands are attempted in ONTAP 9.3 with name service caching enabled, an error will occur, such as the following:

Error: show failed: As name service caching is enabled, "Netgroups" caches no longer exist. Use the command "vserver services name-service cache netgroups members show" (advanced privilege level) to view the corresponding name service cache entries.

The name service caches are controlled in a centralized location, below the name-service cache command set. This provides easier cache management, from configuring caches to clearing stale entries.

The global name service cache can be disabled for individual caches using vserver services name-service cache commands in advanced privilege, but it is not recommended to do so. For more detailed information, please see later sections in this document.

ONTAP also offers the additional benefit of using the caches while external name services are unavailable.  If there is an entry in the cache, regardless if the entry’s TTL is expired or not, ONTAP will use that cache entry when external name services servers cannot be reached, thereby providing continued access to data served by the SVM.

Hosts Cache

There are two individual host caches; forward-lookup and reverse-lookup but the hosts cache settings are controlled as a whole.  When a record is retrieved from DNS, the TTL of that record will be used for the cache TTL, otherwise, the default TTL in the host cache settings will be used (24 hours).  The default for negative entries (host not found) is 60 seconds.  Changing DNS settings does not affect the cache contents in any way.

  • The network ping command does not use the name services hosts cache when using a hostname.

User and Group Cache

The user and group caches consist of three categories; passwd (user), group and group membership.

  • Cluster RBAC access does not use the any of the caches

Passwd (User) Cache

User cache consists of two caches, passwd and passwd-by-uid.  The caches only cache the name, uid and gid aspects of the user data to conserve space since the other data such as homedir and shell are irrelevant for NAS access.  When an entry is placed in the passwd cache, the corresponding entry is created in the passwd-by-uid cache.  By the same token, when an entry is deleted from one cache, the corresponding entry will be deleted from the other cache.  If you have an environment where there are disjointed username to uid mappings, there is an option to disable this behavior.

Group Cache

Like the passwd cache, the group cache consists of two caches, group and group-by-gid.  When an entry is placed in the group cache, the corresponding entry is created in the group-by-gid cache.  By the same token, when an entry is deleted from one cache, the corresponding entry will be deleted from the other cache.  The full group membership is not cached to conseve space and is not necessary for NAS data access, therefore only the group name and gid are cached.  If you have an environment where there are disjointed group name to gid mappings, there is an option to disable this behavior.

Group Membership Cache

In file and NIS environments, there is no efficient way to gather a list of groups a particular user is a member of, so for these environments ONTAP has a group membership cache to provide these efficiencies.  The group membership cache consists of a single cache and contains a list of groups a user is a member of.

Netgroup Cache

Beginning in ONTAP 9.3, the various netgroup caches have been consolidated into 2 caches; a netgroup.byhost and a netgroup.byname cache.  The netgroup.byhost cache is the first cache consulted for the netgroups a host is a part of.  Next, if this information is not available, then the query reverts to gathering the full netgroup members and comparing that to the host.  If the information is not in the cache, then the same process is performed against the netgroup ns-switch sources.  If a host requesting access via a netgroup is found via the netgroup membership lookup process, that ip-to-netgroup mapping is always added to the netgroup.byhost cache for faster future access.  This also leads to needing a lower TTL for the members cache so that changes in netgroup membership can be reflected in the ONTAP caches within the TTL timeframe.

Viewing cache entries

Each of the above came service caches and be viewed.  This can be used to confirm whether or not expected results are gotten from name services servers.  Each cache has its own individual options that you can use to filter the results of the cache to find what you are looking for.  In order to view the cache, the name-services cache <cache> <subcache> show command is used.

Caches are unique per vserver, so it is suggested to view caches on a per-vserver basis.  Below are some examples of the caches and the options.

ontap9-tme-8040::*> name-service cache hosts forward-lookup show  ?

  (vserver services name-service cache hosts forward-lookup show)

  [ -instance | -fields <fieldname>, ... ]

  [ -vserver <vserver name> ]                                                   *Vserver

  [[-host] <text>]                                                              *Hostname

  [[-protocol] {Any|ICMP|TCP|UDP}]                                              *Protocol (default: *)

  [[-sock-type] {SOCK_ANY|SOCK_STREAM|SOCK_DGRAM|SOCK_RAW}]                     *Sock Type (default: *)

  [[-flags] {FLAG_NONE|AI_PASSIVE|AI_CANONNAME|AI_NUMERICHOST|AI_NUMERICSERV}]  *Flags (default: *)

  [[-family] {Any|Ipv4|Ipv6}]                                                   *Family (default: *)

  [ -canonname <text> ]                                                         *Canonical Name

  [ -ips <IP Address>, ... ]                                                    *IP Addresses

  [ -ip-protocol {Any|ICMP|TCP|UDP}, ... ]                                      *Protocol

  [ -ip-sock-type {SOCK_ANY|SOCK_STREAM|SOCK_DGRAM|SOCK_RAW}, ... ]             *Sock Type

  [ -ip-family {Any|Ipv4|Ipv6}, ... ]                                           *Family

  [ -ip-addr-length <integer>, ... ]                                            *Length

  [ -source {none|files|dns|nis|ldap|netgrp_byname} ]                           *Source of the Entry

  [ -create-time <"MM/DD/YYYY HH:MM:SS"> ]                                      *Create Time

  [ -ttl <integer> ]                                                            *DNS TTL




ontap9-tme-8040::*> name-service cache unix-user user-by-id show

  (vserver services name-service cache unix-user user-by-id show)

Vserver    UID         Name         GID            Source  Create Time

---------- ----------- ------------ -------------- ------- -----------

SVM1       0           root         1              files   1/25/2018 15:07:13

ch-svm-nfs1

           0           root         1              files   1/24/2018 21:59:47

2 entries were displayed.

If there are no entries in a particular cache, the following message will be shown:

ontap9-tme-8040::*> name-service cache netgroups members show

  (vserver services name-service cache netgroups members show)

This table is currently empty.

There you have it! New cache methodology in ONTAP 9.3. If you’re using NAS and name services in ONTAP, it’s highly recommended to go to ONTAP 9.3 to take advantage of this new feature.

Behind the Scenes: Episode 126 – Komprise

Welcome to the Episode 126, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

tot-gopher

This week on the podcast, we bring in Komprise (@Komprise) CEO Kumar Goswami (@KumarKGoswami) to chat about data management and how their software helps get the most out of your NetApp storage systems!

komprise

For more information about Komprise, check out komprise.com!

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

This week’s episode is here:

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

Our YouTube channel (episodes uploaded sporadically) is here:

XCP SMB/CIFS support available!

If you’re not familiar with what XCP is, I covered it in a previous blog post, Migrating to ONTAP – Ludicrous speed! as well as in the XCP podcast. Basically, it’s a super-fast way to scan and migrate data.

One of the downsides of the tool was the fact that it only supported NFSv3 migrations, which also meant it couldn’t handle NTFS style ACLs. Doing that would require a SMB/CIFS supported version of XCP. Today, we get that with XCP SMB/CIFS 1.0:

https://mysupport.netapp.com/tools/download/ECMLP2357425DT.html?productID=62115&pcfContentID=ECMLP2357425

XCP for SMB/CIFS supports the following:

“show” Displays information about the CIFS shares of a system
“scan”  Reads all files and directories found on a CIFS share and build assessment reports
“copy”  Recursively copies everything from source to destination
“sync”  Performs multiple incremental syncs from source to target
“verify”  Verifies that the target state matches the source, including attributes and NTFS ACLs
“activate”  Activates the XCP license on Windows hosts
“help”     Displays detailed information about XCP commands and options

 

Right now, it’s CLI only, but be on the lookout for a GUI version.

“Installing” XCP on Windows

XCP in Windows is a simple executable file that runs via the cmd or a PowerShell window. One of the pre-requisites for the software includes Microsoft Visual C++ Redistributable for Visual Studio 2017. If you don’t install this, trying to run the program will result in an error that calls out a specific DLL that isn’t registered.

When I copied the file to my Windows host, I created a new directory called “C:\XCP.” You can put that directory anywhere. To run the utility in CMD, you can either navigate to the directory and run “xcp” or add the directory to your system paths to run from anywhere.

For example:

env-windows-path

XCP-path

Once that’s done, run XCP from any location:

cifs-xcp

cifs-xcp-ps.png

Licensing XCP

XCP is a licensed feature. That doesn’t mean you have to pay for it; the license is only used for tracking purposes. But you do have to apply a license. In Windows, that’s pretty easy.

  1. Download a license from xcp.netapp.com
  2. Copy the license into the C:\NetApp\XCP folder
  3. Run “xcp activate”

xcp-license.png

XCP show

The command “xcp show \\server” can give some useful information for an ONTAP SMB/CIFS server, such as:

  • Available shares
  • Capacity (used and available)
  • Current connections
  • Folder path
  • Share attributes and permissions

This output is a good way to get an overall look at what is available on a server.

cifs-xcp-show.png

XCP scan

XCP has a number of useful scanning features. These include:

PS C:\XCP> xcp help scan

usage: xcp scan [-h] [-v] [-parallel <n>] [-match <filter>] [-preserve-atime]
 [-depth <n>] [-stats] [-l] [-ownership] [-du]
 [-fmt <expression>]
 source

positional arguments:
 source

optional arguments:
 -h, --help show this help message and exit
 -v increase debug verbosity
 -parallel <n> number of concurrent processes (default: <cpu-count>)
 -match <filter> only process files and directories that match the filter
 (see `xcp help -match` for details)
 -preserve-atime restore last accessed date on source
 -depth <n> limit the search depth
 -stats print tree statistics report
 -l detailed file listing output
 -ownership retrieve ownership information
 -du summarize space usage of each directory including
 subdirectories
 -fmt <expression> format file listing according to the python expression
 (see `xcp help -fmt` for details)

I scanned my “shared” directory with the -stats option and it was able to scan over 60,000 files in 31 seconds and gave me the following stats:

== Maximum Values ==
 Size Depth Namelen Dirsize
 2.02KiB 5 15 100

== Average Values ==
 Size Depth Namelen Dirsize
 25.6 5 6 6

== Top File Extensions ==
 .py
 50003 1

== Number of files ==
 empty <8KiB 8-64KiB 64KiB-1MiB 1-10MiB 10-100MiB >100MiB
 3 50001

== Space used ==
 empty <8KiB 8-64KiB 64KiB-1MiB 1-10MiB 10-100MiB >100MiB
 0 1.22MiB 0 0 0 0 0

== Directory entries ==
 empty 1-10 10-100 100-1K 1K-10K >10k
 2 10004 101

== Depth ==
 0-5 6-10 11-15 16-20 21-100 >100
 60111

== Modified ==
 >1 year >1 month 1-31 days 1-24 hrs <1 hour <15 mins future
 60111

== Created ==
 >1 year >1 month 1-31 days 1-24 hrs <1 hour <15 mins future
 60111

Total count: 60111
Directories: 10107
Regular files: 50004
Symbolic links:
Junctions:
Special files:
Total space for regular files: 1.22MiB
Total space for directories: 0
Total space used: 1.22MiB
60,111 scanned, 0 errors, 31s

When I increased the parallel threads to 8, it finished in 18 seconds:

PS C:\XCP> xcp scan -stats -parallel 8 \\demo\shared

Total count: 60111
Directories: 10107
Regular files: 50004
Symbolic links:
Junctions:
Special files:
Total space for regular files: 1.22MiB
Total space for directories: 0
Total space used: 1.22MiB
60,111 scanned, 0 errors, 18s

XCP copy

With xcp copy, I can copy SMB/CIFS data with or without ACLs at a much faster rate than simple robocopy. Keep in mind that with this version of XCP, it doesn’t have BACKUP OPERATOR rights, so you’d need to run the utility as an admin user on both source and destination.

In the following example, I used robocopy to copy the same dataset as XCP to a NetApp FlexGroup volume.

Robocopy to FlexGroup results (~20-30 minutes)

         Total Copied Skipped Mismatch FAILED Extras
 Dirs :  10107  10106       1        0      0      0
 Files : 50004  50004       0        0      0      0
 Bytes : 1.21m  1.21m       0        0      0      0
 Times : 0:19:01 0:13:11 0:00:00 0:05:50

Speed : 1615 Bytes/sec.
 Speed : 0.092 MegaBytes/min.

UPDATE: Someone asked if the above robocopy run was done with the /MT flag, which would be a more fair apples to apples comparison, since XCP does multithreading. It wasn’t. The syntax used was:

PS C:\XCP> robocopy /S /COPYALL source destination

So, I re-ran it using MT:8 and with an empty FlexGroup after restoring the base snapshot and converting the security style to NTFS to ensure the ACLs come over as well. The multithreading of robocopy cut the time to completion roughly in half.

Robocopy /MT to FlexGroup results (~8-9 minutes)

 PS C:\XCP> robocopy /S /COPYALL /MT:8 \\demo\shared \\demo\flexgroup\robocopyMT

-------------------------------------------------------------------------------
 ROBOCOPY :: Robust File Copy for Windows
-------------------------------------------------------------------------------
Started : Tue Aug 22 20:32:54 2017

Source : \\demo\shared\
 Dest : \\demo\flexgroup\robocopyMT\

Files : *.*

Options : *.* /S /COPYALL /MT:8 /R:1000000 /W:30
------------------------------------------------------------------------------
Total Copied Skipped Mismatch FAILED Extras
 Dirs : 10107 10106 1 0 0 0
 Files : 50004 50004 0 0 0 0
 Bytes : 1.21 m 1.21 m 0 0 0 0
 Times : 0:35:21 0:06:23 0:00:00 0:01:59

Ended : Tue Aug 22 20:41:18 2017

Then I re-ran the XCP to FlexGroup by restoring the baseline snapshot and then making sure the security style of the volume was NTFS. (It was UNIX before, which would have affected ACLs and overall speed). But, the run still held within 4 minutes. So, we’re looking at 2x as fast as robocopy with a small 60k file and folder workload. In addition, the host I’m using is a Windows 7 client VM with a 1GB network connection and not a ton of power behind it. XCP works best with more robust hardware.

win7-info

XCP to FlexGroup results – NTFS security style (~4 minutes!)

PS C:\XCP> xcp copy -parallel 8 \\demo\shared \\demo\flexgroup\XCP
1,436 scanned, 0 errors, 0 skipped, 0 copied, 0 (0/s), 5s
4,381 scanned, 0 errors, 0 skipped, 507 copied, 12.4KiB (2.48KiB/s), 10s
5,426 scanned, 0 errors, 0 skipped, 1,882 copied, 40.5KiB (5.64KiB/s), 15s
7,431 scanned, 0 errors, 0 skipped, 3,189 copied, 67.4KiB (5.37KiB/s), 20s
8,451 scanned, 0 errors, 0 skipped, 4,537 copied, 96.1KiB (5.75KiB/s), 25s
9,651 scanned, 0 errors, 0 skipped, 5,867 copied, 123KiB (5.31KiB/s), 30s
10,751 scanned, 0 errors, 0 skipped, 7,184 copied, 150KiB (5.58KiB/s), 35s
12,681 scanned, 0 errors, 0 skipped, 8,507 copied, 178KiB (5.44KiB/s), 40s
13,891 scanned, 0 errors, 0 skipped, 9,796 copied, 204KiB (5.26KiB/s), 45s
14,861 scanned, 0 errors, 0 skipped, 11,136 copied, 232KiB (5.70KiB/s), 50s
15,966 scanned, 0 errors, 0 skipped, 12,464 copied, 259KiB (5.43KiB/s), 55s
18,031 scanned, 0 errors, 0 skipped, 13,784 copied, 287KiB (5.52KiB/s), 1m0s
19,056 scanned, 0 errors, 0 skipped, 15,136 copied, 316KiB (5.80KiB/s), 1m5s
20,261 scanned, 0 errors, 0 skipped, 16,436 copied, 342KiB (5.21KiB/s), 1m10s
21,386 scanned, 0 errors, 0 skipped, 17,775 copied, 370KiB (5.65KiB/s), 1m15s
23,286 scanned, 0 errors, 0 skipped, 19,068 copied, 397KiB (5.36KiB/s), 1m20s
24,481 scanned, 0 errors, 0 skipped, 20,380 copied, 424KiB (5.44KiB/s), 1m25s
25,526 scanned, 0 errors, 0 skipped, 21,683 copied, 451KiB (5.35KiB/s), 1m30s
26,581 scanned, 0 errors, 0 skipped, 23,026 copied, 479KiB (5.62KiB/s), 1m35s
28,421 scanned, 0 errors, 0 skipped, 24,364 copied, 507KiB (5.63KiB/s), 1m40s
29,701 scanned, 0 errors, 0 skipped, 25,713 copied, 536KiB (5.70KiB/s), 1m45s
30,896 scanned, 0 errors, 0 skipped, 26,996 copied, 561KiB (5.15KiB/s), 1m50s
31,911 scanned, 0 errors, 0 skipped, 28,334 copied, 590KiB (5.63KiB/s), 1m55s
33,706 scanned, 0 errors, 0 skipped, 29,669 copied, 617KiB (5.52KiB/s), 2m0s
35,081 scanned, 0 errors, 0 skipped, 30,972 copied, 644KiB (5.44KiB/s), 2m5s
36,116 scanned, 0 errors, 0 skipped, 32,263 copied, 671KiB (5.30KiB/s), 2m10s
37,201 scanned, 0 errors, 0 skipped, 33,579 copied, 698KiB (5.48KiB/s), 2m15s
38,531 scanned, 0 errors, 0 skipped, 34,898 copied, 726KiB (5.65KiB/s), 2m20s
40,206 scanned, 0 errors, 0 skipped, 36,199 copied, 753KiB (5.36KiB/s), 2m25s
41,371 scanned, 0 errors, 0 skipped, 37,507 copied, 780KiB (5.39KiB/s), 2m30s
42,441 scanned, 0 errors, 0 skipped, 38,834 copied, 808KiB (5.63KiB/s), 2m35s
43,591 scanned, 0 errors, 0 skipped, 40,161 copied, 835KiB (5.47KiB/s), 2m40s
45,536 scanned, 0 errors, 0 skipped, 41,445 copied, 862KiB (5.31KiB/s), 2m45s
46,646 scanned, 0 errors, 0 skipped, 42,762 copied, 890KiB (5.56KiB/s), 2m50s
47,691 scanned, 0 errors, 0 skipped, 44,052 copied, 916KiB (5.30KiB/s), 2m55s
48,606 scanned, 0 errors, 0 skipped, 45,371 copied, 943KiB (5.45KiB/s), 3m0s
50,611 scanned, 0 errors, 0 skipped, 46,518 copied, 967KiB (4.84KiB/s), 3m5s
51,721 scanned, 0 errors, 0 skipped, 47,847 copied, 995KiB (5.54KiB/s), 3m10s
52,846 scanned, 0 errors, 0 skipped, 49,138 copied, 1022KiB (5.32KiB/s), 3m15s
53,876 scanned, 0 errors, 0 skipped, 50,448 copied, 1.02MiB (5.53KiB/s), 3m20s
55,871 scanned, 0 errors, 0 skipped, 51,757 copied, 1.05MiB (5.42KiB/s), 3m25s
57,011 scanned, 0 errors, 0 skipped, 53,080 copied, 1.08MiB (5.52KiB/s), 3m30s
58,101 scanned, 0 errors, 0 skipped, 54,384 copied, 1.10MiB (5.39KiB/s), 3m35s
59,156 scanned, 0 errors, 0 skipped, 55,714 copied, 1.13MiB (5.57KiB/s), 3m40s
60,111 scanned, 0 errors, 0 skipped, 57,049 copied, 1.16MiB (5.52KiB/s), 3m45s
60,111 scanned, 0 errors, 0 skipped, 58,483 copied, 1.19MiB (6.02KiB/s), 3m50s
60,111 scanned, 0 errors, 0 skipped, 59,907 copied, 1.22MiB (5.79KiB/s), 3m55s
60,111 scanned, 0 errors, 0 skipped, 60,110 copied, 1.22MiB (5.29KiB/s), 3m56s

XCP sync and verify

Sync and verify can be used during data migrations to ensure the source and target match up before cutting over. These use the same multi-processing capabilities as copy, so this should also be fast. Keep in mind that sync could also potentially be used to do incremental backups using XCP!

xcp-verify.png

Behind the Scenes: Episode 100 – XCP

Welcome to the Episode 100, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

group-4-2016

This week is our 100th episode! In true TechONTAP podcast fashion, we didn’t celebrate it at all.

Instead, we stuck to the tech and brought in Bogdan Minciu and Joshey Lazer of the XCP team to discuss XCP and the upcoming release that supports CIFS/SMB.

Also, check out the podcast episode on migrations (where we chat about XCP) and this XCP blog.

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

You can listen here:

 

SMB1 Vulnerabilities: How do they affect NetApp’s Data ONTAP?

Google SMBv1 vulnerability, and you’ll get a ton of hits. There’s a reason for this.

SMB1 is the devil!

waterboy-smb1.jpg

But seriously, there are some major security holes in the protocol.

For a good rundown, check out the new NetApp CIFS/SMB TME Chris Hurley’s blog:

http://averageguyx.blogspot.com/2017/03/smb1-is-baaaaaad.html

This is in addition to the limitations of SMB1, such as lack of resiliency for network loss, lack of durable handles and overall performance and chattiness. There are many good reasons why Microsoft has decided to deprecate SMB1 in favor of newer protocols. SMB owner at Microsoft, Ned Pyle (@NerdPyle), gives a plethora of impassioned reasoning in his TechNet blog “Stop using SMB1!

So, there we are. SMB1 is bad, mmkay?

How does SMB1’s devil status affect NetApp’s ONTAP operating systems?

For the official NetApp statement, see this KB:

https://kb.netapp.com/support/s/article/NTAP-20170515-0001

This question comes up a bit here at NetApp, since security scanners will throw bells, whistles and alarms whenever SMB1 is detected in an environment. What follows is:

  • Does SMB1 in ONTAP have the same vulnerabilities?
  • Can I disable SMB1 in ONTAP?
  • If I can’t disable it, can I block it?

The good news is, the main security vulnerabilities that plague SMB1 in Windows generally don’t affect ONTAP (such as 0-day), because ONTAP isn’t a Windows client. It’s using a proprietary, custom built CIFS/SMB stack (akin to Samba). Thus, the vulnerabilities that impact Windows don’t impact ONTAP.

Note: I can’t take all the credit for the information in this blog. That credit goes to John Lantz (CIFS TME at NetApp), as well as various CIFS/SMB engineering resources here.

Can I disable SMB1 in ONTAP?

While the vulnerabilities don’t necessarily affect ONTAP, the security scanners still are triggering alarms and managers are still wanting the red X’s to go away.

scan

As a result, people want to just turn it off in ONTAP, especially since they aren’t currently using it in their environments (hopefully).

The good news is that ONTAP is in the process of deprecating SMB1. The bad news? It’s still there.

However, in ONTAP 9.2, NetApp introduced a new CIFS option to disable SMB1 in advanced privilege!

cluster::> set advanced
cluster::*> cifs options modify -vserver DEMO -smb1-enabled false
 [-smb1-enabled {true|false}] - Enable SMB1 Protocol (privilege: advanced)
 This optional parameter specifies whether the CIFS server negotiates the SMB 1.0 version of the CIFS protocol. The default value for this parameter is true.

If you need to disable SMBv1 now in ONTAP, you’d need to be on ONTAP 9.2.

We also have the ability to control what SMB version is used with domain controllers for authentication. In systems running ONTAP 7-mode, use the following option to enable SMB2.

cifs.smb2.client.enable

In systems running clustered ONTAP, starting in ONTAP 8.3.2P5, you can disable SMB1 connections to the DC, as well as enabling SMB2.

[-smb1-enabled-for-dc-connections {false|true|system-default}] - SMB1 Enabled for DC Connections
 This parameter specifies whether SMB1 is enabled for use with connections to domain controllers. If you do not specify this parameter, the default is system-default.

SMB1 Enabled For DC Connections can be one of the following:
o false - SMB1 is not enabled.
o true - SMB1 is enabled.
o system-default - This sets the option to whatever is the default for the release of Data ONTAP that is running. For this release it is: SMB1 is enabled.

[-smb2-enabled-for-dc-connections {false|true|system-default}] - SMB2 Enabled for DC Connections
 This parameter specifies whether SMB2 is enabled for use with connections to domain controllers. If you do not specify this parameter, the default is system-default.

SMB2 Enabled For DC Connections can be one of the following:
o false - SMB2 is not enabled.
o true - SMB2 is enabled.
o system-default - This sets the option to whatever is the default for the release of Data ONTAP that is running. For this release it is: SMB2 is not enabled.

Use the following command to do that:

cifs security modify -vserver DEMO -smb1-enabled-for-dc-connections false -smb2-enabled-for-dc-connections true

If I can’t disable it in ONTAP, can I block it?

Technically, you *could* block the SMB1 ports. However, if you block ports that SMB2 also needs (such as 445), you’d be in trouble.

The official recommendation from Microsoft is a combination of disabling SMB1 on clients (you could handle this via Group Policy), as well as blocking ports on *external* facing interfaces. In other words, don’t allow SMB outside of the firewall.

Here’s the official link:

https://technet.microsoft.com/en-us/library/cc766392%28v=ws.10%29.aspx?f=255&MSPPError=-2147217396

To disable SMB1 on the client:

https://support.microsoft.com/en-us/kb/2696547

Inside your firewall, you shouldn’t need the following ports, so block away:

  • UDP/137 (NetBIOS name service)
  • UDP/138 (NetBIOS datagram service)
  • TCP/139 (NetBIOS session service)

In some cases, you won’t be able to rid yourself entirely of SMB1. Remember that $30k printer/copier/scanner that you bought 10 years ago that was cool because you could scan directly to a SMB share? Yeah…. that’s probably still using SMB1. Check with your scanner/copier vendor to see if they have any software updates. Otherwise, you may need to disable SMB1 on the copier/scanner, or budget for a new one.

copier

For official NetApp statement on SMB1, check out this TR, starting on page 4:

http://www.netapp.com/us/media/tr-4543.pdf

Also, check out Episode 95 of the Tech ONTAP podcast, where we discuss WannaCry and Petya!

https://m.soundcloud.com/techontap_podcast/episode-95-quarterly-security-update-wannacry-and-petya

Mixed perceptions with NetApp multiprotocol NAS access

EDIT: As the original post for this was super long, I’ve since broken it up into a 2 part post. I moved the vserver security information to the following post:

Managing ACLs via the ONTAP Command Line

NetApp’s ONTAP operating system is one of the few storage operating systems out there that supports data access from both CIFS/SMB and NFS clients. NetApp’s been doing this for a long time – longer than I’ve been there, and I’m going on 10 years!

Despite the fact that it’s been around so long and is one of *the* core competencies in ONTAP, it’s one of the most frequently misunderstood configurations I see. When I was in support, it was one of the biggest case generators. As the NFS TME, it’s one of the most common emails I get that customers need assistance on.

I can tell you what it’s not….

Multiprotocol NAS is NOT “Mixed Mode”

Many people use this terminology for describing access from multiple clients. Unfortunately, it only adds to the confusion, because there is also a security style called “mixed” (see below) and that makes people associate the two and then they start setting mixed security styles when they don’t need to…

So, call it what it is – Multiprotocol NAS. 🙂

What’s so hard about it?

The reason it seems to confound so many people is two fold:

  • Windows administrators are generally not UNIX-savvy
  • UNIX administrators are generally not Windows-savvy

To truly understand multiprotocol NAS, you either have to know both Windows and UNIX file systems/security sematics pretty well, or be open to the fact that Windows and UNIX have similarities and differences.

That said, when you do understand how it works and get it configured properly, it’s a pretty powerful tool for serving data for multiple client types.

There’s currently a Multiprotocol TR in the works, but will be a ways out. However, I just dealt with a recent multiprotocol NAS issue and wanted to do a brain dump before the information got stale and I had to revisit it. This blog is intended to be a quick hit guide to multiprotocol NAS in ONTAP. Some of the ideas will make their way into official TR format.

What makes multiprotocol NAS possible in ONTAP?

ONTAP is fairly agnostic when it comes to file systems and ACL styles. SMB and NFS clients use different security semantics, but the general concepts of those are the same.

Users, groups, permissions.

From there, things tend to skew a bit. Windows uses NTFS security concepts. NFS clients use mode bits for NFSv3/NFSv4.x or ACLs for NFSv4.x. NFSv3 had the concept of POSIX ACLs, but ONTAP doesn’t support those.

The issue is that NTFS ACLs are more complex than mode bits, but match up pretty nicely with NFSv4.x ACLs. Mode bits only do Read, Write, eXecute (RWX), so Windows ACLs don’t match up 1 to 1, especially when you have “special permissions” in the mix. As a result, when dealing with ONTAP file systems, we have the concept of a security style that helps us choose the style of ACL we want to implement. The choices we have:

  • NTFS – NTFS ACLs only
  • UNIX – UNIX style permissions only
  • Mixed – UNIX or NTFS permissions, depending on who last changed permissions
  • Unified (Infinite Volume only)

To properly address permissions, ONTAP has to pick one security style over the other. This allows the storage system to decide which direction a user will map to determine the correct permissions. After all, what’s the point of permissions if they don’t work properly?

User mapping

ONTAP is not unique in the concept of user mapping, but it is still a concept that gets people confused on occasion.

Essentially, to get the proper permissions on a NetApp storage system, a client must first pass a “test” in the form of initial authentication.

The initial test is “Who are you?”

The storage system needs to know that the user you are claiming to be is actually you. There are varying degrees of how secure this test is, mostly dependent on the protocol you’re using, but the bottom line is this: authentication helps us get a user name. That user name allows us to map to another user name, depending on the volume security style.

In general:

  • SMB clients always map to a UNIX user because ONTAP is UNIX-based, even if NTFS security style is in use
  • If no name mapping rules or 1:1 name mappings exist, SMB users map to a default UNIX user set in CIFS options (pcuser/65534 by default)
  • 65534 is “nobody” or “nfsnobody” in most UNIX clients
  • NFS clients only map to Windows users when the security style is NTFS
  • NFS clients cannot chmod or chown on NTFS style volumes; SMB clients cannot take ownership or change ACLs on UNIX style volumes

Once a user has authenticated, the permissions can be discerned based on access control lists. One can see those ACLs via the CLI of the storage system with “vserver security file-directory show.”

cluster::*> vserver security file-directory show -vserver parisi -path /cifs

                 Vserver: parisi
               File Path: /cifs
       File Inode Number: 64
               Security Style: ntfs
         Effective Style: ntfs
          DOS Attributes: 10
  DOS Attributes in Text: ----D---
 Expanded Dos Attributes: -
            UNIX User Id: 0
           UNIX Group Id: 0
          UNIX Mode Bits: 777
  UNIX Mode Bits in Text: rwxrwxrwx
                    ACLs: NTFS Security Descriptor
                          Control:0x8004
                            Owner:BUILTIN\Administrators
                            Group:BUILTIN\Administrators
                            DACL - ACEs
                             ALLOW-Everyone-0x1f01ff
                             ALLOW-Everyone-0x10000000-OI|CI|IO

User/name mapping is one of the most important pieces of the multiprotocol NAS puzzle. Get that part right and most everything else is easy.

Name mapping can be done either locally (via name mapping rules) or with LDAP. TR-4073 covers this sort of thing in pretty finite detail.

Name services/LDAP

The easiest way to handle name mapping in ONTAP for multiprotocol NAS is to leverage a name service server like LDAP. When dealing with both SMB and NFS, the most logical choice is to use the existing Active Directory infrastructure to host UNIX identities. While you can host name mapping rules for users that don’t have the same UNIX and Windows names, it’s best to try to have UNIX and Windows user names match 1:1. (I.e., DOMAIN\nfsdudeabides == nfsdudeabides in UNIX).

TR-4073 covers LDAP and TR-4379 covers name service best practices for ONTAP 9.2 and prior. TR-4668 covers name services in ONTAP 9.3 and beyond.

Mixed Security Style

Fun fact – Mixed security style isn’t truly “mixed.” When you use mixed security style, it’s always either NTFS or UNIX security style at any given moment. This is known as the “effective” security style, which can be seen in “vserver security file-directory show.”

cluster::*> vserver security file-directory show -vserver parisi -path /cifs

                 Vserver: parisi
               File Path: /cifs
       File Inode Number: 64
               Security Style: ntfs
         Effective Style: ntfs

The “effective” style changes based on  the last permission change. If an NFS client does a chmod or chown, the mixed security style volume changes to effective UNIX security style. If an SMB client changes owner or sets an ACL, the effective security style changes to NTFS. When these effective styles change, how the storage does name mapping changes (ie; win-unix to unix-win, etc).

Is mixed security style recommended?

Generally speaking, you don’t want file systems changing something behind the scenes without the knowledge of the storage administrators. Plus, these changes can affect functionality, and even access. As a result, mixed security style is generally not recommended. The only time you’d want to use mixed security style is if your environment requires the ability for clients or applications to change permissions from both NFS and SMB. And even then, if you do set up mixed security style, consider limiting the ability for regular users to take ownership or change permissions on folders and files via NTFS ACLs.

Otherwise, I personally recommend picking either NTFS or UNIX and sticking with it. That choice would be based on how you want your users to manage their ACLs, as well as how granular you want control to be on those file systems. For example, mode bits in UNIX only allow setting an owner, group and everyone else. There’s no way to set multiple groups with different access on the object unless you use NFSv4 or NTFS ACLs.

I usually prefer NTFS because you get the granularity, as well as the GUI functionality many users are accustomed to.

If you do decide to use mixed security style, keep the following in mind:

  • If a volume is using mixed security style and the effective style gets flipped from NTFS to UNIX and then back to NTFS by way of the clients, the previous NTFS ACLs are lost.
  • When a volume flips from UNIX effective to NTFS effective, you get the mode bit translation. For example, if the UNIX volume was 755, you get “Owner – Full Control” and “Everyone – Read/Execute” as Windows ACLs. 700 gives “Owner – Full Control” only.
  • Administrator always gets added onto the ACL with Read/Write access when we flip to NTFS from UNIX.
  • With mixed security style, there are two types of owners – UNIX owner and Windows owner. When Windows “takes ownership,” the UNIX owner does not change.
  • When the effective style of the volume is NTFS, UNIX clients will see permissions as 777 unless the NFS server option ntacl-display-permissive-perms is set to “disabled.”

For information on how to manage permissions in ONTAP, see the following post:

Managing ACLs via the ONTAP Command Line

Be on the lookout for a multiprotocol TR in the future that covers this and more!

Got any questions? Feel free to post in the comments!

TECH::Multiprotocol NAS, Locking and You

Bank_Vault_3D_Wallpaper-HD

One question I got today, and have gotten a few times in my years as a NAS guy, was “how does locking work when you are sharing files between CIFS/SMB and NFS?”

Essentially, “can you guarantee my data will be safe?”

Well, the first answer to that question is a question: What is a file lock?

File locks explained

Essentially, a file lock is exactly what it sounds like. We all know what locks are; we have them on our doors. They keep unwanted people and things out. A file lock is no different; it’s a way to prevent unwanted people and applications from accessing files while they are in use to prevent the “c” word – CORRUPTION!

ohfudge

File locks are always issued by the requesting client or application – a NAS will only honor or deny the lock. If the client or application does not issue a lock to the file, all bets are off. This is similar to our door locking analogy – a door will not lock unless you turn the key.

Similarly, a lock will only be safely broken or released when the application or client is done with it. If the client or application dies, the lock needs to be cleaned up. Depending on the NAS protocol/protocol version, this may or may not require manual intervention. For example, if NFSv3 locks are left over by an application crash (such as an Oracle database), then the locks must be manually cleared on the server before the application can be restarted to use the same files. This is because locking in NFSv3 is handled via the Network Lock Manager (NLM), which is an ancillary protocol to NFS and doesn’t always play well with others.

Conversely, NFSv4.x locking is integrated with the protocol and is lease-based. With leases, the locks will live for a pre-determined amount of time. If the client doesn’t renew the lock in that amount of time, the locks will expire. If the client or application crashes, the locks will release on their own after the lease period expires. If the NFS server restarts, the locks will remain intact until either a client reclaims them or the lease expires.

From RFC-5661:

Lease: A lease is an interval of time defined by the server for which the client is irrevocably granted locks. At the end of a lease period, locks may be revoked if the lease has not been 
extended. A lock must be revoked if a conflicting lock has been granted after the lease interval

A server grants a client a single lease for all state.

Simple enough. But what many people don’t know is that there are also different types of file locks.

File Locking in CIFS/SMB

Special thanks to NetApp CIFS/SMB TME Marc Waldrop (@CIFSorSMB on Twitter) for the CIFS/SMB file lock sanity check.

Locks in CIFS/SMB are done either at a share level or a file level. A share lock will dictate what level of access is allowed to the open file while the original opener of the file has the file opened. The share level lock, commonly known as share access mode, will dictate whether additional openers of the file can read or write to the file. The share access mode can lock the entire file from additional clients doing specific read or write operations. A file level lock is what most know as byte-range lock.

File locking in CIFS/SMB is done via oplocks and share locks. When a file is opened for editing, an oplock is applied and the share-level lock is modified to control what access a client or application has to the file. In some cases, classic byte-range locks are used when portions of a file need to be locked.

CIFS/SMB uses three types of “opportunistic locks,” or “oplocks.”

  1. Batch locks
    Created for batch files to help performance issues with files that are opened and closed many times.
  2. Exclusive locks
    Used when an application opens a file in “shared” mode; for example, Microsoft Office. Exclusive locks allow a client to assume they are the only ones using a file and will cache all changes locally. This is where that weird looking ~FILE.doc comes from in Word. If another application requests an exclusive lock on the file, the server will invalidate the original lock and the application will flush the cached changes to the file.
  3. Level 2 Oplocks
    These get issued when multiple clients want to access the same file. When an application gets issued a Level 2 Oplock, multiple clients can read/cache reads of a file. If any client attempts a write, that client gets issued an exclusive lock.

File Locking in NFS

Unlike CIFS/SMB, NFS doesn’t do share-level locking. It only does file locking. NFS locking comes in two flavors:

  1. Shared locks
    Shared locks can be used by multiple processes at the same time and can only be issued if there are no exclusive locks on a file. These are intended for read-only work, but can be used for writes (such as with a database). This is similar to the Level 2 Oplock in CIFS/SMB.
  2. Exclusive locks
    These operate the same as exclusive locks in CIFS/SMB – only one process can use the file when there is an exclusive lock. If any other processes have locked the file, an exclusive lock cannot be issued, unless that process was “forked.” However, unlike CIFS/SMB, there isn’t a notion of “opportunistic” locking, where a file will allow access without outside intervention.

One of the best analogies I’ve seen for this is a real-world example on stackoverflow:

I wrote this answer down because I thought this would be a fun (and fitting) analogy:

Think of a lockable object as a blackboard (lockable) in a class room containing a teacher (writer) and many students (readers).

While a teacher is writing something (exclusive lock) on the board:

1. Nobody can read it, because it’s still being written, and she’s blocking your view => If an object is exclusively
          locked, shared locks cannot be obtained.

2. Other teachers won’t come up and start writing either, or the board becomes unreadable, and confuses
students => If an object is exclusively locked, other exclusive locks cannot be obtained.

When the students are reading (shared locks) what is on the board:

1. They all can read what is on it, together => Multiple shared locks can co-exist.

2. The teacher waits for them to finish reading before she clears the board to write more => If one or more
          shared locks already exist, exclusive locks cannot be obtained.

The reason I prefer the CIFS/SMB notion of locking is that it feels a lot less messy in general and there is less overhead/management involved with the locking. This is especially true with NFSv3, where locking is not integrated into the protocol, but instead uses an ancillary process called NLM. (as mentioned above)

NFSv4.x and later realized this folly and have integrated locking into the protocol. Locking is better now in NFS, and applications are starting to adopt NFSv4.x as a standard because of the improved locking mechanisms.

Locking in multiprotocol environments

Multiprotocol in NAS simply means “ability to access the same datasets from multiple protocols.” Thus, CIFS/SMB and NFS can read and write to the same files.

This can be problematic for a few reasons:

  • CIFS/SMB and NFS permissions are not the same
    NFS, especially NFSv3, uses mode bit permissions (such as 777, 755, etc). NFSv4.x implements ACLs, which look and feel an awful lot like Windows NTFS ACLs. They’re so close, in fact, that it solves a lot of the old permissioning issues you would see in multiprotocol environments. But they don’t solve all of our problems.
  • CIFS/SMB and NFS user/group concepts are not the same
    Windows users and groups use a super long Security Identifier (SID) for unique identifiers, which is constructed by leveraging the domain SID and unique user/group Relative ID (RID). NFS users and groups use a numeric UID/GID and/or NFSv4 ID domain string. In order to get a user to leverage the correct security permission structure, a name mapping has to take place, depending on the originating request + the type of ACL on the file or folder. Unified name services, such as LDAP, are often useful in alleviating the pain of this scenario.
  • CIFS/SMB and NFS file locks are not the same
    As described above. Because of this, the underlying file system has to be able to negotiate the file locking. As it so happens, NetApp invented integrated locking in the 1990s.

So, how do NAS vendors (like NetApp) get this to work?

In ONTAP (and, by proxy, WAFL), the file system owns the locks and gets the final say as to who gets what access. When a CIFS/SMB client grants an exclusive lock to a file, a NFS client that tries to get a file lock to that same file would not be granted access until the lock has been released. The same goes for NFS clients that have an exclusive lock on a file – CIFS/SMB can’t do anything with that file until the locks is broken/released. These locks are only released when the original client releases them, either voluntarily or by lease expiration.

The general idea here is, protocols don’t matter – protect the files at all cost. If I buy a deadbolt lock, I should be able to use it on any door I choose and it should keep my house safe.

In ONTAP, you can check to see if a file has a lock via the command line or API calls.

In Data ONTAP operating in 7-Mode, the commands are:

7mode> lock status
lock status -f [file] [-p protocol] [-n]
lock status -h [host [-o owner]] [-f file] [-p protocol] [-n]
lock status -o [-f file] [-p protocol] [-n]
lock status -o owner [-f file] [-p protocol] [-n] (valid for CIFS only)
lock status -p protocol [-n]
lock status -n

In clustered Data ONTAP, the command is:

cluster::> vserver locks show ?
 [ -instance | -smb-attrs | -fields , ... ]
 { [ -vserver  ] Vserver
 [[-volume] ] Volume
 [[-lif] ] Logical Interface
 [[-path] ] Object Path
 | [ -lockid  ] } Lock UUID
 [ -protocol  ] Lock Protocol
 [ -type {byte-range|share-level|op-lock|delegation} ] Lock Type
 [ -node  ] Node Holding Lock State
 [ -lock-state  ] Lock State
 [ -bytelock-offset  ] Bytelock Starting Offset
 [ -bytelock-length  ] Number of Bytes Locked
 [ -bytelock-mandatory {true|false} ] Bytelock is Mandatory
 [ -bytelock-exclusive {true|false} ] Bytelock is Exclusive
 [ -bytelock-super {true|false} ] Bytelock is Superlock
 [ -bytelock-soft {true|false} ] Bytelock is Soft
 [ -oplock-level {exclusive|level2|batch|null|read-batch} ] Oplock Level
 [ -sharelock-mode  ] Shared Lock Access Mode
 [ -sharelock-soft {true|false} ] Shared Lock is Soft
 [ -delegation-type {read|write} ] Delegation Type
 [ -client-address  ] Client Address
 [ -smb-open-type {none|durable|persistent} ] SMB Open Type
 [ -smb-connect-state  ] SMB Connect State
 [ -smb-expiration-time  ] SMB Expiration Time (Secs)
 [ -smb-open-group-id  ] SMB Open Group ID

Testing file locking between protocols

This can be tricky. I’ve seen a lot of people try to test multiprotocol file locking in the following manner:

  • Open a file in a CIFS/SMB share with notepad in Windows.
  • Go to the NFS client. Open the file with vi.
  • Wonder why the heck writes are allowed on both. Bang head repeatedly on desk. Destroy a copier.

office-space-copier

The reason why writes are allowed on both clients in this scenario is because NEITHER APPLICATION HAS ISSUED A LOCK. File locks are the responsibility of the client and/or application, not the server. Vi and notepad don’t lock files!

Testing locks from NFS when CIFS/SMB owns the lock

The easiest way to lock a file in Windows? Microsoft Office.

When I open a MS Word file in a CIFS/SMB share on cDOT, I get some share locks and a batch oplock:

cluster::> vserver locks show -vserver NAS -volume unix 

Vserver: NAS
Volume   Object Path               LIF         Protocol  Lock Type   Client
-------- ------------------------- ----------- --------- ----------- ----------
unix     /unix/                    data1       cifs      share-level 10.228.225.120
               Sharelock Mode: read-deny_none
               Sharelock Mode: read-deny_none
                                                                     10.62.194.166
               Sharelock Mode: read-deny_none
         /unix/office.docx                               op-lock     10.62.194.166
               Oplock Level: batch
                                                         share-level 10.62.194.166
               Sharelock Mode: read_write-deny_write_delete

In some cases, I may see an exclusive oplock, especially during edits. I can generate an exclusive oplock if another CIFS/SMB client attempts to access the doc with something like WordPad and gets denied access:

Vserver: NAS
Volume   Object Path               LIF         Protocol  Lock Type   Client
-------- ------------------------- ----------- --------- ----------- ----------
unix     /unix/                    data1       cifs      share-level 10.228.225.120
               Sharelock Mode: read-deny_none
               Sharelock Mode: read-deny_none
                                                                     10.62.194.166
               Sharelock Mode: read-deny_none
         /unix/office.docx                               op-lock     10.62.194.166
               Oplock Level: exclusive
                                                         share-level 10.62.194.166
               Sharelock Mode: read_write-deny_write_delete

If I use another instance of MS Word on a separate client, I can get a level 2 oplock:

Vserver: NAS
Volume   Object Path               LIF         Protocol  Lock Type   Client
-------- ------------------------- ----------- --------- ----------- ----------
unix     /unix/                    data1       cifs      share-level 10.228.225.120
               Sharelock Mode: read-deny_none
               Sharelock Mode: read-deny_none
                                                                     10.62.194.166
               Sharelock Mode: read-deny_none
         /unix/office.docx                               op-lock     10.62.194.166
               Oplock Level: level2
                                                         share-level 10.62.194.166
               Sharelock Mode: read_write-deny_write_delete

With this Office file open, I want to test to see if my NFS client can access/write to the file. When I look at the share with “ls,” I can see that funny ~filename listed.

# ls | grep office
~$office.docx
office.docx

As I mentioned, vi is a terrible way to test locks. However, Linux clients have utilities to test file locking in NFS. In CentOS/RHEL, one utility to use is flock. With flock, I can run a command to lock a file. To be cute, I use vi. 😉

When I try to get an exclusive lock on that file, it hangs:

# flock --exclusive office.docx vi

Since I’m doing NFSv3, I can check to see if any NLM locks have been issued on my cDOT Storage Virtual Machine. I can see that I have been issued a byte-range lock, but since the command hung, I can’t do any damage to the file:

cluster::> vserver locks show -vserver NAS -volume unix -protocol nlm

Vserver: NAS
Volume   Object Path               LIF         Protocol  Lock Type   Client
-------- ------------------------- ----------- --------- ----------- ----------
unix     /unix/office.docx         data1       nlm       byte-range  10.228.225.140
                Bytelock Offset(Length): 0 (18446744073709551615)

When I try to get a shared lock to that file, it allows me in:

~ VIM - Vi IMproved 
~ 
~ version 7.2.411 
~ by Bram Moolenaar et al. 
~ Modified by <bugzilla@redhat.com> 
~ Vim is open source and freely distributable 
~ 
~ Become a registered Vim user! 
~ type :help register for information 
~ 
~ type :q to exit 
~ type :help or  for on-line help 
~ type :help version7 for version info

And my SVM grants a byte-range lock:

cluster::> vserver locks show -vserver NAS -volume unix -protocol nlm

Vserver: NAS
Volume   Object Path               LIF         Protocol  Lock Type   Client
-------- ------------------------- ----------- --------- ----------- ----------
unix     /unix/office.docx         data1       nlm       byte-range  10.228.225.140
                Bytelock Offset(Length): 0 (18446744073709551615)

But when I try to write, it fails.

E32: No file name

So, that’s good, right? I close the Word doc out and all the file locks are gone. Only share locks remain:

cluster::> vserver locks show -vserver NAS -volume unix
Vserver: NAS
Volume   Object Path               LIF         Protocol  Lock Type   Client
-------- ------------------------- ----------- --------- ----------- ----------
unix     /unix/                    data1       cifs      share-level 10.228.225.120
               Sharelock Mode: read-deny_none
               Sharelock Mode: read-deny_none
                                                                     10.62.194.166
               Sharelock Mode: read-deny_delete
3 entries were displayed.

Testing locks from CIFS/SMB when NFS owns the lock

From my NFS client, I lock one of the files in the share with an exclusive lock. In this case, I use “newfile,” because vi has no idea what to do with an Office doc.

# flock --exclusive newfile vi

And I can see the new byte-range lock on the SVM:

cluster::> vserver locks show -vserver NAS -volume unix
Vserver: NAS
Volume   Object Path               LIF         Protocol  Lock Type   Client
-------- ------------------------- ----------- --------- ----------- ----------
unix     /unix/                    data1       cifs      share-level 10.228.225.120
               Sharelock Mode: read-deny_none
               Sharelock Mode: read-deny_none
                                                                     10.62.194.166
               Sharelock Mode: read-deny_delete
unix     /unix/office.docx         data1       nlm       byte-range  10.228.225.140
                Bytelock Offset(Length): 0 (18446744073709551615)

The expected behavior here? The file will only allow reads. And I am not disappointed!

Screen Shot 2015-05-20 at 12.34.12 AM

What happens if I have a stale lock?

If my NFS client dies, for whatever reason, and that byte-range lock is still lingering, I can break it from the storage in advanced privilege (complete with scary/legit warning):

cluster::*> vserver locks break -vserver NAS -volume unix -lif data1 -path /unix/newfile

Warning: Breaking file locks can cause applications to become unsynchronized and may lead to data corruption.
Do you want to continue? {y|n}: y
1 entry was acted on.
cluster::> vserver locks show -vserver NAS -volume unix
Vserver: NAS
Volume   Object Path               LIF         Protocol  Lock Type   Client
-------- ------------------------- ----------- --------- ----------- ----------
unix     /unix/                    data1       cifs      share-level 10.228.225.120
               Sharelock Mode: read-deny_none
               Sharelock Mode: read-deny_none
                                                                     10.62.194.166
               Sharelock Mode: read-deny_delete
3 entries were displayed.

Once that happens, I can issue new locks from other clients, whether they are CIFS/SMB or NFS. Doesn’t matter – multiprotocol locking in ONTAP just works!

Where can I find out more?

For more information on NFS in clustered Data ONTAP, see TR-4067: NFS Best Practice and Implementation Guide.

For more information on assorted aspects of multiprotocol NAS access in clustered Data ONTAP, see TR-4073: Secure Unified Authentication.

Additionally, subscribe to Why is the Internet Broken and follow @NFSDudeAbides on Twitter for more NAS-related information!