Windows NFS? WHO DOES THAT???

Image result for disgusted girl meme

Believe it or not, Windows NFS is a thing. Microsoft has its own NFS server and client, which can leverage RFC compliant NFSv3 calls to a Windows Server running NFS server or to a 3rd party NFS server, such as NetApp ONTAP. It’s actually so popular, that NetApp had to re-introduce it in clustered ONTAP (it wasn’t there until ONTAP 8.2.3/8.3.1).

While Windows NFS currently provides NFSv3 clients, they don’t have NFSv4.1 clients – yet. They do provide NFSv4.1 as a server option, though:

https://docs.microsoft.com/en-us/windows-server/storage/nfs/nfs-overview

I cover Windows NFS support in TR-4067 starting on page 116. I am bringing this topic up because it has come up again recently and I wanted to create a quick and easy blog to follow, as well as call out how you can integrate AD LDAP to help identity management.

There are a few things you have to do to get it working in ONTAP.

Specifically:

  • enable -v3-ms-dos-client option on the NFS server
  • enable -showmount on the NFS server – this prevents some weirdness with writing files
  • disable -enable-ejukebox and -v3-connection-drop

The command would look like this:

cluster::> set advanced
cluster::*> nfs server modify -vserver DEMO -v3-ms-dos-client enabled -v3-connection-drop disabled -enable-ejukebox false -showmount enabled
cluster::*> nfs server show -vserver DEMO -fields v3-ms-dos-client,v3-connection-drop,showmount,enable-ejukebox
vserver enable-ejukebox v3-connection-drop showmount v3-ms-dos-client
------- --------------- ------------------ --------- ----------------
DEMO false disabled enabled enabled

Once that’s done, you can mount via NFS inside Windows clients using the standard “mount” command, provided you’ve enabled the Services for UNIX functionality. There’s plenty of documentation out there for that.

Just by doing the above, here’s an example of a working NFS mount in Windows:

C:\Users\Administrator>mount DEMO:/flexvol X:
X: is now successfully connected to DEMO:/flexvol

The command completed successfully.

Here’s the cluster’s view of that connection:

ontap9-tme-8040::*> network connections active show -node ontap9-tme-8040-0* -service nfs*,mount -remote-ip 10.193.67.236
              Vserver   Interface         Remote
      CID Ctx Name      Name:Local Port   Host:Port            Protocol/Service
--------- --- --------- ----------------- -------------------- ----------------
Node: ontap9-tme-8040-02
2968991376  4 DEMO      data:2049         oneway.ntap.local:931
                                                               TCP/nfs

When I write a file to the mount, there is something that can prove to be an issue, however. Users other than Administrator will write as UID/GID of 4294967294 (-2).

ontap9-tme-8040::*> vserver security file-directory show -vserver DEMO -path /flexvol/student1-nfs.txt

                Vserver: DEMO
              File Path: /flexvol/student1-nfs.txt
      File Inode Number: 1606599
         Security Style: unix
        Effective Style: unix
         DOS Attributes: 20
DOS Attributes in Text: ---A----
Expanded Dos Attributes: -
           UNIX User Id: 4294967294
          UNIX Group Id: 4294967294
         UNIX Mode Bits: 755
UNIX Mode Bits in Text: rwxr-xr-x
                   ACLs: -

That means users won’t show up properly/as desired in UNIX NFS mounts. For example, this is that same file from CentOS:

[root@centos7 /]# cd flexvol
[root@centos7 flexvol]# ls -la | grep student1-nfs
-rwxr-xr-x 1 4294967294 4294967294 0 Feb 5 09:18 student1-nfs.txt

So, how does one fix that?

Configuring Windows NFS clients to negotiate users properly

There are a few ways to have users leverage UID/GID other than -2.

One way is to “squash” every NFS user to the same UID/GID via the old Windows standby – the Windows registry. This is useful if only a single user will be using an NFS client.

This covers how to do that:

https://blogs.msdn.microsoft.com/saponsqlserver/2011/02/03/installation-configuration-of-windows-nfs-client-to-enable-windows-to-mount-a-unix-file-system/

Some of the third party NFS clients (such as Cygwin and Hummingbird/OpenText) will provide local passwd and group file functionality to allow you to leverage more users. In some cases, all this does is add more registry entries.

Another was is to chmod/chown the file after it’s written. But that’s not ideal.

The best way is to leverage an existing name service (such as NIS or LDAP) and have Windows clients query for the UID and GID. If you have one already, great! It’s super easy to set up the client. Just run the following command as an administrator in cmd. My NTAP.LOCAL domain already has an LDAP server set up:

C:\Users\administrator>nfsadmin mapping WIN7-CLIENT config adlookup=yes addomain=NTAP.LOCAL

The settings were successfully updated.

Once I did that, I wrote a new file and the UID/GID was properly represented:

ontap9-tme-8040::*> vserver security file-directory show -vserver DEMO -path /flexvol/prof1-nfs.txt

                Vserver: DEMO
              File Path: /flexvol/prof1-nfs.txt
      File Inode Number: 1606600
         Security Style: unix
        Effective Style: unix
         DOS Attributes: 20
DOS Attributes in Text: ---A----
Expanded Dos Attributes: -
           UNIX User Id: 1100
          UNIX Group Id: 1101
         UNIX Mode Bits: 755
UNIX Mode Bits in Text: rwxr-xr-x
                   ACLs: -

ontap9-tme-8040::*> getxxbyyy getpwbyname -node ontap9-tme-8040-01 -vserver DEMO -username prof1
  (vserver services name-service getxxbyyy getpwbyname)
pw_name: prof1
pw_passwd:
pw_uid: 1100
pw_gid: 1101
pw_gecos:
pw_dir:
pw_shell:

If you’re interested, a packet trace shows that the Windows client will communicate via encrypted LDAP to query the user’s UNIX attribute information:

windows-ldap

An added bonus of having Windows clients query LDAP for UNIX user names and groups for NFS on ONTAP is that if you’re using NTFS security style volumes, you won’t have issues connecting to those mounts.

What breaks when doing NTFS security style?

When a UNIX user attempts to access a volume with NTFS security style ACLs, ONTAP will attempt to map that user to a valid Windows user to make sure Windows ACLs can be calculated. (I cover this in Mixed perceptions with NetApp multiprotocol NAS access)

If a user comes in with the default Windows NFS ID of 4294967294 (which doesn’t translate to a UNIX user), this is what happens.

  • The UNIX user 4294967294 tries to access the mount.
  • ONTAP receives a UID of 4294967294 and attempts to map that to a Windows user
  • That Windows user does not exist, so access is denied. This can manifest as an error (such as when writing a file) or it could just show no files/folder.

windows-nfs-ntfs-noaccess.png

windows-nfs-ntfs-noaccess2

That particular folder does have data. It’s just that the user can’t see it:

windows-nfs-ntfs-data-list

In ONTAP, we’d see this error, confirming that the user doesn’t exist:

2/5/2019 14:31:26 ontap9-tme-8040-02
ERROR secd.nfsAuth.problem: vserver (DEMO) General NFS authorization problem. Error: Get user credentials procedure failed
[ 15 ms] Hostname found in Name Service Cache
[ 19] Hostname found in Name Service Cache
[ 23] Successfully connected to ip 10.193.67.236, port 389 using TCP
**[ 28] FAILURE: User ID '4294967294' not found in UNIX authorization source LDAP.
[ 28] Entry for user-id: 4294967294 not found in the current source: LDAP. Ignoring and trying next available source
[ 29] Entry for user-id: 4294967294 not found in the current source: FILES. Entry for user-id: 4294967294 not found in any of the available sources
[ 44] Unable to get the name for UNIX user with UID 4294967294

With LDAP involved, access to the access to the NFS mounted volume with NTFS security works much better, because ONTAP and the client agree that user 1100 is prof1.

windows-nfs-ntfs-data-list-ldap

So, uh… what if I don’t have LDAP or NIS?

Well, in a Windows domain, you ALWAYS have an LDAP server. Active Directory leverages LDAP schemas to store information and any version of Windows Active Directory can be used to look up UNIX users and groups. In fact, the newer versions of Windows make this very easy. In older Windows versions, you had to manually extend the LDAP schema to provide UNIX attributes. Now, UNIX attributes like UID, UIDnumber, etc. are all in LDAP by default. All you have to do is populate these values with information. You can even do it via PowerShell CMDlets!

Once you have a working Active Directory LDAP environment, you can then configure ONTAP to communicate with LDAP for UNIX identities and you’re well on your way to having a scalable, functional multiprotocol NAS environment.

The one downside I’ve found with Windows NFS is that it doesn’t always play nicely when you want to use SMB on the same client. Windows gets a bit… confused. I haven’t dug into that a ton, but I’ve seen it enough to express caution. 🙂

Advertisements

Behind the Scenes: Episode 165 – Accelerate your NAS Data with FlexCache

Welcome to the Episode 165, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

tot-gopher

This week on the podcast, we talk about the new iteration of ONTAP’s NAS acceleration feature, FlexCache! Join us as we discuss with NetApp’s Technical Director Pranoop Erasani (pranoop@netapp.com), FlexCache PM Shriya Paramkusam (shriya@netapp.com) and FlexCache TME Chris Hurley (@averageguyx).

Finding the Podcast

You can find this week’s episode here:

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

Our YouTube channel (episodes uploaded sporadically) is here:

Behind the Scenes: Episode 134 – The Active IQ Story: Building a Data Pipeline for Machine Learning

Welcome to the Episode 134, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

tot-gopher

This week on the podcast, Active IQ Technical Director Shankar Pasupathy joins us and tells us how AutoSupport’s infrastructure and backend evolved into Active IQ’s multicloud data pipeline. Learn how NetApp is using big data analytics and machine learning on ONTAP to improve the overall customer experience

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

This week’s episode is here:

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

Our YouTube channel (episodes uploaded sporadically) is here:

Behind the Scenes: Episode 126 – Komprise

Welcome to the Episode 126, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

tot-gopher

This week on the podcast, we bring in Komprise (@Komprise) CEO Kumar Goswami (@KumarKGoswami) to chat about data management and how their software helps get the most out of your NetApp storage systems!

komprise

For more information about Komprise, check out komprise.com!

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

This week’s episode is here:

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

Our YouTube channel (episodes uploaded sporadically) is here:

Behind the Scenes: Episode 98 – SnapCenter 3.0

Welcome to the Episode 98, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

group-4-2016

This week on the podcast, we check in with John Spinks, SnapCenter TME, to find out what’s in SnapCenter 3.0 – just in time for its release!

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

You can listen here:

Managing ACLs via the ONTAP Command Line

In a previous post, I covered multiprotocol NAS in ONTAP, as well as mixed security styles. The following post covers how to manage permissions from the ONTAP CLI, as well as how to centralize permission management from a single Linux client. Some of the following was moved from the previous post to this one to make it easier to read and digest.

Viewing permissions in multiprotocol NAS

There are options to display permissions from both types of clients. For viewing UNIX permissions from Windows property tabs, use the cifs option is-unix-nt-acl-enabled.

cluster::*> cifs option show -vserver parisi-fields is-unix-nt-acl-enabled
vserver is-unix-nt-acl-enabled
----------- ----------------------
parisi     true

When using this option, the Windows clients will show a security tab entry that approximates the UNIX mode bits into ACLs. It will show the owner, group and “other” permissions. It will also attempt to convert the UNIX UID into a Windows-friendly SID so the client can display it. The Windows user will look like this:

unix-windows-acl1

That user is a “fake SID” that is tied to the cluster’s Storage Virtual Machine. It translates to a SID that ONTAP creates based on the numeric ID of the user or group. The Windows client uses that SID to translate into a name.

For example:

cluster::*> diag secd authentication translate -node node1 -vserver SVM -win-name UNIXPermUid\root
S-1-5-21-2038298172-1297133386-11111-0

cluster::*> diag secd authentication translate -node node1 -vserver SVM -unix-user-name root
0

cluster::*> diag secd authentication translate -node node1 -vserver SVM -win-name UNIXPermUid\user3
S-1-5-21-2038298172-1297133386-11111-703

cluster::*> diag secd authentication translate -node node1 -vserver SVM -unix-user-name user3
703

cluster::*> diag secd authentication translate -node node1 -vserver SVM -win-name UNIXPermGid\homedirs
S-1-5-21-2038298172-1297133386-22222-1002

cluster::*> diag secd authentication translate -node node1 -vserver SVM -unix-group-name homedirs
1002

From Windows, we can see the level of access for the users from the “Change Permissions” window:

unix-windows-acl2

On the NFS side, mode bits have no clue how to translate NTFS permission concepts like extended attributes. Instead, the clients only know Read, Write, Execute, Traverse, etc. It’s possible to show an approximation of those mode bits in UNIX for NTFS security style volumes with this option:

cluster::*> nfs server show -fields ntacl-display-permissive-perms
vserver ntacl-display-permissive-perms
----------- ------------------------------
parisi     disabled

When that option is disabled, NTFS ACLs show up as closely to UNIX permissions as they can. In the following example, I have an NTFS security style folder that allowed only the owner to have full control, but allows read to “Everyone.” With the option mentioned, we see that reflected as “755” in permissions:

unix-windows-acl3

drwxr-xr-x 3 user1 homedirs 4096 Nov 8 14:15 user1

Translating NTFS style DACLs

As previously mentioned, in ONTAP we can view the Windows ACLs on a file, folder or volume using vserver security file-directory show.

cluster::*> vserver security file-directory show -vserver SVM-path /homedir1/user1

Vserver: SVM
 File Path: /homedir1/user1
 File Inode Number: 10363
 Security Style: mixed
 Effective Style: ntfs
 DOS Attributes: 10
 DOS Attributes in Text: ----D---
Expanded Dos Attributes: -
 UNIX User Id: 701
 UNIX Group Id: 1002
 UNIX Mode Bits: 777
 UNIX Mode Bits in Text: rwxrwxrwx
 ACLs: NTFS Security Descriptor
 Control:0x8004
 Owner:CPOC\user1
 Group:CPOC\Domain Users
 DACL - ACEs
 ALLOW-CPOC\Administrator-0xe0000040-OI|IO
 ALLOW-CPOC\Administrator-0x1201ff-CI
 ALLOW-CPOC\user1-0x10000000-OI|IO
 ALLOW-CPOC\user1-0x1f01ff-CI
 ALLOW-Everyone-0xa0000000-OI|IO
 ALLOW-Everyone-0x1200a9-CI

However, as you can see, those ACLs don’t make a ton of sense unless you can read hexadecimal. (If you can, more power to ya.)

Let’s break down the ACLs a bit to understand them better.

  • First, DACL means “Discretionary Access Control List.” From MSDN:
  • An access control list that is controlled by the owner of an object and that specifies the access particular users or groups can have to the object.
  • In the DACLs above, we can see whether the DACL is an ALLOW or a DENY ACL. (Deny ACLs override ALLOWS.) We can also see the user or group being allowed access. After that, the information isn’t really in a “human readable” format.
  • The CI, IO, OI values are “ACE strings” and tell us whether the ACL was inherited by the owner or container. MSDN has a handy list of those here: ACE Strings

The rest of the ACLs are hexadecimal values and translate into what the actual permissions that were set were.

Expanding ACLs

Rather than try to decode all of those, ONTAP has an option on the file-directory show command that allows you to expand the ACL mask from the CLI (-expand-mask). This actually cracks open the DACLs and shows an expanded view of what actual permissions are allowed.

For example:

cluster::> vserver security file-directory show -vserver parisi -path /cifs -expand-mask true

Vserver: parisi
 File Path: /cifs
 File Inode Number: 64
 Security Style: ntfs
 Effective Style: ntfs
 DOS Attributes: 10
 DOS Attributes in Text: ----D---
Expanded Dos Attributes: 0x10
 ...0 .... .... .... = Offline
 .... ..0. .... .... = Sparse
 .... .... 0... .... = Normal
 .... .... ..0. .... = Archive
 .... .... ...1 .... = Directory
 .... .... .... .0.. = System
 .... .... .... ..0. = Hidden
 .... .... .... ...0 = Read Only
 UNIX User Id: 0
 UNIX Group Id: 0
 UNIX Mode Bits: 777
 UNIX Mode Bits in Text: rwxrwxrwx
 ACLs: NTFS Security Descriptor
 Control:0x8004

1... .... .... .... = Self Relative
 .0.. .... .... .... = RM Control Valid
 ..0. .... .... .... = SACL Protected
 ...0 .... .... .... = DACL Protected
 .... 0... .... .... = SACL Inherited
 .... .0.. .... .... = DACL Inherited
 .... ..0. .... .... = SACL Inherit Required
 .... ...0 .... .... = DACL Inherit Required
 .... .... ..0. .... = SACL Defaulted
 .... .... ...0 .... = SACL Present
 .... .... .... 0... = DACL Defaulted
 .... .... .... .1.. = DACL Present
 .... .... .... ..0. = Group Defaulted
 .... .... .... ...0 = Owner Defaulted

Owner:BUILTIN\Administrators
 Group:BUILTIN\Administrators
 DACL - ACEs
 ALLOW-Everyone-0x1f01ff
 0... .... .... .... .... .... .... .... = Generic Read
 .0.. .... .... .... .... .... .... .... = Generic Write
 ..0. .... .... .... .... .... .... .... = Generic Execute
 ...0 .... .... .... .... .... .... .... = Generic All
 .... ...0 .... .... .... .... .... .... = System Security
 .... .... ...1 .... .... .... .... .... = Synchronize
 .... .... .... 1... .... .... .... .... = Write Owner
 .... .... .... .1.. .... .... .... .... = Write DAC
 .... .... .... ..1. .... .... .... .... = Read Control
 .... .... .... ...1 .... .... .... .... = Delete
 .... .... .... .... .... ...1 .... .... = Write Attributes
 .... .... .... .... .... .... 1... .... = Read Attributes
 .... .... .... .... .... .... .1.. .... = Delete Child
 .... .... .... .... .... .... ..1. .... = Execute
 .... .... .... .... .... .... ...1 .... = Write EA
 .... .... .... .... .... .... .... 1... = Read EA
 .... .... .... .... .... .... .... .1.. = Append
 .... .... .... .... .... .... .... ..1. = Write
 .... .... .... .... .... .... .... ...1 = Read

ALLOW-Everyone-0x10000000-OI|CI|IO
 0... .... .... .... .... .... .... .... = Generic Read
 .0.. .... .... .... .... .... .... .... = Generic Write
 ..0. .... .... .... .... .... .... .... = Generic Execute
 ...1 .... .... .... .... .... .... .... = Generic All
 .... ...0 .... .... .... .... .... .... = System Security
 .... .... ...0 .... .... .... .... .... = Synchronize
 .... .... .... 0... .... .... .... .... = Write Owner
 .... .... .... .0.. .... .... .... .... = Write DAC
 .... .... .... ..0. .... .... .... .... = Read Control
 .... .... .... ...0 .... .... .... .... = Delete
 .... .... .... .... .... ...0 .... .... = Write Attributes
 .... .... .... .... .... .... 0... .... = Read Attributes
 .... .... .... .... .... .... .0.. .... = Delete Child
 .... .... .... .... .... .... ..0. .... = Execute
 .... .... .... .... .... .... ...0 .... = Write EA
 .... .... .... .... .... .... .... 0... = Read EA
 .... .... .... .... .... .... .... .0.. = Append
 .... .... .... .... .... .... .... ..0. = Write
 .... .... .... .... .... .... .... ...0 = Read

This also works with NFSv4 ACLs:

cluster::*> vserver security file-directory show -vserver DEMO -path /shared/unix -expand-mask true

                Vserver: DEMO
              File Path: /shared/unix
      File Inode Number: 20034
         Security Style: unix
        Effective Style: unix
         DOS Attributes: 10
 DOS Attributes in Text: ----D---
Expanded Dos Attributes: 0x10
     ...0 .... .... .... = Offline
     .... ..0. .... .... = Sparse
     .... .... 0... .... = Normal
     .... .... ..0. .... = Archive
     .... .... ...1 .... = Directory
     .... .... .... .0.. = System
     .... .... .... ..0. = Hidden
     .... .... .... ...0 = Read Only
           UNIX User Id: 1100
          UNIX Group Id: 1101
         UNIX Mode Bits: 770
 UNIX Mode Bits in Text: rwxrwx---
                   ACLs: NFSV4 Security Descriptor
                         Control:0x8014
                              1... .... .... .... = Self Relative
                              .0.. .... .... .... = RM Control Valid
                              ..0. .... .... .... = SACL Protected
                              ...0 .... .... .... = DACL Protected
                              .... 0... .... .... = SACL Inherited
                              .... .0.. .... .... = DACL Inherited
                              .... ..0. .... .... = SACL Inherit Required
                              .... ...0 .... .... = DACL Inherit Required
                              .... .... ..0. .... = SACL Defaulted
                              .... .... ...1 .... = SACL Present
                              .... .... .... 0... = DACL Defaulted
                              .... .... .... .1.. = DACL Present
                              .... .... .... ..0. = Group Defaulted
                              .... .... .... ...0 = Owner Defaulted

                         DACL - ACEs
                           ALLOW-OWNER@-0x1601ff
                              0... .... .... .... .... .... .... .... = Generic Read
                              .0.. .... .... .... .... .... .... .... = Generic Write
                              ..0. .... .... .... .... .... .... .... = Generic Execute
                              ...0 .... .... .... .... .... .... .... = Generic All
                              .... ...0 .... .... .... .... .... .... = System Security
                              .... .... ...1 .... .... .... .... .... = Synchronize
                              .... .... .... 0... .... .... .... .... = Write Owner
                              .... .... .... .1.. .... .... .... .... = Write DAC
                              .... .... .... ..1. .... .... .... .... = Read Control
                              .... .... .... ...0 .... .... .... .... = Delete
                              .... .... .... .... .... ...1 .... .... = Write Attributes
                              .... .... .... .... .... .... 1... .... = Read Attributes
                              .... .... .... .... .... .... .1.. .... = Delete Child
                              .... .... .... .... .... .... ..1. .... = Execute
                              .... .... .... .... .... .... ...1 .... = Write EA
                              .... .... .... .... .... .... .... 1... = Read EA
                              .... .... .... .... .... .... .... .1.. = Append
                              .... .... .... .... .... .... .... ..1. = Write
                              .... .... .... .... .... .... .... ...1 = Read

                           ALLOW-user-prof1-0x1601ff
                              0... .... .... .... .... .... .... .... = Generic Read
                              .0.. .... .... .... .... .... .... .... = Generic Write
                              ..0. .... .... .... .... .... .... .... = Generic Execute
                              ...0 .... .... .... .... .... .... .... = Generic All
                              .... ...0 .... .... .... .... .... .... = System Security
                              .... .... ...1 .... .... .... .... .... = Synchronize
                              .... .... .... 0... .... .... .... .... = Write Owner
                              .... .... .... .1.. .... .... .... .... = Write DAC
                              .... .... .... ..1. .... .... .... .... = Read Control
                              .... .... .... ...0 .... .... .... .... = Delete
                              .... .... .... .... .... ...1 .... .... = Write Attributes
                              .... .... .... .... .... .... 1... .... = Read Attributes
                              .... .... .... .... .... .... .1.. .... = Delete Child
                              .... .... .... .... .... .... ..1. .... = Execute
                              .... .... .... .... .... .... ...1 .... = Write EA
                              .... .... .... .... .... .... .... 1... = Read EA
                              .... .... .... .... .... .... .... .1.. = Append
                              .... .... .... .... .... .... .... ..1. = Write
                              .... .... .... .... .... .... .... ...1 = Read

                           ALLOW-GROUP@-0x1201ff-IG
                              0... .... .... .... .... .... .... .... = Generic Read
                              .0.. .... .... .... .... .... .... .... = Generic Write
                              ..0. .... .... .... .... .... .... .... = Generic Execute
                              ...0 .... .... .... .... .... .... .... = Generic All
                              .... ...0 .... .... .... .... .... .... = System Security
                              .... .... ...1 .... .... .... .... .... = Synchronize
                              .... .... .... 0... .... .... .... .... = Write Owner
                              .... .... .... .0.. .... .... .... .... = Write DAC
                              .... .... .... ..1. .... .... .... .... = Read Control
                              .... .... .... ...0 .... .... .... .... = Delete
                              .... .... .... .... .... ...1 .... .... = Write Attributes
                              .... .... .... .... .... .... 1... .... = Read Attributes
                              .... .... .... .... .... .... .1.. .... = Delete Child
                              .... .... .... .... .... .... ..1. .... = Execute
                              .... .... .... .... .... .... ...1 .... = Write EA
                              .... .... .... .... .... .... .... 1... = Read EA
                              .... .... .... .... .... .... .... .1.. = Append
                              .... .... .... .... .... .... .... ..1. = Write
                              .... .... .... .... .... .... .... ...1 = Read

                           ALLOW-EVERYONE@-0x120080
                              0... .... .... .... .... .... .... .... = Generic Read
                              .0.. .... .... .... .... .... .... .... = Generic Write
                              ..0. .... .... .... .... .... .... .... = Generic Execute
                              ...0 .... .... .... .... .... .... .... = Generic All
                              .... ...0 .... .... .... .... .... .... = System Security
                              .... .... ...1 .... .... .... .... .... = Synchronize
                              .... .... .... 0... .... .... .... .... = Write Owner
                              .... .... .... .0.. .... .... .... .... = Write DAC
                              .... .... .... ..1. .... .... .... .... = Read Control
                              .... .... .... ...0 .... .... .... .... = Delete
                              .... .... .... .... .... ...0 .... .... = Write Attributes
                              .... .... .... .... .... .... 1... .... = Read Attributes
                              .... .... .... .... .... .... .0.. .... = Delete Child
                              .... .... .... .... .... .... ..0. .... = Execute
                              .... .... .... .... .... .... ...0 .... = Write EA
                              .... .... .... .... .... .... .... 0... = Read EA
                              .... .... .... .... .... .... .... .0.. = Append
                              .... .... .... .... .... .... .... ..0. = Write
                              .... .... .... .... .... .... .... ...0 = Read

However, with a ton of ACLs on an object, this could get a bit overwhelming. So, translating the hex might be better overall. This blog covers it in a bit more detail:

About the ACCESS_MASK structure

In the above ACL, we see 0x1f01ff for Everyone. That’s Full Control. In addition, 0x10000000 is considered GENERIC_ALL.

Applying ACLs to objects from the storage

In addition to displaying ACLs, vserver security file-directory commands can be used to apply SACLs and DACLs to objects from the cluster’s CLI.

The general steps are covered in this KB article:

https://kb.netapp.com/support/s/article/how-to-modify-permissions-on-files-and-folders-in-clustered-data-ontap-when-there-is-no-permission-to-take-ownership?t=1484836401866

The following shows an example of doing this on a single qtree in ONTAP.

This is a qtree called “mixed.” It has an effective security style of UNIX, unix permissions 770 and root:sharedgroup as the owners.

cluster::*> vserver security file-directory show -vserver DEMO -path /shared/mixed

                Vserver: DEMO
              File Path: /shared/mixed
      File Inode Number: 20035
         Security Style: mixed
        Effective Style: unix
         DOS Attributes: 10
 DOS Attributes in Text: ----D---
Expanded Dos Attributes: -
           UNIX User Id: 0
          UNIX Group Id: 1206
         UNIX Mode Bits: 770
 UNIX Mode Bits in Text: rwxrwx---
                   ACLs: -

To change permissions on this object (or other objects, if desired), first create a security policy:

cluster::*> file-directory policy create -vserver DEMO -policy-name Policy1
  (vserver security file-directory policy create)
 
cluster::*> vserver security file-directory policy show -vserver DEMO -instance
    Vserver: DEMO
Policy Name: Policy1

Then, create a security descriptor, which allows a storage admin to add access control entries (ACEs) to the discretionary access control list (DACL) and the system access control list (SACL). This provides the ability to add, in bulk, an owner, group or control flags in raw hex:

cluster::*> vserver security file-directory ntfs create -vserver DEMO -ntfs-sd sdname 
 -owner ntfsonly

cluster::*> vserver security file-directory ntfs show -instance
                      Vserver: DEMO
NTFS Security Descriptor Name: sdname
                        Owner: NTAP\ntfsonly
                Primary Group: -
            Raw Control Flags: -

Next, create one or more DACLs or SACLs. In this case, I’ve created 2 DACLs. This command allows the following:

cluster::*> vserver security file-directory ntfs dacl add ?
    -vserver                                                   Vserver
   [-ntfs-sd]                                             NTFS Security Descriptor Name
   [-access-type] {deny|allow}                                               Allow or Deny
   [-account]                                                   Account Name or SID
  { [[-rights] {no-access|full-control|modify|read-and-execute|read|write}]  DACL ACE's Access Rights
  | [ -advanced-rights , ... ]                        DACL ACE's Advanced Access Rights
  | [ -rights-raw  ] }                                          *DACL ACE's Raw Access Rights
  [ -apply-to {this-folder|sub-folders|files}, ... ]                         Apply DACL Entry

The users I’m adding are ntfsonly and student1. Ntfsonly gets full control; student1 gets readonly access. I’m applying the DACL to all objects (this-folder, sub-folders, files).

NOTE: If you don’t apply the DACL to the top level folder, you run the risk of denying access to everyone because the owner doesn’t get set properly.

ontap9-tme-8040::*> vserver security file-directory ntfs dacl add -vserver DEMO -ntfs-sd sdname -access-type allow -account ntfsonly -apply-to this-folder,sub-folders,files -advanced-rights full-control

ontap9-tme-8040::*> vserver security file-directory ntfs dacl add -vserver DEMO -ntfs-sd sdname -access-type allow -account student1 -rights read -apply-to this-folder,sub-folders,files

In addition to the ACLs we define, we also get default built-in DACLs. Feel free to delete those as needed.

ontap9-tme-8040::*> vserver security file-directory ntfs dacl show -vserver DEMO -instance


                      Vserver: DEMO
NTFS Security Descriptor Name: sdname
                Allow or Deny: allow
          Account Name or SID: BUILTIN\Administrators
                Access Rights: full-control
            Raw Access Rights: -
       Advanced Access Rights: -
             Apply DACL Entry: this-folder, sub-folders, files
                Access Rights: full-control

                      Vserver: DEMO
NTFS Security Descriptor Name: sdname
                Allow or Deny: allow
          Account Name or SID: BUILTIN\Users
                Access Rights: full-control
            Raw Access Rights: -
       Advanced Access Rights: -
             Apply DACL Entry: this-folder, sub-folders, files
                Access Rights: full-control

                      Vserver: DEMO
NTFS Security Descriptor Name: sdname
                Allow or Deny: allow
          Account Name or SID: CREATOR OWNER
                Access Rights: full-control
            Raw Access Rights: -
       Advanced Access Rights: -
             Apply DACL Entry: this-folder, sub-folders, files
                Access Rights: full-control

                      Vserver: DEMO
NTFS Security Descriptor Name: sdname
                Allow or Deny: allow
          Account Name or SID: NT AUTHORITY\SYSTEM
                Access Rights: full-control
            Raw Access Rights: -
       Advanced Access Rights: -
             Apply DACL Entry: this-folder, sub-folders, files
                Access Rights: full-control

                      Vserver: DEMO
NTFS Security Descriptor Name: sdname
                Allow or Deny: allow
          Account Name or SID: NTAP\ntfsonly
                Access Rights: -
            Raw Access Rights: -
       Advanced Access Rights: full-control
             Apply DACL Entry: this-folder, sub-folders, files
                Access Rights: full-control

                      Vserver: DEMO
NTFS Security Descriptor Name: sdname
                Allow or Deny: allow
          Account Name or SID: NTAP\student1
                Access Rights: read
            Raw Access Rights: -
       Advanced Access Rights: -
             Apply DACL Entry: this-folder, sub-folders, files
                Access Rights: read
6 entries were displayed.

Now that the policy is created and I have the desired DACLs and SACLs, I can apply them to whatever paths I want. In the above, I’ve set the DACLs to only apply to the specific folder. To apply the policy, create a new task and define the path you want to re-ACL. The task will “propogate” by default. You can also specify “replace” if desired.

cluster::*> file-directory policy task add -vserver DEMO -policy-name Policy1 -path /shared/mixed -ntfs-sd sdname
  (vserver security file-directory policy task add)

cluster::*> file-directory policy task show
  (vserver security file-directory policy task show)

Vserver: DEMO
  Policy: Policy1

   Index  File/Folder  Access           Security  NTFS       NTFS Security
          Path         Control          Type      Mode       Descriptor Name
   -----  -----------  ---------------  --------  ---------- ---------------
   1      /shared/mixed
                       file-directory   ntfs      propagate  sdname

Once everything appears in order, apply the policy:

cluster::*> file-directory apply -vserver DEMO -policy-name Policy1
  (vserver security file-directory apply)

[Job 3229] Job is queued: Fsecurity Apply. Use the "job show -id 3229" command to view the status of this operation.

If you want status of the progress, use job show. If you want detailed progress, use job show -instance.

cluster::*> job show -id 3229
                            Owning
Job ID Name                 Vserver    Node           State
------ -------------------- ---------- -------------- ----------
3229   Fsecurity Apply      cluster
                                       cluster2
                                                      Success
       Description: File Directory Security Apply Job

Then, check your ACLs. Note how the effective style of the mixed qtree has changed from UNIX to NTFS:

cluster::*> vserver security file-directory show -vserver DEMO -path /shared/mixed

                Vserver: DEMO
              File Path: /shared/mixed
      File Inode Number: 20035
         Security Style: mixed
        Effective Style: ntfs
         DOS Attributes: 10
 DOS Attributes in Text: ----D---
Expanded Dos Attributes: -
           UNIX User Id: 0
          UNIX Group Id: 0
         UNIX Mode Bits: 777
 UNIX Mode Bits in Text: rwxrwxrwx
                   ACLs: NTFS Security Descriptor
                         Control:0x8014
                         Owner:NTAP\ntfsonly
                         Group:BUILTIN\Administrators
                         DACL - ACEs
                           ALLOW-BUILTIN\Administrators-0x1f01ff-OI|CI
                           ALLOW-BUILTIN\Users-0x1f01ff-OI|CI
                           ALLOW-CREATOR OWNER-0x1f01ff-OI|CI
                           ALLOW-NT AUTHORITY\SYSTEM-0x1f01ff-OI|CI
                           ALLOW-NTAP\ntfsonly-0x1f01ff
                           ALLOW-NTAP\student1-0x120089     

If you want to apply the policy to other paths (or multiple paths at once), create new tasks:

cluster::*> vserver security file-directory show -vserver DEMO -path /shared/security
                Vserver: DEMO
              File Path: /shared/security
      File Inode Number: 96
         Security Style: mixed
        Effective Style: unix
         DOS Attributes: 10
 DOS Attributes in Text: ----D---
Expanded Dos Attributes: -
           UNIX User Id: 0
          UNIX Group Id: 0
         UNIX Mode Bits: 770
 UNIX Mode Bits in Text: rwxrwx---
                   ACLs: -

cluster::*> file-directory policy task add -vserver DEMO -policy-name Policy1 -path /shared/security -ntfs-sd sdname
  (vserver security file-directory policy task add)

cluster::*> file-directory policy task show
  (vserver security file-directory policy task show)
Vserver: DEMO
  Policy: Policy1
   Index  File/Folder  Access           Security  NTFS       NTFS Security
          Path         Control          Type      Mode       Descriptor Name
   -----  -----------  ---------------  --------  ---------- ---------------
   1      /shared/mixed
                       file-directory   ntfs      propagate  sdname
   2      /shared/security
                       file-directory   ntfs      propagate  sdname
2 entries were displayed.

cluster::*> file-directory apply -vserver DEMO -policy-name Policy1
  (vserver security file-directory apply)

[Job 3232] Job is queued: Fsecurity Apply. Use the "job show -id 3232" command to view the status of this operation.

cluster::*> vserver security file-directory show -vserver DEMO -path /shared/security
                Vserver: DEMO
              File Path: /shared/security
      File Inode Number: 96
         Security Style: mixed
        Effective Style: ntfs
         DOS Attributes: 10
 DOS Attributes in Text: ----D---
Expanded Dos Attributes: -
           UNIX User Id: 0
          UNIX Group Id: 0
         UNIX Mode Bits: 777
 UNIX Mode Bits in Text: rwxrwxrwx
                   ACLs: NTFS Security Descriptor
                         Control:0x8014
                         Owner:NTAP\ntfsonly
                         Group:BUILTIN\Administrators
                         DACL - ACEs
                           ALLOW-BUILTIN\Administrators-0x1f01ff-OI|CI
                           ALLOW-BUILTIN\Users-0x1f01ff-OI|CI
                           ALLOW-CREATOR OWNER-0x1f01ff-OI|CI
                           ALLOW-NT AUTHORITY\SYSTEM-0x1f01ff-OI|CI
                           ALLOW-NTAP\ntfsonly-0x1f01ff
                           ALLOW-NTAP\student1-0x120089

Example of a running job with more information:

cluster::*> job show -id 3317 -instance
                      Job ID: 3317
              Owning Vserver: cluster
                        Name: Fsecurity Apply
                 Description: File Directory Security Apply Job
                    Priority: Low
                        Node: cluster02
                    Affinity: Cluster
                    Schedule: @now
                  Queue Time: 01/24 09:45:19
                  Start Time: 01/24 09:45:19
                    End Time: -
              Drop-dead Time: -
                  Restarted?: false
                       State: Running
                 Status Code: 0
           Completion String:
                    Job Type: FSEC_APPLY
                Job Category: FSECURITY
                        UUID: b9e7bf61-e243-11e6-a40c-00a0986b1210
          Execution Progress: Fsecurity Apply processed 46766 files/dirs. Last Processed: /shared/security/files/topdir_77/subdir_81
                   User Name: admin
                     Process: mgwd
  Restart Is or Was Delayed?: false
Restart Is Delayed by Module: -

Centralizing permission management

With multiprotocol NAS, it’s possible to view and manage ACLs from multiple clients, as well as the storage. The way I did this was to set up passwordless SSH on a Linux client and then create simple shell scripts that call SSH commands to the cluster. Another way to do this would be to leverage the ONTAP SDK. I’ll write up a post on the SDK at some point in the future, but for now, we’ll focus on the bash scripts.

To set up passwordless SSH to the cluster, do the following (from TR-4073):

Create the SSH Keypair

In the following example, ssh-keygen is used on a Linux box.

  • If a ssh key pair already exists, there is no need to generate one using ssh-keygen.
monitor@linux:/$ ssh-keygen -q -f ~/.ssh/id_rsa -t rsa
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
monitor@linux:/$ ls -lsa ~/.ssh
total 16
4 drwx------ 2 monitor monitor 4096 2008-08-26 11:47 .
4 drwxr-xr-x 3 monitor monitor 4096 2008-08-26 11:47 ..
4 -rw------- 1 monitor monitor 1679 2008-08-26 11:47 id_rsa
4 -rw-r--r-- 1 monitor monitor 401 2008-08-26 11:47 id_rsa.pub

Create the User with a Public Key Authentication Method

cluster::> security login create -username monitor -application ssh -authmethod publickey -profile admin

Create the Public Key on the Cluster

Copy the public key contents of the id_rsa.pub file and place it between quotes in the security login public key create command. Take caution not to add carriage returns or other data that modifies the keystring; leave it in one line.

netapp::> security login publickey create -username monitor -index 1 -publickey “ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA5s4vVbwEO1sOsq7r64V5KYBRXBDb2I5mtGmt0+3p1jjPJrXx4/IPHFLalXAQkG7LhV5Dyc5jyQiGKVawBYwxxSZ3GqXJNv1aORZHJEuCd0zvSTBGGZ09vra5uCfxkpz8nwaTeiAT232LS2lZ6RJ4dsCz+GAj2eidpPYMldi2z6RVoxpZ5Zq68MvNzz8b15BS9T7bvdHkC2OpXFXu2jndhgGxPHvfO2zGwgYv4wwv2nQw4tuqMp8e+z0YP73Jg0T3jV8NYraXO951Rr5/9ZT8KPUqLEgPZxiSNkLnPC5dnmfTyswlofPGud+qmciYYr+cUZIvcFaYRG+Z6DM/HInX7w==  monitor@linux”

Alternatively, you can use the load-from-uri function to bring the public key from another source.

cluster::> security login publickey load-from-uri -username monitor -uri http://linux/id_rsa.pub

Verify Creation

netapp::> security login publickey show -username monitor

UserName: monitor Index: 1

Public Key:

ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA5s4vVbwEO1sOsq7r64V5KYBRXBDb2I5mtGmt0+3p1jjPJrXx4/IPHFLalXAQkG7LhV5Dyc5jyQiGKVawBYwxxSZ3GqXJNv1aORZHJEuCd0zvSTBGGZ09vra5uCfxkpz8nwaTeiAT232LS2lZ6RJ4dsCz+GAj2eidpPYMldi2z6RVoxpZ5Zq68MvNzz8b15BS9T7bvdHkC2OpXFXu2jndhgGxPHvfO2zGwgYv4wwv2nQw4tuqMp8e+z0YP73Jg0T3jV8NYraXO951Rr5/9ZT8KPUqLEgPZxiSNkLnPC5dnmfTyswlofPGud+qmciYYr+cUZIvcFaYRG+Z6DM/HInX7w==monitor@linux

Test Access from the Host

monitor@linux:~$ ssh 10.61.64.150
The authenticity of host '10.61.64.150 (10.61.64.150)' can't be established.
DSA key fingerprint is d9:15:cf:4b:d1:7b:a9:67:4d:b0:a9:20:e4:fa:f4:69.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.61.64.150' (DSA) to the list of known hosts.

Once that’s done, you can set up scripts to make SSH calls without having to interact.

Sample scripts

I’ve posted some sample bash scripts on GitHub to allow for open-sourcing of the tasks. But, essentially, the scripts I created can:

  • Show ACLs for specified paths
  • Change ACLs en masse for a specified object
  • Clean up policies and DACLs created
  • Be used as a wrapper

Creating a wrapper command

In addition to the scripts above, it’s also possible to create a simple wrapper command in Linux that will call a script to make life easier for an administrator. To do this, modify the .bashrc file in the user’s home directory. In the following example, I created a command called ONTAP_ACL and pointed it to my script.

# cat ~/.bashrc
# .bashrc

# User specific aliases and functions

alias rm='rm -i'
alias cp='cp -i'
alias mv='mv -i'
alias ONTAP_ACL='/scripts/./show-ACL.sh'

Once this is done, you can restart the shell session and the command can be run to execute the script. In the above, the show-ACL script simply takes the path input and asks a yes/no question and dumps the output.

# ONTAP_ACL /home
Do you want to expand the ACL masks to show all fields? (enter 1 or 2)
CAUTION: Output may be lengthy

1) Yes
2) No
#? 2


 Vserver: DEMO
 File Path: /home
 File Inode Number: 64
 Security Style: mixed
 Effective Style: ntfs
 DOS Attributes: 10
 DOS Attributes in Text: ----D---
Expanded Dos Attributes: -
 UNIX User Id: 0
 UNIX Group Id: 1
 UNIX Mode Bits: 711
 UNIX Mode Bits in Text: rwx--x--x
 ACLs: NTFS Security Descriptor
 Control:0x9504
 Owner:NTAP\Administrator
 Group:NTAP\Domain Users
 DACL - ACEs
 ALLOW-NTAP\Administrator-0x1f01ff-OI|CI
 ALLOW-Everyone-0x100020-OI|CI

I could also apply a wrapper to other scripts, such as the script to modify ACLs if I chose. This allows a simple way to centrally manage your file and folder ACLs  rather than having to jump between clients and storage.

Testing the scripts

Be sure to test the scripts only in protected environments, rather than on your production data. Make generous use of ONTAP features such as FlexClone, SnapMirror (to new Storage Virtual Machines) and Snapshots.

Questions? Leave them in the comments!

 

Mixed perceptions with NetApp multiprotocol NAS access

EDIT: As the original post for this was super long, I’ve since broken it up into a 2 part post. I moved the vserver security information to the following post:

Managing ACLs via the ONTAP Command Line

NetApp’s ONTAP operating system is one of the few storage operating systems out there that supports data access from both CIFS/SMB and NFS clients. NetApp’s been doing this for a long time – longer than I’ve been there, and I’m going on 10 years!

Despite the fact that it’s been around so long and is one of *the* core competencies in ONTAP, it’s one of the most frequently misunderstood configurations I see. When I was in support, it was one of the biggest case generators. As the NFS TME, it’s one of the most common emails I get that customers need assistance on.

I can tell you what it’s not….

Multiprotocol NAS is NOT “Mixed Mode”

Many people use this terminology for describing access from multiple clients. Unfortunately, it only adds to the confusion, because there is also a security style called “mixed” (see below) and that makes people associate the two and then they start setting mixed security styles when they don’t need to…

So, call it what it is – Multiprotocol NAS. 🙂

What’s so hard about it?

The reason it seems to confound so many people is two fold:

  • Windows administrators are generally not UNIX-savvy
  • UNIX administrators are generally not Windows-savvy

To truly understand multiprotocol NAS, you either have to know both Windows and UNIX file systems/security sematics pretty well, or be open to the fact that Windows and UNIX have similarities and differences.

That said, when you do understand how it works and get it configured properly, it’s a pretty powerful tool for serving data for multiple client types.

There’s currently a Multiprotocol TR in the works, but will be a ways out. However, I just dealt with a recent multiprotocol NAS issue and wanted to do a brain dump before the information got stale and I had to revisit it. This blog is intended to be a quick hit guide to multiprotocol NAS in ONTAP. Some of the ideas will make their way into official TR format.

What makes multiprotocol NAS possible in ONTAP?

ONTAP is fairly agnostic when it comes to file systems and ACL styles. SMB and NFS clients use different security semantics, but the general concepts of those are the same.

Users, groups, permissions.

From there, things tend to skew a bit. Windows uses NTFS security concepts. NFS clients use mode bits for NFSv3/NFSv4.x or ACLs for NFSv4.x. NFSv3 had the concept of POSIX ACLs, but ONTAP doesn’t support those.

The issue is that NTFS ACLs are more complex than mode bits, but match up pretty nicely with NFSv4.x ACLs. Mode bits only do Read, Write, eXecute (RWX), so Windows ACLs don’t match up 1 to 1, especially when you have “special permissions” in the mix. As a result, when dealing with ONTAP file systems, we have the concept of a security style that helps us choose the style of ACL we want to implement. The choices we have:

  • NTFS – NTFS ACLs only
  • UNIX – UNIX style permissions only
  • Mixed – UNIX or NTFS permissions, depending on who last changed permissions
  • Unified (Infinite Volume only)

To properly address permissions, ONTAP has to pick one security style over the other. This allows the storage system to decide which direction a user will map to determine the correct permissions. After all, what’s the point of permissions if they don’t work properly?

User mapping

ONTAP is not unique in the concept of user mapping, but it is still a concept that gets people confused on occasion.

Essentially, to get the proper permissions on a NetApp storage system, a client must first pass a “test” in the form of initial authentication.

The initial test is “Who are you?”

The storage system needs to know that the user you are claiming to be is actually you. There are varying degrees of how secure this test is, mostly dependent on the protocol you’re using, but the bottom line is this: authentication helps us get a user name. That user name allows us to map to another user name, depending on the volume security style.

In general:

  • SMB clients always map to a UNIX user because ONTAP is UNIX-based, even if NTFS security style is in use
  • If no name mapping rules or 1:1 name mappings exist, SMB users map to a default UNIX user set in CIFS options (pcuser/65534 by default)
  • 65534 is “nobody” or “nfsnobody” in most UNIX clients
  • NFS clients only map to Windows users when the security style is NTFS
  • NFS clients cannot chmod or chown on NTFS style volumes; SMB clients cannot take ownership or change ACLs on UNIX style volumes

Once a user has authenticated, the permissions can be discerned based on access control lists. One can see those ACLs via the CLI of the storage system with “vserver security file-directory show.”

cluster::*> vserver security file-directory show -vserver parisi -path /cifs

                 Vserver: parisi
               File Path: /cifs
       File Inode Number: 64
               Security Style: ntfs
         Effective Style: ntfs
          DOS Attributes: 10
  DOS Attributes in Text: ----D---
 Expanded Dos Attributes: -
            UNIX User Id: 0
           UNIX Group Id: 0
          UNIX Mode Bits: 777
  UNIX Mode Bits in Text: rwxrwxrwx
                    ACLs: NTFS Security Descriptor
                          Control:0x8004
                            Owner:BUILTIN\Administrators
                            Group:BUILTIN\Administrators
                            DACL - ACEs
                             ALLOW-Everyone-0x1f01ff
                             ALLOW-Everyone-0x10000000-OI|CI|IO

User/name mapping is one of the most important pieces of the multiprotocol NAS puzzle. Get that part right and most everything else is easy.

Name mapping can be done either locally (via name mapping rules) or with LDAP. TR-4073 covers this sort of thing in pretty finite detail.

Name services/LDAP

The easiest way to handle name mapping in ONTAP for multiprotocol NAS is to leverage a name service server like LDAP. When dealing with both SMB and NFS, the most logical choice is to use the existing Active Directory infrastructure to host UNIX identities. While you can host name mapping rules for users that don’t have the same UNIX and Windows names, it’s best to try to have UNIX and Windows user names match 1:1. (I.e., DOMAIN\nfsdudeabides == nfsdudeabides in UNIX).

TR-4073 covers LDAP and TR-4379 covers name service best practices for ONTAP 9.2 and prior. TR-4668 covers name services in ONTAP 9.3 and beyond.

Mixed Security Style

Fun fact – Mixed security style isn’t truly “mixed.” When you use mixed security style, it’s always either NTFS or UNIX security style at any given moment. This is known as the “effective” security style, which can be seen in “vserver security file-directory show.”

cluster::*> vserver security file-directory show -vserver parisi -path /cifs

                 Vserver: parisi
               File Path: /cifs
       File Inode Number: 64
               Security Style: ntfs
         Effective Style: ntfs

The “effective” style changes based on  the last permission change. If an NFS client does a chmod or chown, the mixed security style volume changes to effective UNIX security style. If an SMB client changes owner or sets an ACL, the effective security style changes to NTFS. When these effective styles change, how the storage does name mapping changes (ie; win-unix to unix-win, etc).

Is mixed security style recommended?

Generally speaking, you don’t want file systems changing something behind the scenes without the knowledge of the storage administrators. Plus, these changes can affect functionality, and even access. As a result, mixed security style is generally not recommended. The only time you’d want to use mixed security style is if your environment requires the ability for clients or applications to change permissions from both NFS and SMB. And even then, if you do set up mixed security style, consider limiting the ability for regular users to take ownership or change permissions on folders and files via NTFS ACLs.

Otherwise, I personally recommend picking either NTFS or UNIX and sticking with it. That choice would be based on how you want your users to manage their ACLs, as well as how granular you want control to be on those file systems. For example, mode bits in UNIX only allow setting an owner, group and everyone else. There’s no way to set multiple groups with different access on the object unless you use NFSv4 or NTFS ACLs.

I usually prefer NTFS because you get the granularity, as well as the GUI functionality many users are accustomed to.

If you do decide to use mixed security style, keep the following in mind:

  • If a volume is using mixed security style and the effective style gets flipped from NTFS to UNIX and then back to NTFS by way of the clients, the previous NTFS ACLs are lost.
  • When a volume flips from UNIX effective to NTFS effective, you get the mode bit translation. For example, if the UNIX volume was 755, you get “Owner – Full Control” and “Everyone – Read/Execute” as Windows ACLs. 700 gives “Owner – Full Control” only.
  • Administrator always gets added onto the ACL with Read/Write access when we flip to NTFS from UNIX.
  • With mixed security style, there are two types of owners – UNIX owner and Windows owner. When Windows “takes ownership,” the UNIX owner does not change.
  • When the effective style of the volume is NTFS, UNIX clients will see permissions as 777 unless the NFS server option ntacl-display-permissive-perms is set to “disabled.”

For information on how to manage permissions in ONTAP, see the following post:

Managing ACLs via the ONTAP Command Line

Be on the lookout for a multiprotocol TR in the future that covers this and more!

Got any questions? Feel free to post in the comments!

NetApp FlexGroup: An evolution of NAS

evolution-of-man-parodies-333

Check out the official NetApp version of this blog on the NetApp Newsroom!

I’ve been the NFS TME at NetApp for 3 years now.

I also cover name services (LDAP, NIS, DNS, etc.) and occasionally answer the stray CIFS/SMB question. I look at NAS as a data utility, not unlike water or electricity in your home. You need it, you love it, but you don’t really think about it too much and it doesn’t really excite you.

However, once I heard that NetApp was creating a brand new distributed file system that could evolve how NAS works, I jumped at the opportunity to be a TME for it. So, now, I am the Technical Marketing Engineer for NFS, Name Services and NetApp FlexGroup (and sometimes CIFS/SMB). How’s that for a job title?

We covered NetApp FlexGroup in the NetApp Tech ONTAP Podcast the week of June 30, but I wanted to write up a blog post to expand upon the topic a little more.

Now that ONTAP 9.1 is available, it was time to update the blog here.

For the official Technical Report, check out TR-4557 – NetApp FlexGroup Technical Overview.

For the best practice guide, see TR-4571 – NetApp FlexGroup Best Practices and Implementation Guide.

Here are a couple videos I did at Insight:

I also had a chance to chat with Enrico Signoretti at Insight:

Data is growing.

It’s no secret… we’re leaving – some may say, left – the days behind where 100TB in a single volume is enough space to accommodate a single file system. Files are getting larger and datasets are increasing. For instance, think about the sheer amount of data that’s needed to keep something like a photo or video repository running. Or a global GPS data structure. Or Electronic Design Automation environments designing the latest computer chipset. Or seismic data analyzing oil and gas locations.

Environments like these require massive amounts of capacity, with billions of files in some cases. Scale-out NAS storage devices are the best way to approach these use cases because of the flexibility, but it’s important to be able to scale the existing architecture in a simple and efficient manner.

For a while, storage systems like ONTAP had a single construct to handle these workloads – the Flexible Volume (or, FlexVol).

FlexVols are great, but…

For most use cases, FlexVols are perfect. They are large enough (up to 100TB) and can handle enough files (up to 2 billion). For NAS workloads, they can do just about anything. But where you start to see issues with the FlexVol is when you start to increase the number of metadata operations in a file system. The FlexVol volume will serialize these operations and won’t use all possible CPU threads for the operations. I think of it like a traffic jam due to lane closures; when a lane is closed, everyone has to merge, causing slowdowns.

traffic-jam

When all lanes are open, traffic is free to move normally and concurrently.

traffic-clear.png

Additionally, because a FlexVol volume is tied directly to a physical aggregate and node, your NAS operations are also tied to that single aggregate or node. If you have a 10-node cluster, each with multiple aggregates, you might not be getting the most bang for your buck.

That’s where NetApp FlexGroup comes in.

FlexGroup has been designed to solve multiple issues in large-scale NAS workloads.

  • Capacity – Scales to multiple petabytes
  • High file counts – Hundreds of billions of files
  • Performance – parallelized operations in NAS workloads, across CPUs, nodes, aggregates and constituent member FlexVol volumes
  • Simplicity of deployment – Simple-to-use GUI in System Manager allows fast provisioning of massive capacity
  • Load balancing – Use all your cluster resources for a single namespace

With FlexGroup volumes, NAS workloads can now take advantage of every resource available in a cluster. Even with a single node cluster, a FlexGroup can balance workloads across multiple FlexVol constituents and aggregates.

How does a FlexGroup volume work at a high level?

FlexGroup volumes essentially take the already awesome concept of a FlexVol volume and simply enhances it by stitching together multiple FlexVol member constituents into a single namespace that acts like a single FlexVol to clients and storage administrators.

A FlexGroup volume would roughly look like this from an ONTAP perspective:

fg-diagram

Files are not striped, but instead are placed systematically into individual FlexVol member volumes that work together under a single access point. This concept is very similar in function to a multiple FlexVol volume configuration, where volumes are junctioned together to simulate a large bucket.

multi-flexvol.png

However, multiple FlexVol volume configurations add complexity via junctions, export policies and manual decisions for volume placement across cluster nodes, as well as needing to re-design applications to point to a filesystem structure that is being defined by the storage rather than by the application.

To a NAS client, a FlexGroup volume would look like a single bucket of storage:

flexgroup-client.png

When a client creates a file in a FlexGroup, ONTAP will decide which member FlexVol volume is the best possible container for that write based on a number of things such as capacity across members, throughput, last accessed… Basically, doing all the hard work for you. The idea is to keep the members as balanced as possible without hurting performance predictability at all, and, in fact, increasing performance in some workloads.

The creates can arrive on any node in the cluster. Once the request arrives to the cluster, if ONTAP chooses a member volume that’s different than where the request arrived, a hardlink is created within ONTAP (remote or local, depending on the request) and the create is then passed on to the designated member volume. All of this is transparent to clients.

Reads and writes after a file is created will operate much like they already do in ONTAP FlexVols now; the system will tell the client where the file location is and point that client to that particular member volume. As such, you would see much better gains with initial file ingest versus reads/writes after the files have already been placed.

 

Why is this better?

 

When NAS operations can be allocated across multiple FlexVol volumes, we don’t run into the issue of serialization in the system. Instead, we start spreading the workload across multiple file systems (FlexVol volumes) joined together (the FlexGroup volume). And unlike Infinite Volumes, there is no concept of a single FlexVol volume to handle metadata operations – every member volume in a FlexGroup volume is eligible to process metadata operations. As a result, FlexGroup volumes perform better than Infinite Volumes in most cases.

What kind of performance boost are we potentially seeing?

In preliminary testing of a FlexGroup against a single FlexVol, we’ve seen up to 6x the performance. And that was with simple spinning SAS disk. This was the set up used:

  • Single FAS8080 node
  • SAS drives
  • 16 FlexVol member constituents
  • 2 aggregates
  • 8 members per aggregate

The workload used to test the FlexGroup as a software build using Git. In the graph below, we can see that operations such as checkout and clone show the biggest performance boosts, as they take far less time to run to completion on a FlexGroup than on a single FlexVol.

fg-git-graph

Adding more nodes and members can improve performance. Adding AFF into the mix can help latency. Here’s a similar test comparison with an AFF system. This test used GIT, but did a compile of gcc instead of the Linux source code to give us more files.

aff-fg.png

In this case, we see similar performance between a single FlexVol and FlexGroup. We do see slightly better performance with multiple FlexVols (junctioned), but doing that creates complexity and doesn’t offer a true single namespace of >100TB.

We also did some recent AFF testing with a GIT workload. This time, the compile was the gcc library, rather than a Linux compile. This gave us more files and folders to work with. The systems used were an AFF8080 (4 nodes) and an A700 (2 nodes).

aff-completiontimes.png

Simple management

FlexGroup volumes allow storage administrators to deploy multiple petabytes of storage to clients in a single container within a matter of seconds. This provides capacity, as well as similar performance gains you’d see with multiple junctioned FlexVol volumes. (FYI, a junction is essentially just mounting a FlexVol to a FlexVol)

In addition to that, there is compatability out of the gate with OnCommand products. The OnCommand TME Yuvaraju B has created a video showing this, which you can see here:

Snapshots

This section is added after the blog post was already published, as per one of the blog comments. I just simply forgot to mention it. 🙂

In the first release of NetApp FlexGroup, we’ll have access to snapshot functionality. Essentially, this works the same as regular snapshots in ONTAP – it’s done at the FlexVol level and will capture a point in time of the filesystem and lock blocks into place with pointers. I cover general snapshot technology in the blog post Snapshots and Polaroids: Neither Last Forever.

Because a FlexGroup is a collection of member FlexVols, we want to be sure snapshots are captured at the exact same time for filesystem consistency. As such, FlexGroup snapshots are coordinated by ONTAP to be taken at the same time. If a member FlexVol cannot take a snapshot for any reason, the FlexGroup snapshot fails and ONTAP cleans things up.

SnapMirror

FlexGroup supports SnapMirror for disaster recovery. This currently replicates up to 32 member volumes per FlexGroup (100 total per cluster) to a DR site. SnapMirror will take a snapshot of all member volumes at once and then do a concurrent transfer of the members to the DR site.

Automatic Incremental Resiliency

Also included in the FlexGroup feature is a new mechanism that seeks out metadata inconsistencies and fixes them when a client requests access, in real time. No outages. No interruptions. The entire FlexGroup remains online while this happens and the clients don’t even notice when a repair takes place. In fact, no one would know if we didn’t trigger a pesky EMS message to ONTAP to ensure a storage administrator knows we fixed something. Pretty underrated new aspect of FlexGroup, if you ask me.

How do you get NetApp FlexGroup?

NetApp FlexGroup is currently available in ONTAP 9.1 for general availability. It can be used by anyone, but should only be used for the specific use cases covered in the FlexGroup TR-4557. I also cover best practices in TR-4571.

In ONTAP 9.1, FlexGroup supports:

  • NFSv3 and SMB 2.x/3.x (RC2 for SMB support; see TR-4571 for feature support)
  • Snapshots
  • SnapMirror
  • Thin Provisioning
  • User and group quota reporting
  • Storage efficiencies (inline deduplication, compression, compaction; post-process deduplication)
  • OnCommand Performance Manager and System Manager support
  • All-flash FAS (incidentally, the *only* all-flash array that currently supports this scale)
  • Sharing SVMs with FlexVols
  • Constituent volume moves

To get more information, please email flexgroups-info@netapp.com.

What other ONTAP 9 features enhance NetApp FlexGroup volumes?

While FlexGroup as a feature is awesome on its own, there are also a number of ONTAP 9 features added that make a FlexGroup even more attractive, in my opinion.

I cover ONTAP 9 in ONTAP 9 RC1 is now available! but the features I think benefit FlexGroup right out of the gate include:

  • 15 TB SSDs – once we support flash, these will be a perfect fit for FlexGroup
  • Per-aggregate CPs – never bottleneck a node on an over-used aggregate again
  • RAID Triple Erasure Coding (RAID-TEC) – triple parity to add extra protection to your large data sets

Be sure to keep an eye out for more news and information regarding FlexGroup. If you have specific questions, I’ll answer them in the comments section (provided they’re not questions I’m not allowed to answer). 🙂

If you missed the NetApp Insight session I did on FlexGroup volumes, you can find session 60411-2 here:

https://www.brainshark.com/go/netapp-sell/insight-library.html?cf=12089#bsk-lightbox

(Requires a login)

Also, check out my blog on XCP, which I think would be a pretty natural fit for migration off existing NAS systems onto FlexGroup.

What’s the deal with remote I/O in ONTAP?

cropped-jerry-seinfeld-stand-up-comedy-seinfeld1

I’m sure most of you have seen Seinfeld, so be sure to read the title in your head as if Seinfeld is delivering it.

I used a comedian as a starter because this post is about a question that I get asked – a lot – that is kind of a running joke by now.

The set up…

When Clustered Data ONTAP first came out, there was a pretty big kerfuffle (love that word) about the architecture of the OS. After all, wasn’t it just a bunch of 7-Mode systems stitched together with duct tape?

Actually, no.

It’s a complete re-write of the ONTAP operating system, for one. The NAS stack from 7-Mode was gutted and became a new architecture built for clustering.

Then, in 8.1, the SAN concepts in 7-Mode were re-done for clustering.

So, while a clustered Data ONTAP cluster is, at the hardware level, a series of HA pairs stitched together with a 10GB network, the operating system has been turned into essentially what I like to call a storage blade center. Your storage systems span clusters of up to 24 physical hardware nodes, effectively obfuscating the hardware and allowing a single management plane for the entire subsystem.

Every node in a cluster is aware of every other node, as well as every other storage object. If a volume lives on node 1, then node 20 knows about it and where it lives via the concept of a replicated database (RDB).

Additionally, the cluster also has a clustered networking stack, where an IP address or WWPN is presented via a logical interface (a LIF). While SAN LIFs have to stay put and leverage host-side pathing for data locality, NAS LIFs have the ability to migrate across any node and any port in the cluster.

However, volumes are still located on physical disks and owned by physical nodes, even though you can move them around via volume move or vol rehost. LIFs are still located on physical ports and nodes, even though you can move them around and load balance connections on them. This raises the question…

What is the deal with remote I/O in ONTAP?

Since you can have multiple nodes in a cluster and a volume can only exist on one node (well, unless you want to check out FlexGroups), and since data LIFs live on single or aggregated ports on a single node, you are bound to run into scenarios where you end up traversing the backend cluster network for data operations unless you want to take on the headache of ensuring every client mounts to a specific IP address to ensure data locality, or you want to leverage one of the data locality features in NAS, such as pNFS or node referrals on initial connection (available for NFSv4.x and CIFS/SMB). I cover some of the NFS-related data locality features in TR-4067, and CIFS autolocation is covered in TR-4191.

In SAN, we have ALUA to manage that locality (or optimized paths), but even adding an extra layer of protection in the form of protocol locality can’t avoid scenarios where interfaces go down or volumes move around after a TCP connection has been established.

That backend network? Why, it’s a 10GB dedicated network with 2-4 dedicated ports per node. No other data is allowed on the network other than cluster operations. Data I/O traverses the network in a proprietary protocol known as SpinNP, which leverages TCP to guarantee the arrival of packets. And, with the advent of 40GB ethernet and other speedier methods of data transfer, I’d be shocked if we didn’t see that backend network improve over the next 5-10 years. The types of operations that traverse the cluster network include:

  • SpinNP for data/local snapmirror
  • ZAPI calls

That’s pretty much it. It’s a beefy, robust backend network that is *extremely* hard to saturate. You’re more likely to bottleneck somewhere else (like your client) before you overload a cluster network.

So now that we’ve established that remote I/O will likely happen, let’s talk about if that matters…

The punchline

simpson_krusty_il_clown

Remote I/O absolutely adds overhead to operations. There’s no technical way around saying it. Suggesting there is no penalty would be dishonest. The amount of penalty, however, varies, depending on protocol. This is especially true when  you consider that NAS operations will leverage a fast path when you localize data.

But the question wasn’t “is there a penalty?” The question is “does it matter?”

I’ll answer with some anecdotal evidence – I spent 5 years in support, working on escalations for clustered Data ONTAP for 3 of those years. I closed thousands of cases over that time period. In that time, I *never* fixed a performance issue by making sure a customer used a local data path.  And believe me, it wasn’t for lack of effort. I *wanted* remote traffic to be the root cause, because that was the easy answer.

Sure, it could help when dealing with really low latency applications, such as Oracle. But in those cases, you architect the solution with data locality in mind. In the other vast majority of scenarios, the “remote I/O” penalty is pretty much irrelevant and causes more hand wringing than necessary.

The design of clustered Data ONTAP was intended to help storage administrators stop worrying about the layout of the data. Let’s start allowing it to do its job!

Spreading the love: Load balancing NAS connections in ONTAP

peanut-butter-spread-400x400

I can be a little thick at times.

I’ll get asked a question a number of times, answer the question, and then forget the most important action item – document the question and answer somewhere to refer people to later, when I inevitably get asked the same question.

Some of the questions I get asked about fairly often as the NetApp NFS Technical Marketing Engineer involve DNS, which is only loosely associated with NFS. Go figure.

But, because I know enough about DNS to have written a blog post on it and a Technical Report on our Name Services Best Practices (and I actually respond to emails), I get asked.

These questions include:

  • What’s round robin DNS?
  • What other load balancing options are  there?
  • What is on-box DNS in clustered Data ONTAP?
  • How do I ensure data access is local?
  • How do I set it up?
  • When would I use on-box DNS vs DNS round robin?

So, in this blog, I’ll try to answer most of those at a high level. For more detail, see the new TR-4523: DNS Load Balancing in ONTAP.

What’s round robin DNS?

Remember when you were in school and you played “duck duck goose“? If you didn’t, click the link on the term and read about it.

But essentially, the game is: everyone sits in a circle, someone walks around the circle and taps each person and says “duck” and then when they want to initiate the chase, they yell “GOOSE!” and run around the circle to sit before the person catches them.

That’s essentially round robin DNS.

You create multiple A/AAAA records, associate with the same host name and away you go! The DNS server will deliver a different IP address for each request of the hostname, in ABCD/ABCD fashion. No real rhyme or reason, just first come/first serve.

What other DNS load balancing options are there?

There are 3rd party load balance appliances, such as F5 Big IP (not an endorsement, just an example). But, those cost money and require administration.

In ONTAP, however, there is a not-so-well-known feature for DNS load balancing called “on-box DNS load balancing” that is intended to incorporate intelligent load balancing for DNS requests into a cluster.

What is on-box DNS load balancing?

On-box DNS load balancing in ONTAP uses a patented algorithm to determine the best possible data LIFs on the best possible nodes to return to clients.

Basically, it looks a bit like this:

onbox

The client will make a DNS request to the DNS servers in its configuration.

The DNS server will notice that the request is from a specific zone and use its zone forwarder to pass that request to the cluster data LIFs acting as name servers.

The cluster will leverage its DNS application process and a weight file to determine which IP addresses out of the ones configured to be used in that DNS zone should be used.

The algorithm factors in CPU utilization, throughput, etc when making the determination.

The data LIF IP address is passed back to the DNS server, then to the client.

Easy peasy.

picture13911134748425

How do I ensure data locality?

The short answer: With on-box DNS, you can’t. But does it matter?

In clustered Data ONTAP, if you have multiple nodes and multiple data LIFs, you might end up landing on a node’s data LIF that is not local to the volume being requested. That can incur a slight latency penalty as the request traverses the backend cluster network.

In a majority of cases, this penalty is negligible to clients and applications, but with latency-sensitive applications (especially in flash environments), this penalty can hurt a little. Using local network connections to data volumes for NAS uses a concept of “fast path” that bypasses things that the remote connections need to do. I cover this in a little more detail in TR-4067 and in TECH::Data LIF best practices for NAS in cDOT 8.3.

In cases where you absolutely *need* data access to be local to the node, you would need to mount those local data LIFs specifically. Create A/AAAA records with node names incorporated to help discern which LIFs are on which nodes.

But in most cases, it doesn’t hurt to have remote traffic – in my 5 years in support, I never fixed a performance issue by making data access local to the node.

How do I set it up?

It’s pretty straightforward. I cover it in detail in TR-4523: DNS Load Balancing in ONTAP. In that TR, I cover Active Directory and BIND environments.

For a simple summary:

  1. Configure data LIFs in your storage virtual machine to use -dns-zone [zone name]
  2. Select data LIFs in your storage virtual machine that will act as name servers and listen for DNS queries on port 53 with “-listen-for-dns-query true”. I’d recommend multiple LIFs to provide fault tolerance.
  3. Add a DNS forwarding zone (subdomain in BIND, delegation or conditional forwarder in AD) on the DNS server. Use the data LIFs acting as name servers in the configuration and use the zone specified in -dns-zone.
  4. Add PTR records for the LIFs as needed.

That’s about it.

When to use on-box DNS vs Round Robin DNS?

This is one of the trickier questions I get, because it’s ultimately due to preference.

However, there are some guidelines…

  • If the cluster is 1 or 2 nodes in size, it probably makes sense from a administration perspective to simply use round robin DNS.
  • If the cluster is larger than 2 nodes or will eventually scale out to more than 2 nodes, it probably makes sense to get the forwarding zones set up and use on-box DNS.
  • If you require data locality or plan on using features such as NFS node referrals, SMB node referrals or pNFS, then the load balance choice doesn’t matter much – the locality features will override the DNS request.

Conclusion

So there you have it – the quick and dirty rundown of using DNS load balancing for NAS connections. I’m personally a big fan of on-box DNS as a feature because of the notion of intelligent calculation of “best available” IP addresses.

If you have any questions about the feature or the new TR-4523, please comment below.