MacOS NFS Clients with ONTAP – Tips and Considerations

When I’m testing stuff out for customer deployments that I don’t work with a ton, I like to keep notes on the work so I can reference it later for TRs or other things. A blog is a great place to do that, as it might help other people in similar scenarios. This won’t be an exhaustive list, and it is certain to change over time and possibly make its way into a TR, but here we go…

ONTAP and Multiprotocol NAS

Before I get into the MacOS client stuff, we need to understand ONTAP multiprotocol NAS, as it can impact how MacOS clients behave.

In ONTAP, you can serve the same datasets to clients regardless of the NAS protocol they use (SMB or NFS). Many clients can actually do both protocols – MacOS is one of those clients.

The way ONTAP does Multiprotocol NAS (and keeps permissions predictable) is via name mappings and volume “security styles,” which controls what kind of ACLs are in use. TR-4887 goes into more detail on how all that works, but at a high level:

NTFS security styles use NTFS ACLs

SMB clients will map to UNIX users and NFS clients will require mappings to valid Windows users for authentication. Then, permissions are controlled via ACLs. Chmod/chown from NFS clients will fail.

UNIX security styles use UNIX mode bits (rwx) and/or NFSv4 ACLs

SMB clients will require mappings to a valid UNIX user for permissions; NFS clients will only require mapping to a UNIX user name if using NFSv4 ACLs. SMB clients can do *some* permissions changes, but on a very limited basis.

Mixed security styles always use either UNIX or NTFS effective security styles, based on last ACL change

Basically, if an NFS client chmods a file, it switches to UNIX security style. If an SMB client changes ownership of the file, it flips back to NTFS security style. This allows you to change permissions from any client, but you need to ensure you have proper name mappings in place to avoid undesired permission behavior. Generally, we recommend avoiding mixed security styles in most cases.

MacOS NFS Client Considerations

When using MacOS for an NFS client, there are a few things I’ve run into the past week or two while testing that you would want to know to avoid issues.

MacOS can be configured to use Active Directory LDAP for UNIX Identities

When you’re doing multiprotocol NAS (even if the clients will only do NFS, your volumes might have NTFS style permissions), you want to try to use a centralized name service like LDAP so that ONTAP, SMB clients and NFS clients all agree on who the users are, what groups they belong to, what numeric IDs they have, etc. If ONTAP thinks a user has a numeric ID of 1234 and the client things that user has a numeric ID of 5678, then you likely won’t get the access you expected. I wrote up a blog on configuring MacOS clients to use AD LDAP here:

MacOS clients can also be configured to use single sign on with AD and NFS home directories

Your MacOS clients – once added to AD in the blog post above – can now log in using AD accounts. There’s also an additional tab in the Directory Utility that allows you to auto-create home directories when a new user logs in to the MacOS client.

But you can also configure the auto-created home directories to leverage an NFS mount on the ONTAP storage system. You can configure the MacOS client to automount homedirs and then configure the MacOS client to use that path. (This process varies based on Mac version; I’m on 10.14.4 Catalina)

By default, the homedir path is /home in auto_master. We can use that.

Then, chmod the /etc/auto_home file to 644:

$ sudo chmod 644 /etc/auto_home

Create a volume on the ONTAP cluster for the homedirs and ensure it’s able to be mounted from the MacOS clients via the export policy rules (TR-4067 covers export policy rules):

::*> vol show -vserver DEMO -volume machomedirs -fields junction-path,policy
vserver volume      policy  junction-path
------- ----------- ------- ----------------
DEMO    machomedirs default /machomedirs

Create qtrees for each user and set the user/group and desired UNIX permissions:

qtree create -vserver DEMO -volume machomedirs -qtree prof1 -user prof1 -group ProfGroup -unix-permissions 755
qtree create -vserver DEMO -volume machomedirs -qtree student1 -user student1 -group group1 -unix-permissions 755

(For best results on Mac clients, use UNIX security styles.)

Then modify the automount /etc/auto_home file to use that path for homedir mounts. When a user logs in, the homedir will auto mount.

This is the line I used:

* -fstype=nfs nfs://demo:/machomedirs/&

And I also add the home mount

Then apply the automount change:

$ sudo automount -cv
automount: /net updated
automount: /home updated
automount: /Network/Servers updated
automount: no unmounts

Now, when I cd to /home/username, it automounts that path:

$ cd /home/prof1
$ mount
demo:/machomedirs/prof1 on /home/prof1 (nfs, nodev, nosuid, automounted, nobrowse)

But if I want that path to be the new homedir path, I would need to log in as that user and then go to “System Preferences -> Users and Groups” and right click the user. Then select “Advanced Options.”

Then you’d need to restart. Once that happens, log in again and when you first open Terminal, it will use the NFS homedir path.

NOTE: You may want to test if the Mac client can manually mount the homedir before testing logins. If the client can’t automount the homedir on login things will break.

Alternately, you can create a user with the same name as the AD account and then modify the homedir path (this removes the need to login). The Mac will pick up the correct UID, but the group ID may need to be changed.

If you use SMB shares for your home directories, it’s as easy as selecting “Use UNC path” in the User Experience area of Directory Utility (there’s no way to specify NFS here):

With new logins, the profile will get created in the qtree you created for the homedir (and you’ll go through the typical initial Mac setup screens):

# ls -la
total 28
drwxrwxr-x 6 student1 group1 4096 Apr 14 16:39 .
drwxr-xr-x 6 root root 4096 Apr 14 15:28 ..
drwx------ 2 student1 group1 4096 Apr 14 16:39 Desktop
drwx------ 2 student1 group1 4096 Apr 14 16:35 Downloads
drwxr-xr-x 25 student1 group1 4096 Apr 14 16:39 Library
-rw-r--r-- 1 student1 group1 4096 Apr 14 16:35 ._Library
drwx------ 4 student1 group1 4096 Apr 14 16:35 .Spotlight-V100

When you open terminal, it automounts the NFS home directory mount for that user and drops you right into your folder!

Mac NFS Considerations, Caveats, Issues

If you’re using NFS on Mac clients, there are two main things to remember:

  • Volumes/qtrees using UNIX security styles work best with NFS in general
  • Terminal/CLI works better than Finder in nearly all instances

If you have to/want to use Finder, or you have to/want to use NTFS security styles for multiprotocol, then there are some things you’d want to keep in mind.

  • If possible, connect the Mac client to the Active Directory domain and use LDAP for UNIX identities as described above.
  • Ensure your users/groups are all resolving properly on the Mac clients and ONTAP system. TR-4887 and TR-4835 cover some commands you can use to check users and groups, name mappings, group memberships, etc.
  • If you’re using NTFS security style volumes/qtrees and want the Finder to work properly for copies to and from the NFS mount, configure the NFS export policy rule to set -ntfs-unix-security-ops to “ignore” – Finder will bail out if ONTAP returns an error, so we want to silently fail those operations (such as SETATTR; see below).
  • When you open a file for reading/writing (such as a text file), Mac creates a ._filename file along with it. Depending on how many files you have in your volume, this can be an issue. For example, if you open 1 million files and Mac creates 1 million corresponding ._filename files, that starts to add up. Don’t worry! You’re not alone: https://apple.stackexchange.com/questions/14980/why-are-dot-underscore-files-created-and-how-can-i-avoid-them
  • If you’re using DFS symlinks, check out this KB: DFS links do not work on MAC OS client, with ONTAP 9.5 and symlinks enabled

I’ve also run into some interesting behaviors with Mac/Finder/SMB and junction paths in ONTAP, as covered in this blog:

Workaround for Mac Finder errors when unzipping files in ONTAP

One issue that I did a pretty considerable amount of analysis on was the aforementioned “can’t copy using Finder.” Here are the dirty details…

Permissions Error When Copying a File to a NFS Mount in ONTAP using Finder

In this case, a file copy worked using Terminal, but was failing with permissions errors when using Finder and complaining about the file already existing.

First, it wants a login (which shouldn’t be needed):

Then it says this:

If you select “Replace” this is the error:

If you select “Stop” it stops and you are left with an empty 0 byte “file” – so the copy failed.

If you select “Keep Both” the Finder goes into an infinite loop of 0 byte file creations. I stopped mine at around 2500 files (forced an unmount):

# ls -al | wc -l
1981
# ls -al | wc -l
2004
# ls -al | wc -l
2525

So what does that happen? Well, in a packet trace, I saw the following:

The SETATTR fails on CREATE (expected in NFS operations on NTFS security style volumes in ONTAP, but not expected for NFS clients as per RFC standards):

181  60.900209  x.x.x.x    x.x.x.y      NFS  226  V3 LOOKUP Call (Reply In 182), DH: 0x8ec2d57b/copy-file-finder << Mac NFS client checks if the file exists
182  60.900558  x.x.x.y   x.x.x.x      NFS  186  V3 LOOKUP Reply (Call In 181) Error: NFS3ERR_NOENT << does not exist, so let’s create it!
183  60.900633  x.x.x.x    x.x.x.y      NFS  238  V3 CREATE Call (Reply In 184), DH: 0x8ec2d57b/copy-file-finder Mode: EXCLUSIVE << creates the file
184  60.901179  x.x.x.y   x.x.x.x      NFS  362  V3 CREATE Reply (Call In 183)
185  60.901224  x.x.x.x    x.x.x.y      NFS  238  V3 SETATTR Call (Reply In 186), FH: 0x7b82dffd
186  60.901564  x.x.x.y   x.x.x.x      NFS  214  V3 SETATTR Reply (Call In 185) Error: NFS3ERR_PERM << fails setting attributes, which also fails the copy of the actual file data, so we have a 0 byte file

Then it REMOVES the file (since the initial operation fails) and creates it again, and SETATTR fails again. This is where that “Keep Both” loop behavior takes place.

229 66.995698 x.x.x.x x.x.x.y NFS 210 V3 REMOVE Call (Reply In 230), DH: 0x8ec2d57b/copy-file-finder
233 67.006816 x.x.x.x x.x.x.y NFS 226 V3 LOOKUP Call (Reply In 234), DH: 0x8ec2d57b/copy-file-finder
234 67.007166 x.x.x.y x.x.x.x NFS 186 V3 LOOKUP Reply (Call In 233) Error: NFS3ERR_NOENT
247 67.036056 x.x.x.x x.x.x.y NFS 238 V3 CREATE Call (Reply In 248), DH: 0x8ec2d57b/copy-file-finder Mode: EXCLUSIVE
248 67.037662 x.x.x.y x.x.x.x NFS 362 V3 CREATE Reply (Call In 247)
249 67.037732 x.x.x.x x.x.x.y NFS 238 V3 SETATTR Call (Reply In 250), FH: 0xc33bff48
250 67.038534 x.x.x.y x.x.x.x NFS 214 V3 SETATTR Reply (Call In 249) Error: NFS3ERR_PERM

With Terminal, it operates a little differently. Rather than bailing out after the SETATTR failure, it just retries it:

11 19.954145 x.x.x.x x.x.x.y NFS 226 V3 LOOKUP Call (Reply In 12), DH: 0x8ec2d57b/copy-file-finder
12 19.954496 x.x.x.y x.x.x.x NFS 186 V3 LOOKUP Reply (Call In 11) Error: NFS3ERR_NOENT
13 19.954560 x.x.x.x x.x.x.y NFS 226 V3 LOOKUP Call (Reply In 14), DH: 0x8ec2d57b/copy-file-finder
14 19.954870 x.x.x.y x.x.x.x NFS 186 V3 LOOKUP Reply (Call In 13) Error: NFS3ERR_NOENT
15 19.954930 x.x.x.x x.x.x.y NFS 258 V3 CREATE Call (Reply In 18), DH: 0x8ec2d57b/copy-file-finder Mode: UNCHECKED
16 19.954931 x.x.x.x x.x.x.y NFS 230 V3 LOOKUP Call (Reply In 17), DH: 0x8ec2d57b/._copy-file-finder
17 19.955497 x.x.x.y x.x.x.x NFS 186 V3 LOOKUP Reply (Call In 16) Error: NFS3ERR_NOENT
18 19.957114 x.x.x.y x.x.x.x NFS 362 V3 CREATE Reply (Call In 15)
25 19.959031 x.x.x.x x.x.x.y NFS 238 V3 SETATTR Call (Reply In 26), FH: 0x8bcb16f1
26 19.959512 x.x.x.y x.x.x.x NFS 214 V3 SETATTR Reply (Call In 25) Error: NFS3ERR_PERM
27 19.959796 x.x.x.x x.x.x.y NFS 238 V3 SETATTR Call (Reply In 28), FH: 0x8bcb16f1 << Hey let's try again and ask in a different way!
28 19.960321 x.x.x.y x.x.x.x NFS 214 V3 SETATTR Reply (Call In 27)

The first SETATTR tries to chmod to 700:

Mode: 0700, S_IRUSR, S_IWUSR, S_IXUSR

The retry uses 777. Since the file already shows as 777, it succeeds (because it was basically fooled):

Mode: 0777, S_IRUSR, S_IWUSR, S_IXUSR, S_IRGRP, S_IWGRP, S_IXGRP, S_IROTH, S_IWOTH, S_IXOTH

Since Finder bails on the error, setting the NFS server to return no error here for this export (ntfs-unix-security-ops ignore) on this client allows the copy to succeed. You can create granular rules in your export policy rules to just set that option for your Mac clients.

Now, why do our files all show as 777?

Displaying NTFS Permissions via NFS

Because NFS doesn’t understand NTFS permissions, the job to translate user identities into valid access rights falls onto the shoulders of ONTAP. A UNIX user maps to a Windows user and then that Windows user is evaluated against the folder/file ACLs.

So “777” here doesn’t mean we have wide open access; we only have access based on the Windows ACL. Instead, it just means “the Linux client can’t view the access level for that user.” In most cases, this isn’t a huge problem. But sometimes, you need files/folders not to show 777 (like for applications that don’t allow 777).

In that case, you can control somewhat how NFS clients display NTFS ACLs in “ls” commands with the NFS server option ntacl-display-permissive-perms.

[-ntacl-display-permissive-perms {enabled|disabled}] - Display maximum NT ACL Permissions to NFS Client (privilege: advanced)
This optional parameter controls the permissions that are displayed to NFSv3 and NFSv4 clients on a file or directory that has an NT ACL set. When true, the displayed permissions are based on the maximum access granted by the NT ACL to any user. When false, the displayed permissions are based on the minimum access granted by the NT ACL to any user. The default setting is false.

The default setting of “false” is actually “disabled.” When that option is enabled, this is the file/folder view:

When that option is disabled (the default):

This option is covered in more detail in TR-4067, but it doesn’t require a remount to take effect. It may take some time for the access caches to clear to see the results, however.

Keep in mind that these listings are approximations of the access as seen by the current user. If the option is disabled, you see the minimum access; if the option is enabled, you see the maximum access. For example, the “test” folder above shows 555 when the option is disabled, but 777 when the option is enabled.

These are the actual permissions on that folder:

::*> vserver security file-directory show -vserver DEMO -path /FG2/test
Vserver: DEMO
File Path: /FG2/test
File Inode Number: 10755
Security Style: ntfs
Effective Style: ntfs
DOS Attributes: 10
DOS Attributes in Text: ----D---
Expanded Dos Attributes: -
UNIX User Id: 1102
UNIX Group Id: 10002
UNIX Mode Bits: 777
UNIX Mode Bits in Text: rwxrwxrwx
ACLs: NTFS Security Descriptor
Control:0x8504
Owner:BUILTIN\Administrators
Group:NTAP\ProfGroup
DACL - ACEs
ALLOW-Everyone-0x1200a9-OI|CI (Inherited)
ALLOW-NTAP\prof1-0x1f01ff-OI|CI (Inherited)

Here are the expanded ACLs:

                     Owner:BUILTIN\Administrators
                     Group:NTAP\ProfGroup
                     DACL - ACEs
                       ALLOW-Everyone-0x1200a9-OI|CI (Inherited)
                          0... .... .... .... .... .... .... .... = Generic Read
                          .0.. .... .... .... .... .... .... .... = Generic Write
                          ..0. .... .... .... .... .... .... .... = Generic Execute
                          ...0 .... .... .... .... .... .... .... = Generic All
                          .... ...0 .... .... .... .... .... .... = System Security
                          .... .... ...1 .... .... .... .... .... = Synchronize
                          .... .... .... 0... .... .... .... .... = Write Owner
                          .... .... .... .0.. .... .... .... .... = Write DAC
                          .... .... .... ..1. .... .... .... .... = Read Control
                          .... .... .... ...0 .... .... .... .... = Delete
                          .... .... .... .... .... ...0 .... .... = Write Attributes
                          .... .... .... .... .... .... 1... .... = Read Attributes
                          .... .... .... .... .... .... .0.. .... = Delete Child
                          .... .... .... .... .... .... ..1. .... = Execute
                          .... .... .... .... .... .... ...0 .... = Write EA
                          .... .... .... .... .... .... .... 1... = Read EA
                          .... .... .... .... .... .... .... .0.. = Append
                          .... .... .... .... .... .... .... ..0. = Write
                          .... .... .... .... .... .... .... ...1 = Read

                       ALLOW-NTAP\prof1-0x1f01ff-OI|CI (Inherited)
                          0... .... .... .... .... .... .... .... = Generic Read
                          .0.. .... .... .... .... .... .... .... = Generic Write
                          ..0. .... .... .... .... .... .... .... = Generic Execute
                          ...0 .... .... .... .... .... .... .... = Generic All
                          .... ...0 .... .... .... .... .... .... = System Security
                          .... .... ...1 .... .... .... .... .... = Synchronize
                          .... .... .... 1... .... .... .... .... = Write Owner
                          .... .... .... .1.. .... .... .... .... = Write DAC
                          .... .... .... ..1. .... .... .... .... = Read Control
                          .... .... .... ...1 .... .... .... .... = Delete
                          .... .... .... .... .... ...1 .... .... = Write Attributes
                          .... .... .... .... .... .... 1... .... = Read Attributes
                          .... .... .... .... .... .... .1.. .... = Delete Child
                          .... .... .... .... .... .... ..1. .... = Execute
                          .... .... .... .... .... .... ...1 .... = Write EA
                          .... .... .... .... .... .... .... 1... = Read EA
                          .... .... .... .... .... .... .... .1.. = Append
                          .... .... .... .... .... .... .... ..1. = Write
                          .... .... .... .... .... .... .... ...1 = Read

So, prof1 has Full Control (7) and “Everyone” has Read (5). That’s where the minimum/maximum permissions show up. So you won’t get *exact* permissions here. If you want exact permission views, consider using UNIX security styles.

DS_Store files

Mac will leave these little files laying around as users browse shares. In a large environment, that can start to create clutter, so you may want to consider disabling the creation of these on network shares (such as NFS mounts), as per this:

http://hints.macworld.com/article.php?story=2005070300463515http://hints.macworld.com/article.php?story=2005070300463515

If you have questions, comments or know of some other weirdness in MacOS with NFS, comment below!

How to Configure MacOS to Use Active Directory LDAP for UNIX users/groups

In NetApp ONTAP, it’s possible to serve data to NAS clients over SMB and NFS, including the same datasets. This is known as “multiprotocol NAS” and I cover the best practices for that in the new TR-4887:

TR-4887: Multiprotocol NAS Best Practices in ONTAP

When you do multiprotocol NAS in ONTAP (or really, and storage system), it’s usually best to leverage a centralized repository for user names, group names and numeric IDs that the NAS clients and NAS servers all point to (such as LDAP). That way, you get no surprises when accessing files and folders – user and groups get the expected ownership and permissions.

I cover LDAP in ONTAP in TR-4835:

TR-4835: LDAP in NetApp ONTAP

One of the more commonly implemented solutions for LDAP in environments that serve NFS and SMB is Active Directory. In these environments, you can either use either native UNIX attributes or a 3rd party utility, such as Centrify. (Centrify basically just uses the AD UNIX attributes and centralizes the management into a GUI – both are covered in the LDAP TR.)

While most Linux clients are fairly straightforward for LDAP integration, MacOS does things slightly differently. However, it’s pretty easy to configure.

Note: The steps may vary based on your environment configs and this covers just AD LDAP; not OpenLDAP/IDM or other Linux-based LDAP.

macOS Big Sur is here - Apple

Step 1: Ensure the Mac is configured to use the same DNS as the AD domain

This is done via “Network” settings in System Preferences. DNS is important here because it will be used to query the AD domain when we bind the Mac for the Directory Services. In the following, I’ve set the DNS server IP and the search domain to my AD domain “NTAP.LOCAL”:

Then I tested the domain lookup in Terminal:

Step 2: Configure Directory Services to use Active Directory

The “Directory Utility” is what we’ll use here. It’s easiest to use the spotlight search to find it.

Essentially, this process adds the MacOS to the Active Directory domain (as you would with a Windows server or a Linux box with “realm join”) and then configures the LDAP client on the Mac to leverage specific attributes for LDAP queries.

In the above, I’ve used uidNumber and gidNumber as the attributes. You can also control/view these via the CLI command “dsconfigad”:

Configure domain access in Directory Utility on Mac

I can see in my Windows AD domain the machine account was created:

A few caveats here about the default behavior for this:

  • LDAP queries will be encrypted by default, so if you’re trying to troubleshoot via packet capture, you won’t see a ton of useful info (such as attributes used for queries). To disable this (mainly for troubleshooting purposes):
$ dsconfigad -packetsign disable -packetencrypt disable
  • MacOS uses sAMAccountName as the user name/uid value, so it should work fine with AD out of the gate
  • MacOS adds additional Mac-specific system groups to the “id” output (such as netaccounts:62, and GIDs 701/702); these may need to be added to LDAP, depending on file ownership
  • LDAP queries to AD from Mac will use the Global Catalog port 3268 by default when using Active Directory config (which I was able to see from a packet capture)

This use of the global catalog port can be problematic, as standard LDAP configurations in AD don’t set the Global Catalog to replicate attributes, but rather uses the standard port 389/636 for LDAP communication. AD doesn’t replicate the UNIX attributes across the global catalog by default for LDAP, so you’d have to configure that manually (covered in TR-4835) or modify the port the Mac uses for LDAP.

My AD domain does have the attributes that replicate via the Global Catalog, so the LDAP lookups work for me from the Mac:

Here’s what the prof1 user looks like from a CentOS client:

# id prof1
uid=1102(prof1) gid=10002(ProfGroup) groups=10002(ProfGroup),10000(Domain Users),48(apache-group),1101(group1),1202(group2),1203(group3)

This is how that user looks from the ONTAP CLI:

cluster::> set advanced; access-check authentication show-creds -node node1 -vserver DEMO -unix-user-name prof1 -list-name true -list-id true
UNIX UID: 1102 (prof1) <> Windows User: S-1-5-21-3552729481-4032800560-2279794651-1110 (NTAP\prof1 (Windows Domain User))
GID: 10002 (ProfGroup)
Supplementary GIDs:
10002 (ProfGroup)
10000 (Domain Users)
1101 (group1)
1202 (group2)
1203 (group3)
48 (apache-group)
Primary Group SID: S-1-5-21-3552729481-4032800560-2279794651-1111 NTAP\ProfGroup (Windows Domain group)
Windows Membership:
S-1-5-21-3552729481-4032800560-2279794651-1301 NTAP\apache-group (Windows Domain group)
S-1-5-21-3552729481-4032800560-2279794651-1106 NTAP\group2 (Windows Domain group)
S-1-5-21-3552729481-4032800560-2279794651-513 NTAP\DomainUsers (Windows Domain group)
S-1-5-21-3552729481-4032800560-2279794651-1105 NTAP\group1 (Windows Domain group)
S-1-5-21-3552729481-4032800560-2279794651-1107 NTAP\group3 (Windows Domain group)
S-1-5-21-3552729481-4032800560-2279794651-1111 NTAP\ProfGroup (Windows Domain group)
S-1-5-21-3552729481-4032800560-2279794651-1231 NTAP\local-group.ntap (Windows Alias)
S-1-18-2 Service asserted identity (Windows Well known group)
S-1-5-32-551 BUILTIN\Backup Operators (Windows Alias)
S-1-5-32-544 BUILTIN\Administrators (Windows Alias)
S-1-5-32-545 BUILTIN\Users (Windows Alias)
User is also a member of Everyone, Authenticated Users, and Network Users
Privileges (0x22b7):
SeBackupPrivilege
SeRestorePrivilege
SeTakeOwnershipPrivilege
SeSecurityPrivilege
SeChangeNotifyPrivilege

Most people aren’t going to want/be allowed to crack open ADSIEdit and modify schema attributes, so you’d want to change how MacOS queries LDAP to use port 389 or 636. I’m currently waiting on word of how to do that from Apple, so I’ll update when I get that info. If you are reading this and already know, feel free to add to the comments!

Step 3: Mount the NetApp NFS export and test multiprotocol access

NFS mounts to UNIX security style volumes are pretty straightforward, so we won’t cover that here. Where it gets tricky is when your ONTAP volumes are NTFS security style. When that’s the case, a UNIX -> Windows name mapping occurs when using NFS, as we need to make sure the user trying to access the NTFS permissions truly has access.

This is the basic process:

  • MacOS NFS client sends a numeric UID and GID to the NetApp ONTAP system (if NFSv3 is used)
  • If the volume is NTFS security, ONTAP will try to translate the numeric IDs into user names. The method depends on the cluster config; in this case, we’ll use LDAP.
  • If the numeric IDs map to user names/group names, then ONTAP uses those UNIX names and tries to find a valid Windows name with the same names; if none exist, ONTAP looks for explicit name mapping rules and a default Windows user; if none of those work then access is denied.

I have mounted a volume to my MacOS client that uses NTFS security style.

This is the volume in ONTAP:

::*> vol show -vserver DEMO -volume FG2 -fields security-style
vserver volume security-style
------- ------ --------------
DEMO       FG2           ntfs

MacOS user IDs start at the ID 501; So my “admin” user ID is 501. This user doesn’t exist in LDAP.

ONTAP has a local user named “Podcast” but no valid Windows user mapping:

::*> set advanced; access-check authentication show-creds -node ontap9-tme-8040-01 -vserver DEMO -uid 501
(vserver services access-check authentication show-creds)
Vserver: DEMO (internal ID: 10)
Error: Get user credentials procedure failed
[ 33] Determined UNIX id 501 is UNIX user 'Podcast'
[ 34] Using a cached connection to ntap.local
[ 36] Trying to map 'Podcast' to Windows user 'Podcast' using
implicit mapping
[ 36] Successfully connected to ip 10.193.67.236, port 445
using TCP
[ 46] Successfully authenticated with DC oneway.ntap.local
[ 49] Could not find Windows name 'Podcast'
[ 49] Unable to map 'Podcast'. No default Windows user defined.
**[ 49] FAILURE: Name mapping for UNIX user 'Podcast' failed. No
** mapping found
Error: command failed: Failed to get user credentials. Reason: "SecD Error: Name mapping does not exist".

In addition, MacOS disables root by default:

https://support.apple.com/en-us/HT204012

So when I try to access this mount, it will attempt to use UID 501 and translate it to a UNIX user and then to a Windows user. Since ONTAP can’t translate UID 501 to a valid Windows user, this will fail and we’ll see it in the event log of the ONTAP CLI.

Here’s the access failure:

Here’s the ONTAP error:

ERROR secd.nfsAuth.noNameMap: vserver (DEMO) Cannot map UNIX name to CIFS name. Error: Get user credentials procedure failed
[ 33] Determined UNIX id 501 is UNIX user 'Podcast'
[ 34] Using a cached connection to ntap.local
[ 36] Trying to map 'Podcast' to Windows user 'Podcast' using implicit mapping
[ 36] Successfully connected to ip 10.193.67.236, port 445 using TCP
[ 46] Successfully authenticated with DC oneway.ntap.local
[ 49] Could not find Windows name 'Podcast'
[ 49] Unable to map 'Podcast'. No default Windows user defined.
**[ 49] FAILURE: Name mapping for UNIX user 'Podcast' failed. No mapping found

When I “su” to a user that *does* have a valid Windows user (such as prof1), this works fine and I can touch a file and get the proper owner/group:

Note that in the above, we see “root:wheel” owned folders; just because root is disabled by default on MacOS doesn’t mean that MacOS isn’t aware of the user. Those folders were created on a separate NFS client.

Also, note in the above that the file shows 777 permissions; this is because those are the allowed permissions for the prof1 user on that file. The permissions are defined by Windows/NTFS. Here, they are set to “Everyone:Full Control” by way of file inheritance. These are the new permissions. Profgroup (with prof1 and studen1 as members) gets write access. Administrator gets “Full Control.” Group10 (with only student1 as a member) gets read access.

In ONTAP, you can also control the way NTFS security style files are viewed on NFS clients with the NFS server option -ntacl-display-permissive-perms. TR-4887 covers that option in more detail.

Prof1 access view after permissions change (write access):

Student1 access view after permissions change (read access only via group ACL):

Read works, but write does not (by design!)

Student2 access view (write access defined by membership in ProfGroup):

Newuser1 access view (not a member of any groups in the ACL):

Newuser1 can create a new file, however, and it shows the proper owner. The permissions are 777 because of the inherited NTFS ACLs from the share:

As you can see, we will get the expected access for users and groups on Mac NFS using NTFS security styles, but the expected *views* won’t always line up. This is because there isn’t a direct correlation between NTFS and UNIX permissions, so we deliver an approximation. ONTAP doesn’t store ACLs for both NTFS and UNIX on disk; it only chooses one or the other. If you require exact NFS permission translation via Mac NFS, consider using UNIX security style and mode bits.

Addendum: Squashing root

In the event your MacOS users enable the root account and become the “root” user on the client, you can squash the root user to an anonymous user by using ONTAP’s export policies and rules. TR-4067 covers how to do this:

NFS Best Practices in ONTAP

Let me know if you have questions!

Brand new tech report: Multiprotocol NAS Best Practices in ONTAP

I don’t like to admit to being a procrastinator, but…

Lazy Sloth Drawing (Page 1) - Line.17QQ.com
(Not actually a sloth)

Four years ago, I said this:

And people have asked about it a few times since then. To be fair, I did say “will be a ways out…”

In actuality, I started that TR in March of 2017. And then again in February of 2019. And then started all over when the pandemic hit, because what else did I have going on? ūüôā

And it’s not like I haven’t done *stuff* in that time.

The trouble was, I do multiprotocol NAS every day, so I think I had writer’s block because I didn’t know where to start and the challenge of writing an entire TR on the subject without making it 100-200 pages like some of the others I’ve written was… daunting. But, it’s finally done. And the actual content is under 100 pages!

Topics include:

  • NFS and SMB best practices/tips
  • Name mapping explanations and best practices
  • Name service information
  • CIFS Symlink information
  • Advanced multiprotocol NAS concepts

Multiprotocol NAS Best Practices in ONTAP

If you have any comments/questions, feel free to comment!

New/Updated NAS Technical Reports! – Spring 2020

With the COVID-19 quarantine, stay at home orders and new 1-year ONTAP release cadence, I’m finding I have a lot more spare time, which translates into time to update old, crusty technical reports!

30 Gandalf Facts To Rule Them All | The Fact Site

Some of the old TRs hadn’t been updated for 3 years or so. Much of the information in those still applied, but overall, the TR either had to be retired or needed an update – if only to refresh the publish date and apply new templates.

So, first, let’s cover the grandfather TRs.

Updated TRs

TR-4073: Secure Unified Authentication

This TR was a monolith that I wrote when I first started as a TME back in 2015-ish. It covers LDAP, Kerberos and NFSv4.x for a unified security approach to NFS. The goal was to combine everything into a centralized document, but what ended up happening was I now had a TR that was 250+ pages long. Not only is that hard to read, but it’s also daunting enough to cause people not to want to read it at all. As a result, I made it a goal to break the TR up into more manageable chunks. Eventually, this TR will be deprecated in favor of newer TRs that are shorter and more specific.

TR-4616: NFS Kerberos in ONTAP

I created the NFS Kerberos TR in 2017 to focus only on Kerberos with NFS. To streamline the document, I narrowed the focus to only a set of configuration options (AD KDCs, RHEL clients, newest ONTAP version), removed extraneous details and moved examples/configuration steps to the end of the document. The end result – a 42 page document with the most important information taking up around 30 pages.

However, there hasn’t been an updated version since then. I’m currently in the process of updating that TR and was waiting on some other TRs to be completed before I finished this one. The new revision will include updated information and the page count will rise to around 60-70 pages.

TR-4067: NFS Best Practice Guide

This TR is another of the original documents I created and hasn’t been updated since 2017. It’s currently getting a major overhaul right now, including re-organizing the order to include the more crucial information at the start of the document and reducing the total page count by roughly 20 pages. Examples and advanced topics were moved to the back of the document and the “meat” of the TR is going to be around 90 pages.

Major changes include:

  • New TR template
  • Performance testing for NFSv3 vs. NFSv4.x
  • New best practice recommendations
  • Security best practices
  • Multiprotocol NAS information
  • Removal of Infinite Volume section
  • NFS credential information

As part of the TR-4073 de-consolidation project, TR-4067 will cover the NFSv4.x aspects.

This TR is nearly done and is undergoing some peer review, so stay tuned!

TR-4523: DNS Load Balancing in ONTAP

This TR was created to cover the DNS load balancing approaches for NAS workloads with ONTAP. It’s pretty short – 35 pages or so – and covers on-box and off-box DNS load balancing.

It was updated in May 2020 and was basically a minor refresh.

New TR

TR-4835: How to Configure LDAP in ONTAP

The final part of the TR-4073 de-consolidation effort was creating an independent LDAP TR. Unlike the NFS Kerberos TR, I wanted this one to cover a wide array of configurations and use cases, so the total length ended up being 135 pages, but the “meat” of the document (the most pertinent information) only takes up around 87 pages.

Sections include, in order:

  • LDAP overview
  • Authentication in ONTAP
  • LDAP Components and Considerations
  • Configuration
  • Common Issues and Troubleshooting
  • Best Practices
  • Appendix/Command Examples

Feedback and comments are welcome!

Using Windows Lightweight Directory Services for UNIX Identity Management with ONTAP

Windows Active Directory domains have been the way to leverage UNIX identity management in environments using Windows, given the tight integration with Kerberos, Windows accounts and ease of use. I cover a lot of this in TR-4073 (with a new LDAP-only TR coming out soon).

But, it doesn’t always fit all use cases.

For example, what if:

  • You don’t have a Windows Active Directory domain?
  • You don’t have access or permission to modify Active Directory domain accounts to use UNIX attributes?
  • You don’t need a full-fledged AD domain with hundreds or thousands of users/groups, but only need a handful of users and groups for a single application?

There are likely other use cases that could apply here, but if you need LDAP without dealing with AD domains, we can leverage something called Windows Active Directory Lightweight Directory Services for identity management and LDAP services.

Image result for windows active directory lds

This also illustrates that pretty much *any* LDAP server can be used with ONTAP.

For example, this is a query from ONTAP to a server running AD LDS:

::*> getxxbyy getgrlist -node node1 -vserver NFS -username lds
(vserver services name-service getxxbyyy getgrlist)
pw_name: lds
Groups: 1101 1102


::*> getxxbyyy getpwbyname -node node1 -vserver NFS -username lds
(vserver services name-service getxxbyyy getpwbyname)
pw_name: lds
pw_passwd:
pw_uid: 1001
pw_gid: 1101
pw_gecos:
pw_dir: 
pw_shell:

The configuration from ONTAP is identical for any LDAP server. In my case, I used the read-only LDAP client schema template named “MS-AD-BIS.”

Basic configuration steps:

  • Create the LDAP client (“ldap client create” using the stock LDAP template MS-AD-BIS)
  • Create/enable LDAP on the SVM (ldap create)
  • Configure DNS on the SVM (dns create/modify)
  • Modify ns-switch to use LDAP (ns-switch modify)

This was my LDAP client configuration:

::*> ldap client show -client-config LDS

Vserver: NFS
Client Configuration Name: LDS
LDAP Server List: PARISI-WIN2019
(DEPRECATED)-LDAP Server List: -
Active Directory Domain: -
Preferred Active Directory Servers: -
Bind Using the Vserver's CIFS Credentials: false
Schema Template: MS-AD-BIS
LDAP Server Port: 389
Query Timeout (sec): 3
Minimum Bind Authentication Level: anonymous
Bind DN (User): administrator
Base DN: CN=LDS-LDAP,DC=PARISI-WIN2019
Base Search Scope: subtree
User DN: -
User Search Scope: subtree
Group DN: -
Group Search Scope: subtree
Netgroup DN: -
Netgroup Search Scope: subtree
Vserver Owns Configuration: false
Use start-tls Over LDAP Connections: false
Enable Netgroup-By-Host Lookup: false
Netgroup-By-Host DN: -
Netgroup-By-Host Scope: subtree
Client Session Security: none
LDAP Referral Chasing: false
Group Membership Filter:

The hardest part of this configuration for me was the AD LDS side, as there are a bunch of manual steps involved.

Configuring AD LDS for LDAP Services

The best guide I came across was this one:

AD LDS Identity Mapping for Services for NFS

Even with this guide, however, there were a few tweaks that needed to be made to the steps. For example, in the guide, they used the AD Schema Manager. But since this isn’t a full AD instance, you might be stuck using ADSI Edit.

One place I also got stuck was a RTFM moment where I forgot in my initial configuration to create an application partition. This is essential, as this is where you will be creating the OUs/Containers, users and groups for the LDAP services.

When creating the new objects, you’ll want to consider populating the following fields in the users/groups for use with ONTAP (based on the MS-AD-BIS default schema).

Already populated at creation

When you create a new object, “cn” gets autopopulated. This is used to pull the Group names. (User names use “uid” by default)

  • cn

Required

“Required” here means that ONTAP won’t be able to query for the object without these attributes populated properly.

  • uid
  • uidNumber
  • gidNumber

Optional, but recommended

These attributes are recommended, but things will mostly work without setting them (unless you require proper group memberships).

  • unixHomeDirectory
  • memberUid or member (for group memberships)

Optional for specific use cases

This is basically only if you need things like netgroup services or name mapping to Windows users.

  • sAMAccountName (for asymmetric name mappings to Windows)
  • nisObject
  • nisMapName
  • nisMapEntry
  • nisNetgroupTriple
  • memberNisNetgroup
  • loginShell (would need to be added manually)

User and Group creation

For user and group management/creation, you can either use ADSI Edit or PowerShell cmdlets. The ADSI Edit method is covered in the linked guide above.

For PowerShell you can use the standard Get-ADuser/Group and New-ADUser/Group. The catch is that you may have to specify the server – especially if your LDS server is a member of the AD domain.

For example:

PS C:\> get-aduser -Identity lds -server PARISI-WIN2019 -Properties uid,uidNumber,gidNumber,gecos,unixHomeDirectory

DistinguishedName : CN=lds,CN=Accounts,CN=LDS-LDAP,DC=PARISI-WIN2019
Enabled : False
gecos : AD LDS User
gidNumber : 1101
GivenName :
Name : lds
ObjectClass : user
ObjectGUID : 087692ff-ae7f-4171-922a-98accfdfdaa8
SID : S-1-396492173-265181619-2703889971-1168526272-875569285-466579950
Surname :
uid : {lds}
uidNumber : 1001
unixHomeDirectory : /u/lds
UserPrincipalName : lds@NTAP.LOCAL

PS C:\> Get-ADGroup -Identity ldsgroup1 -server PARISI-WIN2019 -Properties gidNumber,member,memberUid

DistinguishedName : CN=ldsgroup1,CN=Accounts,CN=LDS-LDAP,DC=PARISI-WIN2019
gidNumber         : 1101
GroupCategory     : Security
GroupScope        : Global
member            : {CN=lds,CN=Accounts,CN=LDS-LDAP,DC=PARISI-WIN2019}
Name              : ldsgroup1
ObjectClass       : group
ObjectGUID        : 5764424e-aba4-4e68-bff5-b7a6989d3d0c
SID               : S-1-396492173-265181619-2048905208-1111646721-233999250-1998119334

PS C:\> Get-ADGroup -Identity ldsgroup2 -server PARISI-WIN2019 -Properties gidNumber,member,memberUid

DistinguishedName : CN=ldsgroup2,CN=Accounts,CN=LDS-LDAP,DC=PARISI-WIN2019
gidNumber         : 1102
GroupCategory     : Security
GroupScope        : Global
member            : {CN=lds,CN=Accounts,CN=LDS-LDAP,DC=PARISI-WIN2019}
Name              : ldsgroup2
ObjectClass       : group
ObjectGUID        : 8aec090c-0865-4b80-bd4a-723884524ff6
SID               : S-1-396492173-265181619-3623710820-1184275431-3423871139-2897951880

Note that in the group examples, I used “member” – this enables ONTAP to use RFC-2307bis. When you add members using “member,” you use the “Add DN” button in ADSI Edit or the Add-ADGroupMember PowerShell cmdlet. Then you use the full DN of the user.

For example:

PS C:\> Add-ADGroupMember -Identity ldsgroup2 -Members "CN=lds2,CN=Accounts,CN=LDS-LDAP,DC=PARISI-WIN2019" -Server PARISI-WIN2019

PS C:\> Get-ADGroup -Identity ldsgroup2 -server PARISI-WIN2019 -Properties member

DistinguishedName : CN=ldsgroup2,CN=Accounts,CN=LDS-LDAP,DC=PARISI-WIN2019
GroupCategory : Security
GroupScope : Global
member : {CN=lds2,CN=Accounts,CN=LDS-LDAP,DC=PARISI-WIN2019,
CN=lds,CN=Accounts,CN=LDS-LDAP,DC=PARISI-WIN2019}
Name : ldsgroup2
ObjectClass : group
ObjectGUID : 8aec090c-0865-4b80-bd4a-723884524ff6
SID : S-1-396492173-265181619-3623710820-1184275431-3423871139-2897951880

Now ONTAP will be able to query for both groups for that user:

::*> getxxbyy getgrlist -node node1 -vserver NFS -username lds2
(vserver services name-service getxxbyyy getgrlist)
pw_name: lds2
Groups: 1101 1102

So there it is – ONTAP using AD LDS for UNIX Identity Management.

Windows NFS? WHO DOES THAT???

Image result for disgusted girl meme

Believe it or not, Windows NFS is a thing. Microsoft has its own NFS server and client, which can leverage RFC compliant NFSv3 calls to a Windows Server running NFS server or to a 3rd party NFS server, such as NetApp ONTAP. It’s actually so popular, that NetApp had to re-introduce it in clustered ONTAP (it wasn’t there until ONTAP 8.2.3/8.3.1).

While Windows NFS currently provides NFSv3 clients, they don’t have NFSv4.1 clients – yet. They do provide NFSv4.1 as a server option, though:

https://docs.microsoft.com/en-us/windows-server/storage/nfs/nfs-overview

I cover Windows NFS support in TR-4067 starting on page 116. I am bringing this topic up because it has come up again recently and I wanted to create a quick and easy blog to follow, as well as call out how you can integrate AD LDAP to help identity management.

There are a few things you have to do to get it working in ONTAP.

Specifically:

  • enable -v3-ms-dos-client option on the NFS server
  • enable -showmount on the NFS server – this prevents some weirdness with writing files
  • disable -enable-ejukebox and -v3-connection-drop

The command would look like this:

cluster::> set advanced
cluster::*> nfs server modify -vserver DEMO -v3-ms-dos-client enabled -v3-connection-drop disabled -enable-ejukebox false -showmount enabled
cluster::*> nfs server show -vserver DEMO -fields v3-ms-dos-client,v3-connection-drop,showmount,enable-ejukebox
vserver enable-ejukebox v3-connection-drop showmount v3-ms-dos-client
------- --------------- ------------------ --------- ----------------
DEMO false disabled enabled enabled

Once that’s done, you can mount via NFS inside Windows clients using the standard “mount” command, provided you’ve enabled the Services for UNIX functionality. There’s plenty of documentation out there for that.

Just by doing the above, here’s an example of a working NFS mount in Windows:

C:\Users\Administrator>mount DEMO:/flexvol X:
X: is now successfully connected to DEMO:/flexvol

The command completed successfully.

Here’s the cluster’s view of that connection:

ontap9-tme-8040::*> network connections active show -node ontap9-tme-8040-0* -service nfs*,mount -remote-ip 10.193.67.236
              Vserver   Interface         Remote
      CID Ctx Name      Name:Local Port   Host:Port            Protocol/Service
--------- --- --------- ----------------- -------------------- ----------------
Node: ontap9-tme-8040-02
2968991376  4 DEMO      data:2049         oneway.ntap.local:931
                                                               TCP/nfs

When I write a file to the mount, there is something that can prove to be an issue, however. Users other than Administrator will write as UID/GID of 4294967294 (-2).

ontap9-tme-8040::*> vserver security file-directory show -vserver DEMO -path /flexvol/student1-nfs.txt

                Vserver: DEMO
              File Path: /flexvol/student1-nfs.txt
      File Inode Number: 1606599
         Security Style: unix
        Effective Style: unix
         DOS Attributes: 20
DOS Attributes in Text: ---A----
Expanded Dos Attributes: -
           UNIX User Id: 4294967294
          UNIX Group Id: 4294967294
         UNIX Mode Bits: 755
UNIX Mode Bits in Text: rwxr-xr-x
                   ACLs: -

That means users won’t show up properly/as desired in UNIX NFS mounts. For example, this is that same file from CentOS:

[root@centos7 /]# cd flexvol
[root@centos7 flexvol]# ls -la | grep student1-nfs
-rwxr-xr-x 1 4294967294 4294967294 0 Feb 5 09:18 student1-nfs.txt

So, how does one fix that?

Configuring Windows NFS clients to negotiate users properly

There are a few ways to have users leverage UID/GID other than -2.

One way is to “squash” every NFS user to the same UID/GID via the old Windows standby – the Windows registry. This is useful if only a single user will be using an NFS client.

This covers how to do that:

https://blogs.msdn.microsoft.com/saponsqlserver/2011/02/03/installation-configuration-of-windows-nfs-client-to-enable-windows-to-mount-a-unix-file-system/

Some of the third party NFS clients (such as Cygwin and Hummingbird/OpenText) will provide local passwd and group file functionality to allow you to leverage more users. In some cases, all this does is add more registry entries.

Another was is to chmod/chown the file after it’s written. But that’s not ideal.

The best way is to leverage an existing name service (such as NIS or LDAP) and have Windows clients query for the UID and GID. If you have one already, great! It’s super easy to set up the client. Just run the following command as an administrator in cmd. My NTAP.LOCAL domain already has an LDAP server set up:

C:\Users\administrator>nfsadmin mapping WIN7-CLIENT config adlookup=yes addomain=NTAP.LOCAL

The settings were successfully updated.

Once I did that, I wrote a new file and the UID/GID was properly represented:

ontap9-tme-8040::*> vserver security file-directory show -vserver DEMO -path /flexvol/prof1-nfs.txt

                Vserver: DEMO
              File Path: /flexvol/prof1-nfs.txt
      File Inode Number: 1606600
         Security Style: unix
        Effective Style: unix
         DOS Attributes: 20
DOS Attributes in Text: ---A----
Expanded Dos Attributes: -
           UNIX User Id: 1100
          UNIX Group Id: 1101
         UNIX Mode Bits: 755
UNIX Mode Bits in Text: rwxr-xr-x
                   ACLs: -

ontap9-tme-8040::*> getxxbyyy getpwbyname -node ontap9-tme-8040-01 -vserver DEMO -username prof1
  (vserver services name-service getxxbyyy getpwbyname)
pw_name: prof1
pw_passwd:
pw_uid: 1100
pw_gid: 1101
pw_gecos:
pw_dir:
pw_shell:

If you’re interested, a packet trace shows that the Windows client will communicate via encrypted LDAP to query the user’s UNIX attribute information:

windows-ldap

An added bonus of having Windows clients query LDAP for UNIX user names and groups for NFS on ONTAP is that if you’re using NTFS security style volumes, you won’t have issues connecting to those mounts.

What breaks when doing NTFS security style?

When a UNIX user attempts to access a volume with NTFS security style ACLs, ONTAP will attempt to map that user to a valid Windows user to make sure Windows ACLs can be calculated. (I cover this in Mixed perceptions with NetApp multiprotocol NAS access)

If a user comes in with the default Windows NFS ID of 4294967294 (which doesn’t translate to a UNIX user), this is what happens.

  • The UNIX user 4294967294 tries to access the mount.
  • ONTAP receives a UID of 4294967294 and attempts to map that to a Windows user
  • That Windows user does not exist, so access is denied. This can manifest as an error (such as when writing a file) or it could just show no files/folder.

windows-nfs-ntfs-noaccess.png

windows-nfs-ntfs-noaccess2

That particular folder does have data. It’s just that the user can’t see it:

windows-nfs-ntfs-data-list

In ONTAP, we’d see this error, confirming that the user doesn’t exist:

2/5/2019 14:31:26 ontap9-tme-8040-02
ERROR secd.nfsAuth.problem: vserver (DEMO) General NFS authorization problem. Error: Get user credentials procedure failed
[ 15 ms] Hostname found in Name Service Cache
[ 19] Hostname found in Name Service Cache
[ 23] Successfully connected to ip 10.193.67.236, port 389 using TCP
**[ 28] FAILURE: User ID '4294967294' not found in UNIX authorization source LDAP.
[ 28] Entry for user-id: 4294967294 not found in the current source: LDAP. Ignoring and trying next available source
[ 29] Entry for user-id: 4294967294 not found in the current source: FILES. Entry for user-id: 4294967294 not found in any of the available sources
[ 44] Unable to get the name for UNIX user with UID 4294967294

With LDAP involved, access to the access to the NFS mounted volume with NTFS security works much better, because ONTAP and the client agree that user 1100 is prof1.

windows-nfs-ntfs-data-list-ldap

So, uh… what if I don’t have LDAP or NIS?

Well, in a Windows domain, you ALWAYS have an LDAP server. Active Directory leverages LDAP schemas to store information and any version of Windows Active Directory can be used to look up UNIX users and groups. In fact, the newer versions of Windows make this very easy. In older Windows versions, you had to manually extend the LDAP schema to provide UNIX attributes. Now, UNIX attributes like UID, UIDnumber, etc. are all in LDAP by default. All you have to do is populate these values with information. You can even do it via PowerShell CMDlets!

Once you have a working Active Directory LDAP environment, you can then configure ONTAP to communicate with LDAP for UNIX identities and you’re well on your way to having a scalable, functional multiprotocol NAS environment.

The one downside I’ve found with Windows NFS is that it doesn’t always play nicely when you want to use SMB on the same client. Windows gets a bit… confused. I haven’t dug into that a ton, but I’ve seen it enough to express caution. ūüôā

Behind the Scenes: Episode 137: Name Services in ONTAP

Welcome to the Episode 137, part of the continuing¬†series called ‚ÄúBehind the Scenes of the NetApp Tech ONTAP Podcast.‚ÄĚ

tot-gopher

This week on the podcast, we talk Name Services in ONTAP and the introduction of the new global name services cache in ONTAP 9.3 with NAS TME, Chris Hurley (@averageguyX)!

We’ll be taking next week off as we record and prepare for some big announcements coming soon!

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

This week’s episode is here:

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

Our YouTube channel (episodes uploaded sporadically) is here:

Cache Rules Everything Around Me: New Global Name Service Cache in ONTAP 9.3

cache-rules

In an ONTAP cluster made up of individual nodes with individual hardware resources, it’s useful if a storage administrator can manage the entire cluster as a monolithic entity, without having to worry about what lives where.

Prior to ONTAP 9.3, name service caches were node-centric, for the most part. This sometimes could create scenarios where a cache could become stale on one node, where it was recently populated on another node. Thus, a client may get different results depending on which physical node the network connection occurred.

The following is pulled right out of the new name services best practices technical report (https://www.netapp.com/us/media/tr-4668.pdf), which acts as an update to TR-4379. I wrote some of this, but most of what’s written here is by the new NFS/Name Services TME Chris Hurley. (@averageguyx) This is basically a copy/paste, but I thought this was a cool enough feature to highlight on its own.

Global Name Services Cache in ONTAP 9.3

ONTAP 9.3 offers a new caching mechanism that moves name service caches out of memory and into a persistent cache that is replicated asynchronously between all nodes in the cluster. This provides more reliability and resilience in the event of failovers, as well as offering higher limits for name service entries due to being cached on disk rather than in node memory.

The name service cache is enabled by default. If legacy cache commands are attempted in ONTAP 9.3 with name service caching enabled, an error will occur, such as the following:

Error: show failed: As name service caching is enabled, "Netgroups" caches no longer exist. Use the command "vserver services name-service cache netgroups members show" (advanced privilege level) to view the corresponding name service cache entries.

The name service caches are controlled in a centralized location, below the name-service cache command set. This provides easier cache management, from configuring caches to clearing stale entries.

The global name service cache can be disabled for individual caches using vserver services name-service cache commands in advanced privilege, but it is not recommended to do so. For more detailed information, please see later sections in this document.

ONTAP also offers the additional benefit of using the caches while external name services are unavailable.  If there is an entry in the cache, regardless if the entry’s TTL is expired or not, ONTAP will use that cache entry when external name services servers cannot be reached, thereby providing continued access to data served by the SVM.

Hosts Cache

There are two individual host caches; forward-lookup and reverse-lookup but the hosts cache settings are controlled as a whole.  When a record is retrieved from DNS, the TTL of that record will be used for the cache TTL, otherwise, the default TTL in the host cache settings will be used (24 hours).  The default for negative entries (host not found) is 60 seconds.  Changing DNS settings does not affect the cache contents in any way.

  • The network ping command does not use the name services hosts cache when using a hostname.

User and Group Cache

The user and group caches consist of three categories; passwd (user), group and group membership.

  • Cluster RBAC access does not use the any of the caches

Passwd (User) Cache

User cache consists of two caches, passwd and passwd-by-uid.  The caches only cache the name, uid and gid aspects of the user data to conserve space since the other data such as homedir and shell are irrelevant for NAS access.  When an entry is placed in the passwd cache, the corresponding entry is created in the passwd-by-uid cache.  By the same token, when an entry is deleted from one cache, the corresponding entry will be deleted from the other cache.  If you have an environment where there are disjointed username to uid mappings, there is an option to disable this behavior.

Group Cache

Like the passwd cache, the group cache consists of two caches, group and group-by-gid.  When an entry is placed in the group cache, the corresponding entry is created in the group-by-gid cache.  By the same token, when an entry is deleted from one cache, the corresponding entry will be deleted from the other cache.  The full group membership is not cached to conseve space and is not necessary for NAS data access, therefore only the group name and gid are cached.  If you have an environment where there are disjointed group name to gid mappings, there is an option to disable this behavior.

Group Membership Cache

In file and NIS environments, there is no efficient way to gather a list of groups a particular user is a member of, so for these environments ONTAP has a group membership cache to provide these efficiencies.  The group membership cache consists of a single cache and contains a list of groups a user is a member of.

Netgroup Cache

Beginning in ONTAP 9.3, the various netgroup caches have been consolidated into 2 caches; a netgroup.byhost and a netgroup.byname cache.  The netgroup.byhost cache is the first cache consulted for the netgroups a host is a part of.  Next, if this information is not available, then the query reverts to gathering the full netgroup members and comparing that to the host.  If the information is not in the cache, then the same process is performed against the netgroup ns-switch sources.  If a host requesting access via a netgroup is found via the netgroup membership lookup process, that ip-to-netgroup mapping is always added to the netgroup.byhost cache for faster future access.  This also leads to needing a lower TTL for the members cache so that changes in netgroup membership can be reflected in the ONTAP caches within the TTL timeframe.

Viewing cache entries

Each of the above came service caches and be viewed.  This can be used to confirm whether or not expected results are gotten from name services servers.  Each cache has its own individual options that you can use to filter the results of the cache to find what you are looking for.  In order to view the cache, the name-services cache <cache> <subcache> show command is used.

Caches are unique per vserver, so it is suggested to view caches on a per-vserver basis.  Below are some examples of the caches and the options.

ontap9-tme-8040::*> name-service cache hosts forward-lookup show  ?

  (vserver services name-service cache hosts forward-lookup show)

  [ -instance | -fields <fieldname>, ... ]

  [ -vserver <vserver name> ]                                                   *Vserver

  [[-host] <text>]                                                              *Hostname

  [[-protocol] {Any|ICMP|TCP|UDP}]                                              *Protocol (default: *)

  [[-sock-type] {SOCK_ANY|SOCK_STREAM|SOCK_DGRAM|SOCK_RAW}]                     *Sock Type (default: *)

  [[-flags] {FLAG_NONE|AI_PASSIVE|AI_CANONNAME|AI_NUMERICHOST|AI_NUMERICSERV}]  *Flags (default: *)

  [[-family] {Any|Ipv4|Ipv6}]                                                   *Family (default: *)

  [ -canonname <text> ]                                                         *Canonical Name

  [ -ips <IP Address>, ... ]                                                    *IP Addresses

  [ -ip-protocol {Any|ICMP|TCP|UDP}, ... ]                                      *Protocol

  [ -ip-sock-type {SOCK_ANY|SOCK_STREAM|SOCK_DGRAM|SOCK_RAW}, ... ]             *Sock Type

  [ -ip-family {Any|Ipv4|Ipv6}, ... ]                                           *Family

  [ -ip-addr-length <integer>, ... ]                                            *Length

  [ -source {none|files|dns|nis|ldap|netgrp_byname} ]                           *Source of the Entry

  [ -create-time <"MM/DD/YYYY HH:MM:SS"> ]                                      *Create Time

  [ -ttl <integer> ]                                                            *DNS TTL




ontap9-tme-8040::*> name-service cache unix-user user-by-id show

  (vserver services name-service cache unix-user user-by-id show)

Vserver    UID         Name         GID            Source  Create Time

---------- ----------- ------------ -------------- ------- -----------

SVM1       0           root         1              files   1/25/2018 15:07:13

ch-svm-nfs1

           0           root         1              files   1/24/2018 21:59:47

2 entries were displayed.

If there are no entries in a particular cache, the following message will be shown:

ontap9-tme-8040::*> name-service cache netgroups members show

  (vserver services name-service cache netgroups members show)

This table is currently empty.

There you have it! New cache methodology in ONTAP 9.3. If you’re using NAS and name services in ONTAP, it’s highly recommended to go to ONTAP 9.3 to take advantage of this new feature.

NFS Kerberos in ONTAP Primer

Fun fact!

Kerberos was named after Cerberus, the hound of Hades, which protected the gates of the underworld with its three heads of gnashing teeth.

cerberos

Kerberos in IT security isn’t a whole lot different; it’s pretty effective at stopping intruders and is literally a three-headed monster.

In my day to day role as a Technical Marketing Engineer for NFS, I find that one of the most challenging questions I get is regarding NFS mounts using Kerberos. This is especially true now, as IT organizations are focusing more and more on securing their data and Kerberos is one way to do that. CIFS/SMB already does a nice job of this and it’s pretty easily integrated without having to do a ton on the client or storage side.

With NFS Kerberos, however, there are a ton of moving parts and not a ton of expertise that spans those moving parts. Think for a moment what all is involved here when dealing with ONTAP:

  • DNS
  • KDC server (Key Distribution Center)
  • Client/principal
  • NFS server/principal
  • ONTAP
  • NFS
  • LDAP/name services

This blog post isn’t designed to walk you through all those moving parts; that’s what TR-4073 was written for. Instead, this blog is going to simply walk through the workflow of what happens during an NFS mount using Kerberos and where things can fail/common failure scenarios. This post will focus on Active Directory KDCs, since that’s what I see most and get the most questions on. Other UNIX-based KDCs are either not as widely used, or the admins running them are ninjas that never need any help. ūüôā

Common terms

First, let’s cover a few common terms used in NFS Kerberos.

Storage Virtual Machine (SVM)

This is what clustered ONTAP uses to present NAS and SAN storage to clients. SVMs act as tenants within a cluster. Think of them as “virtualized storage blades.”

Key Distribution Center (KDC)

The Kerberos ticket headquarters. This stores all the passwords, objects, etc. for running Kerberos in an environment. In Active Directory, domain controllers are KDCs and replicate to other DCs in the environment, which makes Active Directory an ideal platform to run Kerberos on due to ease of use and familiarity. As a bonus, Active Directory is already primed with UNIX attributes for Identity Management with LDAP. (Note: Windows 2012 has UNIX attributes by default; prior to 2012, you had to manually extend the schema.)

Kerberos principals

Kerberos principals are objects within a KDC that can have tickets assigned. Users can own principals. Machine accounts can own principals. However, simply creating a user or machine account doesn’t mean you have created a principal. Those are stored within the object’s LDAP schema attributes in Active Directory. Generally speaking, it’s one of either:

  • servicePrincipalName (SPN)
  • userPrincipalName (UPN)

These get set when adding computers to a domain (including joining Linux clients), as well as when creating new users (every user gets a UPN). Principals include three different components.

  1. Primary – this defines the type of principal (usually a service such as ldap, nfs, host, etc) and is followed by a “/”; Not all principals have primary components. For example, most users are simply user@REALM.COM.
  2. Secondary – this defines the name of the principal (such as jimbob)
  3. Realm – This is the Kerberos realm and is usually defined in ALL CAPS and is the name of the domain your principal was added into (such as CONTOSO.COM)

Keytabs

The keytab file allows a client or server that is participating in an NFS mount to use their keytab to generate AS (authentication service) ticket requests. Think of this as the principal “logging in” to the KDC, similar to what you’d do with a username and password.¬†Keytab files can make their way to clients one of two ways.

  1. Manually creating and copying the keytab file to the client (old school)
  2. Using the domain join tool of your choice (realmd, net ads/samba, adcli, etc.) on the client to automatically negotiate the keytab and machine principals on the KDC (recommended)

Keytab files, when created using the domain join tools, will create multiple entries for Kerberos principals. Generally, this will include a service principal name (SPN) for host/shortname@REALM.COM, host/fully.qualified.name@REALM.COM and a UPN for the machine account such as MACHINE$@REALM.COM. The auto-generated keytabs will also include multiple entries for each principal with different encryption types (enctypes). The following is an example of a CentOS 7 box’s keytab joined to an AD domain using realm join:

# klist -kte
Keytab name: FILE:/etc/krb5.keytab
KVNO Timestamp Principal
---- ------------------- ------------------------------------------------------
 3 05/15/2017 18:01:39 host/centos7.ntap.local@NTAP.LOCAL (des-cbc-crc)
 3 05/15/2017 18:01:39 host/centos7.ntap.local@NTAP.LOCAL (des-cbc-md5)
 3 05/15/2017 18:01:39 host/centos7.ntap.local@NTAP.LOCAL (aes128-cts-hmac-sha1-96)
 3 05/15/2017 18:01:39 host/centos7.ntap.local@NTAP.LOCAL (aes256-cts-hmac-sha1-96)
 3 05/15/2017 18:01:39 host/centos7.ntap.local@NTAP.LOCAL (arcfour-hmac)
 3 05/15/2017 18:01:39 host/CENTOS7@NTAP.LOCAL (des-cbc-crc)
 3 05/15/2017 18:01:39 host/CENTOS7@NTAP.LOCAL (des-cbc-md5)
 3 05/15/2017 18:01:39 host/CENTOS7@NTAP.LOCAL (aes128-cts-hmac-sha1-96)
 3 05/15/2017 18:01:39 host/CENTOS7@NTAP.LOCAL (aes256-cts-hmac-sha1-96)
 3 05/15/2017 18:01:39 host/CENTOS7@NTAP.LOCAL (arcfour-hmac)
 3 05/15/2017 18:01:39 CENTOS7$@NTAP.LOCAL (des-cbc-crc)
 3 05/15/2017 18:01:39 CENTOS7$@NTAP.LOCAL (des-cbc-md5)
 3 05/15/2017 18:01:39 CENTOS7$@NTAP.LOCAL (aes128-cts-hmac-sha1-96)
 3 05/15/2017 18:01:39 CENTOS7$@NTAP.LOCAL (aes256-cts-hmac-sha1-96)
 3 05/15/2017 18:01:39 CENTOS7$@NTAP.LOCAL (arcfour-hmac)

Encryption types (enctypes)

Encryption types (or enctypes) are the level of encryption used for the Kerberos conversation. The client and KDC will negotiate the level of enctype used. The client will tell the KDC “hey, I want to use this list of enctypes. Which do you support?” and the KDC will respond “I support these, in order of strongest to weakest. Try using the strongest first.” In the example above, this is the order of enctype strength, from strongest to weakest:

  • AES-256
  • AES-128
  • ARCFOUR-HMAC
  • DES-CBC-MD5
  • DES-CBC-CRC

The reason a keytab file would add weaker enctypes like DES or ARCFOUR is for backwards compatibility. For example, Windows 2008 DCs don’t support AES enctypes. In some cases, the enctypes can cause Kerberos issues due to lack of support. Windows 2008 and later don’t support DES unless you explicitly enable it. ARCFOUR isn’t supported in clustered ONTAP for NFS Kerberos. In these cases, it’s good to modify the machine accounts to strictly define which enctypes to use for Kerberos.

What you need before you try mounting

This is a quick list of things that have to be in place before you can expect Kerberos with NFS to work properly. If I left something out, feel free to remind me in the comments. There’s so much info involved that I occasionally forget some things. ūüôā

KDC and client –¬†The KDC is a given – in this case, Active Directory. The client would need to have some things installed/configured before you try to join it, including a valid DNS server configuration, Kerberos utilities, etc. This varies depending on client and would be too involved to get into here. Again, TR-4073 would be a good place to start.

DNS entries for all clients and servers participating in the NFS Kerberos operation – this includes forward and reverse (PTR) records for the clients and servers. The DNS friendly names *must* match the SPN names. If they don’t, then when you try to mount, the DNS lookup will file the name hostname1 and use that to look up the SPN host/hostname1. If the SPN was called nfs/hostname2, then the Kerberos attempt will fail with “PRINCIPAL_UNKNOWN.” This is also true for Kerberos in CIFS/SMB environments. In ONTAP, a common mistake people make is they name the CIFS server or NFS Kerberos SPN as the SVM name (such as SVM1), but their DNS names are something totally different (such as cifs.domain.com).

Valid Kerberos SPNs and UPNs – When you join a Linux client to a domain, the machine account and SPNs are automatically created. However, the UPN is not created. Having no UPN on a machine account can create issues with some Linux services that use Kerberos keytab files to authenticate. For example, RedHat’s LDAP service (SSSD) can fail to bind if using a Kerberos service principal in the configuration via the¬†ldap_sasl_authid option. The error you’d see would be “PRINCIPAL_UNKNOWN” and would drive you batty because it would be using a principal¬†you *know* exists in your environment. That’s because it’s trying to find the UPN, not the SPN. You can manage the SPN and UPN via the Active Directory attributes tab in the advanced features view. You can query whether SPNs exist via the setspn command (use /q to query by SPN name) in the CLI or PowerShell.

PS C:\> setspn /q host/centos7.ntap.local
Checking domain DC=NTAP,DC=local
CN=CENTOS7,CN=Computers,DC=NTAP,DC=local
 HOST/centos7.ntap.local
 HOST/CENTOS7

Existing SPN found!

You can view a user’s UPN and SPN with the following PowerShell command:

PS C:\> Get-ADUser student1 -Properties UserPrincipalName,ServicePrincipalName

DistinguishedName : CN=student1,CN=Users,DC=NTAP,DC=local
Enabled : True
GivenName : student1
Name : student1
ObjectClass : user
ObjectGUID : d5d5b526-bef8-46fa-967b-00ebc77e468d
SamAccountName : student1
SID : S-1-5-21-3552729481-4032800560-2279794651-1108
Surname :
UserPrincipalName : student1@NTAP.local

And a machine account’s with:

PS C:\> Get-ADComputer CENTOS7$ -Properties UserPrincipalName,ServicePrincipalName

DistinguishedName : CN=CENTOS7,CN=Computers,DC=NTAP,DC=local
DNSHostName : centos7.ntap.local
Enabled : True
Name : CENTOS7
ObjectClass : computer
ObjectGUID : 3a50009f-2b40-46ea-9014-3418b8d70bdb
SamAccountName : CENTOS7$
ServicePrincipalName : {HOST/centos7.ntap.local, HOST/CENTOS7}
SID : S-1-5-21-3552729481-4032800560-2279794651-1140
UserPrincipalName : HOST/centos7.ntap.local@NTAP.LOCAL

Network Time Protocol (NTP) – With Kerberos, there is a 5 minute default time skew window. If a client and server/KDC’s time is outside of that window, Kerberos requests will fail with “Access denied” and you’d see time skew errors in the cluster logs. This KB covers it nicely:

https://kb.netapp.com/support/s/article/ka11A0000001V1YQAU/Troubleshooting-Workflow-CIFS-Authentication-failures?language=en_US

A common issue I’ve seen with this is time zone differences or daylight savings issues. I’ve often seen the wall clock time look identical on server and client, but the time zones or month/date differ, causing the skew.

The NTP requirement is actually a “make sure your time is up to date and in sync on everything” requirement, but NTP makes that easier.

Kerberos to UNIX name mappings – In ONTAP, we authenticate via name mappings not only for CIFS/SMB, but also for Kerberos. When a client attempts to send an authentication request to the cluster for an AS request or ST (service ticket) request, it has to map to a valid UNIX user. The UNIX user mapping will depend on what type of principal is coming in. If you don’t have a valid name mapping rule, you’d see something like this in the event log:

5/16/2017 10:24:23 ontap9-tme-8040-01
 ERROR secd.nfsAuth.problem: vserver (DEMO) General NFS authorization problem. Error: RPC accept GSS token procedure failed
 [ 8 ms] Acquired NFS service credential for logical interface 1034 (SPN='nfs/demo.ntap.local@NTAP.LOCAL').
 [ 11] GSS_S_COMPLETE: client = 'CENTOS7$@NTAP.LOCAL'
 [ 11] Trying to map SPN 'CENTOS7$@NTAP.LOCAL' to UNIX user 'CENTOS7$' using implicit mapping
 [ 12] Using a cached connection to oneway.ntap.local
**[ 14] FAILURE: User 'CENTOS7$' not found in UNIX authorization source LDAP.
 [ 15] Entry for user-name: CENTOS7$ not found in the current source: LDAP. Ignoring and trying next available source
 [ 15] Entry for user-name: CENTOS7$ not found in the current source: FILES. Entry for user-name: CENTOS7$ not found in any of the available sources
 [ 15] Unable to map SPN 'CENTOS7$@NTAP.LOCAL'
 [ 15] Unable to map Kerberos NFS user 'CENTOS7$@NTAP.LOCAL' to appropriate UNIX user

For service principals (SPNS) such as host/name or nfs/name, the mapping would try to default to primary/, so you’d need a UNIX user named host or nfs on the local SVM or in a name service like LDAP. Otherwise, you can create static krb-unix name mappings in the SVM to map to whatever user you like. If you want to use wild cards, regex, etc. you can do ¬†that. For example, this name mapping rule will map all SPNs coming in as {MACHINE}$@REALM.COM to root.

cluster::*> vserver name-mapping show -vserver DEMO -direction krb-unix -position 1

Vserver: DEMO
 Direction: krb-unix
 Position: 1
 Pattern: (.+)\$@NTAP.LOCAL
 Replacement: root
IP Address with Subnet Mask: -
 Hostname: -

To test the mapping, use diag priv:

cluster::*> diag secd name-mapping show -node node1 -vserver DEMO -direction krb-unix -name CENTOS7$@NTAP.LOCAL

'CENTOS7$@NTAP.LOCAL' maps to 'root'

You can map the SPN to root, pcuser, etc. – as long as the UNIX user exists locally on the SVM or in the name service.

The workflow

Now that I’ve gotten some basics out of the way (and if you find that I’ve missed some, add to the comments), let’s look at how the workflow for an NFS mount using Kerberos would work, end to end. This is assuming we’ve configured everything correctly and are ready to mount, and that all the export policy rules allow the client to mount NFSv4 and Kerberos. If a mount fails, always check your export policy rules first.

Some common export policy issues include:

  • The export policy doesn’t have any rules configured
  • The vserver/SVM root volume doesn’t allow read access in the export policy rule for traversal of the / mount point in the namespace
  • The export policy has rules, but they are either misconfigured (clientmatch is wrong, read access disallowed, NFS protocol or auth method is disallowed) or they aren’t allowing the client to access the mount (Run export-policy rule show -instance)
  • The wrong/unexpected export policy has been applied to the volume (Run volume show -fields policy)

What’s unfortunate about trying to troubleshoot mounts with NFS Kerberos involved is that, regardless of the failures happening, the client will report:

mount.nfs: access denied by server while mounting

It’s a generic error and isn’t really helpful in diagnosing the issue.

In ONTAP, there is a command in admin privilege to check the export policy access for the client for troubleshooting purposes. Be sure to use it to rule out export issues.

cluster::> export-policy check-access -vserver DEMO -volume flexvol -client-ip 10.193.67.225 -authentication-method krb5 -protocol nfs4 -access-type read-write
 Policy Policy Rule
Path Policy Owner Owner Type Index Access
----------------------------- ---------- --------- ---------- ------ ----------
/ root vsroot volume 1 read
/flexvol default flexvol volume 1 read-write
2 entries were displayed.

The mount command is issued.

In my case, I use NFSv4.x, as that’s the security standard. Mounting without specifying a version will default to the highest NFS version allowed by the client and server, via a client-server¬†negotiation. If NFSv4.x is disabled on the server, the client will fall back to NFSv3.

# mount -o sec=krb5 demo:/flexvol /mnt

Once the mount command gets issued and Kerberos is specified, a few (ok, a lot of) things happen in the background.

While this stuff happens, the mount command will appear to “hang” as the client, KDC and server suss out if you’re going to be allowed access.

  • DNS lookups are done for the client hostname and server hostname (or reverse lookup of the IP address) to help determine what names are going to be used. Additionally, SRV lookups are done for the LDAP service and Kerberos services in the domain. DNS lookups are happening constantly through this process.
  • The client uses its keytab file to send an authentication service request (AS-REQ) to the KDC, along with what enctypes it has available. The KDC then verifies if the requested principal actually exists in the KDC and if the enctypes are supported.
  • If the enctypes are not supported, or if the principal exists, or if there are DUPLICATE principals, the AS-REQ fails. If the principal exists, the KDC will send a successful reply.
  • Then the client will send a Ticket Granting Service request (TGS-REQ) to the KDC. This request is an attempt to look up the NFS service ticket named nfs/name. The name portion of the ticket is generated either via what was typed into the mount command (ie, demo) or via reverse lookup (if we typed in an IP address to mount). The TGS-REQ will be used later to allow us to obtain a service ticket (ST). The TGS will also negotiate supported enctypes for later. If the TGS-REQ between the KDC and client negotiates an enctype that ONTAP doesn’t support (for example, ARCFOUR), then the mount will fail later in process.
  • If the TGS-REQ succeeds, a TGS-REP is sent. If the KDC doesn’t support the requested enctypes from the client, we fail here. If the NFS principal doesn’t exist (remember, it has to be in DNS and match exactly), then we fail.
  • Once the TGS is acquired by the NFS client, it presents the ticket to the NFS server in ONTAP via a NFS NULL call. The ticket information includes the NFS service SPN and the enctype used. If the NFS SPN doesn’t match what’s in “kerberos interface show,” the mount fails. If the enctype presented by the client isn’t supported or is disallowed in “permitted enctypes” on the NFS server, the request fails. The client would show “access denied.”
  • The NFS service SPN sent by the client is presented to ONTAP. This is where the krb-unix mapping takes place. ONTAP will first see if a user named “nfs” exists in local files or name services (such as LDAP, where a bind to the LDAP server and lookup takes place). If the user doesn’t exist, it will then check to see if any krb-unix name mapping rules were set explicitly. If no rules exist and mapping fails, ONTAP logs an error on the cluster and the mount fails with “Access denied.” If the mapping works, the mount procedure moves on to the next step.
  • After the NFS service ticket is verified, the client will send SETCLIENTID calls and then the NFSv4.x mount compound call (PUTROOTFH | GETATTR). The client and server are also negotiating the name@domainID string to make sure they match on both sides as part of NFSv4.x security.
  • Then, the client will try to run a series of GETATTR calls to “/” in the path. If we didn’t allow “read” access in the policy rule for “/” (the vsroot volume), we fail. If the ACLs/mode bits on the vsroot volume don’t allow at least traverse permissions, we fail. In a packet trace, we can see that the vsroot volume has only traverse permissions:
    V4 Reply (Call In 268) ACCESS, [Access Denied: RD MD XT], [Allowed: LU DL]

    We can also see that from the cluster CLI (“Everyone” only has “Execute” permissions in this NTFS security style volume):

    cluster::> vserver security file-directory show -vserver DEMO -path / -expand-mask true
    
    Vserver: DEMO
     File Path: /
     File Inode Number: 64
     Security Style: ntfs
     Effective Style: ntfs
     DOS Attributes: 10
     DOS Attributes in Text: ----D---
    Expanded Dos Attributes: 0x10
     ...0 .... .... .... = Offline
     .... ..0. .... .... = Sparse
     .... .... 0... .... = Normal
     .... .... ..0. .... = Archive
     .... .... ...1 .... = Directory
     .... .... .... .0.. = System
     .... .... .... ..0. = Hidden
     .... .... .... ...0 = Read Only
     UNIX User Id: 0
     UNIX Group Id: 0
     UNIX Mode Bits: 777
     UNIX Mode Bits in Text: rwxrwxrwx
     ACLs: NTFS Security Descriptor
     Control:0x9504
    
    1... .... .... .... = Self Relative
     .0.. .... .... .... = RM Control Valid
     ..0. .... .... .... = SACL Protected
     ...1 .... .... .... = DACL Protected
     .... 0... .... .... = SACL Inherited
     .... .1.. .... .... = DACL Inherited
     .... ..0. .... .... = SACL Inherit Required
     .... ...1 .... .... = DACL Inherit Required
     .... .... ..0. .... = SACL Defaulted
     .... .... ...0 .... = SACL Present
     .... .... .... 0... = DACL Defaulted
     .... .... .... .1.. = DACL Present
     .... .... .... ..0. = Group Defaulted
     .... .... .... ...0 = Owner Defaulted
    
    Owner:BUILTIN\Administrators
     Group:BUILTIN\Administrators
     DACL - ACEs
     ALLOW-NTAP\Domain Admins-0x1f01ff-OI|CI
     0... .... .... .... .... .... .... .... = Generic Read
     .0.. .... .... .... .... .... .... .... = Generic Write
     ..0. .... .... .... .... .... .... .... = Generic Execute
     ...0 .... .... .... .... .... .... .... = Generic All
     .... ...0 .... .... .... .... .... .... = System Security
     .... .... ...1 .... .... .... .... .... = Synchronize
     .... .... .... 1... .... .... .... .... = Write Owner
     .... .... .... .1.. .... .... .... .... = Write DAC
     .... .... .... ..1. .... .... .... .... = Read Control
     .... .... .... ...1 .... .... .... .... = Delete
     .... .... .... .... .... ...1 .... .... = Write Attributes
     .... .... .... .... .... .... 1... .... = Read Attributes
     .... .... .... .... .... .... .1.. .... = Delete Child
     .... .... .... .... .... .... ..1. .... = Execute
     .... .... .... .... .... .... ...1 .... = Write EA
     .... .... .... .... .... .... .... 1... = Read EA
     .... .... .... .... .... .... .... .1.. = Append
     .... .... .... .... .... .... .... ..1. = Write
     .... .... .... .... .... .... .... ...1 = Read
    
    ALLOW-Everyone-0x100020-OI|CI
     0... .... .... .... .... .... .... .... = Generic Read
     .0.. .... .... .... .... .... .... .... = Generic Write
     ..0. .... .... .... .... .... .... .... = Generic Execute
     ...0 .... .... .... .... .... .... .... = Generic All
     .... ...0 .... .... .... .... .... .... = System Security
     .... .... ...1 .... .... .... .... .... = Synchronize
     .... .... .... 0... .... .... .... .... = Write Owner
     .... .... .... .0.. .... .... .... .... = Write DAC
     .... .... .... ..0. .... .... .... .... = Read Control
     .... .... .... ...0 .... .... .... .... = Delete
     .... .... .... .... .... ...0 .... .... = Write Attributes
     .... .... .... .... .... .... 0... .... = Read Attributes
     .... .... .... .... .... .... .0.. .... = Delete Child
     .... .... .... .... .... .... ..1. .... = Execute
     .... .... .... .... .... .... ...0 .... = Write EA
     .... .... .... .... .... .... .... 0... = Read EA
     .... .... .... .... .... .... .... .0.. = Append
     .... .... .... .... .... .... .... ..0. = Write
     .... .... .... .... .... .... .... ...0 = Read
  • If we have the appropriate permissions to traverse “/” then the NFS client attempts to find the file handle for the mount point via a LOOKUP call, using the file handle of vsroot in the path. It would look something like this:
    V4 Call (Reply In 271) LOOKUP DH: 0x92605bb8/flexvol
  • If the file handle exists, it gets returned to the client:
    fh.png
  • Then the client uses that file handle to run GETATTRs to see if it can access the mount:
    V4 Call (Reply In 275) GETATTR FH: 0x1f57355e

If all is clear, our mount succeeds!

But we’re not done… now the user that wants to access the mount has to go through another ticket process. In my case, I used a user named “student1.” This is because a lot of the Kerberos/NFSv4.x requests I get are generated by universities interested in setting up multiprotocol-ready home directories.

When a user like student1 wants to get into a Kerberized NFS mount, they can’t just cd into it. That would look like this:

# su student1
sh-4.2$ cd /mnt
sh: cd: /mnt: Not a directory

Oh look… another useless error! If I were to take that error literally, I would think “that mount doesn’t even exist!” But, it does:

sh-4.2$ mount | grep mnt
demo:/flexvol on /mnt type nfs4 (rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=krb5,clientaddr=10.193.67.225,local_lock=none,addr=10.193.67.219)

What that error actually means is that the user requesting access does not have a valid Kerberos AS ticket (login) to make the request for a TGS (ticket granting ticket) to get a service ticket for NFS (nfs/server-hostname). We can see that via the klist -e command.

sh-4.2$ klist -e
klist: Credentials cache keyring 'persistent:1301:1301' not found

Before you can get into a mount that is only allowing Kerberos access, you have to get a Kerberos ticket. On Linux, you can do that via the kinit command, which is akin to a Windows login.

sh-4.2$ kinit
Password for student1@NTAP.LOCAL:
sh-4.2$ klist -e
Ticket cache: KEYRING:persistent:1301:1301
Default principal: student1@NTAP.LOCAL

Valid starting Expires Service principal
05/16/2017 15:54:01 05/17/2017 01:54:01 krbtgt/NTAP.LOCAL@NTAP.LOCAL
 renew until 05/23/2017 15:53:58, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96

Now that I have a my ticket, I can cd into the mount. When I cd into a Kerberized NFS mount, the client will make TGS requests to the KDC (seen in the trace in packet 101) for the service ticket. If that process is successful, we get access:

sh-4.2$ cd /mnt
sh-4.2$ pwd
/mnt
sh-4.2$ ls
c0 c1 c2 c3 c4 c5 c6 c7 newfile2 newfile-nfs4
sh-4.2$ klist -e
Ticket cache: KEYRING:persistent:1301:1301
Default principal: student1@NTAP.LOCAL

Valid starting Expires Service principal
05/16/2017 15:55:32 05/17/2017 01:54:01 nfs/demo.ntap.local@NTAP.LOCAL
 renew until 05/23/2017 15:53:58, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96
05/16/2017 15:54:01 05/17/2017 01:54:01 krbtgt/NTAP.LOCAL@NTAP.LOCAL
 renew until 05/23/2017 15:53:58, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96

Now we’re done. (at least until our tickets expire…)