MacOS NFS Clients with ONTAP – Tips and Considerations

When I’m testing stuff out for customer deployments that I don’t work with a ton, I like to keep notes on the work so I can reference it later for TRs or other things. A blog is a great place to do that, as it might help other people in similar scenarios. This won’t be an exhaustive list, and it is certain to change over time and possibly make its way into a TR, but here we go…

ONTAP and Multiprotocol NAS

Before I get into the MacOS client stuff, we need to understand ONTAP multiprotocol NAS, as it can impact how MacOS clients behave.

In ONTAP, you can serve the same datasets to clients regardless of the NAS protocol they use (SMB or NFS). Many clients can actually do both protocols – MacOS is one of those clients.

The way ONTAP does Multiprotocol NAS (and keeps permissions predictable) is via name mappings and volume “security styles,” which controls what kind of ACLs are in use. TR-4887 goes into more detail on how all that works, but at a high level:

NTFS security styles use NTFS ACLs

SMB clients will map to UNIX users and NFS clients will require mappings to valid Windows users for authentication. Then, permissions are controlled via ACLs. Chmod/chown from NFS clients will fail.

UNIX security styles use UNIX mode bits (rwx) and/or NFSv4 ACLs

SMB clients will require mappings to a valid UNIX user for permissions; NFS clients will only require mapping to a UNIX user name if using NFSv4 ACLs. SMB clients can do *some* permissions changes, but on a very limited basis.

Mixed security styles always use either UNIX or NTFS effective security styles, based on last ACL change

Basically, if an NFS client chmods a file, it switches to UNIX security style. If an SMB client changes ownership of the file, it flips back to NTFS security style. This allows you to change permissions from any client, but you need to ensure you have proper name mappings in place to avoid undesired permission behavior. Generally, we recommend avoiding mixed security styles in most cases.

MacOS NFS Client Considerations

When using MacOS for an NFS client, there are a few things I’ve run into the past week or two while testing that you would want to know to avoid issues.

MacOS can be configured to use Active Directory LDAP for UNIX Identities

When you’re doing multiprotocol NAS (even if the clients will only do NFS, your volumes might have NTFS style permissions), you want to try to use a centralized name service like LDAP so that ONTAP, SMB clients and NFS clients all agree on who the users are, what groups they belong to, what numeric IDs they have, etc. If ONTAP thinks a user has a numeric ID of 1234 and the client things that user has a numeric ID of 5678, then you likely won’t get the access you expected. I wrote up a blog on configuring MacOS clients to use AD LDAP here:

MacOS clients can also be configured to use single sign on with AD and NFS home directories

Your MacOS clients – once added to AD in the blog post above – can now log in using AD accounts. There’s also an additional tab in the Directory Utility that allows you to auto-create home directories when a new user logs in to the MacOS client.

But you can also configure the auto-created home directories to leverage an NFS mount on the ONTAP storage system. You can configure the MacOS client to automount homedirs and then configure the MacOS client to use that path. (This process varies based on Mac version; I’m on 10.14.4 Catalina)

By default, the homedir path is /home in auto_master. We can use that.

Then, chmod the /etc/auto_home file to 644:

$ sudo chmod 644 /etc/auto_home

Create a volume on the ONTAP cluster for the homedirs and ensure it’s able to be mounted from the MacOS clients via the export policy rules (TR-4067 covers export policy rules):

::*> vol show -vserver DEMO -volume machomedirs -fields junction-path,policy
vserver volume      policy  junction-path
------- ----------- ------- ----------------
DEMO    machomedirs default /machomedirs

Create qtrees for each user and set the user/group and desired UNIX permissions:

qtree create -vserver DEMO -volume machomedirs -qtree prof1 -user prof1 -group ProfGroup -unix-permissions 755
qtree create -vserver DEMO -volume machomedirs -qtree student1 -user student1 -group group1 -unix-permissions 755

(For best results on Mac clients, use UNIX security styles.)

Then modify the automount /etc/auto_home file to use that path for homedir mounts. When a user logs in, the homedir will auto mount.

This is the line I used:

* -fstype=nfs nfs://demo:/machomedirs/&

And I also add the home mount

Then apply the automount change:

$ sudo automount -cv
automount: /net updated
automount: /home updated
automount: /Network/Servers updated
automount: no unmounts

Now, when I cd to /home/username, it automounts that path:

$ cd /home/prof1
$ mount
demo:/machomedirs/prof1 on /home/prof1 (nfs, nodev, nosuid, automounted, nobrowse)

But if I want that path to be the new homedir path, I would need to log in as that user and then go to “System Preferences -> Users and Groups” and right click the user. Then select “Advanced Options.”

Then you’d need to restart. Once that happens, log in again and when you first open Terminal, it will use the NFS homedir path.

NOTE: You may want to test if the Mac client can manually mount the homedir before testing logins. If the client can’t automount the homedir on login things will break.

Alternately, you can create a user with the same name as the AD account and then modify the homedir path (this removes the need to login). The Mac will pick up the correct UID, but the group ID may need to be changed.

If you use SMB shares for your home directories, it’s as easy as selecting “Use UNC path” in the User Experience area of Directory Utility (there’s no way to specify NFS here):

With new logins, the profile will get created in the qtree you created for the homedir (and you’ll go through the typical initial Mac setup screens):

# ls -la
total 28
drwxrwxr-x 6 student1 group1 4096 Apr 14 16:39 .
drwxr-xr-x 6 root root 4096 Apr 14 15:28 ..
drwx------ 2 student1 group1 4096 Apr 14 16:39 Desktop
drwx------ 2 student1 group1 4096 Apr 14 16:35 Downloads
drwxr-xr-x 25 student1 group1 4096 Apr 14 16:39 Library
-rw-r--r-- 1 student1 group1 4096 Apr 14 16:35 ._Library
drwx------ 4 student1 group1 4096 Apr 14 16:35 .Spotlight-V100

When you open terminal, it automounts the NFS home directory mount for that user and drops you right into your folder!

Mac NFS Considerations, Caveats, Issues

If you’re using NFS on Mac clients, there are two main things to remember:

  • Volumes/qtrees using UNIX security styles work best with NFS in general
  • Terminal/CLI works better than Finder in nearly all instances

If you have to/want to use Finder, or you have to/want to use NTFS security styles for multiprotocol, then there are some things you’d want to keep in mind.

  • If possible, connect the Mac client to the Active Directory domain and use LDAP for UNIX identities as described above.
  • Ensure your users/groups are all resolving properly on the Mac clients and ONTAP system. TR-4887 and TR-4835 cover some commands you can use to check users and groups, name mappings, group memberships, etc.
  • If you’re using NTFS security style volumes/qtrees and want the Finder to work properly for copies to and from the NFS mount, configure the NFS export policy rule to set -ntfs-unix-security-ops to “ignore” – Finder will bail out if ONTAP returns an error, so we want to silently fail those operations (such as SETATTR; see below).
  • When you open a file for reading/writing (such as a text file), Mac creates a ._filename file along with it. Depending on how many files you have in your volume, this can be an issue. For example, if you open 1 million files and Mac creates 1 million corresponding ._filename files, that starts to add up. Don’t worry! You’re not alone: https://apple.stackexchange.com/questions/14980/why-are-dot-underscore-files-created-and-how-can-i-avoid-them
  • If you’re using DFS symlinks, check out this KB: DFS links do not work on MAC OS client, with ONTAP 9.5 and symlinks enabled

I’ve also run into some interesting behaviors with Mac/Finder/SMB and junction paths in ONTAP, as covered in this blog:

Workaround for Mac Finder errors when unzipping files in ONTAP

One issue that I did a pretty considerable amount of analysis on was the aforementioned “can’t copy using Finder.” Here are the dirty details…

Permissions Error When Copying a File to a NFS Mount in ONTAP using Finder

In this case, a file copy worked using Terminal, but was failing with permissions errors when using Finder and complaining about the file already existing.

First, it wants a login (which shouldn’t be needed):

Then it says this:

If you select “Replace” this is the error:

If you select “Stop” it stops and you are left with an empty 0 byte “file” – so the copy failed.

If you select “Keep Both” the Finder goes into an infinite loop of 0 byte file creations. I stopped mine at around 2500 files (forced an unmount):

# ls -al | wc -l
1981
# ls -al | wc -l
2004
# ls -al | wc -l
2525

So what does that happen? Well, in a packet trace, I saw the following:

The SETATTR fails on CREATE (expected in NFS operations on NTFS security style volumes in ONTAP, but not expected for NFS clients as per RFC standards):

181  60.900209  x.x.x.x    x.x.x.y      NFS  226  V3 LOOKUP Call (Reply In 182), DH: 0x8ec2d57b/copy-file-finder << Mac NFS client checks if the file exists
182  60.900558  x.x.x.y   x.x.x.x      NFS  186  V3 LOOKUP Reply (Call In 181) Error: NFS3ERR_NOENT << does not exist, so let’s create it!
183  60.900633  x.x.x.x    x.x.x.y      NFS  238  V3 CREATE Call (Reply In 184), DH: 0x8ec2d57b/copy-file-finder Mode: EXCLUSIVE << creates the file
184  60.901179  x.x.x.y   x.x.x.x      NFS  362  V3 CREATE Reply (Call In 183)
185  60.901224  x.x.x.x    x.x.x.y      NFS  238  V3 SETATTR Call (Reply In 186), FH: 0x7b82dffd
186  60.901564  x.x.x.y   x.x.x.x      NFS  214  V3 SETATTR Reply (Call In 185) Error: NFS3ERR_PERM << fails setting attributes, which also fails the copy of the actual file data, so we have a 0 byte file

Then it REMOVES the file (since the initial operation fails) and creates it again, and SETATTR fails again. This is where that “Keep Both” loop behavior takes place.

229 66.995698 x.x.x.x x.x.x.y NFS 210 V3 REMOVE Call (Reply In 230), DH: 0x8ec2d57b/copy-file-finder
233 67.006816 x.x.x.x x.x.x.y NFS 226 V3 LOOKUP Call (Reply In 234), DH: 0x8ec2d57b/copy-file-finder
234 67.007166 x.x.x.y x.x.x.x NFS 186 V3 LOOKUP Reply (Call In 233) Error: NFS3ERR_NOENT
247 67.036056 x.x.x.x x.x.x.y NFS 238 V3 CREATE Call (Reply In 248), DH: 0x8ec2d57b/copy-file-finder Mode: EXCLUSIVE
248 67.037662 x.x.x.y x.x.x.x NFS 362 V3 CREATE Reply (Call In 247)
249 67.037732 x.x.x.x x.x.x.y NFS 238 V3 SETATTR Call (Reply In 250), FH: 0xc33bff48
250 67.038534 x.x.x.y x.x.x.x NFS 214 V3 SETATTR Reply (Call In 249) Error: NFS3ERR_PERM

With Terminal, it operates a little differently. Rather than bailing out after the SETATTR failure, it just retries it:

11 19.954145 x.x.x.x x.x.x.y NFS 226 V3 LOOKUP Call (Reply In 12), DH: 0x8ec2d57b/copy-file-finder
12 19.954496 x.x.x.y x.x.x.x NFS 186 V3 LOOKUP Reply (Call In 11) Error: NFS3ERR_NOENT
13 19.954560 x.x.x.x x.x.x.y NFS 226 V3 LOOKUP Call (Reply In 14), DH: 0x8ec2d57b/copy-file-finder
14 19.954870 x.x.x.y x.x.x.x NFS 186 V3 LOOKUP Reply (Call In 13) Error: NFS3ERR_NOENT
15 19.954930 x.x.x.x x.x.x.y NFS 258 V3 CREATE Call (Reply In 18), DH: 0x8ec2d57b/copy-file-finder Mode: UNCHECKED
16 19.954931 x.x.x.x x.x.x.y NFS 230 V3 LOOKUP Call (Reply In 17), DH: 0x8ec2d57b/._copy-file-finder
17 19.955497 x.x.x.y x.x.x.x NFS 186 V3 LOOKUP Reply (Call In 16) Error: NFS3ERR_NOENT
18 19.957114 x.x.x.y x.x.x.x NFS 362 V3 CREATE Reply (Call In 15)
25 19.959031 x.x.x.x x.x.x.y NFS 238 V3 SETATTR Call (Reply In 26), FH: 0x8bcb16f1
26 19.959512 x.x.x.y x.x.x.x NFS 214 V3 SETATTR Reply (Call In 25) Error: NFS3ERR_PERM
27 19.959796 x.x.x.x x.x.x.y NFS 238 V3 SETATTR Call (Reply In 28), FH: 0x8bcb16f1 << Hey let's try again and ask in a different way!
28 19.960321 x.x.x.y x.x.x.x NFS 214 V3 SETATTR Reply (Call In 27)

The first SETATTR tries to chmod to 700:

Mode: 0700, S_IRUSR, S_IWUSR, S_IXUSR

The retry uses 777. Since the file already shows as 777, it succeeds (because it was basically fooled):

Mode: 0777, S_IRUSR, S_IWUSR, S_IXUSR, S_IRGRP, S_IWGRP, S_IXGRP, S_IROTH, S_IWOTH, S_IXOTH

Since Finder bails on the error, setting the NFS server to return no error here for this export (ntfs-unix-security-ops ignore) on this client allows the copy to succeed. You can create granular rules in your export policy rules to just set that option for your Mac clients.

Now, why do our files all show as 777?

Displaying NTFS Permissions via NFS

Because NFS doesn’t understand NTFS permissions, the job to translate user identities into valid access rights falls onto the shoulders of ONTAP. A UNIX user maps to a Windows user and then that Windows user is evaluated against the folder/file ACLs.

So “777” here doesn’t mean we have wide open access; we only have access based on the Windows ACL. Instead, it just means “the Linux client can’t view the access level for that user.” In most cases, this isn’t a huge problem. But sometimes, you need files/folders not to show 777 (like for applications that don’t allow 777).

In that case, you can control somewhat how NFS clients display NTFS ACLs in “ls” commands with the NFS server option ntacl-display-permissive-perms.

[-ntacl-display-permissive-perms {enabled|disabled}] - Display maximum NT ACL Permissions to NFS Client (privilege: advanced)
This optional parameter controls the permissions that are displayed to NFSv3 and NFSv4 clients on a file or directory that has an NT ACL set. When true, the displayed permissions are based on the maximum access granted by the NT ACL to any user. When false, the displayed permissions are based on the minimum access granted by the NT ACL to any user. The default setting is false.

The default setting of “false” is actually “disabled.” When that option is enabled, this is the file/folder view:

When that option is disabled (the default):

This option is covered in more detail in TR-4067, but it doesn’t require a remount to take effect. It may take some time for the access caches to clear to see the results, however.

Keep in mind that these listings are approximations of the access as seen by the current user. If the option is disabled, you see the minimum access; if the option is enabled, you see the maximum access. For example, the “test” folder above shows 555 when the option is disabled, but 777 when the option is enabled.

These are the actual permissions on that folder:

::*> vserver security file-directory show -vserver DEMO -path /FG2/test
Vserver: DEMO
File Path: /FG2/test
File Inode Number: 10755
Security Style: ntfs
Effective Style: ntfs
DOS Attributes: 10
DOS Attributes in Text: ----D---
Expanded Dos Attributes: -
UNIX User Id: 1102
UNIX Group Id: 10002
UNIX Mode Bits: 777
UNIX Mode Bits in Text: rwxrwxrwx
ACLs: NTFS Security Descriptor
Control:0x8504
Owner:BUILTIN\Administrators
Group:NTAP\ProfGroup
DACL - ACEs
ALLOW-Everyone-0x1200a9-OI|CI (Inherited)
ALLOW-NTAP\prof1-0x1f01ff-OI|CI (Inherited)

Here are the expanded ACLs:

                     Owner:BUILTIN\Administrators
                     Group:NTAP\ProfGroup
                     DACL - ACEs
                       ALLOW-Everyone-0x1200a9-OI|CI (Inherited)
                          0... .... .... .... .... .... .... .... = Generic Read
                          .0.. .... .... .... .... .... .... .... = Generic Write
                          ..0. .... .... .... .... .... .... .... = Generic Execute
                          ...0 .... .... .... .... .... .... .... = Generic All
                          .... ...0 .... .... .... .... .... .... = System Security
                          .... .... ...1 .... .... .... .... .... = Synchronize
                          .... .... .... 0... .... .... .... .... = Write Owner
                          .... .... .... .0.. .... .... .... .... = Write DAC
                          .... .... .... ..1. .... .... .... .... = Read Control
                          .... .... .... ...0 .... .... .... .... = Delete
                          .... .... .... .... .... ...0 .... .... = Write Attributes
                          .... .... .... .... .... .... 1... .... = Read Attributes
                          .... .... .... .... .... .... .0.. .... = Delete Child
                          .... .... .... .... .... .... ..1. .... = Execute
                          .... .... .... .... .... .... ...0 .... = Write EA
                          .... .... .... .... .... .... .... 1... = Read EA
                          .... .... .... .... .... .... .... .0.. = Append
                          .... .... .... .... .... .... .... ..0. = Write
                          .... .... .... .... .... .... .... ...1 = Read

                       ALLOW-NTAP\prof1-0x1f01ff-OI|CI (Inherited)
                          0... .... .... .... .... .... .... .... = Generic Read
                          .0.. .... .... .... .... .... .... .... = Generic Write
                          ..0. .... .... .... .... .... .... .... = Generic Execute
                          ...0 .... .... .... .... .... .... .... = Generic All
                          .... ...0 .... .... .... .... .... .... = System Security
                          .... .... ...1 .... .... .... .... .... = Synchronize
                          .... .... .... 1... .... .... .... .... = Write Owner
                          .... .... .... .1.. .... .... .... .... = Write DAC
                          .... .... .... ..1. .... .... .... .... = Read Control
                          .... .... .... ...1 .... .... .... .... = Delete
                          .... .... .... .... .... ...1 .... .... = Write Attributes
                          .... .... .... .... .... .... 1... .... = Read Attributes
                          .... .... .... .... .... .... .1.. .... = Delete Child
                          .... .... .... .... .... .... ..1. .... = Execute
                          .... .... .... .... .... .... ...1 .... = Write EA
                          .... .... .... .... .... .... .... 1... = Read EA
                          .... .... .... .... .... .... .... .1.. = Append
                          .... .... .... .... .... .... .... ..1. = Write
                          .... .... .... .... .... .... .... ...1 = Read

So, prof1 has Full Control (7) and “Everyone” has Read (5). That’s where the minimum/maximum permissions show up. So you won’t get *exact* permissions here. If you want exact permission views, consider using UNIX security styles.

DS_Store files

Mac will leave these little files laying around as users browse shares. In a large environment, that can start to create clutter, so you may want to consider disabling the creation of these on network shares (such as NFS mounts), as per this:

http://hints.macworld.com/article.php?story=2005070300463515http://hints.macworld.com/article.php?story=2005070300463515

If you have questions, comments or know of some other weirdness in MacOS with NFS, comment below!

Brand new tech report: Multiprotocol NAS Best Practices in ONTAP

I don’t like to admit to being a procrastinator, but…

Lazy Sloth Drawing (Page 1) - Line.17QQ.com
(Not actually a sloth)

Four years ago, I said this:

And people have asked about it a few times since then. To be fair, I did say “will be a ways out…”

In actuality, I started that TR in March of 2017. And then again in February of 2019. And then started all over when the pandemic hit, because what else did I have going on? ūüôā

And it’s not like I haven’t done *stuff* in that time.

The trouble was, I do multiprotocol NAS every day, so I think I had writer’s block because I didn’t know where to start and the challenge of writing an entire TR on the subject without making it 100-200 pages like some of the others I’ve written was… daunting. But, it’s finally done. And the actual content is under 100 pages!

Topics include:

  • NFS and SMB best practices/tips
  • Name mapping explanations and best practices
  • Name service information
  • CIFS Symlink information
  • Advanced multiprotocol NAS concepts

Multiprotocol NAS Best Practices in ONTAP

If you have any comments/questions, feel free to comment!

Behind the Scenes: Episode 137: Name Services in ONTAP

Welcome to the Episode 137, part of the continuing¬†series called ‚ÄúBehind the Scenes of the NetApp Tech ONTAP Podcast.‚ÄĚ

tot-gopher

This week on the podcast, we talk Name Services in ONTAP and the introduction of the new global name services cache in ONTAP 9.3 with NAS TME, Chris Hurley (@averageguyX)!

We’ll be taking next week off as we record and prepare for some big announcements coming soon!

Finding the Podcast

The podcast is all finished and up for listening. You can find it on iTunes or SoundCloud or by going to techontappodcast.com.

This week’s episode is here:

Also, if you don’t like using iTunes or SoundCloud, we just added the podcast to Stitcher.

http://www.stitcher.com/podcast/tech-ontap-podcast?refid=stpr

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

Our YouTube channel (episodes uploaded sporadically) is here:

Cache Rules Everything Around Me: New Global Name Service Cache in ONTAP 9.3

cache-rules

In an ONTAP cluster made up of individual nodes with individual hardware resources, it’s useful if a storage administrator can manage the entire cluster as a monolithic entity, without having to worry about what lives where.

Prior to ONTAP 9.3, name service caches were node-centric, for the most part. This sometimes could create scenarios where a cache could become stale on one node, where it was recently populated on another node. Thus, a client may get different results depending on which physical node the network connection occurred.

The following is pulled right out of the new name services best practices technical report (https://www.netapp.com/us/media/tr-4668.pdf), which acts as an update to TR-4379. I wrote some of this, but most of what’s written here is by the new NFS/Name Services TME Chris Hurley. (@averageguyx) This is basically a copy/paste, but I thought this was a cool enough feature to highlight on its own.

Global Name Services Cache in ONTAP 9.3

ONTAP 9.3 offers a new caching mechanism that moves name service caches out of memory and into a persistent cache that is replicated asynchronously between all nodes in the cluster. This provides more reliability and resilience in the event of failovers, as well as offering higher limits for name service entries due to being cached on disk rather than in node memory.

The name service cache is enabled by default. If legacy cache commands are attempted in ONTAP 9.3 with name service caching enabled, an error will occur, such as the following:

Error: show failed: As name service caching is enabled, "Netgroups" caches no longer exist. Use the command "vserver services name-service cache netgroups members show" (advanced privilege level) to view the corresponding name service cache entries.

The name service caches are controlled in a centralized location, below the name-service cache command set. This provides easier cache management, from configuring caches to clearing stale entries.

The global name service cache can be disabled for individual caches using vserver services name-service cache commands in advanced privilege, but it is not recommended to do so. For more detailed information, please see later sections in this document.

ONTAP also offers the additional benefit of using the caches while external name services are unavailable.  If there is an entry in the cache, regardless if the entry’s TTL is expired or not, ONTAP will use that cache entry when external name services servers cannot be reached, thereby providing continued access to data served by the SVM.

Hosts Cache

There are two individual host caches; forward-lookup and reverse-lookup but the hosts cache settings are controlled as a whole.  When a record is retrieved from DNS, the TTL of that record will be used for the cache TTL, otherwise, the default TTL in the host cache settings will be used (24 hours).  The default for negative entries (host not found) is 60 seconds.  Changing DNS settings does not affect the cache contents in any way.

  • The network ping command does not use the name services hosts cache when using a hostname.

User and Group Cache

The user and group caches consist of three categories; passwd (user), group and group membership.

  • Cluster RBAC access does not use the any of the caches

Passwd (User) Cache

User cache consists of two caches, passwd and passwd-by-uid.  The caches only cache the name, uid and gid aspects of the user data to conserve space since the other data such as homedir and shell are irrelevant for NAS access.  When an entry is placed in the passwd cache, the corresponding entry is created in the passwd-by-uid cache.  By the same token, when an entry is deleted from one cache, the corresponding entry will be deleted from the other cache.  If you have an environment where there are disjointed username to uid mappings, there is an option to disable this behavior.

Group Cache

Like the passwd cache, the group cache consists of two caches, group and group-by-gid.  When an entry is placed in the group cache, the corresponding entry is created in the group-by-gid cache.  By the same token, when an entry is deleted from one cache, the corresponding entry will be deleted from the other cache.  The full group membership is not cached to conseve space and is not necessary for NAS data access, therefore only the group name and gid are cached.  If you have an environment where there are disjointed group name to gid mappings, there is an option to disable this behavior.

Group Membership Cache

In file and NIS environments, there is no efficient way to gather a list of groups a particular user is a member of, so for these environments ONTAP has a group membership cache to provide these efficiencies.  The group membership cache consists of a single cache and contains a list of groups a user is a member of.

Netgroup Cache

Beginning in ONTAP 9.3, the various netgroup caches have been consolidated into 2 caches; a netgroup.byhost and a netgroup.byname cache.  The netgroup.byhost cache is the first cache consulted for the netgroups a host is a part of.  Next, if this information is not available, then the query reverts to gathering the full netgroup members and comparing that to the host.  If the information is not in the cache, then the same process is performed against the netgroup ns-switch sources.  If a host requesting access via a netgroup is found via the netgroup membership lookup process, that ip-to-netgroup mapping is always added to the netgroup.byhost cache for faster future access.  This also leads to needing a lower TTL for the members cache so that changes in netgroup membership can be reflected in the ONTAP caches within the TTL timeframe.

Viewing cache entries

Each of the above came service caches and be viewed.  This can be used to confirm whether or not expected results are gotten from name services servers.  Each cache has its own individual options that you can use to filter the results of the cache to find what you are looking for.  In order to view the cache, the name-services cache <cache> <subcache> show command is used.

Caches are unique per vserver, so it is suggested to view caches on a per-vserver basis.  Below are some examples of the caches and the options.

ontap9-tme-8040::*> name-service cache hosts forward-lookup show  ?

  (vserver services name-service cache hosts forward-lookup show)

  [ -instance | -fields <fieldname>, ... ]

  [ -vserver <vserver name> ]                                                   *Vserver

  [[-host] <text>]                                                              *Hostname

  [[-protocol] {Any|ICMP|TCP|UDP}]                                              *Protocol (default: *)

  [[-sock-type] {SOCK_ANY|SOCK_STREAM|SOCK_DGRAM|SOCK_RAW}]                     *Sock Type (default: *)

  [[-flags] {FLAG_NONE|AI_PASSIVE|AI_CANONNAME|AI_NUMERICHOST|AI_NUMERICSERV}]  *Flags (default: *)

  [[-family] {Any|Ipv4|Ipv6}]                                                   *Family (default: *)

  [ -canonname <text> ]                                                         *Canonical Name

  [ -ips <IP Address>, ... ]                                                    *IP Addresses

  [ -ip-protocol {Any|ICMP|TCP|UDP}, ... ]                                      *Protocol

  [ -ip-sock-type {SOCK_ANY|SOCK_STREAM|SOCK_DGRAM|SOCK_RAW}, ... ]             *Sock Type

  [ -ip-family {Any|Ipv4|Ipv6}, ... ]                                           *Family

  [ -ip-addr-length <integer>, ... ]                                            *Length

  [ -source {none|files|dns|nis|ldap|netgrp_byname} ]                           *Source of the Entry

  [ -create-time <"MM/DD/YYYY HH:MM:SS"> ]                                      *Create Time

  [ -ttl <integer> ]                                                            *DNS TTL




ontap9-tme-8040::*> name-service cache unix-user user-by-id show

  (vserver services name-service cache unix-user user-by-id show)

Vserver    UID         Name         GID            Source  Create Time

---------- ----------- ------------ -------------- ------- -----------

SVM1       0           root         1              files   1/25/2018 15:07:13

ch-svm-nfs1

           0           root         1              files   1/24/2018 21:59:47

2 entries were displayed.

If there are no entries in a particular cache, the following message will be shown:

ontap9-tme-8040::*> name-service cache netgroups members show

  (vserver services name-service cache netgroups members show)

This table is currently empty.

There you have it! New cache methodology in ONTAP 9.3. If you’re using NAS and name services in ONTAP, it’s highly recommended to go to ONTAP 9.3 to take advantage of this new feature.

TECH::TR-4379 Name Services Best Practices in clustered Data ONTAP updated for 8.3.1!

It’s time for new technical report updates!

Since clustered Data ONTAP 8.3.1 is now available, we are publishing our 8.3.1 updates to our docs.

idmu

TR-4379: Name Services Best Practices covers a wide range of considerations when using external name services like LDAP, DNS and NIS with your clustered Data ONTAP storage system. External name services are critical to NAS environments, as they help control identity management, Kerberos authentication, hostname resolution, netgroups and export policy rule access.

What’s new in TR-4379?

  • Dynamic DNS support information for 8.3.1
  • Clarification and updates on existing best practices
  • Improved information on name server best practices
  • Upgrade considerations

Where can I find it?

Technical reports can be found a variety of ways. Google search works, as does looking in the NetApp library. I cover how to be better at NetApp documentation in a separate blog post.

To make it super easy, just follow this link:

TR-4379: Name Services Best Practices

TECH::Spring cleaning for your NAS environment

The other day, I was on a customer call, helping with some NAS/netgroup configuration. We were running some tests connecting to a LDAP server to fetch netgroups when I noticed that a netgroup of 100 hosts only returned four IP addresses. FOUR!

There wasn’t anything broken from the storage side. Instead, it was the netgroup – of the 100 hosts, only 4 still existed in DNS.

facepalm

April mis-configurations bring disrupted summer vacations

Every spring, I like to clean out the garage, clean the grill grates, and eventually, spray off the thick coat of North Carolina pollen that has caked itself on. If I were to let this stuff go all year, I’d probably be featured on some reality show like Hoarders.

The same mentality could be applied to your NAS environment maintenance.

  • Removed hosts from the network? Remove them from DNS.
  • Removed hosts from DNS? Remove them from netgroups.
  • Removed netgroups? Remove them from export policies and rules.
  • Changed IP addresses? Make sure those changes are applied everywhere.

If you don’t keep up with your NAS environment, you might be getting calls from your users and/or customers at hours or times you don’t appreciate. Once, on my on-call weekend when I was in support, I had to work a case at 3AM. At the beach. In a hotel parking lot. Stealing wi-fi.

The root cause? Someone mis-configured something.

The fact that I was at the beach and had to be in a hotel parking lot stealing wi-fi was due to my own poor planning. Everyone loses!

Centralize and organize

Some environments still use flat files for hosts, netgroups, etc. While that’s ok, you should start considering consolidating those files into a centralized name service like LDAP, NIS, DNS, etc. After all, it’s a lot easier to make a change on one server than on 600.

If you move away from flat files, you make your life easier, bottom line. And you make your spring cleaning efforts that much more bearable.

What else can you do?

Along with spring cleaning/regular maintenance of your name services, be sure to follow best practices for your NAS environment. For clustered Data ONTAP, I’ve recently published an update to TR-4379: Name Service Best Practices, which covers best practices for many scenarios. For example… Using short hostnames? Don’t. Use FQDNs whenever possible. Shortnames force the DNS client to figure out the DNS zone, which add latency to requests.

Also, pro-tip: If you’re on-call, try to remember not to also be on vacation.

TECH::There’s no place like 127.0.0.1. (But for everywhere else, use DNS.)

One of my favorite IT jokes is “there’s no place like 127.0.0.1.” You can get this slogan emblazoned on t-shirts, welcome mats, etc.

127.0.0.1 is, of course, localhost¬†or the loopback address. Every device on a network has one. However, for addresses that need to be resolvable outside of the internal subsystem, we need MAC addresses, IP addresses and in most cases, routing and DNS. Think of it this way – 127.0.0.1 is your bedroom door. That doesn’t help people find your house when you invite them over, however.

Guess who’s coming to dinner?

When you have people over, you need to give them information to get them to your house. In today’s age, that’s as easy as telling someone a street number and name that they can plug into a GPS or Google maps. No more having to give step-by-step directions!

But even giving that much information can be too much, especially if that person comes over a lot (but has a terrible memory). So, in those cases, an address can be saved as a shortcut in a map app or GPS with an alias such as “Justin’s house.”

This is not unlike how MAC and IP addresses work. A MAC address is the physical pavement of the road. An IP is the street number and name. The aliased short cut? That’s the hostname.

The hostname can be served locally via a flat file, or in a database like DNS, LDAP or even NIS. Then clients and servers can query the common database for the information and use that information to find their way around the IT village.

This may all seem rudimentary to you; that’s because it is. ūüôā

But you would be surprised how often DNS/hostname resolution comes up in support cases, configuration issues, etc. The reason for that is two-fold.

1) People do not fully understand DNS/hostname resolution

2) People take DNS/hostname resolution for granted

What is DNS?

To cover #1, let’s talk about DNS and what it is/does.

DNS is short for Domain Name System. It’s a centralized database that contains hostnames, IP addresses, service records, aliases, zones… all sorts of things that allow enterprise IT environments leverage it for day to day operations. By default, DNS is included in Active Directory domain deployments. It has to be – otherwise, AD would not function very well/at all. If you want to read more about that, see the following:

How DNS support for Active Directory works

Active Directory-Integrated DNS

Configure a DNS server for use with Active Directory

However, DNS isn’t just used for Active Directory and isn’t isolated to only Windows environments. DNS has been around for a long time and is critical in numerous widely used IT services, including:

  • NAS (NFS and SMB)
  • Kerberos
  • Microsoft Exchange
  • LDAP
  • Various other 3rd party applications

The above list is by no means complete, but gives a general idea of how integral DNS is to day to day IT shops.

What is so difficult about DNS?

DNS is not extremely complicated. However, there are general high-level concepts that get mistaken from time to time.

Servers

DNS servers themselves are concepts that can get lost on people. These contain the records, zones, etc. They also may replicate across the network to other DNS servers. They require specific functionality, such as being able to listen for DNS requests on port 53, caching requests, acting as authoritative servers (SOA) for DNS updates, etc.

Records

This is one thing that trips a lot of people up, mainly because there are many different types of records. Some of the main/common ones include:

  • A/AAAA records (for IPv4/IPv6 addresses)
  • CNAMEs (aliases)
  • MX (mail exchange)
  • NS (name server)
  • PTR records (pointer/reverse lookup)
  • SOA (start of authoritative zone)
  • SRV (service records such as LDAP, Kerberos KDC, etc)

Zones

Zones are used to direct requests from clients to their appropriate locations and/or forward them to other name servers. For example, dns.windows.com might be the name of the Active Directory domain, but you might also have DNS zones in other locations that exist on other name servers. If so, you could add a zone (such as bind.linux.com) and add NS records to forward requests on to the appropriate name servers running BIND. This allows for improved performance of lookups, as well as scalable DNS environments.

NetApp’s clustered Data ONTAP actually allows storage admins to configure individual data LIFs as name servers to act as DNS zones in a Storage Virtual Machine. This comes in handy for intelligent DNS load balancing in clusters and is covered in TR-4073: Secure Unified Authentication on page 27.

Wither DNS?

There is plenty more to DNS than the above. However, if you already know and understand DNS, you can see why it’s easy to overlook it and take it for granted. When configured properly, it just works. It’s not fancy. It’s generally robust and resilient. And with DDNS, you don’t even have to go in and add records to existing DNS servers. Clients do it for you. So when a problem *does* occur, it becomes a “forest for the trees” problem where DNS is one of the last places many admins look. This is a mistake – DNS should be one of the first things checked off the list as “not a problem” when troubleshooting, as it’s so important to so many things in IT.

Best Practices

Most DNS servers out there have documented best practices, and any best practice for a DNS server should come from a vendor. However, there are universal best practices that are pretty much no-brainers when it comes to managing DNS.

  • Use multiple DNS servers: This provides redundancy, eliminates single points of failures, allows load balancing, etc.
  • If using multiple DNS servers, ensure they are all in sync: Replicate all the zones and records on a regular interval. Check error logs to ensure that replication is occurring normally and without error.
  • Be thorough in hostname record creation: Don’t just add a forward lookup record. Add the PTR, too. And don’t create a CNAME unless you have an A/AAAA and PTR record to point it to.
  • Make sure your clients are configured to use the correct DNS servers and zones
  • Avoid using local hosts files if possible: Everyone forgets to update those things. And imagine having to update 1000s of files every time an IP address or hostname changes….
  • Ensure proper service records (SRV) are in place for services.
  • Review the vendor recommendation for enabling recursion. Some vendors want it disabled.
  • Know your DNS port number (53) by heart. This will save you troubleshooting headaches.
  • Learn to love packet traces for troubleshooting, as well as ping, nslookup and dig. Just be careful with ping. General rule of thumb is, if you can ping the IP but not the hostname, check DNS.

There are tons of other best practices out there, including this Cisco doc, this Microsoft doc and this Wikia article. For Name Services Best Practices related to NetApp’s clustered Data ONTAP, see the new TR I wrote on the subject (TR-4379).

TECH:: What’s the latest cDOT 8.2.x release that’s best for NAS?

Data ONTAP 8.2.3P6 has been released. As the NFS TME at NetApp, I am pretty stoked. Get it here:

http://mysupport.netapp.com/NOW/download/software/ontap/8.3.1P6

On that page is a list of bug fixes in the release. This is the recommended release for all NAS environments running 8.2.x.

This is an update to both 7-Mode and clustered Data ONTAP. Keep in mind that minor releases (such as 8.2.x) are immediately considered GA, so there is no RC for minor releases. They are essentially patch rollups with minimal new features.

The update contains:

  • NAS fixes and improvements
  • Name service fixes and improvements
  • Kerberos fixes

If you are running NAS and are already on a 8.2.x release, UPGRADE AS SOON AS POSSIBLE. Lots of good NAS features/fixes for both 7-mode and cDOT.

If you are on a 8.1.x or prior release, consider/plan on upgrading, as it will be a little more involved than a minor upgrade from 8.2.x.

Windows NFSv3

Included in the 8.2.3 release is support for Windows NFSv3, which, believe it or not, people still use in production. ūüôā

7-mode has supported this for a while, but clustered Data ONTAP did not support it until this release. 8.3 still does not support it, but 8.3.1 does. TR-4067: NFS Best Practice and Implementation Guide contains content regarding Windows NFSv3.

Check it out today!