TECH:: FUD for thought – Don’t be a part of the problem.

I’m fairly new to this “blogging about tech” thing. I’m probably going to sound a bit naive about some of this, but I’m a strong believer in fairness.

In fact, my previous social media experiences were essentially cat videos, memes and snarky comments on news articles. So now that I’ve started paying attention to the tech world and blogs, I’m seeing an alarming trend – the unmitigated rise of FUD.

FUD fights

3stooges

FUD is an acronym that stands for:

Fear

Uncertainty

Doubt

At its best, it’s lazy marketing. At worst, it’s purposely deceitful. And I’m seeing it a LOT in blogs by some well-known bloggers who claim to be “independent.” But it’s hard to imagine anyone who has a strong enough opinion about a technology to be considered “independent” about anything, especially if their opinions (remember, blogs are opinions, not fact) are laser focused on specific vendors and contain vague or broad “information” about that tech.

FUD isn’t just used in tech

FUD has always been around. Advertisers use it as a tactic to tear down their competitors to make their own brand look better. Coke vs. Pepsi comes to mind. As does Republican vs. Democrat. I’ve always hated the tactic, however. If your product is so good, why can’t you point to its merits as the selling point rather than your competitor’s faults? And if you’re going to go negative, can’t you at least be honest and forthcoming?

Everyone else does it…

Its-Always-Sunny-shrug

One rationalization of using FUD to promote your brand or product is that “everyone else uses FUD,” particularly against YOU. While it’s valid to dislike people lying about you and your product, or “creatively spinning” your message to one that’s negative, it’s not ok to stoop to their level. Combat the FUD, stay on your message. Stay positive. If you don’t have anything nice to say about your own product, don’t say anything at all.

FUD reeks of desperation

desperation

When you resort to using FUD in your messaging, it sounds a lot like one of the following.

  1. You don’t believe in your own product.
  2. Your sales are bad or your message isn’t being absorbed.
  3. You’re losing to your competitors.
  4. You’re just not a good person.

In today’s age, many companies are using social media to create promotion teams on Twitter, Facebook, etc via programs like Cisco Champions, NetApp A-Team, vExpert, etc. For example, I’ve illustrated my own point in this blog by purposely omitting a competitor’s social media program, since I work for NetApp. ūüôā

It’s great when these programs result in honest debate or even rah-rah cheerleading. Even better when people posting information disclose their bias. But when these “independent” actors start making broad statements like “don’t use this vendor’s feature; performance is bad!” without backing evidence, or mis-stating features, or spreading rumors about things that don’t work, or being dishonest about their overall intent, they’re no longer people you should trust. I’ve seen all of these things in recent days. These authors/bloggers are well worth reading and listening to, as it provides an opposing view, but always consider the source.

Do they have a track record of being negative towards a vendor?

Do they ever say bad things about other vendors?

If the answers are “yes” and “no” to those (in order), it might be time to start reading other blogs to get a more balanced tech world view.

When is FUD not FUD?

2392237019_01cecd96f8

Another thing I’ve noticed is that FUD sometimes gets called FUD when it’s actually just… honest. FUD is becoming the new “troll.” The word “troll” has become the de facto insult to anyone on the Internet that doesn’t agree with something you say. You think the dress is white and gold? TROLL!

It’s not FUD if you say something “negative” about a vendor as long as you can back it up. Twitter is one thing – you don’t have a lot of room to argue. Just ask vStewed. ūüėČ (Not to pick on him – he’s generally very fair. But it’s REAL hard to prove a point on Twitter.)

140 characters is really no place to have a technical discussion. It’s hard. It’s easy to misconstrue. But with blogs, there is no excuse. If you make a claim about a vendor, back it up, preferably with links to THEIR OWN documentation. If it’s a perf claim, show your work. WHY was it slow? What made you think that? Were you even using the right solution? Or were you just regurgitating something else you read on a blog somewhere or overheard at a conference?

It’s our responsibility as self-proclaimed “tech leaders” to make sure we are honest and detailed in our statements.

Stop spreading FUD.

Fight fair.

Be honest.

NOTE: Glenn Sizemore just wrote a blog that hammers home what I am saying about believing in what you do. Worth a read!

http://datacenterdude.com/netapp/sitting-on-the-anchor-of-humanity/

TECH::There’s no place like 127.0.0.1. (But for everywhere else, use DNS.)

One of my favorite IT jokes is “there’s no place like 127.0.0.1.” You can get this slogan emblazoned on t-shirts, welcome mats, etc.

127.0.0.1 is, of course, localhost¬†or the loopback address. Every device on a network has one. However, for addresses that need to be resolvable outside of the internal subsystem, we need MAC addresses, IP addresses and in most cases, routing and DNS. Think of it this way – 127.0.0.1 is your bedroom door. That doesn’t help people find your house when you invite them over, however.

Guess who’s coming to dinner?

When you have people over, you need to give them information to get them to your house. In today’s age, that’s as easy as telling someone a street number and name that they can plug into a GPS or Google maps. No more having to give step-by-step directions!

But even giving that much information can be too much, especially if that person comes over a lot (but has a terrible memory). So, in those cases, an address can be saved as a shortcut in a map app or GPS with an alias such as “Justin’s house.”

This is not unlike how MAC and IP addresses work. A MAC address is the physical pavement of the road. An IP is the street number and name. The aliased short cut? That’s the hostname.

The hostname can be served locally via a flat file, or in a database like DNS, LDAP or even NIS. Then clients and servers can query the common database for the information and use that information to find their way around the IT village.

This may all seem rudimentary to you; that’s because it is. ūüôā

But you would be surprised how often DNS/hostname resolution comes up in support cases, configuration issues, etc. The reason for that is two-fold.

1) People do not fully understand DNS/hostname resolution

2) People take DNS/hostname resolution for granted

What is DNS?

To cover #1, let’s talk about DNS and what it is/does.

DNS is short for Domain Name System. It’s a centralized database that contains hostnames, IP addresses, service records, aliases, zones… all sorts of things that allow enterprise IT environments leverage it for day to day operations. By default, DNS is included in Active Directory domain deployments. It has to be – otherwise, AD would not function very well/at all. If you want to read more about that, see the following:

How DNS support for Active Directory works

Active Directory-Integrated DNS

Configure a DNS server for use with Active Directory

However, DNS isn’t just used for Active Directory and isn’t isolated to only Windows environments. DNS has been around for a long time and is critical in numerous widely used IT services, including:

  • NAS (NFS and SMB)
  • Kerberos
  • Microsoft Exchange
  • LDAP
  • Various other 3rd party applications

The above list is by no means complete, but gives a general idea of how integral DNS is to day to day IT shops.

What is so difficult about DNS?

DNS is not extremely complicated. However, there are general high-level concepts that get mistaken from time to time.

Servers

DNS servers themselves are concepts that can get lost on people. These contain the records, zones, etc. They also may replicate across the network to other DNS servers. They require specific functionality, such as being able to listen for DNS requests on port 53, caching requests, acting as authoritative servers (SOA) for DNS updates, etc.

Records

This is one thing that trips a lot of people up, mainly because there are many different types of records. Some of the main/common ones include:

  • A/AAAA records (for IPv4/IPv6 addresses)
  • CNAMEs (aliases)
  • MX (mail exchange)
  • NS (name server)
  • PTR records (pointer/reverse lookup)
  • SOA (start of authoritative zone)
  • SRV (service records such as LDAP, Kerberos KDC, etc)

Zones

Zones are used to direct requests from clients to their appropriate locations and/or forward them to other name servers. For example, dns.windows.com might be the name of the Active Directory domain, but you might also have DNS zones in other locations that exist on other name servers. If so, you could add a zone (such as bind.linux.com) and add NS records to forward requests on to the appropriate name servers running BIND. This allows for improved performance of lookups, as well as scalable DNS environments.

NetApp’s clustered Data ONTAP actually allows storage admins to configure individual data LIFs as name servers to act as DNS zones in a Storage Virtual Machine. This comes in handy for intelligent DNS load balancing in clusters and is covered in TR-4073: Secure Unified Authentication on page 27.

Wither DNS?

There is plenty more to DNS than the above. However, if you already know and understand DNS, you can see why it’s easy to overlook it and take it for granted. When configured properly, it just works. It’s not fancy. It’s generally robust and resilient. And with DDNS, you don’t even have to go in and add records to existing DNS servers. Clients do it for you. So when a problem *does* occur, it becomes a “forest for the trees” problem where DNS is one of the last places many admins look. This is a mistake – DNS should be one of the first things checked off the list as “not a problem” when troubleshooting, as it’s so important to so many things in IT.

Best Practices

Most DNS servers out there have documented best practices, and any best practice for a DNS server should come from a vendor. However, there are universal best practices that are pretty much no-brainers when it comes to managing DNS.

  • Use multiple DNS servers: This provides redundancy, eliminates single points of failures, allows load balancing, etc.
  • If using multiple DNS servers, ensure they are all in sync: Replicate all the zones and records on a regular interval. Check error logs to ensure that replication is occurring normally and without error.
  • Be thorough in hostname record creation: Don’t just add a forward lookup record. Add the PTR, too. And don’t create a CNAME unless you have an A/AAAA and PTR record to point it to.
  • Make sure your clients are configured to use the correct DNS servers and zones
  • Avoid using local hosts files if possible: Everyone forgets to update those things. And imagine having to update 1000s of files every time an IP address or hostname changes….
  • Ensure proper service records (SRV) are in place for services.
  • Review the vendor recommendation for enabling recursion. Some vendors want it disabled.
  • Know your DNS port number (53) by heart. This will save you troubleshooting headaches.
  • Learn to love packet traces for troubleshooting, as well as ping, nslookup and dig. Just be careful with ping. General rule of thumb is, if you can ping the IP but not the hostname, check DNS.

There are tons of other best practices out there, including this Cisco doc, this Microsoft doc and this Wikia article. For Name Services Best Practices related to NetApp’s clustered Data ONTAP, see the new TR I wrote on the subject (TR-4379).

TECH::Uh, I didn’t put that VM there… #vExpert

Ever find yourself browsing vSphere and seeing a VM show up on a datastore you *know* you didn’t put that VM in? I’ve run into this issue a few times and never have seen a KB or blog post on it. Closest I’ve seen is this one:

In VMware vCenter Server 5.x a virtual machine with a snapshot displays datastores or port groups that are no longer in use

If you’ve ever run into this issue, you know how irritating and maddening it can be. Forehead-smashing even. One of my co-workers/co-lab admins ran into this a couple weeks ago, so I decided to blog it up.

For example, in my vSphere, I have a datastore mounted (via NFS on clustered Data ONTAP, of course!) that contains ISO images for my VMs to use for installs, upgrades, etc. But I don’t ever create VMs on it. In fact, I’ve got roles set to disallow creating VMs. However, when I browse to it…

vsphere-datastore

And when I look at the datastore where that VM is *supposed* to exist…

vsphere-datastore2

So what gives? Why is my VM, which I am certain only exists once, showing up in multiple places, including a datastore where I can’t/don’t create VMs?

The answer? @#$^! snapshots.

The datastore that is showing an extra VM is my ISO datastore. As I mentioned, it hosts my ISO images that I mount to VMs. In this case, my Centos 6.5 image has an ISO mounted.

snapshot-iso

When I took a snapshot of the VM in vSphere, that meant it took a snapshot of the VM with an ISO mounted. To do that, it had to include information about the ISO in the snapshot, which means it “added” the VM to the ISO datastore.

snapshot-manager

So how do I fix it?

Simple – delete the snapshot. If you want a snapshot of the VM that doesn’t do this, unmount the ISO before taking a snapshot. And, as a best practice, unmount your ISOs when you’re done with them. (Put your toys away where you found them!)

After I delete the snapshot, the VM still¬†shows up in the ISO datastore, because I have not unmounted the ISO yet. Once I unmount the ISO, the VM disappears from the datastore…

vm-removed

TECH::Become a clustered Data ONTAP CLI Ninja

From HowStuffWorks.com

Ninjas. They move silently through the night, hunting their prey and striking with a swift ferocity before darting out, unseen and without a trace. Highly skilled and trained, they are efficient and accurate. And yes, I know they are also kind of assholes…

With this blog entry, I hope to arm you, clustered Data ONTAP CLI lovers, with the skills necessary to become CLI Ninjas.

If you attended Insight or have access to the slides from Insight, GV Govindasamy did an excellent session (CR-2-2177) on much of what is covered below. I even borrowed some of his graphics/content :). Follow him on Twitter. Bug him to tweet more.

https://twitter.com/GvgNtap

What is this clustered Data ONTAP you speak of?

Clustered Data ONTAP is NetApp’s scale-out storage operating system.

You can have up to 24 nodes in a NAS cluster, 8 in a SAN cluster. Despite having so many entry points into a cluster, the operating system allows you to use a single point of entry to manage the entire thing.

With Data ONTAP 8.3, there is a management GUI called OnCommand System Manager that is loaded on-box, which means you can access it any time, any where (well, provided you can access that network), with a simple web browser.

But what if you hate GUIs? What if you prefer the cold, dark screen of a PuTTY terminal as you comb your neck beard, drink your Tab cola and listen to your Steely Dan 8-tracks?

There’s good news!

The good news is that clustered Data ONTAP’s CLI is pretty powerful and easy to use.

If you’ve ever used Cisco IOS, you are familiar with the concept of tab completion. If you aren’t familiar, you basically can use the TAB key to help give you hints on what comes next.

It all comes full circle…

Shortcuts!

Additionally, like Cisco IOS, you can use shortcuts for commands. So rather than typing out “volume,” you can use “vol.” Just look out for command ambiguity, where a command with the same few first letters as another command won’t work with a shortcut.

cluster1::*> v
Error: Ambiguous command.  Possible matches include:
      volume
      vserver

Another way shortcuts work is based on how the CLI in clustered Data ONTAP is architected. Every command is considered a “directory” and as you drill down the command tree, you get deeper and deeper into the tree.

For example:

cluster::> vserver services name-services dns

OR

cluster::> dns

Directories!

Folders!

Commands live in directories in clustered Data ONTAP. As such, you can “cd” into these directories by typing a command. With our DNS command example…

cluster::> dns
cluster::vserver services name-services dns>

We dropped down two directory levels! But how do I get back up… Up¬†perhaps?

cluster::vserver services name-services dns> up
cluster::vserver services name-services>

What if I want to go all the way to the top?

cluster::vserver services name-services dns> top
cluster::>

Well, that was easy… what else can we do?

Stuck on a command?

If tab completion isn’t enough for you, or if you want to see all command options, use the ?:

cluster::> vserver ?
 active-directory>           Manage Active Directory
 add-aggregates              Add aggregates to the Vserver
 add-protocols               Add protocols to the Vserver
 audit>                      Manage auditing of protocol requests that the
                             Vserver services
 check>                      The check directory
 cifs>                       Manage the CIFS configuration of a Vserver
 context                     Set Vserver context
 create                      Create a Vserver
 dashboard>                  The dashboard directory
 data-policy>                Manage data policy
 delete                      Delete a Vserver
 export-policy>              Manage export policies and rules
 fcp>                        Manage the FCP service on a Vserver
 fpolicy>                    Manage FPolicy
 group-mapping>              The group-mapping directory
 iscsi>                      Manage the iSCSI services on a Vserver
 locks>                      Manage Client Locks
 modify                      Modify a Vserver
 name-mapping>               The name-mapping directory
 nfs>                        Manage the NFS configuration of a Vserver
 peer>                       Create and manage Vserver peer relationships
 remove-aggregates           Remove aggregates from the Vserver
 remove-protocols            Remove protocols from the Vserver
 rename                      Rename a Vserver
 security>                   Manage ontap security
 services>                   The services directory
 show                        Display Vservers
 show-protocols              Show protocols for Vserver
 smtape>                     The smtape directory
 start                       Start a Vserver
 stop                        Stop a Vserver
 vscan>                      Manage Vscan

Want to know what the command does/is or want to see examples? Use the man command. It will open a “window” similar to vi (ok, a LOT like vi) that allows you to scroll using the space bar, arrow keys and search using the / operand.

cluster::> man volume create

volume create                   Data ONTAP 8.3                   volume create

NAME
    volume create -- Create a new volume

AVAILABILITY
    This command is available to cluster and Vserver administrators at the admin privilege level.

DESCRIPTION
    The volume create command creates a volume on a specified Vserver and storage aggregate. You can optionally specify the following  attributes for the new volume:

Want to see more detailed output from a command?

When ¬†you type a command like “volume show” you might see this:

cluster::> volume show
Vserver   Volume       Aggregate    State      Type       Size  Available Used%
--------- ------------ ------------ ---------- ---- ---------- ---------- -----
NAS       mixed        aggr1_node1  online     RW         20MB    18.86MB    5%
NAS       ntfs         aggr1_node1  online     RW         20MB    18.86MB    5%
NAS       unix         aggr1_node1  online     RW         20MB    18.87MB    5%
NAS       unix2        aggr1_node1  online     RW         20MB    18.88MB    5%
NAS       vsroot       aggr1_node1  online     RW         20MB    18.84MB    5%
TRUST     ntfs         aggr1_node2  online     RW         20MB    18.82MB    5%
TRUST     unix         aggr1_node1  online     RW         20MB    18.88MB    5%
TRUST     vsroot       aggr1_node1  online     RW         20MB    18.78MB    6%
parisi-fs-01 vol0      aggr0_root_node1 online RW       2.86GB    560.2MB   80%
parisi-fs-02 vol0      aggr0_root_node2 online RW       2.86GB    650.1MB   77%
10 entries were displayed.

What if you want more? Use -instance!

cluster::> volume show -vserver NAS -volume unix -instance 

                                  Vserver Name: NAS
                                   Volume Name: unix
                                Aggregate Name: aggr1_node1
                                   Volume Size: 20MB
                            Volume Data Set ID: 1036
                     Volume Master Data Set ID: 2147484684
                                  Volume State: online
                                   Volume Type: RW
                                  Volume Style: flex
                        Is Cluster-Mode Volume: true
                         Is Constituent Volume: false
                                 Export Policy: allow_all
                                       User ID: 0
                                      Group ID: 0
                                Security Style: unix
                              UNIX Permissions: ---rwxrwxrwx
                                 Junction Path: /unix
                          Junction Path Source: RW_volume
                               Junction Active: true
                        Junction Parent Volume: vsroot
                                       Comment: 
                                Available Size: 18.87MB
                               Filesystem Size: 20MB
                       Total User-Visible Size: 19MB
                                     Used Size: 132KB
                               Used Percentage: 5%
          Volume Nearly Full Threshold Percent: 95%
                 Volume Full Threshold Percent: 98%
          Maximum Autosize (for flexvols only): 24MB
(DEPRECATED)-Autosize Increment (for flexvols only): 1MB
                              Minimum Autosize: 20MB
            Autosize Grow Threshold Percentage: 85%
          Autosize Shrink Threshold Percentage: 50%
                                 Autosize Mode: off
          Autosize Enabled (for flexvols only): false
           Total Files (for user-visible data): 566
            Files Used (for user-visible data): 100
                         Space Guarantee Style: volume
                     Space Guarantee in Effect: true
             Snapshot Directory Access Enabled: false
            Space Reserved for Snapshot Copies: 5%
                         Snapshot Reserve Used: 71%
                               Snapshot Policy: default
                                 Creation Time: Tue Aug 26 14:13:19 2014
                                      Language: C.UTF-8
                                  Clone Volume: false
                                     Node name: parisi-fs-01
                                 NVFAIL Option: off
                         Volume's NVFAIL State: false
       Force NVFAIL on MetroCluster Switchover: off
                     Is File System Size Fixed: false
                                 Extent Option: off
                 Reserved Space for Overwrites: 0B
                            Fractional Reserve: 100%
             Primary Space Management Strategy: volume_grow
                      Read Reallocation Option: off
              Inconsistency in the File System: false
                  Is Volume Quiesced (On-Disk): false
                Is Volume Quiesced (In-Memory): false
     Volume Contains Shared or Compressed Data: false
             Space Saved by Storage Efficiency: 0B
        Percentage Saved by Storage Efficiency: 0%
                  Space Saved by Deduplication: 0B
             Percentage Saved by Deduplication: 0%
                 Space Shared by Deduplication: 0B
                    Space Saved by Compression: 0B
         Percentage Space Saved by Compression: 0%
           Volume Size Used by Snapshot Copies: 728KB
                                    Block Type: 64-bit
                              Is Volume Moving: false
                Flash Pool Caching Eligibility: read-write
 Flash Pool Write Caching Ineligibility Reason: -
                    Managed By Storage Service: -
Create Namespace Mirror Constituents For SnapDiff Use: -
                       Constituent Volume Role: -
                         QoS Policy Group Name: -
                           Caching Policy Name: -
               Is Volume Move in Cutover Phase: false
       Number of Snapshot Copies in the Volume: 10
VBN_BAD may be present in the active filesystem: false
               Is Volume on a hybrid aggregate: false
                      Total Physical Used Size: 860KB
                      Physical Used Percentage: 4%

Information… overload…

Ok, so -instance is too much. What if you want to know specifics about a volume that aren’t found in volume show, but you don’t want all the noise? Use -fields to filter!

For instance, what if you only care about what space is available and used?

cluster::> volume show -vserver NAS -volume unix -fields used,available,percent-used 
vserver volume available used  percent-used 
------- ------ --------- ----- ------------ 
NAS     unix   18.87MB   132KB 5%

Awesome! Now, how do I *exclude* things? What if I want to see all disks in my system that are *not* spares? Use bang(!):

cluster::> disk show -state !spare
                     Usable           Disk    Container   Container   
Disk                   Size Shelf Bay Type    Type        Name      Owner
---------------- ---------- ----- --- ------- ----------- --------- --------
VMw-1.1              3.55GB     -   0 VMDISK  aggregate   aggr0_root_node1 parisi-fs-01
VMw-1.2              3.55GB     -   1 VMDISK  aggregate   aggr0_root_node1 parisi-fs-01
VMw-1.3              3.55GB     -   2 VMDISK  aggregate   aggr0_root_node1 parisi-fs-01
VMw-1.4             82.75MB     -   3 VMDISK  aggregate   aggr1_node1 parisi-fs-01
VMw-1.5             82.75MB     -   4 VMDISK  aggregate   aggr1_node1 parisi-fs-01
VMw-1.6             82.75MB     -   5 VMDISK  aggregate   aggr1_node1 parisi-fs-01
VMw-1.7             82.75MB     -   6 VMDISK  aggregate   aggr1_node1 parisi-fs-01
VMw-1.8             82.75MB     -   8 VMDISK  aggregate   aggr1_node1 parisi-fs-01
VMw-1.15            82.75MB     -   0 VMDISK  aggregate   aggr1_node1 parisi-fs-01
VMw-1.16            82.75MB     -   1 VMDISK  aggregate   aggr1_node1 parisi-fs-01
Press  to page down,  for next line, or 'q' to quit...

That’s a lot of output. And I know I have more than 16 disks…

Sick of having to hit the space bar to see more output from a command?

When a command exceeds the default of 24 lines (or rows), the system will ask you to hit the space bar to continue. This is done because, in a 24 node cluster, you can end up with a LOT of objects. Thousands of volumes, potentially.

If you truly want to see them all, use the rows command to set to 0:

cluster::> rows 0

Or, if you’d rather, set the rows to a larger number:

cluster::> rows 50

Tired of typing the same command over and over and over…?

Built in to the clustered Data ONTAP CLI is the ability to use !, history and redo commands that Linux admins know and love. This is great for repetitive tasks, or even for commands that get super long.

cluster1::> history
   1  rows
   2  rows 0
   3  rows 24
   4  vol show
   5  vserver
   6  up
   7  vserver
   8  ..
   9  man job
  10  man rows
  11  man top
  12  vserver
  13  top
  14  man man
  15  man redo
  16  history
  17  storage
  18  ..
  19  history

To run a specific command you ran already (such as #4, vol show), use one of the following:

cluster::> redo 4  
cluster::> !4

Wildcards!

Clustered Data ONTAP CLI also supports the use of wildcards with most commands. This allows you to show a filtered set of objects, based on name. For example, if you have multiple volumes with “unix” in the name:

cluster::> vol show -volume *unix*
Vserver   Volume       Aggregate    State      Type       Size  Available Used%
--------- ------------ ------------ ---------- ---- ---------- ---------- -----
NAS       unix         aggr1_node1  online     RW         20MB    18.87MB    5%
NAS       unix2        aggr1_node1  online     RW         20MB    18.88MB    5%
TRUST     unix         aggr1_node1  online     RW         20MB    18.88MB    5%
3 entries were displayed.

You could combine wildcards with excludes as well.

cluster::> vol show -volume *unix*,!unix2
Vserver   Volume       Aggregate    State      Type       Size  Available Used%
--------- ------------ ------------ ---------- ---- ---------- ---------- -----
NAS       unix         aggr1_node1  online     RW         20MB    18.87MB    5%
TRUST     unix         aggr1_node1  online     RW         20MB    18.88MB    5%
2 entries were displayed.

Either Or

In addition to excluding using !, you can also use | to include multiple instances of a field. This really comes in handy when trying to look at statistics on a cluster.

Without filters, it becomes a minefield of text:

cluster::*> statistics show-periodic -interval 2 -iterations 0 -summary true -object nfsv3 -instance NAS 
cluster: nfsv3.NAS: 2/13/2015 15:59:46
  access                                      commit                                                         create                                      fsinfo                                      fsstat                                     getattr                                                          link                                      lookup                                       mkdir                                       mknod                                       nfsv3             nfsv3      nfsv3               nfsv3      nfsv3     null                                    pathconf                                                  read                           read_symlink     read    read     read     read           readdir                   readdir                   readdirplus                         readdirplus                           remove                                      rename                                       rmdir                                     setattr                                     symlink                                       write
     avg   access  access   access   access      avg   commit  commit   commit   commit                        avg   create  create   create   create      avg   fsinfo  fsinfo   fsinfo   fsinfo      avg   fsstat  fsstat   fsstat   fsstat      avg  getattr getattr  getattr  getattr instance instance      avg     link    link     link     link      avg   lookup  lookup   lookup   lookup      avg    mkdir   mkdir    mkdir    mkdir      avg    mknod   mknod    mknod    mknod     dnfs    nfsv3     read       read      nfsv3    write      write      avg     null    null     null     null      avg pathconf pathconf pathconf pathconf raidprop      avg     read    read     read          avg  symlink symlink  symlink  symlink     read      avg  readdir readdir   postop  readdir  readdir         avg readdirplus readdirplus      postop readdirplus readdirplus      avg   remove  remove   remove   remove      avg   rename  rename   rename   rename      avg    rmdir   rmdir    rmdir    rmdir      avg  setattr setattr  setattr  setattr      avg  symlink symlink  symlink  symlink      avg    write   write    write    write    Complete    Number of
 latency    error percent  success    total  latency    error percent  success    total   cpu_id cpu_name  latency    error percent  success    total  latency    error percent  success    total  latency    error percent  success    total  latency    error percent  success    total     name     uuid  latency    error percent  success    total  latency    error percent  success    total  latency    error percent  success    total  latency    error percent  success    total      ops      ops      ops throughput throughput      ops throughput  latency    error percent  success    total  latency    error  percent  success    total    error  latency    error percent  success      latency    error percent  success    total    total  latency    error percent    error  success    total     latency       error     percent       error     success       total  latency    error percent  success    total  latency    error percent  success    total  latency    error percent  success    total  latency    error percent  success    total  latency    error percent  success    total  latency    error percent  success    total Aggregation Constituents
-------- -------- ------- -------- -------- -------- -------- ------- -------- -------- -------- -------- -------- -------- ------- -------- -------- -------- -------- ------- -------- -------- -------- -------- ------- -------- -------- -------- -------- ------- -------- -------- -------- -------- -------- -------- ------- -------- -------- -------- -------- ------- -------- -------- -------- -------- ------- -------- -------- -------- -------- ------- -------- -------- -------- -------- -------- ---------- ---------- -------- ---------- -------- -------- ------- -------- -------- -------- -------- -------- -------- -------- -------- -------- -------- ------- -------- ------------ -------- ------- -------- -------- -------- -------- -------- ------- -------- -------- -------- ----------- ----------- ----------- ----------- ----------- ----------- -------- -------- ------- -------- -------- -------- -------- ------- -------- -------- -------- -------- ------- -------- -------- -------- -------- ------- -------- -------- -------- -------- ------- -------- -------- -------- -------- ------- -------- -------- ----------- ------------
     0us        0      0%        0        0      0us        0      0%        0        0 Multiple_Values Multiple_Values 0us 0    0%        0        0      0us        0      0%        0        0      0us        0      0%        0        0      0us        0      0%        0        0      NAS        5      0us        0      0%        0        0      0us        0      0%        0        0      0us        0      0%        0        0      0us        0      0%        0        0        0        0        0          0          0        0          0      0us        0      0%        0        0      0us        0       0%        0        0        0      0us        0      0%        0          0us        0      0%        0        0        0      0us        0      0%        0        0        0         0us           0          0%           0           0           0      0us        0      0%        0        0      0us        0      0%        0        0      0us        0      0%        0        0      0us        0      0%        0        0      0us        0      0%        0        0      0us        0      0%        0        0      Yes      4

Yikes!

If you use | to include only the stats you want:

cluster::*> statistics show-periodic -interval 2 -iterations 0 -summary true -object nfsv3 -instance NAS -counter  nfsv3_read_ops|nfsv3_throughput|nfsv3_write_ops
cluster: nfsv3.NAS: 2/13/2015 16:02:25
   nfsv3               nfsv3
    read      nfsv3    write    Complete    Number of
     ops throughput      ops Aggregation Constituents
-------- ---------- -------- ----------- ------------
       0          0        0      Yes      4
       0          0        0      Yes      4
cluster: nfsv3.NAS: 2/13/2015 16:02:29
   nfsv3               nfsv3
    read      nfsv3    write    Complete    Number of
     ops throughput      ops Aggregation Constituents
-------- ---------- -------- ----------- ------------
Minimums:
        0          0        0       -       - 
Averages for 2 samples:
        0          0        0       -       - 
Maximums:
       0          0        0       -       -

Now that’s nice and readable…

If

(Thanks to Doug Moore, TME for Multi-tenancy at NetApp for assistance with this portion)

In the clustered Data ONTAP CLI, there is also a way to use “if” statements by way of the curly bracket thingies.

{these!}

One use case for them could be during upgrades. Each node has 2 images that can be used to store ONTAP.

cluster::> system image show
                 Is      Is                                Install
Node     Image   Default Current Version                   Date
-------- ------- ------- ------- ------------------------- -------------------
node1
         image1  true    true    8.2.3                     8/27/2014 15:20:52
         image2  false   false   8.2.2                     10/15/2014 12:39:58

When you update an image (upgrade), the current image will remain the default image until you change it. That enables the system to boot on the new image when you reboot. If you have 24 nodes and each node has a different image that is the default image, trying to set the default image on each node can become tiresome. However, with the use of an “if” statement in our command, we can set every node’s default image to the image we just upgraded to. The command below will use all instances of “iscurrent=false” and change them to “isdefault=true.”

cluster::> system image modify {-iscurrent false} -isdefault true

If you noticed in the “system image show” command, my default image is image1. When I run the command with the “if” statement, it changes my default image to image2.

cluster::> system image modify {-iscurrent false} -isdefault true

After a clean shutdown, image2 will be set as the default boot image on node
node1.

1 entry was modified.

Now image2 is the default image:

cluster::> system image show
                 Is      Is                                Install
Node     Image   Default Current Version                   Date
-------- ------- ------- ------- ------------------------- -------------------
node1
         image1  false   true    8.2.3                     8/27/2014 15:20:52
         image2  true    false   8.2.2                     10/15/2014 12:39:58

Another use case for “if” is modifying snapshot policies. You can run a command to query for all instances of a policy that is enabled and tell the system to disable it.

cluster::> volume snapshot policy modify {-enabled true} -enabled false
2 entries were modified.

cluster::> volume snapshot policy show -enabled true
There are no entries matching your query.

Sorting your output

So let’s say you want to see a list of your volumes in a cluster but sort them by percent of space available. Guess what? You can do that in the CLI! Use -sort-by.

cluster::*> vol show -fields percent-used -sort-by percent-used
vserver      volume percent-used                                      
------------ ------ ------------ 
NAS          mixed  5%           
NAS          ntfs   5%           
NAS          unix   5%           
NAS          unix2  5%           
NAS          vsroot 5%           
TRUST        ntfs   5%           
TRUST        unix   5%           
TRUST        vsroot 6%

Sort-tastic!

But can I sort stats?

Of course you can! You can use the -sort-key option to do that. Find that problem volume fast!

cluster::> statistics volume show -sort-key latency -max 3

cluster : 9/18/2014 15:13:33
                     Total Read Write Other    Read   Write *Latency
  Volume     Vserver   Ops  Ops   Ops   Ops   (Bps)   (Bps)     (us)
--------- ----------- ----- ---- ----- ----- ------- ------- --------
basevol_5 VS_2_802002   268  166    81    21 3322384 1618744   119387
basevol_5 VS_9_802009   325  177   124    23 3539789 2494725   103094
basevol_5 VS_1_802001   399  215   158    25 4307034 3163792    97657

Viewing event logs

You can use a combination of the tricks above to effectively view event logs to search for errors.

Maybe you want to see only error messages:

cluster::> event log show -severity err
Time                Node             Severity      Event
------------------- ---------------- ------------- ---------------------------
2/13/2015 15:10:07  clusternode-02     ERROR         secd.ldap.connectFailure: vserver (TRUST) could not make a connection over the network to LDAP server (italy) at address (10.228.225.125) and received error (Invalid credentials)
2/13/2015 11:06:33  clusternode-02     ERROR         secd.ldap.connectFailure: vserver (TRUST) could not make a connection over the network to LDAP server (italy) at address (10.228.225.125) and received error (Invalid credentials)
2/13/2015 06:57:58  clusternode-02     ERROR         secd.ldap.connectFailure: vserver (TRUST) could not make a connection over the network to LDAP server (italy) at address (10.228.225.125) and received error (Invalid credentials)
2/13/2015 02:59:22  clusternode-02     ERROR         secd.ldap.connectFailure: vserver (TRUST) could not make a connection over the network to LDAP server (italy) at address (10.228.225.125) and received error (Invalid credentials)

Maybe you want to see messages from a specific event type:

cluster::> event log show -messagename secd*
Time                Node             Severity      Event
------------------- ---------------- ------------- ---------------------------
2/13/2015 15:10:07  clusternode-02     ERROR         secd.ldap.connectFailure: vserver (TRUST) could not make a connection over the network to LDAP server (italy) at address (10.228.225.125) and received error (Invalid credentials)
2/13/2015 11:06:33  clusternode-02     ERROR         secd.ldap.connectFailure: vserver (TRUST) could not make a connection over the network to LDAP server (italy) at address (10.228.225.125) and received error (Invalid credentials)
2/13/2015 06:57:58  clusternode-02     ERROR         secd.ldap.connectFailure: vserver (TRUST) could not make a connection over the network to LDAP server (italy) at address (10.228.225.125) and received error (Invalid credentials)
2/13/2015 02:59:22  clusternode-02     ERROR         secd.ldap.connectFailure: vserver (TRUST) could not make a connection over the network to LDAP server (italy) at address (10.228.225.125) and received error (Invalid credentials)

Or maybe you only want to see errors for the last 2 hours:

cluster::> event log show -time >120m -severity err
Time                Node             Severity      Event
------------------- ---------------- ------------- ---------------------------
2/13/2015 15:10:07  clusternode-02     ERROR         secd.ldap.connectFailure: vserver (TRUST) could not make a connection over the network to LDAP server (italy) at address (10.228.225.125) and received error (Invalid credentials)

Question marks in LUN serial numbers, etc?

In cDOT, the question mark (?) automatically triggers a help screen whenever you type it. This is useful in many cases, but not when you want to set a LUN serial number with a question mark in it. However, there is a way to disable the help function:

cluster::> set -active-help false

Once you do that, the ? will no longer trigger the help screen!

Lots more you can do from the CLI… But the stuff above should get you more than ready!

To close this out, let’s play a ninja song (NSFW, btw).

TECH::vSphere 6.0 – NFS thoughts

DISCLAIMER: I work for NetApp. However, I don’t speak for NetApp. These are my own views. ūüôā

I’m a tad late to the party here, as there have already been numerous blogs about what’s new in vSphere 6.0, etc. I haven’t seen anything regarding what was missing from a NFS perspective, however. So I’m going to attempt to fill that gap.

What new NFS features were added?

Famously, vSphere 6 brings us NFSv4.1. NFSV4.1 is an enhancement of NFSV4.0, which brought the following features:

  • Pseudo/unified namespace
  • TCP only
  • Better security via domain ID string mapping, single firewall port and Kerberos integration
  • Better locking than NFSv3 via a lease-based model
  • Compound NFS calls (i.e., combining multiple NFS operations into a single packet)
  • Better standardization of the protocol, leveraging IETF
  • More granular ACLs (similar to Windows NTFS ACLs)
  • NFS referrals
  • NFS sessions
  • pNFS

I cover NFSv4.x in some detail in TR-4067 and TR-4073. I cover pNFS in TR-4063.

I wrote a blog post a while back on the Evolution of NAS, which pointed out how NFS and CIFS were going all Voltron on us and basically becoming similar enough to call them nearly identical.

vSphere 6.0 also brings the ability to Kerberize NFS mounts, as well as VVOL support. Fun fact: NetApp is currently the only storage vendor with support for VVOLs over NFS. 

Why do these features matter?

As Stephen Foskett correctly pointed out in his blog, adoption of NFSv4.x has been… slow. A lot of reasons for that, in addition to what he said.

  • Performance. NFSv3 is simply faster in most cases now. Though, that narrative is changing…
  • Disruption. NFSv3 had the illusion of being non-disruptive in failover events. NFSv4 is stateful, thus more susceptible to interruptions, but its locking makes it less susceptible to data loss/corruption in failover events (both network and storage).
  • Infrastructure. It’s a pain in the ass to add name services to an existing enterprise environment to ensure proper ID string mapping.
  • Disdain for change. No one wants to be the “early adopter” in a production environment.

However, more and more applications are recommending NFSv4.x. TIBCO is one. IBM MQueue is another. Additionally, there is a greater focus on security with recent data breaches and hacks, so storage administrators will need to start filling check boxes to be compliant with new security regulations. NFSv4.x features (Kerberos, domain ID, limited firewall ports to open) will likely be on that list. And now, vSphere offers NFSv4.1 with some limited features. What this means for the NFS protocol is that more people will start using it. And as more people start using it, the open-source-ness will start to kick in and the protocol will improve.

As for Kerberos, one of the questions you may be asking, or have heard ask is, “why the heck do I want to Kerberize my NFS datastore mount?” Doesn’t my export policy rule secure it enough?

Well, how easy is it to change an IP address of an ESXi server? How easy is it to create a user? That’s really all you need to mount NFSv3. However, Kerberos requires a user name and password, interaction with a KDC, ticket exchange, etc. So, it’s much more secure.

As for VVOLs, they could be a game changer in the world of software-defined storage.

Check out the following:

Virtual Volumes (VVOLs) On Horizon to Deliver Software Defined Storage for vSphere

The official VMware VVOL blog

vMiss also has a great post on VVOLs on her blog.

Also, NetApp’s ESX TME Peter Learmonth (@titaniumlegs on Twitter) has a video on it:

That’s great and all… but what’s missing?

While it’s awesome that VMware is attempting to keep the NFS stack up to date by adding NFSv4.1 and Kerberos, it just felt a little… incomplete.

For one Kerberos was added, but only with DES support. This is problematic on a few levels. For one, DES is old and laughably weak as far as Kerberos enctypes go. DES was cracked in less than a day… in 2008. If they were going to add Kerberos, why not AES, which is the NIST standard? Were they concerned about performance? AES has been known to be a bit of a hog. If that was a concern, though, why not implement the Intel AES CPU?

As for NFSv4.1… WHERE IS PNFS?? pNFS is an ideal protocol for what virtual machines do – open once, stream reads and writes. Not a ton of metadata. Mobile and agile with storage VMotion¬†and volume moves in clustered Data ONTAP. No need to use up a ton of IP addresses (one per node, per datastore). Most storage operations via NFS¬†would be simplified and virtually transparent with pNFS. Hopefully they add that one soon.

Ultimately, an improvement

I’m glad that VMware added some NFS improvements. It’s a step in the right direction. And they certainly beefed up the capabilities of vSphere 6 with added hardware support. Some of those numbers… monstrous! Hopefully they continue the dedication to NFS in future releases.

Wait, there’s more?!?

That’s right! In addition to the improvements of vSphere 6.0, there is also VMWare Horizon, which integrates with NetApp’s All-Flash FAS solutions. NetApp All-Flash FAS¬†is provides¬†the only all-flash NFS support on the market!

To learn more about it, see this video created by NetApp TME Chris Gebhardt.

You can also see the Shankay Iyer’s blog post here.

Introducing A New Release of VMWare Horizon!

For more info…

What’s New in the VMware vSphere 6.0 Platform

For a snarky rundown on NFSv4.1 and vSphere¬†6.0, check out Stephen Foskett’s blog.

For some more information on NFS-specific features, see Cormac Hogan’s post.

TECH::OMFG! Microsoft is killing IDMU???

Yesterday, I wrote a blog post on LDAP. During that, I was researching links to add to it and I came across this gem. I decided to leave it out of yesterday’s post for two reasons:

  1. Getting clarification on what this actually means
  2. Not to let it fall into the cracks

http://blogs.technet.com/b/activedirectoryua/archive/2015/01/25/identity-management-for-unix-idmu-is-deprecated-in-windows-server.aspx

A few users have asked about this recently so I am posting here to help let everyone know that Identity Management for Unix (IDMU) is deprecated and will not ship in future versions of Windows Server. This is documented in a couple places:

Identity Management for UNIX 

Features Removed or Deprecated in Windows Server 2012 R2

All IDMU-related features will go away, including UNIX Attributes tab. This also applies Network Information Service (NIS) and Remote Server Administration Tools (RSAT). Instead of RSAT, you should use native LDAP, Samba Client, Kerberos, or non-Microsoft options. For Network File System (NFS), there is a Windows PowerShell cmdlet that allows you to update the user account with uid/gid: Set-NfsMappedIdentity.

In the future, if you try upgrade a computer that runs IDMU components, the upgrade will stop and you will be prompted to remove IDMU as explained at Installing or removing Identity Management for UNIX by using a command line.

Reading that, I immediately thought… WTF THEY ARE REMOVING UNIX LDAP???

Source: Playbuzz.com, Home Alone

Naturally, since I push people toward the goodness that is Active Directory LDAP (such as the 240+ page TR-4073), I was a little… concerned. If you look at the comments in that MS blog link, I am Justin P.

However, Justin (from Microsoft) responded and it’s not as bad as I initially thought.

This is what is actually happening:

  • Microsoft, for whatever reason (and here’s hoping they reconsider), is removing the Tools for IDMU.So, no more native GUI to manage attributes, and possibly no more UNIX application support.
  • The schema backend, which is what hosts the UNIX-y attributes, will remain intact.
  • LDAP can still be used on AD, but you will either need to manually manage the schema via ADSI/Attributes Editor or via Powershell. Or, use something like Centrify.

If I recall, when I installed Windows 2012 R2, I didn’t need to extend the schema for UNIX attributes. They were already there – just not populated. But it’s still worth talking about. ūüôā

LDAP::What the heck is an LDAP anyway? – Part 1: Intro

Where’d I put that UID? Source: ZDNet.com

In my time in technical support and as the Technical Marketing Engineer at NetApp, I’ve come to realize that LDAP is one misunderstood technology, which is unfortunate, because I am seeing a large uptick in people who are using it in enterprise environments. As a result, I plan on starting a series of LDAP related posts based on this one, which was originally posted in February of 2015. These will be listed in TECH and LDAP Categories on this blog.

Some topics I’ll be covering (which will become links as they are written):

NOTE: I’m going to keep this real high-level and not bore you with all the details because we could get into the weeds real fast. You can always find copious amount of less interesting information out there in my Technical Reports or on your preferred search engine. If you have specific questions or want me to add something to this entry, hit me up on Twitter @NFSDudeAbides.

What is LDAP?

LDAP is also known as “Lightweight Directory Access Protocol.” It is, essentially, a database that acts as a phone book for all sorts of information for clients and servers in enterprise NAS environments. Some of these include:

  • Names
  • Addresses
  • Phone numbers
  • Unique numerical identifiers (SID, UID, GID, UUID, etc)
  • And many more!

Is Microsoft Active Directory LDAP?

Yes. And no. And maybe.

Microsoft Active Directory uses LDAP for its backend. It leverages LDAP RFC-2307 schema standards and populates records in its database with all sorts of good information about its objects.

For example, this machine account:

machineaccount

Notice it has all sorts of data about that machine account, including the servicePrincipalName (SPN) that is critical to leveraging Kerberos in both Microsoft and non-Microsoft KDCs.

However, while Microsoft is a using LDAP and technically could be called LDAP, it’s not LDAP in the traditional sense. That is reserved for the use of UNIX style attributes for users, groups, netgroups and so on. By default, Microsoft does not apply these schema attributes to its implementation of LDAP. However, the schema can be modified or extended to add attributes to populate things like UID, GID, etc.

How does it work?

LDAP functionality works like this, in a simplistic view.

  • A database is populated with objects and schema attributes.
  • LDAP clients are configured to connect to the LDAP server. This normally means they use port 389 or 636 (LDAP over SSL). Sometimes, the Global Catalog port 3268 can be used when using AD LDAP.
  • When a client connects to LDAP, it tries to bind based on the configuration. This can be SASL (such as GSSAPI, DIGEST-MD5, etc), simple or anonymous, depending on what the client and server are configured for/support.
  • After a bind, a lookup is done via RFC standards using ldapsearch commands and filter strings.
  • If the object and requested attributes exist, they are returned to the client.

Pretty straightforward. The hard part is remembering it. ūüôā

What is it used for?

LDAP is used for enterprise NAS environments that want to centralize their identity management rather than keeping track of n number of passwd, group, netgroup and other files. Having LDAP service the clients allows standardized and automated client rollout. All atomic updates are done from the server side and replicated across multiple servers. All a client has to do is connect and search.

In TR-4073: Secure Unified Authentication, I cover LDAP as it pertains to NetApp’s clustered Data ONTAP. It focuses mainly on Microsoft Active Directory as an LDAP server, but also includes information on RedHat’s Directory Server (which is THE best UNIX-based LDAP server I’ve used so far).

Additionally, a new name service best practice guide was just released today. Check out TR-4379: Name Services Best Practices for more info.

TECH::What Super Bowl 49 Taught Us About Transitioning from NFSv3 to NFSv4.x

I’m sure most everyone either watched Super Bowl 49 or at least read about it somewhere. The Seattle Seahawks and New England Patriots clashed down to the wire for one of the most thrilling games in recent history. Both sides played fairly well, but not without mistakes. And as most of us know, it all came down to a final decision – do we run or pass from the 1 yard line?

The Seahawks chose to pass and the rest was history.

So what lessons can storage administrators learn about NFS from football (or as some call it, sportball)? In this case, we’ll call NFSv3 “the run” and NFSv4.x? The forward pass.

The forward pass: A little history

American football was not always the game it is today.¬†The sport evolved from rugby and the first American football game was played in 1869. Like rugby, it was a run-heavy, violent game. Keep in mind that these used to be the helmets, and helmets weren’t even worn in the early games:

Yeah… that’s not gonna hurt, right? Source: AntiqueAthlete.com

In 1906, a rule change was added to try to help make the game less… bloody.

Mutant League Football – Source: Game Informer

As a result, the game opened up and became more exciting with higher scores and it added a facet of strategy to the game. Now it wasn’t just about simple brute force – you had to try to guess what the other team was going to do based on the situation.

NFS: Not Football Specific

NFS actually stands for Network File System and is used in enterprise environments across the world for file storage. Everything from music to movies to medical images to seismic data to virtual machines are hosted on NFS storage (preferably NetApp :-P). There are plenty of resources on what it is out there, starting with the Request For Comments (RFC) standards. As far as NFS on NetApp, I cover it in some detail in the following technical reports (Light reading, especially if you’re having trouble sleeping.):

TR-3580: NFSv4 Enhancements and Best Practices Guide: Data ONTAP Implementation

TR-4063: Parallel Network File System Configuration and Best Practices for Clustered Data ONTAP 8.2 and Later

TR-4067: Clustered Data ONTAP NFS Best Practice and Implementation Guide

TR-4073: Secure Unified Authentication

TR-4379: Name Services Best Practices

But NFS, and NAS storage administration, has some things in common with football – you have to decide on what to do based on every situation.

NFSv4.x and the forward pass

If we look at the decision made last night not to run ¬†the ball on 2nd and goal from the 1 yard line with a player nicknamed “Beast Mode” (how do you NOT run it??) in terms of transitioning from the generally rock-solid NFSv3 to NFSv4.x, we can make the following analogies.

  • Do I give the ball to my best player (NFSv3) that has been dependable for years and see what happens?
    I know it could backfire on me, and he does have his flaws…
  • Do I throw a forward pass (NFSv4.x) without really thinking about the immediate consequences of just jumping into that decision?
    I could misconfigure the pass and that would be it for my enterprise data storage…
  • Do I call a timeout and re-think the play design?
    We are running out of time. We need to make the best decision for our business!

Planning the transition to NFSv4.x

Like any good coach or storage administrator, we need to come into every game prepared. We need to think out all possible scenarios and plan for them accordingly, but also be flexible enough to adjust on the fly if unforeseen circumstances occur. After all, unforeseen circumstances are, by definition, unforeseen.

In TR-4067, I cover some considerations that need to be made when deciding to move from NFSv3 to NFSv4.x. The following is a sneak preview. Keep in mind that this section may change in the final release of the TR, but the general concepts remain.

Transitioning from NFSv3 to NFSv4.x: Considerations

The following section covers some considerations that need to be addressed when migrating from NFSv3 to NFSv4.x. When choosing to use NFSv4.x after using NFSv3, you cannot simply turn it on and have it work as expected. There are specific items to address, such as:

  • Domain strings/ID mapping
  • Storage failover considerations
  • Name services
  • Firewall considerations
  • Export policy rule considerations
  • Client support
  • NFSv4.x features and functionality

ID Domain Mapping

While customers prepare to migrate their existing setup and infrastructure from NFSv3 to NFSv4, some environmental changes must be made before moving to NFSv4. One of them is “id domain mapping.”

In clustered Data ONTAP 8.1, a new option called v4-id-numerics was added. With this option enabled, even if the client does not have access to the name mappings, numeric IDs can be sent in the user name and group name fields and the server accepts them and treats them as representing the same user as would be represented by a v2/v3 UID or GID having the corresponding numeric value.

Essentially, this makes NFSv4.x behave more like NFSv3. This also removes the security enhancement of forcing ID domain resolution for NFSv4.x name strings, so whenever possible, keep this option as the default of disabled. If a name mapping for the user is present, however, the name string will be sent across the wire rather than the UID/GID. The intent of this option is to ensure the server never sends ‚Äúnobody‚ÄĚ as a response to credential queries in NFS requests.

To access this command prior to clustered Data ONTAP 8.3, you must be in diag mode. Commands related to diag mode should be used with caution.

Some production environments have the challenge to build new naming service infrastructures like NIS or LDAP for string-based name mapping to be functional in order to move to NFSv4. With the new “numeric_id” option, setting name services does not become an absolute requirement. The “numeric_id” feature must be supported and enabled on the server as well as on the client. With this option enabled, the user and groups exchange UIDs/GIDs between the client and server just as with NFSv3. However, for this option to be enabled and functional, NetApp recommends having a supported version of the client and the server. For client versions that support numeric IDs with NFSv4, please contact the OS vendor.

Note that -v4-id-numerics should be enabled only if the client supports it.

Storage failover considerations

NFSv4.x uses a completely different locking model than NFSv3. Locking in NFSv4.x is a lease-based model that is integrated into the protocol itself rather than separated out as it is in NFSv3 (NLM). From the ONTAP documentation:

In accordance with RFC 3530, Data ONTAP “defines a single lease period for all state held by an NFS client. If the client does not renew its lease within the defined period, all states associated with the client’s lease may be released by the server.” The client can renew its lease explicitly or implicitly by performing an operation, such as reading a file.

Furthermore, Data ONTAP defines a grace period, which is a period of special processing in which clients attempt to reclaim their locking state during a server recovery.

Term Definition (as per RFC-3530)
Lease The time period in which Data ONTAP irrevocably grants a lock to a client.
Grace period The time period in which clients attempt to reclaim their locking state from Data ONTAP during server recovery.
Lock Refers to both record (byte-range) locks as well as file (share) locks unless specifically stated otherwise.

For more information on locking, see the section in this document on NFSv4.x locking. Because of this new locking methodology, as well as the statefulness of the NFSv4.x protocol, storage failover operates differently as compared to NFSv3. For more information, see the section in this document on storage failover and in clustered Data ONTAP.

Name services

If deciding to use NFSv4.x, it is a best practice to centralize the NFSv4.x users in name services such as LDAP or NIS. This allows all clients and clustered Data ONTAP NFS servers to leverage the same resources and guarantees that all names, UID and GIDs will be consistent across the implementation. For more information on name services, see TR-4073: Secure Unified Authentication and TR-4379: Name Services Best Practices.

Firewall considerations

NFSv3 required several ports to be opened for ancillary protocols such as NLM, NSM, etc. in addition to port 2049. NFSv4.x requires only port 2049. If wishing to use NFSv3 and NFSv4.x in the same environment, all relevant NFS ports should be opened. These ports are referenced in this document.

Volume language considerations

In NetApp’s Data ONTAP, volumes can have specific languages set. This is intended to be used for internationalization of file names for languages that use characters not common to English, such as Japanese, Chinese, German, etc. When using NFSv4.x, RFC-3530 states that UTF-8 is recommended.

11.  Internationalization

   The primary issue in which NFS version 4 needs to deal with
   internationalization, or I18N, is with respect to file names and
   other strings as used within the protocol.  The choice of string
   representation must allow reasonable name/string access to clients
   which use various languages.  The UTF-8 encoding of the UCS as
   defined by [ISO10646] allows for this type of access and follows the
   policy described in "IETF Policy on Character Sets and Languages",
   [RFC2277].

If you intend to migrate to clustered Data ONTAP from a 7-mode system and use NFSv4.x, you should be using some form of UTF-8 language support, such as C.UTF-8 (which is the default language of volumes in clustered Data ONTAP). If the 7-mode system is not already using a UTF-8 language, then it should be converted before transitioning to cDOT or when intending to transition from NFSv3 to NFSv4. The exact UTF-8 language specified will depend on the specific requirements of the native language to ensure proper display of character sets.

7-mode allowed volumes that hosted NFSv4.x data to use C language types. Clustered Data ONTAP does not, as it honors the RFC standard recommendation of UTF-8. TR-4160: Secure Multi-tenancy Considerations covers language recommendations in cDOT. When changing a volume’s language, every file in the volume must be accessed after the change to ensure they all reflect the language change. This can be done via a simple “ls -lR” to do a recursive listing of files.

Export policy rules

In clustered Data ONTAP, it is possible to specify which version of NFS is supported for an exported filesystem. If an environment was configured for NFSv3 and the export policy rule option ‚Äďprotocol was limited to allow NFSv3 only, then it would need to be modified to allow NFSv4. Additionally, policy rules could be configured to allow only NFSv4.x clients access.

Example:

cluster::> export-policy rule modify -policy default -vserver NAS -protocol nfs4For more information, consult the product documentation for your specific version of clustered Data ONTAP.

Client considerations

When using NFSv4.x, clients are as important to consider as the NFS server. The following client considerations should be followed when implementing NFSv4.x. Other considerations may be necessary. Please contact the OS vendor for specific questions about NFSv4.x configuration.

  • x is supported.
  • The fstab file and NFS configuration files are configured properly. When mounting, the client will negotiate the highest NFS version available with the NFS server. If NFSv4.x is not allowed by the client or fstab specifies NFSv3, then NFSv4.x will not be used at mount.
  • The idmapd.conf file is configured with the proper settings.
  • The client either contains identical users and UID/GID (including case sensitivity) or is using the same name service server as the NFS server/clustered Data ONTAP SVM.
  • If using name services on the client, the client is configured properly for name services (nsswitch.conf, ldap.conf, sssd.conf, etc.) and the appropriate services are started, running and configured to start at boot.
  • The NFSv4.x service is started, running and configured to start at boot.

NFSv4.x Features and Functionality

NFSv4.x is the next evolution of the NFS protocol and enhances NFSv3 with new features and functionality, such as referrals, delegations, pNFS, etc. These features are covered throughout this document and should be factored in to any design decisions for NFSv4.x implementations.

If done right and you make the right play calls, NFSv4.x could bring home the championship for  your NAS storage environment!