Behind the Scenes Episode 290 – NetApp E-Series BeeGFS CSI Driver

Welcome to the Episode 290, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

2019-insight-design2-warhol-gophers

This week we discuss NetApp E-Series support for BeeGFS and the new CSI driver for containers and Kubernetes deployments. Joining us are Eric Weber (eric.weber2@netapp.com) and Joe McCormick (@iamjoemccormick, https://www.linkedin.com/in/developedbyjoe/) from the NetApp E-Series engineering team.

For more information:

Podcast Transcriptions

If you want a searchable transcript of the episode, check it out here (just set expectations accordingly):

Episode 290: NetApp E-Series BeeGFS CSI Driver – Transcript

Just use the search field to look for words you want to read more about. (For example, search for “storage”)

transcript.png

Be sure to give us feedback (or if you need a full text transcript – Gong does not support sharing those yet) on the transcription in the comments here or via podcast@netapp.com! If you have requests for other previous episode transcriptions, let me know!

Tech ONTAP Community

We also now have a presence on the NetApp Communities page. You can subscribe there to get emails when we have new episodes.

Tech ONTAP Podcast Community

techontap_banner2

Finding the Podcast

You can find this week’s episode here:

You can also find the Tech ONTAP Podcast on:

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

Behind the Scenes Episode 289 – NetApp and Rubrik: Better Together

Welcome to the Episode 289, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

2019-insight-design2-warhol-gophers

This week, we discuss the latest news in the NetApp/Rubrik partnership and how Rubrik works with Ben Kendall (Alliances Technical Partner Manager, Americas at Rubrik, https://www.linkedin.com/in/ben-kendall-6436609/) and PF Guglielmi (Alliances Field CTO @pfguglielmi) of Rubrik and Chris Maino (Manager, Americas Solutions Architects, chris.maino@netapp.com) of NetApp.

For more information:

https://www.rubrik.com/en/partners/technology-partners/netapp

https://www.rubrik.com/en/company/newsroom/press-releases/21/rubrik-and-netapp-extend-partnership

Podcast Transcriptions

If you want a searchable transcript of the episode, check it out here (just set expectations accordingly):

Episode 289 – NetApp and Rubrik: Better Together – Transcript

Just use the search field to look for words you want to read more about. (For example, search for “storage”)

transcript.png

Be sure to give us feedback (or if you need a full text transcript – Gong does not support sharing those yet) on the transcription in the comments here or via podcast@netapp.com! If you have requests for other previous episode transcriptions, let me know!

Tech ONTAP Community

We also now have a presence on the NetApp Communities page. You can subscribe there to get emails when we have new episodes.

Tech ONTAP Podcast Community

techontap_banner2

Finding the Podcast

You can find this week’s episode here:

You can also find the Tech ONTAP Podcast on:

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

How to see data transfer on an ONTAP cluster network

ONTAP clusters utilize a backend cluster network to allow multiple HA pairs to communicate and provide more scale for performance and capacity. This is done by allowing you to nondisruptively add new nodes (and, as a result, capacity and compute) into a cluster. Data will be accessible regardless of where you connect in the cluster. You can scale up to 24 nodes for NAS-only clusters, while being able to mix different HA pair types in the same cluster if you choose to offer different service levels for storage (such as performance tiers, capacity tiers, etc).

Network interfaces that serve data to clients live on physical ports on nodes and are floating/virtual IP addresses that can move to/from any node in the cluster. File systems for NAS are defined by Storage Virtual Machines (SVMs) and volumes. The SVMs own the IP addresses you would use to access data.

When using NAS (CIFS/SMB/NFS) for data access, you can connect to a data interface in the SVM that lives on any node in the cluster, regardless of where the data volume resides. The following graphic shows how that happens.

When you access a NAS volume on a data interface on the same node as the data volume, ONTAP can “cheat” a little and directly interact with that volume without having to do extra work.

If that data interface is on a different node than where the volume resides, then the NAS packet gets packaged up as a proprietary protocol and shipped over the cluster network backend to the node where the volume lives. This volume/node relationship is stored in an internal database in ONTAP so we always have a map to find volumes quickly. Once the NAS packet arrives on the destination node, it gets unpackaged, processed and then the response to the client goes back out the way it came.

Traversing the cluster network has a bit of a latency cost, however, as the packaging/unpackaging/traversal takes some time (more time than a local request). This manifests into slightly less performance for those workloads. The impact of that performance hit is negligible in most environments, but for latency-sensitive applications, there might be some noticeable performance degradation.

There are protocol features that help mitigate the remote I/O that can occur in a cluster, such as SMB node referrals and pNFS, but in scenarios where you can’t use either of those (SMB node referrals didn’t use Kerberos in earlier Windows versions; pNFS needs NFSv4.1 and later), then you’re going to likely have remote cluster traffic. As mentioned, in most cases this isn’t an issue, but it may be useful to have an easy way to find out if an ONTAP cluster is doing remote/cluster traffic.

Cluster level – Statistics show-periodic

To get a cluster-wide view if there is remote traffic on the cluster, you can use the advanced priv command “statistics show-periodic.” This command gives a wealth of information by default, such as:

  • CPU average/busy
  • Total ops/NFS ops/CIFS ops
  • FlexCache ops
  • Total data recieved/sent (Data and cluster network throughput)
  • Data received/sent (Data throughput only)
  • Cluster received/sent (Cluster throughput only)
  • Cluster busy % (how busy the cluster network is)
  • Disk reads/writes
  • Packets sent/received

We also have options to limit the intervals, define SVMs/vservers, etc.

::*> statistics show-periodic ?
[[-object] ] *Object
[ -instance ] *Instance
[ -counter ] *Counter
[ -preset ] *Preset
[ -node ] *Node
[ -vserver ] *Vserver
[ -interval ] *Interval in Seconds (default: 2)
[ -iterations ] *Number of Iterations (default: 0)
[ -summary {true|false} ] *Print Summary (default: true)
[ -filter ] *Filter Data

But for backend cluster traffic, we only care about a few of those, so we can filter the iterations for only what we want to view. In this case, I just want to look at the data sent/received and the cluster busy %.

::*> statistics show-periodic -counter total-recv|total-sent|data-recv|data-sent|cluster-recv|cluster-sent|cluster-busy

When I do that, I get a cleaner, easier to read capture. This is what it looks like when we have remote traffic. This is an NFSv4.1 workload without pNFS, using a mount wsize of 64K.

cluster1: cluster.cluster: 5/11/2021 14:01:49
    total    total     data     data cluster  cluster  cluster
     recv     sent     recv     sent    busy     recv     sent
 -------- -------- -------- -------- ------- -------- --------
    157MB   4.85MB    148MB   3.46MB      0%   8.76MB   1.39MB
    241MB   70.2MB    197MB   4.68MB      1%   43.1MB   65.5MB
    269MB    111MB    191MB   4.41MB      4%   78.1MB    107MB
    329MB   92.5MB    196MB   4.52MB      4%    133MB   88.0MB
    357MB    117MB    246MB   5.68MB      2%    111MB    111MB
    217MB   27.1MB    197MB   4.55MB      1%   20.3MB   22.5MB
    287MB   30.4MB    258MB   5.91MB      1%   28.7MB   24.5MB
    205MB   28.1MB    176MB   4.03MB      1%   28.9MB   24.1MB
cluster1: cluster.cluster: 5/11/2021 14:01:57
    total    total     data     data cluster  cluster  cluster
     recv     sent     recv     sent    busy     recv     sent
 -------- -------- -------- -------- ------- -------- --------
Minimums:
    157MB   4.85MB    148MB   3.46MB      0%   8.76MB   1.39MB
Averages for 8 samples:
    258MB   60.3MB    201MB   4.66MB      1%   56.5MB   55.7MB
Maximums:
    357MB    117MB    258MB   5.91MB      4%    133MB    111MB

As we can see, there is an average of 55.7MB sent and 56.5MB received over the cluster network each second; this accounts for an average of 1% of the available bandwidth, which means we have plenty of cluster network utilization left over.

When we look at the latency for this workload, this is what we see. (Using qos statistics latency show)

Policy Group            Latency
-------------------- ----------
-total-                364.00us
extreme-fixed          364.00us
-total-                619.00us
extreme-fixed          619.00us
-total-                490.00us
extreme-fixed          490.00us
-total-                409.00us
extreme-fixed          409.00us
-total-                422.00us
extreme-fixed          422.00us
-total-                474.00us
extreme-fixed          474.00us
-total-                412.00us
extreme-fixed          412.00us
-total-                372.00us
extreme-fixed          372.00us
-total-                475.00us
extreme-fixed          475.00us
-total-                436.00us
extreme-fixed          436.00us
-total-                474.00us
extreme-fixed          474.00us

This is what the cluster network looks like when I use pNFS for data locality:

cluster1: cluster.cluster: 5/11/2021 14:18:19
    total    total     data     data cluster  cluster  cluster
     recv     sent     recv     sent    busy     recv     sent
 -------- -------- -------- -------- ------- -------- --------
    208MB   6.24MB    206MB   4.76MB      0%   1.56MB   1.47MB
    214MB   5.37MB    213MB   4.85MB      0%    555KB    538KB
    214MB   6.27MB    213MB   4.80MB      0%   1.46MB   1.47MB
    219MB   5.95MB    219MB   5.40MB      0%    572KB    560KB
    318MB   8.91MB    317MB   7.44MB      0%   1.46MB   1.47MB
    203MB   5.16MB    203MB   4.62MB      0%    560KB    548KB
    205MB   6.09MB    204MB   4.64MB      0%   1.44MB   1.45MB
cluster1: cluster.cluster: 5/11/2021 14:18:26
    total    total     data     data cluster  cluster  cluster
     recv     sent     recv     sent    busy     recv     sent
 -------- -------- -------- -------- ------- -------- --------
Minimums:
    203MB   5.16MB    203MB   4.62MB      0%    555KB    538KB
Averages for 7 samples:
    226MB   6.28MB    225MB   5.22MB      0%   1.08MB   1.07MB
Maximums:
    318MB   8.91MB    317MB   7.44MB      0%   1.56MB   1.47MB

There is barely any cluster traffic other than the normal cluster operations. The “data” and “total” sent/received is nearly identical.

And the latency was an average of .1 ms lower.

Policy Group            Latency
-------------------- ----------

-total-                323.00us
extreme-fixed          323.00us
-total-                323.00us
extreme-fixed          323.00us
-total-                325.00us
extreme-fixed          325.00us
-total-                336.00us
extreme-fixed          336.00us
-total-                325.00us
extreme-fixed          325.00us
-total-                328.00us
extreme-fixed          328.00us
-total-                334.00us
extreme-fixed          334.00us
-total-                341.00us
extreme-fixed          341.00us
-total-                336.00us
extreme-fixed          336.00us
-total-                330.00us
extreme-fixed          330.00us

Try it out and see for yourself! If you have questions or comments, enter them below.

Behind the Scenes – Episode 288 – ONTAP System Manager 9.9.1

Welcome to the Episode 288, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”

2019-insight-design2-warhol-gophers

This week, NetApp Principal TME Chris Gebhardt (@chrisgeb), PM Aniket Singh (aniket.singh@netapp.com) and TME Yizhao Zhuang (yizhao.zhuang@netapp.com) join us to discuss the latest ONTAP System Manager 9.9.1 changes.

For more information about System Manager:

https://docs.netapp.com/us-en/ontap/

Podcast Transcriptions

If you want a searchable transcript of the episode, check it out here (just set expectations accordingly):

Episode 288 – ONTAP System Manager 9.9.1 – Transcript

Just use the search field to look for words you want to read more about. (For example, search for “storage”)

transcript.png

Be sure to give us feedback (or if you need a full text transcript – Gong does not support sharing those yet) on the transcription in the comments here or via podcast@netapp.com! If you have requests for other previous episode transcriptions, let me know!

Tech ONTAP Community

We also now have a presence on the NetApp Communities page. You can subscribe there to get emails when we have new episodes.

Tech ONTAP Podcast Community

techontap_banner2

Finding the Podcast

You can find this week’s episode here:

You can also find the Tech ONTAP Podcast on:

I also recently got asked how to leverage RSS for the podcast. You can do that here:

http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss

Running PowerShell from Linux to Query SMB Shares in NetApp ONTAP

I recently got a question about how to perform the following scenario:

  • Run a script from Linux that calls PowerShell on a remote Windows client using Kerberos
  • Remote Windows client uses PowerShell to authenticate against an ONTAP SMB share

That’s some Inception-style IT work.

Inception Ending Explained: Christopher Nolan's Endless Spinning | Observer

The issue they were having was that the credentials used to connect to the Windows client weren’t passing through to the ONTAP system. As a result, they’d get “Access Denied” in their script when attempting to access the share. I figured out how to get this working and rather than let that knowledge rot in the far reaches of my brain, I’m writing this up, since in my Google hunt, I found lots of people had similar issues with Linux PowerShell (not necessarily to ONTAP).

This is a known issue with some workarounds listed here:

Making the second hop in PowerShell Remoting

One workaround is to use “Resource-based Kerberos constrained delegation,” where you basically tell the 3rd server to accept delegated credentials from the 2nd server via the PrincipalsAllowedToDelegateToAccount parameter in the ADComputer cmdlets. We’ll cover that in a bit, but first…

WAIT. I can run PowerShell on Linux???

Well, yes! And this article tells you how to install it:

Installing PowerShell on Linux

Now, the downside is that not all PowerShell modules are available from Linux (for example, ActiveDirectory isn’t currently available). But it works!

PS /> New-PSSession -ComputerName COMPUTER -Credential administrator@NTAP.LOCAL -Authentication Kerberos

PowerShell credential request
Enter your credentials.
Password for user administrator@NTAP.LOCAL: **

Id Name     Transport ComputerName ComputerType  State  ConfigurationName    Availability
-- ----     --------- ------------ ------------  ----- --------------------- ------------
9 Runspace9 WSMan     COMPUTER     RemoteMachine Opened Microsoft.PowerShell Available

In that document, they don’t list CentOS/RHEL 8, which can be problematic, as you might run into some issues with the SSL libraries (This blog calls one of those issues out, as well as a few others).

On my Centos8.3 box, I ran into this issue:

New-PSSession: This parameter set requires WSMan, and no supported WSMan client library was found. WSMan is either not installed or unavailable for this system.

Using the guidance from the blog listed earlier, I found that there were a couple of files not found:

# ldd /opt/microsoft/powershell/7/libmi.so
…
libssl.so.1.0.0 => not found
libcrypto.so.1.0.0 => not found
…

That blog lists 1.0.2 as what is needed and looks to be using a different Linux flavor. You can find the files you need/where they live with:

# find / -name 'libssl.so.1.'
/usr/lib64/.libssl.so.1.1.hmac
/usr/lib64/libssl.so.1.1
/usr/lib64/libssl.so.1.1.1g
/usr/lib64/.libssl.so.1.1.1g.hmac
/opt/microsoft/powershell/7/libssl.so.1.0.0

Then you can use the symlink workaround and those files show up properly with ldd:

ln -s libssl.so.1.1 libssl.so.1.0.0
ln -s libcrypto.so.1.1 libcrypto.so.1.0.0
ldd /opt/microsoft/powershell/7/libmi.so
...
libssl.so.1.0.0 => /lib64/libssl.so.1.0.0 (0x00007f41ce3fc000)
libcrypto.so.1.0.0 => /lib64/libcrypto.so.1.0.0 (0x00007f41cdf16000)
...

However, authenticating with the server also requires an additional step.

Authenticating Linux PowerShell with Windows

You can authenticate to Windows servers with Linux PowerShell using the following methods:

Basic Default Kerberos Credssp Digest Negotiate

Here’s how each auth method works (or doesn’t) without doing anything else.

BasicNew-PSSession: Basic authentication is not supported over HTTP on Unix.
DefaultNew-PSSession: MI_RESULT_ACCESS_DENIED
CredSSPNew-PSSession: MI_RESULT_ACCESS_DENIED
DigestNew-PSSession: MI_RESULT_ACCESS_DENIED
KerberosAuthorization failed Unspecified GSS failure.
NegotiateAuthorization failed Unspecified GSS failure.
Auth methods with Linux PowerShell and results with no additional configs

In several places, I’ve seen the recommendation to install gssntlmssp on the Linux client, which works fine for “Negotiate” methods:

PS /> New-PSSession -ComputerName SERVER -Credential administrator@NTAP.LOCAL -Authentication Negotiate

PowerShell credential request
Enter your credentials.
Password for user administrator@NTAP.LOCAL: **

Id Name Transport ComputerName ComputerType State ConfigurationName Availability
-- ---- --------- ------------ ------------ ----- ----------------- ------------
2 Runspace2 WSMan SERVER       RemoteMachine Opened Microsoft.PowerShell Available

But not for Kerberos:

PS /> New-PSSession -ComputerName SERVER -Credential administrator@NTAP.LOCAL -Authentication Kerberos

PowerShell credential request
Enter your credentials.
Password for user administrator@NTAP.LOCAL: **

New-PSSession: [SERVER] Connecting to remote server SERVER failed with the following error message : Authorization failed Unspecified GSS failure. Minor code may provide more information Configuration file does not specify default realm For more information, see the about_Remote_Troubleshooting Help topic.

The simplest way to get around this is to add the Linux client to the Active Directory domain. Then you can use Kerberos for authentication to the client (at least with a user that has the correct permissions, such as a domain administrator).

*Alternately, you could do all this manually, which I don’t recommend.

**For non-AD KDCs, config methods will vary.

# realm join NTAP.LOCAL
Password for Administrator:

# pwsh
PowerShell 7.1.3
Copyright (c) Microsoft Corporation.

https://aka.ms/powershell
Type 'help' to get help.

PS /> New-PSSession -ComputerName SERVER -Credential administrator@NTAP.LOCAL -Authentication Kerberos

PowerShell credential request
Enter your credentials.
Password for user administrator@NTAP.LOCAL: **********


 Id Name            Transport ComputerName    ComputerType    State         ConfigurationName     Availability
 -- ----            --------- ------------    ------------    -----         -----------------     ------------
  1 Runspace1       WSMan     SERVER          RemoteMachine   Opened        Microsoft.PowerShell     Available


So, now that we know we can establish a session to the Windows server where we want to leverage PowerShell, now what?

Double-hopping with Kerberos using Delegation

One way to use Kerberos across multiple servers (including NetApp ONTAP) is to leverage the PrincipalsAllowedToDelegateToAccount parameter.

The script I’ll use does a basic “Get-Content” call to a file in an SMB/CIFS share in ONTAP (similar to “cat” in Linux).

If I don’t set the PrincipalsAllowedToDelegateToAccount parameter, a credential passed from Linux PowerShell to a Windows server to ONTAP will use Kerberos -> NTLM (with a NULL user) for the authentication and this is the end result:

# pwsh test.ps1

PowerShell credential request
Enter your credentials.
Password for user administrator@NTAP.LOCAL: **********

Test-Path: Access is denied
False
Get-Content: Access is denied
Get-Content: Cannot find path '\\DEMO\files\file-symlink.txt' because it does not exist.

In a packet capture, we can see the session setup uses NULL with NTLMSSP:

14    0.031496   x.x.x.x   x.x.x.y      SMB2 289  Session Setup Request, NTLMSSP_AUTH, User: \

And here’s what the ACCESS_DENIED looks like:

20    0.043026   x.x.x.x   x.x.x.y      SMB2 166  Tree Connect Request Tree: \\DEMO\files
21    0.043217   x.x.x.y   x.x.x.x      SMB2 131  Tree Connect Response, Error: STATUS_ACCESS_DENIED

To use Kerberos passthrough/delegation, I run this PowerShell command to set the parameter on the destination (ONTAP) CIFS server:

Set-ADComputer -Identity DEMO -PrincipalsAllowedToDelegateToAccount SERVER$

That allows the SMB session to ONTAP to set up using Kerberos auth:

2603 26.877660 x.x.x.x x.x.x.y SMB2 2179 Session Setup Request
2673 26.909735 x.x.x.y x.x.x.x SMB2 326 Session Setup Response
supportedMech: 1.2.840.48018.1.2.2 (MS KRB5 - Microsoft Kerberos 5)

And the tree connect succeeds (you may need to run klist purge on the Windows client):

2674 26.910117 x.x.x.x x.x.x.y SMB2 154 Tree Connect Request Tree: \demo\files
2675 26.910630 x.x.x.x x.x.x.y SMB2 138 Tree Connect Response

This is the result from the Linux client:

# pwsh test.ps1

PowerShell credential request
Enter your credentials.
Password for user administrator@NTAP.LOCAL: **********

True
This is a file symlink.

So, how do we work around this issue if we can’t delegate Kerberos?

Using the NULL user and NTLM

Remember when I said the request without Kerberos delegation used the NULL user and NTLMSSP?

14    0.031496   x.x.x.x   x.x.x.y      SMB2 289  Session Setup Request, NTLMSSP_AUTH, User: \ 

The reason we saw “Access Denied” to the ONTAP CIFS/SMB share is because ONTAP disallows the NULL user by default. However, in ONTAP 9.0 and later, you can enable NULL user authentication, as described in this KB article:

How to grant access to NULL (Anonymous) user in Clustered Data ONTAP

Basically, it’s a simple two-step process:

  1. Create a name mapping rule for ANONYMOUS LOGON
  2. Set the Windows default NULL user in the CIFS options

Here’s how I did it in my SVM (address is the Windows client IP):

::*> vserver name-mapping create -vserver DEMO -direction win-unix -position 3 -pattern "ANONYMOUS LOGON" -replacement pcuser -address x.x.x.x/24

The Windows user needs to be a valid Windows user.

::*> cifs options modify -vserver DEMO -win-name-for-null-user NTAP\powershell-user

You can verify ONTAP can find it with:

::*> access-check authentication translate -node node1 -vserver DEMO -win-name NTAP\powershell-user
S-1-5-21-3552729481-4032800560-2279794651-1300

Once that’s done, we authenticate with NTLM and get access with the NULL user:

14 0.009012 x.x.x.x x.x.x.y SMB2 289 Session Setup Request, NTLMSSP_AUTH, User: \
27 0.075264 x.x.x.x x.x.x.y SMB2 166 Tree Connect Request Tree: \DEMO\files
28 0.075747 x.x.x.y x.x.x.x SMB2 138 Tree Connect Response

And the Linux client is able to run the PowerShell calls:

# pwsh test.ps1

PowerShell credential request
Enter your credentials.
Password for user administrator@NTAP.LOCAL: **********

True
This is a file symlink.

Questions? Comments? Add them below!