ONTAP 9.6 is coming soon and I recently posted a sneak peek for REST API support. But REST APIs aren’t the only new feature coming with the release. FlexGroup volumes are getting some new enhancements as well.
- Ability to rename a FlexGroup volume
- Ability to shrink a FlexGroup volume
- Support for MetroCluster with FlexGroup volumes
- SMB CA share support
One of the bigger features (albeit more under the radar) is a way for ONTAP to help FlexGroup volumes avoid failed writes to volumes due to being out of space – elastic sizing!
Prior to ONTAP 9.6, storage administrators had to be a bit more cognizant of member volume capacity, because if a member volume ran out of space in a FlexGroup volume, the file write would fail. Since files do not stripe across member volumes, a single file could grow over time to cause issues with space allocation.
There are a few reasons a member volume in a FlexGroup might fill up.
- A single file that exceeds the available space of a member volume is attempted to be written. For example, a 10GB file is written to a member volume with just 9GB available.
- A file is appended/written to over time and eventually fills up a member volume. For example, if a database resides in a member volume.
- Snapshots eat into the active file system space available.
FlexGroup volumes do a generally good job at allocating space across member volumes, but if a workload anomaly occurs, it can throw things off. (Like if your volume is mostly a bunch of 4K files but then you zip a lot of them up and create a giant single file).
Remediation of this problem is generally growing volumes or deleting data. But usually, admins won’t notice the issue until it’s too late and “out of space” errors have occurred. That’s where Elastic Sizing comes in handy.
Elastic Sizing – An Airbag for your Data
One of our FlexGroup volume developers refers to elastic sizing as an “airbag” in that it’s not designed to stop you from getting into an accident, but it does help soften the landing when it happens.
In other words, it’s not going to prevent you from writing large files or from running out of space, but it is going to provide a way for those writes to complete.
Here’s how it works…
- When a file is written to ONTAP, the system has no idea how large that file will become. The client doesn’t know. The application usually doesn’t know. All that’s known is “hey, I want to write a file.”
- When a FlexGroup volume receives a write request, it will get placed in the best available member based on a variety of factors – such as available capacity, inode count, time since last file creation, member volume performance (new in ONTAP 9.6), etc…
- When a file is placed, since ONTAP doesn’t know how big a file will get, it also doesn’t know if the file is going to grow to a size that’s larger than the available space. So, the write is allowed as long as we have space to allow it.
- If/when the member volume runs out of space, right before ONTAP sends an error to the client that we’ve run out of space, it will query the other member volumes in the FlexGroup to see if there’s any available space to borrow. If there is, ONTAP will add 1% of the volume’s total capacity (in a range of 10MB to 10GB) to the volume that is full (while taking the same amount from another member volume in the same FlexGroup volume) and then the file write will continue.
- During the time ONTAP is looking for space to borrow, that file write is paused – this will appear to the client as a performance issue. But the overall goal isn’t to finish the write fast – it’s to allow the write to finish at all. In most cases, a member volume will be large enough to provide the 10GB increment (1% of 1TB is 10GB), which is often more than enough to allow a file creation to complete. In smaller member volumes, the performance impact could be greater, as the system will need to query to borrow space more often.
- The capacity borrowing will maintain the overall size of the FlexGroup – for example, if your FlexGroup is 40TB in size, it will remain 40TB.
Once files are deleted/volumes are grown and space is available in that member volume again, ONTAP will re-adjust the member volumes back to their original sizes to maintain an evenness in space.
Ultimately, elastic sizing helps remove the admin overhead of managing space, as well as worrying so much about the initial sizing/deployment of a FlexGroup. You can spend less time thinking about how many member volumes you need, what size they should be, etc.
When you combine elastic sizing in ONTAP 9.6 with features like autogrow/shrink, then ONTAP can pretty much manage your capacity in most cases and help avoid emergency space issues.
Elastic sizing = new FlexGroup use cases?
Traditionally, FlexGroup volume use cases have mainly been for unstructured NAS data, high file count environments, small files, etc. and I’ve cautioned people against putting larger files into FlexGroup volumes because of the aforementioned issues with large files/files that grow potentially filling up a member volume.
But now, with elastic sizing to mitigate those issues, along with volume autogrow/shrink, the FlexGroup use cases get a bit more expanded and interesting.
Why not put a workload with large files/files that grow on a FlexGroup now? In fact, with SMB support for Continuously Available shares for Hyper-V and SQL server, there is further proof that FlexGroup volumes are becoming more viable solutions for a variety of workloads.
You can find the latest podcast for FlexGroup volumes here: