Storage Storage Quality of Service (QoS)

Overview

Windows Hyper-V Server allows you to run lots of virtual Windows machines (VMs)on a single physical server. Scale-Out File Server lets you store server application data, such as Hyper-V virtual machine files, on file shares. While this is a great way to optimise the use of hardware, all these virtual machines need to be managed, to ensure that they get a fair share of the storage and server resources. Storage Quality of Service (QoS) is designed to manage storage performance for VMs, by setting policies for individual VMs to ensure that a given VM gets enough resources, without hogging them at the expense of others. Don't confuse this with Windows QoS, which is for controlling Network resources.

What can QoS do?

  1. You need to be able to monitor the end to end storage performance of all your VMs, so you can hopefully anticipate problems before your users report them. Storage QoS starts monitoring performance of virtual machines stored on a Scale-Out File Server as sonn as they are started. And, all these performance details can be viewed from a single location
  2. However, monitoring and reporting is not enough, you need to be able to intercept and automatically fix problems as they happen. Storage QoS can throttle an application that is hogging resources at the expense of other workloads, in fact, Storage QoS will stop one virtual machine from hogging all the storage resources by default.
  3. You can tailor policies to manage throughput in terms of IOPs or MB/second for different storage virtual machine (SVM), volumes, LUNS, or VMDK files within an SVM. For example, instead splitting different types of workload onto different dedicated resources, you can define policy groups for critical production, less critical production, application testing and development, and run all these workloads together. Storage QoS will then ensure that the dev/test workloads do not adversely affect the production workloads.
  4. Within these policies, you can set both maximum and minimum performance levels. Why would you need a maximum limit? Well, imagine that you are implementing a shared cloud environment, where individual customers have SLAs. The first customers onto the cluster would get excellent performance, probably exceeding their SLA. As more customers are added, that performance will drop, and while it will still be within SLA, it could be perceived as an issue by the customer, who is not getting the performance she was used to. If you use maximum performance levels from the start, then you will manage the performance expectations of your customer better.
    Now all this sounds good, but in real life, you will be adding more and more applications into the mix, until eventually the environment becomes overprovisioned and SLA performance cannot be met automatically. When this happens, Storage QoS can send alerts to warn that VMs are out of policy

If you have configured a new Failover Cluster and configured a Cluster Shared Volume(CSV) on Windows Server 2016, then the Storage QoS feature will be set up automatically and then the Storage QoS Resource is displayed as a Cluster Core Resource and visible in both Failover Cluster Manager and Windows PowerShell. The failover cluster system will manage this resource.

QoS Terminology

Here is some defintions for Storage QoS terms

Normalized IOPs
IO operations to disk move different amounts of data, depending on what the application is doing. Larger IOs take longer by definition, so to get a sensible time comparison between IOs, Normalised IOPs (Input/output operations per second ) are used. By default, any IO that is 8KB or smaller is considered as one normalized IO. Any IO that is larger than 8KB is broken down into multiples of 8KB. So, a 512KB IO would then be 64 normalized IOPs.
With Windows Server 2016, you can change this default 8KB value and chose your own value for a normalised IOPs.
Flow
A flow can be thought of as the storage connections used by a VM. Every VM will have files open on a VHD, and every open file handle that is opened by a Hyper-V server to a VHD or VHDX file is considered a 'flow'.
InitiatorName
This is the name of the virtual machine
InitiatorID
This is an identifier that matches the virtual machine ID. VMs can have identical Initator Names, but the Initiator ID is always unique.
Policy
The Storage QoS policies contain the properties that are used to manage the VM performance. These include the PolicyId, MinimumIOPS, MaximumIOPS, ParentPolicy, and PolicyType; and are stored in the cluster database.
PolicyId
This is an Unique identifier for a policy.
MinimumIOPS
The minimum number of normalized IOPS that Storage QoS will try to deliver. It is sometimes called the 'Reservation'.
MaximumIOPS
The Maximum number of normalized IOPS or the 'Limit' that will be applied by Storage QoS to VMs with this policy.
Aggregated
A type of policy type where the specified MinimumIOPS & MaximumIOPS and Bandwidth are shared among all flows assigned to the policy.
Dedicated
A type of policy type where the specified Minimum & MaximumIOPs and Bandwidth are managed for individual VHD/VHDx.

Storage QoS Deployment Scenarios

You can use Storage QoS on Hyper-V, either using a Scale-Out File Server or using Cluster Shared Volumes. The clustering requirements are slightly different:
The Scale-Out File Server scenario requires that the storage cluster is a Scale-Out File Server cluster and the Compute cluster has least one Hyper-V role enabled server. The Storage servers need to be in a Failover Cluster, but the compute servers do not. All the servers must be running Windows Server 2016.
The Cluster Shared Volumes scenario requires both a Compute cluster with the Hyper-V role enabled and a Hyper-V using Cluster Shared Volumes for storage. A Failover Cluster is required and all servers must be running the same version of Windows Server 2016.

The policy manager monitors the virtual machines as they are launched by the Hyper-V servers and communicates the Storage QoS policy for each machine back to the Hyper-V server. The Hyper-V server then controls the performance of the virtual machine as appropriate. If the storage QoS policies are changed, the Policy Manager passes the changed QoS requirement to the Hyper-V servers, so they can ensure each VM is getting the correct performance requirements.

Monitoring Storage QoS

You need to install Remote Administration Tools to manage Storage QoS policies from remote computers, or you can manage Storage QoS policies and monitor flows from compute hosts using the Remote Server Administration Tools. These are available as optional features on all Windows Server 2016 installations, and can be downloaded separately for Windows 10 at the Microsoft Download Center website.
To use PowerShell to manage Storage QoS, you need the RSAT-Clustering feature and SAT-Hyper-V-Tools for remote management of Hyper-V. Start up an elevated Windows PowerShell command line, then enter

  Add-WindowsFeature RSAT-Clustering
  Add-WindowsFeature RSAT-Hyper-V-Tools

To see the status of the Storage QoS Resource run the following PowerShell cmdlet.

  Get-ClusterResource -Name "Storage Qos Resource"

To see the status of all current flows initiated by Hyper-V servers, use the Get-StorageQosFlow cmdlet.
The second command shows how to limit the output to flows from a single VM. The 'Format-List' pipe shows data at a per-storage volume level as well as per-flow and shows stuff like the average total utilisation in normalized IOPs, latency, and aggregate limits and reservations applied to a volume.

  Get-StorageQosFlow
  Get-StorageQosFlow -InitiatorName SQLPWD0032 | Format-List

You can also pipe the command output into a sort, The following command will pick out the VMs that are getting a lot of I/O service.

  Get-StorageQosFlow | Sort-Object StorageNodeIOPs -Descending

A Hard disk drive will have a maximum I/O load and this is shared among the virtual drives that are hosted on it. Every virtual hard disk has a set of performance values, MinimumIOPs and MaximumIOPs and MaximumIobandwidth, which will be adjusted based on its load. The values at any time will depend on the overall load on all the virtual disks and will be shared out according to the policy. If only one virtual disk is active, it could take all of the available bandwidth.

  Get-VM -Name SQLPWD0032 | Get-VMHardDiskDrive | Format-List

To list out all configured policies and their status on a Scale-Out Fileserver, use the ServerGet-StorageQosPolicy cmdlet.
You can also pipe the policy command with Get-StorageQosPolicy to get the status of all flows configured to use each policy. The second command will display the flows for the 'SQLserver' policy

  Get-StorageQosPolicy
  Get-StorageQosPolicy -Name SQLserver | Get-StorageQosFlow | ft InitiatorName, *IOPS, Status, FilePath -AutoSize

back to top