Aria Operations – What is left for provisioning?

2023-11-21 Off By vvanvierzen

Introduction

At my customers I often get the question how to measure the amount of resources that are still available for VM provisioning. In this case, resources are:

  • The amount of vCPUs left for provisioning
  • The amount of MEM left for provisioning
  • The amount of vSAN diskspace left for provisioning (taking storage policies into account)

We have multiple clusters for multiple purposes and we use Aria Automation (vRA – sorry, couldn’t come up with a better abbreviation yet 😊) to provision workloads. In order to have vRA determine the appropriate cluster for deployment, we want to make those metrics available to vRA as well. This will be done by having vRA collect them through the API.

So to start, how can we achieve this?

Unlocking the “allocation model”

By default, Aria Operations will know how many vCPUs & MEM you have available in your clusters and it’s also able to deduct some HA and Buffers, but it does not take any overcommit ratios into account. Fortunately, you can configure these settings in your policies nowadays. We use different policies for our clusters already as they have different SLA’s, Alerts etc. So we can easily select the policy for the cluster we want to configure the overcommit ratios for and edit it. If we then go to the “capacity” section, we can  unlock and set the “Allocation Model”

In this example, we have set the overcommit ratio for CPU to 6:1 and for MEM 1:1. For now, we don’t set a ratio for our vSAN datastores and will not use the option to include powered off VMs in the calculation.

In addition to the allocation model, you can also configure Custom Profiles and Capacity Buffers. Custom Profiles will allow you to define a VM profile which you can use to calculate the amount of VMs you can provision with the remaining resources. Capacity buffers will allow you to reduce the cluster’s usable capacity. It is calculated after deduction of HA resources and can be used as a safeguard for your remaining capacity for example.

New metrics available

After saving the changes to your profile, you should see new metrics being created and collected on your Cluster Compute Objects.

There are multiple new metrics available, but for now we will focus on the capacity remaining & Usable Capacity after HA and Buffer metrics.

Within my “test” cluster, I have 18 hosts, each with 48 physical cores and we use an “FTT” of 2. Meaning we will have the resource available of aprox. 16 hosts. 16 * 48 = 768. If we take the ratio of 6:1 into account, we should have 4608 vCPUs available for provisioning. As you can see, the roundup is a tiny bit off, but it’s close enough. This is because the cluster uses a % based calculation in stead of a “hard” number.

So let’s see if things add up here:

We have a total usable capacity of 4614 vCPUs, we have 1041 vCPUs provisioned to powered on VMs meaning we have 3574 vCPUs left for provisioning. The number is actually off by 1, but my best guess is that that has to do with the roundup of numbers. I have chosen to forgive Aria Ops for this 😊

As shown below, we have similar metrics for memory:

What to do with it?

As mentioned earlier, we use vRA to use the collected metrics in order to make a placement decision for a new VM or group of VMs. Our Automation team has created a script that collects the metric info via an API call. The request URL and response looks as follows:

Apart from using the metrics for placement options in Aria Automation, it will also allow for the creation of some nice dashboards. I will cover the creation of those dashboards in a separate post as they will include the creation of some new scoreboard views, supermetrics and alerts as well.