Using what-if to optimize ARM template deployments

Posted by

This week we are going to investigate the PowerShell moduleWhat-if“, to preview an ARM template deployment. Infrastructure as code, is one of the largest challenges face as teams adopt DevOps. It’s often a new and uncomfortable skill for both developers and ops team members.

Our infrastructure as code has worked well, but we’ve debated the best strategy to improve the deployment for several months. We’ve experimented with deploying all the infrastructure as one step, deploying the infrastructure on a daily schedule, and more recently, deploying the infrastructure with every deployment, using with parallel jobs. Breaking apart our templates into smaller, easier to maintain and test pieces proved to make our deployments more reliable and faster.

The approach to deploy infrastructure as code with every release was validated by a poll we ran on Twitter last week. While 26 votes is hardly statistically significant, it gives us a clue that many people are using this same approach.

https://twitter.com/samsmithnz/status/1322504729717624832

However, for each ARM template, we couldn’t help but notice we were still pushing an entire template, even when the service did not need updates. The obvious optimization would be to check if the service needed an update, instead of blindly deploying the ARM Template and updating the service to the same settings it had before.

What is “what-if”?

This is where what-if comes in. What-if allows us to compare our ARM template to the deployed service, and then based on the result, decide if we need to deploy. Over the past months, we’ve tried to use the what-if preview a few times, but it didn’t seem quite ready. We are happy to report, that what-if is much more reliable now, but does still need a little help at times.

How “what-if” works

Let’s deploy a key vault ARM template to a new resource group with what-if. Using the following command, what-if compares the ARM template with our target resource in Azure. As the key vault doesn’t exist, it needs to be created.

az deployment group what-if --only-show-errors --resource-group $resourceGroupName --name $keyVaultName --template-file "$templatesLocation\KeyVault.json" --parameters keyVaultName=$keyVaultName 

We then deploy the key vault ARM template to Azure. When we run the same what-if command as above, we see that no additional changes are required:

Running through this exercise on our resources, we found it interesting that many of our resources did have unexpected differences. Some of these were related to changes made in the portal, some were related to when the portal made assumptions and used defaults. What-if captures all of these, and gives us the opportunity to fix them in our ARM templates. This involves a combination of updating the template api version and adding missing properties. All in all, it’s a good process to validate and modernize our infrastructure as code.

Adding logic to our “what-if”

To programmatically analyze the results, we need to add the “no-pretty-print” parameter, to return a JSON object we can parse for results. This returns a long JSON document, but if we convert it into an object, it’s much easier to process.

$whatifResultsJson = az deployment group what-if --no-pretty-print --only-show-errors --resource-group $resourceGroupName --name $keyVaultName --template-file "$templatesLocation\KeyVault.json" --parameters keyVaultName=$keyVaultName 

$whatifResults = $whatifResultsJson | ConvertFrom-Json  
$ChangeResults = $whatifResults.changes 

From here, when we can examine the $whatifResults object, to see if the change type is “ignore”, “create”, “modify”, or “delete”. These are most common indicators that we need to redeploy our ARM template. Let’s look at a few examples, first an “Ignore” change type, indicating that we do not need to deploy this change again. It shows what Azure will look like (after), what the template currently looks like (before), the change type (ignore – or we don’t need to do anything), the delta (no difference), and the resource id of the resource affected.

Now let’s look at a resource that does have differences. Here we see the change type of “Modify”, and more importantly, the delta has details of specific properties. For example, the first property has detected that “Always On” has been enabled on the portal, but is not part of the template. This is a sign our ARM template is not up to date with the portal!

In this process we will often see “noise”, and need to decide how to resolve it. There is a GitHub repo where issues can be added for noise. Other times we need to resolve the issue ourselves, adding additional details to the ARM template, or ignoring certain results. In our Key Vault example, we’ve decided to ignore the access policies. While access policies can technically be set by ARM templates, we dynamically add access policies based on subsequent resources we deploy (usually web apps). Therefore, we need to ignore this modify entire section. We can do this fairly easily by using a “Where-Object” to filter out all access policy changes.

#Filter out access policies from the result - as this is really data, not infrastructure (in my view)
$ChangeResults4 = $ChangeResults4 | Where-Object { $_.delta.path -ne "properties.accessPolicies" }

As we dug deeper, some of the noise was more involved, for example metric alerts, which still can’t be exported to an ARM template (easily). What-if still can’t process these, and they created a LOT of noise for our alerts ARM template. Since these are nested arrays, we need to loop a little bit to clean up various properties. You can see this in the code sample below, where at the end, if there are no “delta” objects remaining, we set the change type to ignore, as we no longer require a modify.

#Filter out metric alert
for ($i = 0; $i -le $ChangeResults11.Count; $i++) 
{
    if ($ChangeResults11[$i].changeType -eq "Modify")
    {
        $ChangeResults11[$i].delta = ($ChangeResults11[$i].delta | Where-Object { $_.path –ne "properties.targetResourceType" })
        $ChangeResults11[$i].delta = ($ChangeResults11[$i].delta | Where-Object { $_.path –ne "properties.criteria.allOf" })
        $ChangeResults11[$i].delta = ($ChangeResults11[$i].delta | Where-Object { $_.path –ne "properties.profiles" })
        if ($ChangeResults11[$i].delta.Count -eq 0)
        {
            $ChangeResults11[$i].changeType = "Ignore"    
        }
    }
}

Results

There is a cost to using what-if, around roughly 20 seconds. According to our data, it takes on average, roughly 110 seconds to deploy a new Azure service. (Note: this is based on the resources we deploy, to US East, and this may vary greatly depending on time of day, connection, etc, etc). Clearly, spending 20 seconds to check if need to spend 2 minutes on task is worth the savings – but if we do need to deploy, it will cost us those extra 20 seconds. In other words, this potentially increases our worst case scenario, by 20 seconds per resource – but this should be a relatively rare occurrence.

While a new resource costs roughly ~110s to deploy, an existing resource is closer to ~40s. With a what-if check on this resource costs ~20 seconds, using what-if, saves us ~20s per resource.

Let’s look at how this looks in aggregate. When we run a deployment to a new region, it takes ~40 minutes. Running the same ARM templates again on those new resources needs ~12 minutes. Running what-id checks on these resources is only ~6 and a half minutes. That is a significant improvement.

Below we have this data in a table, to deploy our core resources (key vault and storage), databases (SQL and Redis), CDN, web hosting, and web app:

Wrap-Up

What-if is still in preview, and while for some resources there can be noise, what-if is coming together quickly, and the benefits are clearly there today. Resolving differences with your ARM templates and your deployed resources takes work, but this has a positive side effect, in that you are forced to upgrade your templates (let’s be honest, everyone has this on their backlog and isn’t doing it). In turn, this helps to make your templates more robust by including information they should already have to match their environments. If you deploy ARM templates, what-if is clearly worth having a look at now.

References

One comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s