Today we are going to create a dynamic, zero-touch, PR (Pull Request) environment, a place for us to validate and test our enhancements. Ultimately, this will help give us more confidence that changes merged into the master branch will be of the highest quality.
In the past, the cost, time, and effort required to setup this temporary environment was prohibitive. Most projects we’ve worked on require at least a web server and database server as a minimum. With setup times of VMs averaging several weeks and approvals, it has traditionally been difficult to setup an environment quickly.
With the Azure cloud, and PaaS, it’s finally attainable, as we can easily automatically create and tear down environments in minutes. It’s also affordable. For roughly ~$60 a month, we have a full environment with a SQL database, web servers with staging slots, application insights, a storage account, Redis, and a CDN. As most branches are short-lived, surviving a few hours or days, this cost is a fraction of the total, a few dollars a day.
Our workflow today
In our project to date, we’ve been primarily working with three environments:
- “Dev” or Development: where we actually build and develop our new features. As part of the build that runs in a pull request, we deploy our changes to Dev to validate the PR won’t break our deployment
- “QA” or Quality Assurance: Only runs on the master branch, after a successful build and deployment to Dev.
- This environment is meant to be identical to Prod from a vertical scale (same CPU/memory/storage), but with a lower horizontal scale, (VM instances running PaaS in the back end).
- This is also where we can safely run our load and performance tests, knowing it’s similar to Prod.
- “Prod” or Production: The last step in our process, after QA is deployed, where our end users engage with our product.
What happens when our team scales to 10 or 100 developers? Our current setup works until two or more developers start pull request builds at the same time, stomping on each others changes in development. What is the solution? We need a separate environment for each pull request, giving our developers a place to work and test.
The new workflow
- The developer takes work off the Kanban board, creates a branch, and does “the work”
- The developer commits and pushes this branch, creating a new pull request. This triggers a PR build, and a new pull request deployment. For example, if the pull request is #415, a resource group is created “SamLearnsAzurePR415, and all of the resources are named with PR415, and the DNS to the website is setup as pr415.samlearnsazure.com.
- When the pull request is complete, a web hook monitors changes, and runs some code to delete the resource group.
Updating the YAML pipeline
Our first step is to add a new PR stage to our YAML pipeline. We can see the new stage, as well as the rules surrounding it, highlighted in red below and in our repo. There is a significant amount of trial and error hidded by these 12 lines. Achieving a build that works in a PR build and master branch build was a challenge.
Let us walk through this workflow in more detail, reviewing the 5 stages:
- Build: Runs as before, no changes
- DeployPR: Runs if build stage was successful, the “Build reason” equals “PullRequest”, and the “Pull Request Id” variable is not null.
- Three variables are set to help create a PR environment. In our other environments we use “Dev”, “QA”, or “Prod” to describe the environment. For PR environments, we will be using the format “PR###”. For example, if the PR Id is 429, a resource group “SamLearnsAzurePR429” will be created, with all of the resources needed to run an environment.
- To make the variables work, we used “conditional insertion” expressions, that load in a value depending on the current source branch. Note that when the branch is master, and the PR variable is set to ‘000’, the actual stage is still skipped. We found we needed to set this variable (to ‘000’) to ensure the YAML processing doesn’t throw errors.
- DeployDev, DeployQA, and DeployProd: Runs in serial, if the build stage was successful, and the source branch being run equals “master” – essentially it’s not a pull request.
Here is a Pull Request with this new process, note the PR ID of “429” in the top left, which will be used to build our “PR429” environment.
Looking at the pipeline process, we can see the build runs as expected, and as we are running a PR build, only the “Deploy PR” stage is triggered. Note that while the first run to create our infrastructure takes 35 minutes, subsequent runs finish in ~10 minutes. This extra time is related to one off tasks to provision the Azure resources, and restore the SQL database.
With the build policy completed, we can browse to pr429.samlearnsazure.com, a sub domain created as part of the deployment, and see the website loads with data present. This confirms the DNS was configured, and we were able to restore data into the database successfully.
This included dynamic creation of a CNAME DNS record for “pr429”, using the GoDaddy API (as our DNS is hosted by GoDaddy). This creates a friendly version of our site to browse to, (pr429.samlearnsazure.com), instead of having to remember the actual resource name.
In the Azure Portal, we can see our new resource group “SamLearnsAzurePR429”.
Examining the contents of the resource group, we can see all of the unique resources that make up an environment, all with the “pr429” naming.
What happens to these resources when the Pull Request is completed?
Once a Pull Request has been completed, our build will run it’s regular CI/CD workflow, running all stages except the “Deploy PR” stage.
With the resources in our pull request environment now unneeded, we can tear down this environment – after all, we are paying for it. To achieve this, we use a webhook in Azure DevOps to extend and watch the pull request for updates. When the pull request status changes to “completed”, the web hook logs into Azure and deletes the resource group for us. Here is a screenshot of the service hooks history. The failures in the screenshot represent timeouts we will address in a future post.
We have created an automated process to create an isolated development environment to test our development with pull requests. This is an incredible achievement for this project, a feature we had been planning for months. We plan to look into closer into some of the details we’ve glossed over today in the next few weeks, including:
- Creating the Webhook to extend our pull requests and delete the PR resource group when the PR is complete. (blog post) (code)
- Creating a PowerShell script to create Go-Daddy CNAMES (blog post) (code)
- Adding a database restore step during the PR creation if the tables haven’t been populated (our PR environment needs data to test!) (code)
As we complete these future posts, we will link to them here to keep the series together.
- Using null in conditions: https://stackoverflow.com/questions/56875665/how-to-deal-with-null-for-custom-condition-in-azure-pipeline
- Passing variables between stages: https://stefanstranger.github.io/2019/06/26/PassingVariablesfromStagetoStage/
- YAML Variables: https://docs.microsoft.com/en-us/azure/devops/pipelines/process/variables?view=azure-devops&tabs=yaml%2Cbatch
- YAML Expressions: https://docs.microsoft.com/en-us/azure/devops/pipelines/process/expressions?view=azure-devops#functions