Change control in DevOps
Containing risk through continuous delivery.
How can you prove that changes are under control if developers are pushing out changes 10 or 50 times each day to production? How does a Change Advisory Board (CAB) function in DevOps? How and when is change control and authorization being done in an environment where developers push changes directly to production? How can you prove that management was aware of all these changes before they were deployed?
Information Technology Infrastructure Library (ITIL) change management and the associated paperwork and meetings were designed to deal with big changes that were few and far between. Big changes require you to work out operational dependencies in advance and to understand operational risks and how to mitigate them, because big, complex changes done infrequently are risky. In ITIL, smaller changes were the exception and flowed under the bar.
DevOps reverses this approach to change management, by optimizing for small and frequent changes—breaking big changes down to small incremental steps, streamlining and automating how these small changes are managed. Compliance and risk management need to change to fit with this new reality.
Iterative, incremental change to contain risks
DevOps and Continuous Delivery reduce the risk of change by making many small, incremental changes instead of a few “big bang” changes.
Changing more often exercises and proves out your ability to test and successfully push out changes, enhancing your confidence in your build and release processes. Additionally, it forces you to automate and streamline these processes, including configuration management and testing and deployment, which makes them more repeatable, reliable, and auditable.
Smaller, incremental changes are safer by nature. Because the scope of any change is small and generally isolated, the “blast radius” of each change is contained. Mistakes are also easier to catch because incremental changes made in small batches are easier to review and understand upfront, and require less testing.
When something does go wrong, it is also easier to understand what happened and to fix it, either by rolling back the change or pushing a fix out quickly using the Continuous Delivery pipeline.
It’s also important to understand that in DevOps many changes are rolled out dark. That is, they are disabled by default, using runtime “feature flags” or “feature toggles.” These features are usually switched on only for a small number of users at a time or for a short period, and in some cases only after an “operability review” or pre‐mortem review to make sure that the team understands what to watch for and is prepared for problems.
Another way to minimize the risk of change in Continuous Delivery or Continuous Deployment is canary releasing. Changes can be rolled out to a single node first, and automatically checked to ensure that there are no errors or negative trends in key metrics (for example, conversion rates), based on “the canary in a coal mine” metaphor. If problems are found with the canary system, the change is rolled back, the deployment is canceled, and the pipeline shut down until a fix is ready to go out. After a specified period of time, if the canary is still healthy, the changes are rolled out to more servers, and then eventually to the entire environment.