Pitfalls in deployment automation

There is no doubt that organizations with lots of manual tasks should automate at least some of the IT tasks. As the automation tools such as Ansible, Puppet become mature, the skillset becomes prevalent as well. If automation is not carried out properly, they can cost the organization an arm and a leg. The ultimate goal of automation is create value for business delivery.

This article discusses some common pitfalls in automation initiatives.

Automate without justification

If more time is spent trying to automate something than could be saved by having it automated, then there is no value for this automation. This is particularly true for long but one-off processes. However, this equation may not be the same for every case, due to the productivity difference between a senior and a junior engineer. The time spent on automation may not be as long for a senior and therefore is worth it.

Automation a bad process

When an IT operation is full of manual process hence automation opportunities. It is important to assess whether the existing automation opportunity is even a validated manual process. If a manual process is complex and fragile, it is not a great candidate for automation. Trying to automate such processes tend to create more issues. In general, processes that are stable, repeatable within a certain components, widely used in different scenarios and have few dependencies are good opportunity for automation.

Automate without managing the system state

Successful automation requires that the system remains in an automatable state. For example, configuration files needs to be categorized and placed in correct locations. Configuration files specific to the local server environment should not be accidentally overwritten. On the other hand, configuration files that are a scope of automation target should be in a location exclusively managed by automation. No manual process should manage the same files that are part of automation. Immutable infrastructure is highly recommended for automation. In scenarios where immutable infrastructure is not feasible, governance on manual practice should be in place to prevent the server from becoming a snowflake system due to configuration drifts.

Automation without verification

Immutable infrastructure is a good friend of automation. If automation does not yield to a consistent environment, then the automation process will impair the environment that allows itself to be efficient. Verification is critical to ensure the consistency of the system state after the verification. Many automation technology such as Ansible and Puppet uses declarative language to ensure idempotent operations. However, it does not replace a system-level verification as a best practice of automation developer.

Automation without managing complexity

Automation simplifies manual steps but it also creates a layer of abstraction and potentially more problems at this layer. If a component is not well designed for automation, effort to automate it tend to focus on a very simplistic scenario. For example, nifi is not a systemd service nor well supported by Ansible’s service module. Trying to automate its service restart for Nifi results in unnecessarily complicated playbooks, with additional verification steps. This is not easy to maintain by other developers. This is also referred to as external automation, where automation is introduced as an adoption to a bad process (refer to #1). As a result, when the object being automated is improved in the future, automation developer needs to keep the tool up with the object in order to keep automation working. This is very common with components that are not designed for automation at all or not automation-friendly.

Automation in silos

Organization should have oversight in the consistency of automation tools, approaches and documentation. Different individual efforts could result in inconsistencies which can reduce automation adoption. They must be integrated into the organizations automation strategy and comply with the organizations governance rules. It should be made clear who is the customer for automation and ensure the customer get consistent experience across all automation processes. T

Over-automation

It is important to be aware of the limitation of automation, and withdraw from it whenever appropriate. For example, if the process involves customer collaboration and customer operation cannot guarantee immutable servers, then the deployment process becomes customer specific and cannot be fully automated, or can only be partially automated. This is perfectly okay because the level of automation only needs to match the delivery model. Over-automation can lead to either an unreliable process, or one that systematically damages the system.
Even well-designed automation can have its limitation. It increases human’s reliance on it. Although it works 99% of the time, in the 1% scenario it presents huge technical challenges to the user who operates on it without understanding how it works. This is seen in many other industries such as aviation.