Problem
With an expanding cloud platform that had so-far been orchestrated manually via Amazon’s web interface, we identified risks of high cognitive load, critical dependencies on individual members of staff and reliance on our provider in regards to the definition and provisioning of its architecture. Despite it being a fledgeling platform, it was already becoming harder to train people, and harder to understand the extent of our services – and if the configuration were to be lost, we’d have to start again!
This is a problem solved by Infrastructure-as-Code, something that had been on our radar for a while but had never presented as a necessity.
Solution
We earmarked some time to make significant progress in learning the core concepts of Terraform alongside some external consulting. We chose Terraform after some investigation as it is the most popular declarative framework for this kind of lifecycle management.
I was able to manually transcribe a section of our architecture (a static website in an “environment” with correct network access) into Terraform configuration using their documentation, learning some AWS intricacies along the way. I ran a daily peer-review in order to share knowledge and produce thorough documentation. I furthermore communicated with new dev-ops hires to ensure parity.
As IaC produces re-usable components, I was able to use what I had done to kickstart the next project.
Learnings
- Basic principles of IaC in addition to some networking principles such as CIDR blocks, Internet Gateways
- IAM Role management
- Using AWS CLI in GitOps with Bitbucket Cloud
- It is important to ask the right questions of contractors to reduce wasted effort