Terraform vs CloudFormation – A comparison of Infra Management tools

A few months back, we had to migrate a client’s infrastructure setup from their data centre to Amazon Web Services ( AWS ). The requirements were to use a VPC and all the general security measures that Amazon has to offer around it. While deciding the tool to be used for setting up the infrastructure, we scrutinised CloudFormation, Elastic Beanstalk, OpsWorks and Terraform initially. Lets not delve into why we rejected Beanstalk and OpsWorks ( that could be a different post altogether ); finally the decision was between CloudFormation and Terraform. We went ahead with Terraform, although, in this post, I am going to give an unbiased view of how these two tools compare to each other as of today and what might be suitable to use when.

1. Vendor Neutrality

Terraform is vendor neutral. It describes resources for multiple popular providers like AWS, DigitalOcean, Google Cloud, CloudFlare, Heroku, Consul and some more. On the other hand, CloudFormation is an Amazon product and hence obviously there is vendor locking. So, if vendor neutrality is your need, then Terraform can be the answer.

2. Support for standard AWS components

Currently, Terraform does not support all the AWS resources. Resources for VPN Gateway setup, VPC peering etc. are missing. However, Terraform is open source and is written in Golang, so you can always contribute to it. CloudFormation is packed with all the resources you will need.

3. State management

State management is currently the “chink in the armor” for Terraform and I would like to elaborate on this. Terraform manages state via a json file. This file serves as the source of truth about what the actual environment contains. However, the problem is the inability of terraform to uniquely identify resources that it creates. 

This is how terraform works – It maintains a local state file where it keeps track of any CRUD operations that it performs. Whenever terraform is invoked to perform any operations, it compares –
a) resources defined in the template files by the developer.
b) and the local state file.
Now take the state file away and everything will be lost, it will again recreate all the resources that are defined in the .tf files. This means the state file needs to be shared between developers and that can lead to chaos!

CloudFormation, on the other hand has no state. Multiple developers can work on the same environment / stack without worrying about polluting it.

4. “Infrastructure As Code”

Terraform provides a DSL to setup the different resources. However, it lacks the common logical constructs found in most programming and scripting languages. It does not have any looping constructs and does not support conditional statements. CloudFormation templates are written using json. So, here too there are no constructs but there are some popular Ruby gems like cfndsl which provide a better abstraction over the json.

5. Infrastructure Updates

This is the absolute killer feature of Terraform. Terraform has a separate planning and execution phase. The planning phase shows which resources will be created, modified and destroyed. It gives you complete control of how your changes will affect the existing environment, which is quite crucial. This was one of the main reasons why we went ahead with Terraform. CloudFormation does not show you what changes it is going to make to the environment. At times, it might end up deleting resources without prior notice. The main drawback is that developers are forced to mentally reason about the effects of a change, which quickly becomes unmanageable in large infrastructures.

6. Roll back Mechanism

If a resource is successfully created but fails during provision, Terraform will error and mark the resource as “tainted.” In the next execution plan, Terraform will remove the tainted resources and will attempt to provision again. This is because it follows the execution plan very strictly.  In CloudFormation, if a resource creation fails then the entire stack is rolled back by default and sometimes the roll back also fails resulting in frozen stacks which require Amazon technical support.

7. Stability and Community Support

Terraform recently released version 0.3.5, it is not completely mature and stable but it is moving in that direction. It is an open source tool managed by Hashicorp, the renowned creators of Vagrant, Packer and Consul. CloudFormation is a stable tool being used for quite some time and is managed entirely by Amazon.

Conclusion

As you can see, overall both Terraform and CloudFormation have their pros and cons. The reason we went ahead with Terraform was that the planning phase was quite important for our workflow. Also, we had a workaround in mind to solve the state management problem. Regarding the missing resources, we started contributing whatever was essential for us.

So, thats all folks! Based on your requirements and priorities you will need to decide which one of these tools is more suitable for you. Hope this post gives you an insight into making that choice. Do post your suggestions and feedback. Happy coding!!

17 Comments

Add yours →

  1. What are your thoughts for the workaround of the lack of shared state?

    I want to use terraform, but I’m not the only one in my org making changes. It seems like there needs to be a transactional service on top of terraform to ensure the state changes are ordered and only made one at a time.

    Like

  2. Funny thing, a bit after I wrote my reply I stumbled across this new feature in version 0.3.5 that does just that, remote hosting of the tfstate file: https://www.terraform.io/docs/commands/remote.html

    Like

  3. Have you looked at Fugue for “machine as code” immutable state? https://fugue.it/

    Like

  4. I’ve been having fun with terraform recently and also struggled with state management. I wanted version controlled state management but didn’t want to adopt Atlas (not yet anyway). Currently I’m using Go for deployments so have pipelines setup to apply the state. To tackle state management, I created a bash script which pulls the latest state from a github repo before terraforming, runs the terraform commands then pushes the state back to the repo if the state has changed. This is not without risks and I had to make sure I was pushing only the changed .tfstate files from a clean copy of my repo which was inline with my remote origin master, but for now it seems to do the job. It would be nice if terraform included an option for remote state management with git out-of-the-box. They do support S3, but I really wanted the version control and tracking you get from git.

    Like

  5. We performed a similar analysis and due to the lack of required resources supported by Terraform went with CloudFormation. However, for similar criticisms that you outlined above, we created an open source Scala DSL to generate CloudFormation templates (and a companion project to administer them): http://engineering.monsanto.com/2015/07/10/cloudformation-template-generator/

    Like

  6. Great blog ! This is something I was looking to make a decision over terraform vs troposphere. I tried troposhere python library to create cloudformation template and it works great. We can use all python features, you can have your python modules to generate different parts of the template, conditions, loops etc etc.

    As I understand terrform will be right choice if we have to look for cloud migration in future, otherwise for AWS – stick to cloudformation (preferable use troposphere). Please correct me if I am wrong.

    Like

    • Hi Chetan, thanks for the compliment.
      In my opinion, Terraform is a perfect tool if you do not want vendor lock-in allowing you to work with different cloud providers. Also, Terraform’s “plan” action is really helpful. For specifically AWS, plain CF is not the best solution, something like troposphere ( I haven’t used it ) over CF would work well.

      Like

  7. Great article! Explains in simple way difference between both tools.

    Liked by 1 person

Leave a comment