Infrastructure as Code: An Introduction to Terraform & Pulumi

Advances in cloud technology and virtualization have introduced a new paradigm for managing infrastructure. The manual, tedious, and error-prone process of going to a cloud service provider (CSP) portal, selecting resources you want to create, and clicking through workflow steps has been replaced with a more repeatable and elegant solution: managing resources using infrastructure as code (IaC).

IaC is the concept of provisioning and managing cloud infrastructure and resources using code. By describing cloud resources using code, we can make infrastructure deployments more repeatable and consistent. This gives teams the confidence to quickly stand up new environments, modify existing ones, and eliminate configuration drift. Adding version control on top (e.g. Git) essentially allows you to perform updates and rollbacks which are linked to commit hashes. In other words, every single change to your infrastructure can be tracked via a commit.

Among the many tools used to implement IaC, two major ones take the spotlight: Terraform and Pulumi. Terraform is a powerful tool that abstracts cloud APIs into a domain specific language (DSL) called Hashicorp Configuration Language (HCL), which is a declarative language that is fairly easy to read and write. A code snippet for creating an AWS EC2 instance in Terraform is shown below.

data "aws_ami" "awslinux" {
  most_recent = true 
  owners = ["137112412989"] // Amazon

  filter {
    name = "name"
    values = ["amzn-ami-hvm-*-x86_64-ebs"] 
  }
}

resource "aws_security_group" "allow_tls" {
  name        = "allow_tls"
  description = "Allow TLS inbound traffic"
  vpc_id      = aws_vpc.main.id

  ingress = [
    {
      description      = "TLS from VPC"
      from_port        = 443
      to_port          = 443
      protocol         = "tcp"
      cidr_blocks      = [aws_vpc.main.cidr_block]
      ipv6_cidr_blocks = [aws_vpc.main.ipv6_cidr_block]
    }
  ]

  egress = [
    {
      from_port        = 0
      to_port          = 0
      protocol         = "-1"
      cidr_blocks      = ["0.0.0.0/0"]
      ipv6_cidr_blocks = ["::/0"]
    }
  ]
}

resource "aws_instance" "web" {
  ami                    = data.aws_ami.awslinux.id
  instance_type          = "t2.micro"
  vpc_security_group_ids = [aws_security_group.allow_tls.id]
}

As you can see, the HCL language presents itself as very easy to read and understand, especially if you have had prior experience working with JSON or YAML since most of these data structures are key-value pairs, lists, or some combination of both. For more examples and a more in-depth tutorial for using Terraform with AWS, see here.

Terraform is relatively intuitive to understand and its learning curve isn’t bad. However, it does lack the flexibility of a programming language, which is where Pulumi steps in. Pulumi offers a complete Software Develop Kit (SDK) and allows you to write and implement IaC using a variety of languages (think Typescript, JavaScript, Python, Go, or even C#). This flexibility allows for writing loops, conditionals, functions, classes, and more. Plus, it gives developers the ability to use their familiar integrated development environment (IDE) features to validate code and reduce errors. The code can be linted and tests can be written to assess code coverage, all of which result in higher levels of confidence that the code is performing as intended. Languages such as Typescript are compiled before runtime, thus minor errors such as typographical errors can be easily caught and fixed before runtime.

A code snippet for creating an AWS EC2 instance in Typescript using Pulumi is shown below:

import * as aws from "@pulumi/aws";
import * as pulumi from "@pulumi/pulumi";

const amiId = aws.ec2.getAmi({
  owners: ["amazon"],
  mostRecent: true,
  filters: [{
    name: "name",
    values: ["amzn2-ami-hvm-2.0.????????-x86_64-gp2"],
  }],
}, { async: true }).then(ami => ami.id);

const group = new aws.ec2.SecurityGroup("allow-tls", {
  ingress: [
    { protocol: "tcp", fromPort: 443, toPort: 443, cidrBlocks: ["0.0.0.0/0"]},
  ],
});

const server = new aws.ec2.Instance("web", {
  ami: amiId,
  instanceType: aws.ec2.InstanceType.T2_Micro,
  vpcSecurityGroupIds: [ group.id ],
});

By default, Pulumi will use the hosted Pulumi Service backend, which requires a Pulumi account, however, alternative backends (e.g. local/AWS S3) are available that eliminate the dependence on Pulumi's managed offering.

See here for a more in-depth tutorial on using Pulumi with AWS.

Both tools serve their purpose to implement IaC, and the decision between them will really boil down to your specific needs. When choosing a tool, it’s important to consider the number of years of experience the team has with working with full fledged programming languages, as well as the longer term maintainability of the code/tool. Apart from these two cloud-agnostic tools, there are also CSP-specific IaC tools and templates (AWS CloudFormation, Azure ARM templates, Google Cloud Platform IaC). Each CSP has their own specific format and rules for writing templates, with most of them formatted in JSON without the portability to ship across CSPs. One advantage of using a CSP-specific template is that many CSPs have prepared templates readily available for use. However, be aware that this may come at a cost when clients decide to switch to a new CSP and you have to start again from scratch.

At CTG, we utilize and employ IaC across many of our client projects. By managing resources with Terraform, we can quickly upgrade, patch, modify, and destroy common resources (EC2 instances, S3 buckets, IAM policies, relational databases, etc.) in a declarative fashion. In addition, we also use Terraform to manage our company’s own internal Kubernetes infrastructure. It is on this infrastructure where we host many of our public projects. As strong proponents of IaC, we believe that what can be easily seen can also be easily tracked, and thus easier to understand and manage. Infrastructure as Code gives us the confidence and speed needed to quickly and uniformly manage robust cloud deployments for our work.