Best Practices When Using Terraform

1. Overview

Previously, we’ve covered Terraform’s basic concepts and usage. Now, let’s dig deeper and cover some of the best practices when using this popular DevOps tool.

2. Resource Files Organization

When we start using Terraform, it’s not uncommon to put every resource definition, variable, and output in a single file. However, this approach quickly leads to code that is hard to maintain and reuse.

A better approach is to benefit from the fact that, within a module, Terraform will read any “.tf” file and process its contents. The order in which we declare resources in those is irrelevant – this is Terraform’s job, after all. We should keep them organized to better understand what’s going on.

Here, consistency is more important than how we choose to organize resources in our files. A common practice is to use at least three files per module:

variables.tf: All module’s input variables go here, along with their default values when applicable
main.tf: This is where we’ll put our resource definitions. Assuming that we’re using the Single Responsibility principle, its size should stay under control
modules: If our module contains any sub-modules, this is where they’ll go
outputs.tf: Exported data items should go here
providers.tf: Used only on the top-level directory, declares which providers we’ll use in the project, including their versions

This organization allows any team member who wants to use our module to locate the required variables and output data quickly.

Also, as our module evolves, we must keep an eye on the main.tf file’s size. A good sign that we should consider refactoring it into sub-modules is when it grows in size. At this point, we should refactor it by moving tightly-coupled resources, such as an EC2 instance and an attached EBS volume, into nested modules. In the end, the chances are that our top-level main.tf file contains only module references stitched together.

3. Modules Usage

Modules are a powerful tool, but, as in any larger software project, getting the right level of abstraction takes some time to maximize reuse across projects. Given that Terraform, as the whole infrastructure-as-code practice, is relatively new, this is an area where we see many different approaches.

That said, we can still reuse some lessons learned from application codebases that can help with proper module organization. Among those lessons, the Single Responsibility from the S.O.L.I.D set of principles is quite useful.

In our context, a module should focus on a single aspect of the infrastructure, such as setting up a VPC or creating a virtual machine – and just that.

Let’s take a look at a sample Terraform project directory layout that uses this principle:

$ tree .
├── main.tf
├── modules
│   ├── ingress
│   │   └── www.petshop.com.br
│   │       ├── main.tf
│   │       ├── outputs.tf
│   │       └── variables.tf
... other services omitted
│   └── SvcFeedback
│       ├── main.tf
│       ├── outputs.tf
│       └── variables.tf
├── outputs.tf
├── terraform.tfvars
└── variables.tf

Here, we’ve used modules for each significant aspect of our infrastructure: database, ingress, messaging, external services, and backend services. In this layout, each folder containing .tf files is a module containing three files:

variables.tf – Input variables for the module
main.tf – Resource definitions
outputs.tf – Output attributes definitions

This convention has the benefit that module consumers can get straight to its “contract” – variables and outputs – if they want to, skipping implementation details.

4. Provider Configuration

Most providers in Terraform require us to provide valid configuration parameters so it can manipulate resources. For instance, the AWS provider needs an access key/secret and a region to access our account and execute tasks.

Since those parameters contain sensitive and deployment-target-specific information, we should avoid including them as part of our project’s code. Instead, we should use variables or provider-specific methods to configure them.

4.1. Using Variables to Configure a Provider

In this approach, we define a project variable for every required provider parameter:

variable "aws_region" {
  type = string
}
variable "aws_access_key" {
  type = string
}a
variable "aws_secret_key" {
  type = string
}

Now, we use them in our provider declaration:

provider "aws" {
  region = var.aws_region
  access_key = var.aws_access_key
  secret_key = var.aws_secret_key
}

Finally, we provide actual values using a .tfvar file:

aws_access_key="xxxxx"
aws_secret_key="yyyyy"
aws_region="us-east-1"

We can also combine .tfvar files and environment variables when running Terraform commands such as plan or apply:

$ export TF_VAR_aws_region="us-east-1"
$ terraform plan -var="access_key=xxxx" -var-file=./aws.tfvars

We’ve used a mix of environment variables and command-line arguments to pass variable values. In addition to those sources, Terraform will also look at variables defined in a terraform.tfvars file and any file with the “.auto.tfvars” extension in the project’s folder.

4.2. Using Provider-Specific Configuration

In many cases, Terraform providers can pick credentials from the same place used by the native tool. A typical example is the Kubernetes provider. If our environment already has the native utility kubectl configured to point to our target cluster, we don’t need to provide any extra information.

5. State Management

Terraform state files usually contain sensitive information, so we must take proper measures to secure it. Let’s take a look at a few of them:

Always use an exclusion rule for *.tfstate files in our VCS configuration file. For Git, this can go in a global exclusion rule or our project’s .gitignore file.
Adopt a remote backend instead of the default local backend as soon as possible. Also, double-check access restrictions to the chosen backend.

Moving from the default state backend – local files – to a remote is simple. We have to add a backend definition in one of our project’s files:

terraform {
  backend "pg" {}
}

Here, we’re informing Terraform that it will use the PostgreSQL backend to store state information. Remote backends usually require additional configuration. Much like providers, the recommended approach is to pass the needed parameters through environment variables or “.auto.tfvars” files

The main reason to adopt a remote backend is to enable multiple collaborators/tools to run Terraform on the same target environment. In those scenarios, we should avoid more than one Terraform run on the same target environment — that can cause all sorts of race conditions and conflicts and will likely create havoc.

By adopting a remote backend, we can avoid those issues, as remote backends support the concept of locking. This means that only a single collaborator can run commands such as terraform plan or terraform apply in turn.

Another way to enforce proper management of state files is to use a dedicated server to run Terraform. We can use any CI/CD tools, such as Jenkins, GitLab, and others. For small teams/organizations, we can also use the free-forever tier of Terraform’s SaaS offering.

6. Workspaces

Workspaces allow us to store multiple state files for a single project. Building on the VCS branch analogy, we should start using them on a project as soon as we must deal with multiple target environments. This way, we can have a single codebase to recreate the same resources no matter where we point Terraform.

Of course, environments can and will vary in some way or another — for example, in machine sizing/count. Even so, we can address those aspects with input variables passed at apply time.

With those points in mind, a common practice is to name workspaces after environment names. For instance, we can use names such as DEV, QA, and PRD, so they match our existing environments.

If we had multiple teams working on the same project, we could also include their names. For instance, we could have a DEV-SQUAD1 workspace for a team working on new features and a DEV-SUPPORT for another team to reproduce and fix production issues.

7. Testing

As we start adopting standard coding practices to deal with our infrastructure, it is natural that we also adopt one of its hallmarks: automated testing. Those tests are particularly useful in the context of modules, as they enhance our confidence that they’ll work as expected in different scenarios.

A typical test consists of deploying a test configuration into a temporary environment and running a series of tests against it. What should tests cover? Well, it largely depends on the specifics of what we’re creating, but some are quite common:

Accessibility: Did we create our resources correctly? Are they reachable?
Security: Did we leave any open non-essential network ports? Did we disable default credentials?
Correctness: Did our module use its parameters correctly? Did it flag any missing parameters?

As of this writing, Terraform testing is still an evolving topic. We can write our tests using whatever framework we want, but those focusing on integrated tests are generally better suited for this task. Some examples include FitNesse, Spock, and Protractor, among others. We can also create tests using regular shell scripts and add them to our CI/CD pipeline.

8. Conclusion

In this article, we’ve covered some best practices while using Terraform. Given that this is still a relatively new field, we should take those as a starting point. As more people adopt infrastructure-as-code tools, we will likely see new practices and tools emerge.

As usual, all code is available on GitHub.

Full Archive

About Baeldung