Introduction to Terraform
Archie ToTerraform is a very popular technology used both in the industry and within the Research Computing Services (RCS) Team. We at ARC Software have used Terraform to build up STRAP and implemented scripts for Strapper to deploy applications onto STRAP (these apps are called Strapplications). So what is Terraform? What makes it so popular?
What is Terraform?
Terraform is an infrastructure as code tool that allows users to define and provision infrastructure using a high-level configuration language.
So, for example, if you want to set up a database with a cloud service, e.g. AWS, instead of having to log into AWS and make a bunch of clicks on the UI, you would write some code and run a few commands.
Why Terraform?
There are multiple reasons why we should use Terraform:
1. Version control
Since it is infrastructure as code, we can version control the code with Git, share the code with others and have multiple developers working on it.
2. Consistency and reproducibility
You can use the code to create multiple deployments and each deployment will provide exactly the same infrastructure as long as the code remains unchanged. Better yet, you can pull the code from Git and run it on multiple machines and still produce the same result.
3. Automation
Terraform allows you to automate your deployments through code, reducing the need for repeated manual intervention. To run Terraform automatically, you can write the scripts in the language of your choice, i.e., shell, bash, Python, or Go! – fun fact, actually Strapper does this!
Plus, Terraform syntax also provides some “programming features” such as for loops and conditional statements to save us time from repetitive tasks and give us some flexibility.
4. Reusability
Terraform modules are developed and shared widely within the Terraform community. We can take advantage of this to set up infrastructure that has been well planned out by experts. Similarly, we can also share our dedicated infrastructure with the community if we think someone might find that useful.
Core concepts of Terraform
Providers
Terraform relies on plugins called providers to interact with cloud providers, SaaS providers, and other APIs. Each provider adds a set of resource types and/or data sources that Terraform can manage. Providers are distributed separately from Terraform itself, and each provider has its own release cadence and version numbers.
Resources
Resources are the most important element in the Terraform language. Each resource block describes one or more infrastructure objects, such as virtual networks, compute instances, or higher-level components such as DNS records.
Input variables
Input variables let you customize aspects of Terraform modules without altering the module’s own source code. This functionality allows you to share modules across different Terraform configurations, making your module composable and reusable.
State
Terraform must store state about your managed infrastructure and configuration. This state is used by Terraform to map real world resources to your configuration, keep track of metadata, and to improve performance for large infrastructures
Modules
Modules are containers for multiple resources that are used together. A module consists of a collection of .tf
and/or .tf.json
files kept together in a directory.
Modules are the main way to package and reuse resource configurations with Terraform.
Basic commands
Here are the main commands:
terraform init
: Prepare your working directory for other commandsterraform validate
: Check whether the configuration is validterraform plan
: Show changes required by the current configurationterraform apply
: Create or update infrastructureterraform destroy
: Destroy previously-created infrastructure
Example - Use Terraform to deploy a PostgreSQL database
What better way to learn Terraform than to get our hands dirty? Let’s use Terraform to deploy a simple PostgreSQL database, a database that we’re all familiar with.
First, let’s specify a provider. To recall, a provider provides a set of resources that we can use to define our infrastructure. Create a main.tf
:
terraform {
required_providers {
postgresql = {
source = "cyrilgdn/postgresql"
version = "~> 1.22.0"
}
}
}
As you can see, we are selecting cyrilgdn/postgresql provider, with version 1.22.0 or above. By specifying a provider like this, when we run terraform init
, Terraform will download the necessary resources that are supplied by the provider.
Next, let’s configure the provider using the provider
block:
provider "postgresql" {
host = "postgres_server_ip"
port = 5432
database = "postgres"
username = "postgres_user"
password = "postgres_password"
sslmode = "require"
connect_timeout = 15
}
Here, we specify the PostgreSQL server where we will create our database, along with the default database, username, password. We require SSL connections to this server. Connections are timed out after 15 seconds failing to connect to the server.
Now, let’s configure our database:
resource "postgresql_database" "my_db" {
name = "my_db"
owner = "my_role"
}
We create a database named “my_db” with the owner set to “my_role”.
Lastly, we will have to create an owner role to be able to login to the server and run commands on the database:
resource "postgresql_role" "my_role" {
name = "my_role"
login = true
password = "mypass"
}
We create a role named “my_role” with password “mypass”.
To wrap up, our main.tf
looks as follow:
terraform {
required_providers {
postgresql = {
source = "cyrilgdn/postgresql"
version = "~> 1.22.0"
}
}
}
provider "postgresql" {
host = "postgres_server_ip"
port = 5432
database = "postgres"
username = "postgres_user"
password = "postgres_password"
sslmode = "require"
connect_timeout = 15
}
resource "postgresql_database" "my_db" {
name = "my_db"
owner = "my_role"
}
resource "postgresql_role" "my_role" {
name = "my_role"
login = true
password = "mypass"
}
To actually create the database, first we have to make sure we have a PostgreSQL server running on postgres_server_ip
, whatever that’s set to. Next, we move the the same directory as main.tf
, then run:
$ terraform init # Download source code for the providers with the resources
$ terraform plan # Display a list of infrastuctures that is going to be deployed without actually deploying them
$ terraform apply # Deploy "my_db" database and "my_role" role
$ terraform destroy # Once we're done with testing, we remove the database and the role
Note: We didn’t use input variables in this example. Terraform allows you to use variables to define resources dynamically without changing the source code. So for example, if we want our database’s name to be arbitrary, we can do:
# Define the variable
variable "database_name" {
type = string
nullable = false # Force user to provide a database name
}
# Define the database
resource "postgresql_database" "my_db" {
name = var.database_name
owner = "my_role"
}
And provide a myvars.tfvars
:
database_name = "my_db"
Then if we run:
$ terraform apply -var-file='myvars.tfvars'
A database with name “my_db” will be created like the above.
Conclusion
TLDR: Terraform is slick. Period.
Terraform provides a powerful way for us to create infrastructure using code. This becomes extremely beneficial due to better reproducibily, automation and reusability. Terraform is widely used within the industry and in big companies such as Uber, Udemy, Instacart and Slack.
Terraform used to be open-source under Mozilla Public License v2.0. However, on August 10th, 2023, HashiCorp made a transition to the Business Source License for all of their products, including Terraform. The main objective was assumed to prevent the creation of products that directly compete with Hashicorp’s.
OpenTofu, a Terraform’s fork, was born in response to the change in Terraform’s license. On the technical level, OpenTofu 1.6.x is very similar feature-wise to Terraform 1.6.x. In the future, the projects feature sets will diverge. The other main difference is that OpenTofu is open-source, and its goal is to be driven in a collaborative way with no single company being able to dictate the roadmap.