As discussed in part one of the Intro to Terraform series, Terraform is a declarative language which can be used to manage cloud resources in an Infrastructure-as-Code approach. For someone like myself – coming from a traditional IT background with no true software dev experience – the idea of merging IT infrastructure management with code can be daunting. Fear not! I have great news: Terraform is easy to learn, and I’ll help you get started.
NOTE: For the purposes of simplicity, we’re going to focus on Terraform in an AWS environment. Terraform, however, is compatible with all major IaaS providers and generally works the same across all apart from using resources terms specific to certain providers.
To begin, let’s briefly wrap our minds around the high-level concepts. Firstly, don’t think of Terraform as a programming language – because it’s not. As described in part one, Terraform is more like an architectural blueprint depicting how your infrastructure is designed. What you put into your Terraform code corresponds directly to infrastructure created in the cloud on basically a 1-to-1 basis (notwithstanding sections of Terraform code which create multiple pieces of the same infrastructure).
This makes for a very straightforward development experience in which keeping track of your cloud resources is very simple provided you structure you code repository in a manner which makes sense. If you come from a traditional IT background, then think of Terraform in a traditional sense when building your code: servers, routers, network concepts (VLANs, route tables, etc), and I guarantee it will improve your ability to envision your code in deployment.
THE BASICS OF BUILDING
Terraform code is meant to be stored in repo (short for “repository”) in the form of *.tf files. In general, almost every Terraform repo will consist of – at the very least – a main.tf and variables.tf file as the core building blocks of code. The Main file will outline the cloud resources to be deployed, while the Variables file will define the attributes of said resources – such as name, storage size, etc.
The core component of Terraform code is the Resource block; resource blocks in TF code are the 1-to-1 assets to which I referred earlier. You put a resource in your TF code, and you’ll get a piece of cloud infrastructure from your IaaS provider on the other side (generally speaking). Below is a resource block creating a simple VPC in AWS:
Let’s dissect this piece by piece:
resource – this declares that we are standing up a singular cloud asset. After reading this piece, Terraform knows that we want to build infrastructure – and it prepares the execution engine to ingest resource details.
“aws_vpc” – This is the “resource type”, lists of which can be found for each major cloud provider on the Terraform website. Now that Terraform knows we want to stand up a resource, this segment tells it what specifically to create. So by inserting “aws_vpc”, we’re telling Terraform that we want to create a VPC within AWS.
“vpc” – This is the “resource name”. This name is used to identify the resource within the Terraform code, but not on the actual IaaS side. It’s good to give resources the most straightfoward names possible, which is why I simply named this resource “vpc”. If we need to refer to this resource elsewhere in our Terraform code, we’ll do so by using the name “vpc”. Note, though, that the resource has a name defined later on in the resource block which will actually show up within the IaaS provider.
“cidr_block” – This is an example of a “resource attribute”, which could be any number of things depending on the resource. Attributes are what define the details of your infrastructure and allow you to control the specifics of deployment. In this particular case, we’re using the cidr_block attribute to define the IP address block allocated to the VPC. Again, the Terraform registry entry for any given resource will list all the attributes available for modification.
NOTE: See how the attribute is defined with a variable? We could simply use an actual IP block here – “10.0.0.0/16”, for example – but we’re using a variable in this case. More on that later.
NOTE #2: Some attributes are required, meaning that you MUST define them in order to deploy the resource – while other resources are optional and include default values. If you don’t specifically define an optional attribute, then it will deploy with the default value. An example of this for an “aws_vpc” resource is the attribute “instance_tenancy”. An explanation of instance tenancy can be found here. If you don’t define “instance_tenancy”, then it defaults to a “shared” state, meaning there is no dedicated instance tenancy. You can enforce instance tenancy thought by defining the attribute as “dedicate” or “host”.
“tags” – Every resource is able to receive “tags”, which are AWS-side identifiers used to group resources. In this case, we nest the “Name” tag within the tags section – which will show up at the literal resource name on the AWS-side. As you can see, the name will be a combination of the variable “name” with “-VPC” appended to the end of the variable.
…and that’s the simple explanation of a resource block. In a true enterprise setting, resource blocks will often be far more meaty than this – taking care to define every variable so as to ensure that all possible variables are under control. For simple purposes, however, we can lean on default values and define only what we need.
As seen in the previous resource block, we can use variables to substitute attribute values into our Terraform code. While you can hardcode attribute values into the resources living in your main.tf file – and there are some rare cases in which this is a good move – we typically don’t want to hardcode these values. What if you need to change them for whatever reason? Maybe you’re deploying a new environment, or simply modifying aspects of the existing environment. Changing source code for simple variations in attribute values is a bad practice, as it increases the total amount of work you have to do and enlarges the possible margin of coding error.
To resolve this, we use variables as a store of attribute values. Rather than modifying source code in our main.tf file, we modify variable values living in our variables.tf file – thereby leaving our main resource code alone.
Recall our VPC resource block above:
See the “var.cidr” next to the cidr_block attribute? Rather than hardcoding our IP block into the main.tf file, we instead have it referencing a variable. The variables live in our variables.tf file and look like this:
You might be wondering, “How does Terraform know where the look for variable vaues outside of the main.tf file?” In Terraform, all *.tf files living in the same directory level are able to reference each other; so if we have a main.tf and variables.tf file living the same directory level, we can declare variables in one file and reference them in the other. In a later article, we’ll cover advanced Terraform techniques in which you can declare variables in a multi-leveled directory and reference them elsewhere – the most well-known technique being the use of modules. But like I said, we’ll cover that in a future article.
NOTE: Aha! See how this variable has no value declared even within itself? The default attribute simply has a set of empty quotes. We could include an actual IP block, in which case it would look like this:
...but as mentioned previously, an advanced technique using what are called “modules” allows us to even leave the variables untouched, which increases the reusability and integrity of the source code. More on that in another article – just know that the variables.tf file, and the variables contained therein, keep our main.tf source code clean and increase code integrity/reusability.
What if one resource depends on information from another before it can be deployed? For example, a subnet within a VPC – subnets can only be deployed within VPCs, and that value needs to be specified in the Terraform resource block even though the VPC ID has not yet been created. How do we deal with this chicken-or-the-egg paradox?
Thankfully, Terraform has a very simple method of dealing with this: named values. For our specific example above, we want resource values – but there are a few different types of named values of which we’ll go over in more detail in a future article. We’re going to focus on the most commonly-used named value, which are resource values.
Let’s look at a subnet resource block, shown here:
As mentioned, we’re required to declare a VPC ID for a subnet resource – as seen in the vpc_id attribute. But what’s this in the attribute value field? We could have hardcoded in a VPC ID if one was already existing, or used a variable to pass that same value – but either of these options would require either creating the VPC first in a separate template or using data output values to get the VPC ID (which would be overcomplicated and completely unnecessary).
Instead, we use a resource reference pointing at the VPC ID. Let’s break this line down:
vpc_id – This is the attribute for the subnet which specifies the VPC ID. Whatever value gets placed after the equals sign will tell the Terraform engine where to deploy the subnet.
aws_vpc – Surely you recognize this, the resource type. We already explained it above! This tells the resource block that we’re going to pull a value from an existing resource block – in this case, a VPC block.
vpc – The resource name for the type which we specified. This tells us which specific resource is having a value pulled. If we had specified multiple VPCs in our template, then we would use this name to specify which exact VPC is having this subnet added to it.
id – Finally, this specified the atrribute which is being pulled. There are a number of different attributes which can be pulled from any given resource – but only specific ones will give us the info we need. In this case, we want the resulant VPC ID (as opposed to VPC name, or CIDR block).
When used together – aws_vpc.vpc.id – you can see how the line uses a “funnel” approach to narrow down exactly what we need. There’s no need to worry about ordering resource blocks in your code either – e.g. putting subnets after VPC. The Terraform engine builds it’s own dependency tree during the planning phase and handles all of that for you – so all you need to worry about is specifying the correct resource values. This ability to automate the retrieval of values and orchestrate deployment is one of the key advantages to using Terraform.
These are the basics of getting started with Terraform. In the part three of my series, we’ll discuss modules and how they can be used to further enhance Terraform automation and flexibility.