Deleting infrastructure is perfectly normal during development, but once you are in production it is important to protect some instances and resources from being accidentally destroyed by Terraform.
Updates:
Some might say that since we are using Terraform and infrastructure as code, the infrastructure can be recreated in minutes (depending on size and Cloud provider speed), and that is true, so it is not a problem to delete an instance, a database or a VPN, but I believe that once you are in production, it is not acceptable to have your systems and applications down.
Recreating an instance is really easy and a fast operation using AWS and Terraform, but then you have to install software, configure and deploy the application, maybe load data from backup and that can be really time consuming.
In a world of micro-services and cloud-native applications, adding instances and removing them from the pool is fast and desirable, but not everything is a micro-service or has many clones for high availability and fail-over.
Think for example in a Cassandra node or an Elasticsearch node, or a traditional Database node.
Your responsibility is to protect those nodes and only in the case of a hard failure or a planned maintenance replace them.
How do you protect AWS instances from being destroyed by Terraform, or by a user through the AWS control panel or other tools?
AWS instances have some configuration properties for this:
By default, all AWS EBS root device volumes are deleted when the instance terminates. EC2 and RDS instances can be terminated using the AWS API or the AWS control panel.
To change this behavior, in Terraform I like to include a global variable that indicates if the infrastructure is in production or not.
terraform.tfvars
is_production = false
The variable is_production will be false during the development of Terraform plans and will be changed to true as soon as the infrastructure is in production or live.
When creating instances, the value of is_production will be used to set the disable_api_termination and delete_on_termination arguments as shown on the example.
aws_ec2_pro_srv1234.tf
resource "aws_instance" "srv1234" { .... disable_api_termination = var.is_production ? true : false instance_initiated_shutdown_behavior = "stop" ... root_block_device { ... delete_on_termination = var.is_production ? false : true } ... }
If this configuration is in place, the is_production variable is true and the infrastructure has been deployed, you will have to use the AWS console or edit your Terraform resources to change the values of instance_initiated_shutdown_behavior to be able to destroy the instances.
A third configuration property sets the behavior for an instance shutdown, the default behavior is to stop the instance but it can be changed to destroy if needed.
It is also possible to prevent the deletion of a database instance by using de property deletion_protection. The default value of the property if unspecified is false.
aws_rds_pro_rds789.tf
resource "aws_db_instance" "rds789" { .... deletion_protection = var.is_production ? true : false ... }
The new functionality to prevent database deletion was included by AWS on Sep 26, 2018.
Controlling these variables can avoid many problems, like when having a Multi-AZ RDS database that has changed availability zone for maintenance or an upgrade from what is specified in Terraform local state. In this case, Terraform thinks that the database needs to be recreated, in the "configured" availability zone and proceeds to destroy the database if the plan is applied. Having deletion_protection enabled will prevent the deletion.
IT Wonder Lab tutorials are based on the diverse experience of Javier Ruiz, who founded and bootstrapped a SaaS company in the energy sector. His company, later acquired by a NASDAQ traded company, managed over €2 billion per year of electricity for prominent energy producers across Europe and America. Javier has over 25 years of experience in building and managing IT companies, developing cloud infrastructure, leading cross-functional teams, and transitioning his own company from on-premises, consulting, and custom software development to a successful SaaS model that scaled globally.
Are you looking for cloud automation best practices tailored to your company?