Terraform is a powerful tool on its own, but what if you could harness the power of this Infrastructure-as-Code platform from the comfort of Slack? With the right connecting pieces, you can — and it might be easier than you think. Let’s take a look at how Riskalyze uses Hubot and Terraform to create development environments with simple chat commands.
Planning the Implementation
To get started, you'll need some Terraform code. It definitely helps to structure your code in a way that lends itself to automation. Our Terraform repo looks something like this:
We put primitive resources, like VPC definitions, into
tf-infra, while definitions of applications or services live inside
tf-modules is a central repository of Terraform modules that are used by both groups.
When laying out your infrastructure code, be thinking about what you want your chatbot to manage. Should it manage all of your infrastructure components, or just some of them? What about environments? Does it make sense for your chatbot to manage production, testing, or both? The answer to these questions will depend heavily on your development processes and architecture. We chose to use a chatbot for managing development services, and CI for everything else. This allows our operations team to maintain a static infrastructure (things like VPCs, subnets, security groups, and some shared servers) while being able to give engineers autonomy to create and destroy groups of services, on demand, in Slack.
I’m going to walk through an overview of the data flow in our Slack + IAC setup, and then dig into our Hubot and Terraform configurations in a little more detail.
Piecing The Data Flow Together
So before we get into specifics, here’s an overview of the components involved:
The data flow begins in Slack, where a user issues a command like:
create series core workflow pro pronode
chatops (a Hubot-based chatbot) parses and validates the command before calling
terraform-service to create the actual resources. Here’s that original command again, and how
chatops would interpret it:
create series core workflow pro pronode
“Create a new set of servers running each of these apps: core, workflow, pro, and pronode.”
terraform-service (a Node.js wrapper around Terraform) runs
terraform apply, creating the AWS resources necessary for this request — in this case, EC2 instances and RDS databases — before sending a status webhook back to
chatops sends a confirmation message back to the user.
After Terraform creates a new server, Salt takes over to configure and deploy an app. Like
terraform-service, Salt sends webhooks to
chatops as it completes various actions.
The last piece is state storage, and this workflow handles that in a couple of ways.
chatops uses an Elasticache Redis instance to store information like server status and user permissions, while
terraform-service uses an S3 bucket to store working directories for each set of servers it creates. Each of these S3 working directories contains a Terraform configuration, its modules, its state, and logs.
Delving Into Hubot Details
chatops, we started with Hubot's yeoman scaffolding, then added scripts with functions to handle each command:
list series, etc. Functions are written in Node.js and typically consist of four sections:
- Check if the user has permission to execute the command
- Parse and validate inputs
- Perform the specified action(s)
- Send feedback to user
Below is a snippet of a command that sends a request to
terraform-service to run a
terraform destroy (for example, when an engineer is finished using a series of servers and wants to terminate it).
A separate webhook handler sends a confirmation message to the user once the termination is complete.
One of the best parts about using an open-source framework like Hubot is the library of ready-to-use scripts. These scripts have saved us significant development time:
- hubot-auth: Allows us to implement role-based access control to our chatbot commands.
- hubot-redis-brain: We use this to keep track of server metadata (like who created each server, when it was created, which branch it’s tracking, etc.), among other things. It provides a simple way to persist data without the complexity of database queries.
- hubot-schedule: Lets us send messages according to a schedule. We use this plugin to schedule termination of servers after they have reached a set TTL.
Adding Terraform to the Mix
When we first wrote
chatops, we were using Salt for the entire process, from creating servers and databases to deploying individual services. At the time,
chatops interacted with a Salt API that we wrote to facilitate running Salt commands through a REST API. Over time, we began to encounter more and more problems creating resources with Salt. It became apparent that Salt’s strengths lay in configuration management, not AWS resource orchestration. With this realization, we began the search for a better solution, eventually settling on Terraform.
Transitioning to Terraform meant that we needed a new API to interact with Terraform from tools like
chatops. We started with an Express API framework developed internally over the last couple of years, then added endpoints to interact with Terraform configurations and invoke basic commands like
Terraform API Workflow
The first request to the API generates a JSON Terraform configuration.
chatops can invoke a
apply using the generated configuration. For relatively quick commands like
plan, we use
child_process.exec() and return the results right in the response body.
For longer running commands, like
apply, we return a 202 immediately,
spawn() the Terraform command, then send a webhook with the results after it finishes.
To increase resiliency,
spawn() is invoked with
detached: True so Terraform processes will continue running, even if the API dies. In a future iteration, we may maintain a process table and check for orphaned processes on startup. That would allow
terraform-service to consistently send webhooks for completed processes even if it is restarted mid-process.
Writing your own chatbot is surprisingly easy with the help of Hubot. Writing your own API for Terraform might be harder, but is definitely doable. When combined, you'll have a powerful tool for spinning up and tearing down datacenter resources that can be used by your entire Engineering team.