The main purpose of this mini camp is to build a GitOps pipeline to deploy resources, managed by terraform, to AWS using GitHub Actions.
Expand to see requirements
Section | Task | Self-Reported Status | Notes |
---|---|---|---|
Setup | |||
Main branch is protected | ✅ | ||
Cannot merge to main with failed checks | ✅ | ||
State is stored remotely | ✅ | ||
State Locking mechanism is enabled | ✅ | ||
Design and Code | |||
Confirm Account Number | ✅ | allowed_account_ids provider argument |
|
Confirm Region | ✅ | variable validation | |
Add Default Tags | ✅ | added to provider block | |
Avoid Hardcoded Values | ✅ | ||
No plaintext credentials | ✅ | Environment variables set by OIDC | |
Pipeline in GitHub Actions only | ✅ | ||
Validate | |||
terraform fmt pre-commit hook | ✅ | Git Hooks managed by trunk-io | |
pre-commit hooks are in repo | ✅ | Git Hooks managed by trunk-io | |
Test and Review | |||
Pipeline works on every PR | ✅ | on: pull_request trigger |
|
Linter | ✅ | TFLint configured with aws plugin and deep check | |
terraform fmt | ✅ | See PR #5 | |
terraform validate | ✅ | See PR #5 | |
terraform plan | ✅ | See PR #5 | |
Infracost with comment | ✅ | See PR #4 | |
Open Policy Agent fail if cost > $10 | ✅ | See PR #6 | |
Deploy | |||
terraform apply with human intervention | ✅ | Applied when PR is merged | |
Deploy to production environment | ✅ | Matrix strategy | |
Operate and Monitor | |||
Scheduled drift detection | ✅ | ||
Scheduled port accessibility check | |||
Readme | |||
Organized Structure | ✅ | ||
Explains all workflows | ✅ | ||
Link to docs for each action | ✅ | ||
Contribution Instructions | |||
Explains merging strategy | ✅ | ||
Bonus | |||
Deploy to multiple environments | ✅ | See PR #35 | |
Ignore non-terraform changes | ✅ | Workflow trigger use paths filter for tf and tfvars files. | |
Comment PR with useful plan information | ✅ | See PR #7 | |
Comment PR with useful Linter information | ✅ | See PR #5 | |
Open an Issue if Drifted | ✅ | See Issue #20 | |
Open an issue if port is inaccessible | |||
Comment on PR to apply | ✅ | See PR #32 |
- Create feature branch off main
- Commit change locally and push to remote
- Create a draft pull request that targets the main branch:
gh pr create --draft --base main
Important
Pull Requests must be set to draft to prevent CODEOWNER reviewers being assigned until the pull request is ready.
This cannot be set by default. See open discussion.
Unfortunately this also cannot be automated because action runners, using GITHUB_TOKEN
for authentication, are unable to run gh pr ready --undo
as the integration is unavailable. See open discussion
- The workflow will run through the tests (fmt, validate, TFLint), then run
terraform plan
and post the plan to the pull request and workflow job summary. - To approve the plan, approve the pull request and add the pull request to merge queue.
The debate rumbles on. The merge queue does a pretty good job of addressing this. If apply
is triggered using the merge_group
event, the workflow will attempt to apply the plan from the PR and then merge the PR. If the apply fails for any reason, then the PR is not merged.
Another debate. The best argument I have heard for directories was in the Q&A session on 19/10/2024:
"anyone should be able to
cd
into a terraform working directory and simply runterraform plan
without have to worry about workspaces and variable files"
The workflow uses changed-files to find the directories containing terraform changes. The output of this job is used to define the matrix strategy for the terraform workflow.
Each directory is mapped to an environment which achieves 2 things:
- Secrets, in the case, the AWS roles, are stored in the environment - not the repository.
- Deployments to production require additional approval.
---
config:
theme: base
---
gitGraph
commit id: "prev" tag: "v1.0.0"
branch feature
switch feature
commit id: "Terraform Changes"
commit id: "Bug Fix"
commit id: "Plan Diff fix"
switch main
merge feature
commit id: "new" tag: "v1.1.0"
---
config:
look: handDrawn
theme: neo
---
flowchart LR
subgraph Fail Checks
direction LR
Fail("`**Fail Required Checks**
PR Cannot be merged`")
end
subgraph Pass Checks
direction LR
noTFPass("`**Met Required Checks**`") -->merge(Merge PR to main branch)
end
subgraph Pass Terraform Checks
direction LR
TFPass("`**Met Required Checks**
Add to Merge Queue`") -->apply{terraform apply}
apply -->dev(Development) -->prd(Production) -->tfMerge(Complete Merge)
tfMerge -->docs(Run terraform-docs) -->rel(Generate a release)
apply -->|Fail|Fail
end
subgraph Infracost
direction LR
ic{"`**Infracost**
Infracost fail if > $10`"} -->|Fail|Fail
end
subgraph Targets
direction LR
target{"`**Terraform Targets**
Search for terraform changes and output the directory name(s)`"} -->|No Changes|noTFPass
end
subgraph Deploy Development
direction LR
devSetup("`**Setup**
AWS Credentials
Install and Initialise TFLint
with AWS Plugin`") -->
devValidate{"`**Validate**
terraform fmt
terraform validate
tflint`"} -->|Fail|Fail
devValidate -->|Pass|devPlan(terraform plan)
end
subgraph Deploy Production
direction LR
prdSetup("`**Setup**
AWS Credentials
Install and Initialise TFLint
with AWS Plugin`") -->
prdValidate{"`**Validate**
terraform fmt
terraform validate
tflint`"} -->|Fail|Fail
prdValidate -->|Pass|prdPlan(terraform plan)
end
PR(Draft Pull Request) -->target & ic
target -->|Job Matrix|devSetup
devPlan -->prdSetup
prdPlan -->|Approve PR|TFPass
- changed-files
- TF-via-PR
- Infracost
- Setup Terraform
- Setup TFLint
- Terraform Docs
- Semantic Release
- Wait for Status Checks
Infracost runs on pull requests when they are opened or synchronized. The workflow generates a cost difference of the resources between the main branch and the proposed changes on the feature branch.
This workflow also flags any policy violations defined in infracost-policy.rego. See an example in this pull_request
The initial job of the workflow uses changed-files to output the directories where terraform changes have been made. This output is uses ad the matrix strategy for the deploy job.
Uses a matrix strategy to run in each directory identified in the targets job.
Important
The strategy has a max-parallel value of 1, which means the jobs are run sequentially.
- Setup AWS credentials using config-aws-credentials using OIDC to assume a role and set the authentication parameters as environment variables on the runner. This step is required when TFLint deep checking for the AWS rule plugin is enabled.
- Install terraform using setup-terraform. Despite being installed on the runners,
apply
jobs were failing due to version differences between the apply runner and the plan runner - Run
terraform fmt
- Run
terraform init
- Run
terraform validate
- Install TFLint using setup-tflint
- Initialise TFLint to download the AWS plugin rules.
- Run
tflint
- Update the PR comments if any of the steps fail and exit the workflow on failure.
When the validation steps have succeeded - a terraform plan
will be run. The conditional statement runs plan
on a pull_request
event. The workflow uses TF-via-PR. This action adds a high level plan and detailed drop down style plan to the workflow summary and updates the pull request with a comment.
After terraform plan
has been run, assuming the plan is accurate, approve the PR, and click merge when ready
. This adds the pull request to the merge queue. The conditional statement in the workflow will run terraform apply
on a merge_group
event.
The only required check for the pull request.
Uses Wait for Status Checks to poll the checks API for the status of the other running checks. This helps to overcome the situation where a required check may not run. For example, we could make Terraform CI a required check but, this workflow may not run (so it is skipped) and consequently the required check is not met. This workflow will detect that Terraform CI has been skipped and return an outcome of successful for itself, so the required check passes.
Terraform docs will run when the pull request is merged. This only needs to run once, following the apply, and not on every commit to a pull request. Updating the README on every commit generates a lot unnecessary commits and you have to pull the updated README prior to the next push to avoid conflicts.
I use my own Terraform Docs reusable workflow which adds job summaries and verified commits to the terraform-docs gh-action.
Generate a CHANGELOG and version tag using semantic release
- Grafana Port Check
- Fix drift detection for multiple environments
- Special mention to the maintainer of TF-via-PR for responding to queries quickly and proactively suggesting workflow improvements.