Opinionated rules for writing secure, maintainable Infrastructure-as-Code (IaC) that supports enterprise cloud-migration initiatives across AWS, Azure and GCP.
Enterprise cloud migrations fail 60% of the time—not because of cloud platforms, but because of inconsistent infrastructure provisioning, security drift, and manual processes that don't scale across multiple clouds and hundreds of workloads.
You're not just moving servers to the cloud. You're orchestrating complex, multi-wave migrations across AWS, Azure, and GCP while maintaining:
Traditional migration approaches—manual Terraform configs, one-off scripts, and cloud-specific tooling—create more problems than they solve. Infrastructure drift, security misconfigurations, and cost overruns compound as your migration scales.
These opinionated Terraform rules transform your migration from a risky, manual process into a predictable, automated workflow that scales across clouds and teams.
The core premise: Every infrastructure change maps to a specific migration objective (one of the 7 Rs), follows cloud-agnostic patterns, and includes built-in security, compliance, and cost controls.
Here's what changes in your workflow:
Before: Manual Terraform files per cloud, ad-hoc security reviews, post-deployment cost surprises
# Typical migration Terraform - cloud-specific, no governance
resource "aws_instance" "web" {
ami = "ami-12345" # Hardcoded
instance_type = "t3.large" # No cost optimization
# No security scanning, no compliance tags
}
After: Governed, multi-cloud modules with automated validation
# Migration-aware, compliant infrastructure
module "web_tier" {
source = "../modules/compute/v2.1.0"
migration_wave = "wave-2"
migration_pattern = "replatform" # Maps to 7 Rs
providers = {
primary = aws.us-east-1
secondary = azurerm.east-us
}
# Automated cost optimization
enable_rightsizing = true
cost_threshold_usd = 1000
# Built-in compliance
compliance_framework = ["SOX", "PCI-DSS"]
tags = local.migration_tags
}
Your teams provision infrastructure consistently across AWS, Azure, and GCP using the same Terraform modules. No more cloud-specific learning curves or governance gaps.
# Single module, multiple clouds
module "database_tier" {
count = length(var.target_clouds)
source = "../modules/database/v1.8.0"
cloud_provider = var.target_clouds[count.index]
# AWS RDS, Azure SQL, or GCP Cloud SQL - same interface
}
Track progress through your 7 Rs strategy with built-in wave management. Each Terraform workspace maps to specific migration objectives and business KPIs.
# Automated wave progression
terragrunt run-all plan --terragrunt-include-dir wave-2/
# Validates dependencies, cost impact, security posture before apply
Every resource includes security scanning, compliance validation, and evidence generation. No more manual security reviews or audit prep.
# Built into every module
resource "aws_s3_bucket" "data" {
# Automatically encrypted, versioned, compliant
# Security scanned in CI pipeline
# Compliance evidence exported to audit bucket
}
Infrastructure costs are validated before deployment with automatic rightsizing and optimization flags built into every module.
Old way: Weeks of manual discovery, dependency mapping, and risk assessment New approach: Automated asset inventory with dependency graphs generated from your existing Terraform state
# Generate migration readiness report
terraform graph | migration-analyzer --wave wave-1 --pattern rehost
# Outputs: dependency conflicts, cost projections, security gaps
Old way: Cloud-specific Terraform files, manual security reviews, inconsistent tagging New approach: Standardized modules with built-in governance and multi-cloud support
# Provision identical infrastructure across clouds
module "app_infrastructure" {
for_each = var.deployment_regions
source = "../modules/app-stack/v3.2.1"
region = each.value
# Automatic: security scanning, cost analysis, compliance validation
}
Old way: Manual cutover procedures, limited rollback options, monitoring gaps New approach: Blue/green deployments with automated health checks and rollback triggers
# Built-in migration safety
resource "aws_route53_record" "failover" {
# Automatic rollback on health check failure
failover_routing_policy {
type = "PRIMARY"
}
health_check_id = aws_route53_health_check.primary.id
}
# Clone the enterprise migration framework
git clone <migration-rules-repo>
cd enterprise-cloud-migration
# Install required tools
brew install terragrunt tflint checkov infracost
# Configure remote state for all environments
terragrunt hclfmt --terragrunt-diff environments/
# environments/common.hcl
locals {
migration_waves = {
wave-1 = {
pattern = "rehost"
target_clouds = ["aws"]
risk_level = "low"
}
wave-2 = {
pattern = "replatform"
target_clouds = ["aws", "azure"]
risk_level = "medium"
}
}
}
# Plan wave 1 infrastructure
cd environments/prod/wave-1/
terragrunt run-all plan
# Automatic validations run:
# - Security scanning (tfsec, checkov)
# - Cost impact analysis (infracost)
# - Compliance validation (Sentinel policies)
# - Dependency conflict detection
# Deploy when validations pass
terragrunt run-all apply
# Built-in observability
module "monitoring" {
source = "../modules/observability/v1.5.0"
# Centralizes logs, metrics, traces
migration_wave = "wave-1"
alert_targets = ["slack://ops-channel", "pagerduty://migration-team"]
}
Start your first migration wave in under 30 minutes. Your infrastructure will be more secure, cost-effective, and maintainable than manual cloud-specific approaches—while scaling across any number of applications and cloud providers.
The enterprise cloud migration rules handle the complexity. You focus on business value delivery.
You are an expert in multi-cloud Infrastructure-as-Code using Terraform, Terragrunt, Kubernetes manifests, Bash and Python automation.
Key Principles
- Map every IaC change to a defined migration objective (align with business KPIs & one of the 7 Rs).
- Prefer immutable, declarative resources; avoid imperative drift-prone scripts.
- Design for cloud agnosticism: abstract provider-specific details behind modules; support at least AWS, Azure, GCP.
- Treat security, compliance and cost as first-class citizens (shift-left DevSecOps).
- Everything code-reviewed, linted, unit-tested and validated in CI before reaching any cloud account.
- Automate everything: provisioning, rollbacks, tagging, policy enforcement, documentation.
- Keep blast-radius minimal: smallest possible module, least-privilege IAM, scoped workspaces.
- Prefer incremental, phased roll-outs (blue/green or canary) to mirror low-risk migration waves.
- Tag every resource with standardized metadata: owner, env, cost-center, compliance domain.
HCL / Terraform
- Follow semantic directory layout:
environments/<env>/region/<region>.tfvars
modules/<service>/<version>/
live/ contains only Terragrunt *.hcl wrappers, no resource definitions.
- Use Terraform >=1.5, cloud block disabled; state stored remotely in versioned, encrypted buckets (S3+SSE-KMS / GCS+KMS / Azure Blob+CMK).
- Enable backend locking; name state files after environment and migration wave (e.g., prod-wave1.tfstate).
- Always pin provider versions (>=) and module versions (exact). Example:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.28"
}
}
}
- Separate logic (modules) from configuration (live): live code only supplies inputs & remote-state references.
- Inputs: use snake_case, outputs: camelCase, locals: lowerCamel.
- Sensitive data: pass via environment variables or encrypted tfvars. Mark outputs "sensitive = true".
- Validate with `terraform validate`, `tflint`, `checkov`, `tfsec`; fail CI on any HIGH/Critical.
- Write explicit `depends_on` only when implicit graph is insufficient; document why.
- Use for_each over count for unique resources to avoid index drift.
- Prefer "data" sources over duplicating resource attributes; never reference resource IDs manually.
Error Handling and Validation
- Guardrails enforced in CI/CD with Sentinel/Open Policy Agent policies (e.g., disallow public S3, require multi-AZ RDS).
- Pre-plan hook: run `terraform graph | infracost`, block PR if Δ cost exceeds threshold.
- Pre-apply hook: run smoke tests against staging; abort pipeline on failure.
- Post-apply step: verify resource health with AWS Health, Azure Service Health, GCP Ops Center; roll back on alerts.
- Always implement automated drift detection (`terraform plan -detailed-exitcode` on schedule) and alert owners.
Terraform/Terragrunt Framework Rules
- Use Terragrunt to orchestrate multi-account / multi-subscription workspaces:
- keep-yourself DRY with `generate "provider"` blocks.
- Use `before_hook`/`after_hook` for security scanning & tagging.
- Hierarchical configuration: root -> common.hcl (backend, remote_state, provider) -> env.hcl -> service.hcl.
- Employ run-all only from CI runners, never from local shells.
- Lock Terragrunt version in `.terraform-version` file.
Additional Sections
Testing
- Unit-test modules with Terratest (Go) or Kitchen-Terraform.
- Always include at least one assertion per critical resource (exists, tags, encryption).
- Integration tests executed in ephemeral sandbox accounts via CI jobs.
Performance & Cost Optimization
- Enable autoscaling and right-sizing flags in modules; expose CPU/MEM inputs with sane defaults per cloud.
- Activate AWS Graviton/AMD, Azure B-series, GCP Tau-T2A where supported.
- Tag resources with `requires-optimization = true`; weekly Lambda/Cloud Function script rightsizes.
Security & Compliance
- Enforce least-privilege IAM roles generated by IAM Tuner or Access Analyzer.
- Encrypt data in transit & at rest by default; expose KMS CMK ARN as module input.
- Enable cloud provider threat detection (GuardDuty, Security Center, SCC) and centralize findings in SIEM.
- Maintain evidence automatically: output compliance artifacts (JSON) to audit bucket.
Observability
- Ship logs to centralized stack (OpenTelemetry → Grafana/Loki + Tempo).
- Standardize metric naming: `<service>.<resource>.<metric>` (snake_case).
- Alerts codified in Terraform (PrometheusRule, CloudWatch Alarm, Azure Monitor Metric Alert).
Continuous Improvement
- After every migration wave, run retro; record learnings in `/docs/postmortems/<wave>.md`.
- Feed cost, performance and security data back into module default values for next iterations.
Common Pitfalls
- Skipping dependency mapping → leads to circular/module loops; always run `terraform graph` early.
- Forgetting to import existing prod resources → triggers accidental recreation; use `terraform import`.
- Mixing imperative scripts with declarative IaC → drift & hidden state; retire bash once Terraform covers area.