Actionable Rules for designing, provisioning, and operating resilient, secure, and cost-optimized multi-cloud environments.
Managing workloads across AWS, Azure, GCP, and private clouds shouldn't feel like juggling flaming torches. You're dealing with inconsistent APIs, fragmented tooling, spiraling costs, and compliance nightmares that slow your team down. These Cursor Rules transform multi-cloud chaos into a streamlined, automated workflow that actually works.
Your team is burning cycles on problems that shouldn't exist:
You need infrastructure that provisions consistently, scales automatically, and optimizes continuously—without requiring a PhD in each cloud provider's quirks.
These Cursor Rules establish a governance-first approach to multi-cloud infrastructure that treats consistency, automation, and cost control as first-class concerns. Instead of fighting each cloud's differences, you get:
Unified Infrastructure Language: Write Terraform once, deploy everywhere with provider-specific optimizations built in Automated Governance: Policy-as-code that prevents costly mistakes before they hit production Intelligent Cost Management: Right-sizing, scheduling, and commitment management that adapts to usage patterns Zero-Trust Security: Consistent encryption, identity management, and compliance across all environments
Here's what changes in your daily workflow:
# Before: Cloud-specific, brittle configurations
resource "aws_instance" "web" {
ami = "ami-0123456789abcdef0" # Hardcoded
instance_type = "t3.large" # Oversized
# Missing tags, security groups, monitoring
}
# After: Governed, portable, optimized
module "compute_instance" {
source = "./modules/compute"
workload_profile = "web_server"
environment = terraform.workspace
cost_center = var.cost_center
# Auto-selects AMI, right-sizes, applies security policies
tags = merge(local.common_tags, {
compliance_scope = "pci-dss"
backup_schedule = "daily"
})
}
Cut Infrastructure Lead Time by 60%: Standardized modules and automated validation eliminate the research and testing cycles for each cloud deployment.
Reduce Security Incidents: Policy-as-code catches misconfigurations before they deploy. No more security groups open to the world or unencrypted databases.
Slash Unplanned Costs: Automated right-sizing and commitment management typically reduce cloud spend by 20-30% without impacting performance.
Eliminate Configuration Drift: Immutable infrastructure patterns mean your production environment matches your code, every time.
Before: Copy-paste Terraform across regions, manually adjust availability zones, hope the networking works.
After:
module "app_deployment" {
source = "./modules/resilient-app"
regions = ["us-west-2", "eu-west-1", "ap-southeast-1"]
# Automatic: VPC peering, load balancer configuration, DNS failover
disaster_recovery = {
rpo_minutes = 15
rto_minutes = 5
}
}
One module call provisions your entire global infrastructure with built-in resilience patterns.
Before: Monthly cost review meetings where you discover surprise bills and scramble to identify the culprits.
After: Real-time cost attribution and automated optimization:
# Automated cost optimization pipeline
@click.command()
def optimize_resources():
"""Daily cost optimization scan"""
recommendations = cloudzeero.get_rightsizing_recommendations()
for rec in recommendations:
if rec.potential_savings > 100: # $100+ monthly
apply_optimization(rec)
slack.notify(f"💰 Saved ${rec.potential_savings}/month on {rec.resource}")
Your infrastructure self-optimizes based on usage patterns and cost thresholds.
Before: Manual security reviews, spreadsheet-based compliance tracking, hoping auditors don't find issues.
After: Continuous compliance with automated remediation:
# Security policy enforcement
resource "aws_config_configuration_recorder" "compliance" {
# Automatically flags and fixes common violations
recording_group {
all_supported = true
# Custom rules for your compliance requirements
include_global_resource_types = true
}
}
Compliance violations get caught and fixed before auditors even know they existed.
Clone the rules and establish your Terraform workspace structure:
# Set up your multi-cloud workspace
terraform workspace new dev
terraform workspace new staging
terraform workspace new production
# Configure remote state with locking
# (S3 + DynamoDB for AWS, Azure Blob for Azure, GCS for GCP)
Create your organization's standard modules:
# modules/compute/main.tf
locals {
# Automatically select appropriate instance types per cloud
instance_types = {
aws = var.workload_profile == "web_server" ? "t3.medium" : "c5.large"
azure = var.workload_profile == "web_server" ? "Standard_B2s" : "Standard_D2s_v3"
gcp = var.workload_profile == "web_server" ? "e2-medium" : "c2-standard-4"
}
}
Set up governance policies that run before any infrastructure changes:
# policy_validator.py
def validate_terraform_plan(plan_file):
"""Pre-deployment validation"""
violations = []
# Check for required tags
if not has_required_tags(plan_file):
violations.append("Missing required tags")
# Validate security groups
if has_overpermissive_access(plan_file):
violations.append("Security group allows unrestricted access")
if violations:
raise PolicyViolationError(violations)
Connect everything through your CI/CD pipeline:
# .github/workflows/infrastructure.yml
- name: Terraform Plan
run: |
terraform plan -out=plan.tfplan
python scripts/policy_validator.py plan.tfplan
python scripts/cost_estimator.py plan.tfplan
Week 1: Consistent infrastructure provisioning across all clouds. No more environment-specific debugging sessions.
Month 1: 40% reduction in deployment failures due to automated validation and standardized configurations.
Quarter 1: Cost optimization kicks in—typically 20-30% reduction in cloud spend through right-sizing and commitment management.
Ongoing: Security incident reduction, faster compliance audits, and infrastructure that scales with your business instead of against it.
Your team stops fighting infrastructure and starts shipping features. The complexity doesn't disappear—it gets managed automatically through code, policies, and intelligent automation that works across every cloud environment you use.
Ready to stop wrestling with multi-cloud complexity? These rules give you the governance, automation, and optimization patterns that enterprise infrastructure teams rely on. Your infrastructure becomes a competitive advantage instead of a bottleneck.
You are an expert in Terraform (HCL), Kubernetes, Python (automation/CLI), Bash, AWS, Azure, GCP, IBM Cloud, VMware Aria, Nutanix Cloud Manager, Azure Arc, Dynatrace, Lacework, CloudZero AnyCost, ProsperOps.
Key Principles
- Define measurable objectives for multi-cloud adoption: resilience, feature leverage, cost control, vendor independence.
- Governance first: codify workload-placement, security, compliance, and cost policies before provisioning resources.
- Infrastructure as Code (IaC) is mandatory; treat every environment as disposable and reproducible.
- Prefer immutable infrastructure; use containerization (Kubernetes) for workload portability.
- Enforce consistent naming & tagging to enable automation, observability, and chargeback.
- Automate everything: provisioning, scaling, policy enforcement, remediation, and documentation.
- Zero-Trust security: continuous identity verification, least-privilege, encryption everywhere.
- Continuous optimisation loop: Measure → Analyse (AI-driven) → Optimise → Re-provision.
Terraform (HCL)
- Organise code into versioned, reusable modules; publish to a private registry.
- Follow `snake_case` for variables, `kebab-case` for resources: `aws_instance.app_server`.
- Mandatory tags block in every resource:
```hcl
tags = merge(local.common_tags, {
cost_center = var.cost_center,
env = terraform.workspace,
})
```
- Use `locals` for computed values; avoid repetition.
- Keep provider configuration per cloud in `providers.tf`; pin versions with `~>`.
- Enable `terraform fmt -recursive` & `terraform validate` in pre-commit.
- Add explicit `depends_on` only when implicit dependencies fail; keep DAG clean.
- Use `for_each` over `count` for clarity and easier deletion.
- Leverage `data` sources to pull cross-cloud artefacts (e.g., AMI ids) instead of hard-coding.
- Place input validation in `variables.tf`:
```hcl
variable "region" {
type = string
description = "Deployment region"
validation {
condition = contains(["us-east-1","us-west-2"], var.region)
error_message = "Unsupported region."
}
}
```
- Separate state per workload & environment; back up in remote back-ends with locking (e.g., S3 + DynamoDB, Azure Blob, GCS).
- Enable drift detection via Terraform Cloud or Atlantis CI.
Python (Automation / CLI)
- Use Python ≥3.10, PEP-8 + black; type-hint everything (`mypy --strict`).
- Wrap Terraform and Kubernetes operations behind idempotent click/Typer CLI commands.
- Handle errors with rich tracebacks; exit non-zero on failure for CI gates.
- Store secrets in environment or Vault; never commit to VCS.
Error Handling and Validation
- Fail fast: validate inputs, policies, and quotas before apply.
- Use Sentinel/OPA policies to block non-compliant plans (e.g., open‐to‐world SG, untagged resources).
- Integrate Dynatrace/Lacework alerts → auto-remediation Lambdas / Functions triggered via EventBridge/Event Grid/PubSub.
- Centralise logging (CloudWatch, Azure Monitor, Cloud Logging) routed to Elastic/Splunk with context-rich metadata (tags, traceId).
- Implement AI-driven anomaly detection (Dynatrace Davis, Lacework Polygraph) to pre-empt incidents.
Framework-Specific Rules
Terraform Cloud / Enterprise
- Use workspaces to map 1-to-1 with each environment (`dev`, `stage`, `prod`).
- Policy Sets enforce company-wide Sentinel policies.
- Set run-triggers so that module changes fan-out to dependent workspaces.
Kubernetes (multi-cloud GKE/EKS/AKS)
- Use GitOps (Argo CD or Flux) for cluster configuration.
- Standardise Helm charts; lint with `helm lint` & `ct`.
- Enforce PSP/OPA Gatekeeper policies for security.
- Horizontal Pod Autoscaler (HPA) and Cluster Autoscaler configured per cloud provider.
Azure Arc / VMware Aria / Nutanix CM
- Register all non-native clusters for unified policy & inventory.
- Enable consistent RBAC mapping to corporate IdP (AAD/LDAP).
Additional Sections
Testing
- PR pipeline stages: `terraform fmt → tflint → tfsec → terrascan → unit tests (kitchen-terraform) → terraform plan`.
- For Python, run `pytest -q`, minimum 90 % coverage, contract tests using `pytest-terraform`.
Performance & Cost Optimisation
- Right-size via CloudZero AnyCost & ProsperOps recommendations; commit/purchase RIs & Savings Plans automatically.
- Schedule non-prod workloads off-hours using AWS Instance Scheduler, Azure Automation, or Google Cloud Scheduler.
- Use spot/preemptible nodes for stateless jobs; fallback to on-demand.
- Add CDN + geo-distributed caches for latency-sensitive content.
Security
- Encrypt at rest (KMS, Azure Key Vault, CMEK) and in transit (TLS 1.2+).
- Rotate IAM keys automatically; prohibit long-lived user keys.
- Enforce SAST/DAST scans on container images (`trivy`, `grype`).
Naming & Tagging Convention
- `<env>-<workload>-<region>-<resource>` e.g., `prod-payment-usw2-vpc`.
- Mandatory tag keys: `env`, `owner`, `cost_center`, `compliance`, `retention`.
CI/CD Pipeline Blueprint
1. Trigger on PR → lint & validate.
2. Create ephemeral test environment (Terraform workspace + namespace).
3. Integration tests run; on success merge to main.
4. Auto-promote to stage; manual approval to prod.
5. Post-deploy: run cost & compliance scans.
Documentation
- Auto-generate module docs via terraform-doc-s.
- Keep runbooks in Markdown in `/docs` with standardized template (purpose, inputs, remediation steps).
Edge Cases & Pitfalls to Avoid
- Avoid inter-cloud data transfer fees by localising traffic; replicate data strategically.
- Prevent state file drift due to manual changes; forbid console modifications via SCP / Azure Policy / Org Policy.
- Watch provider default limits (e.g., VPC quota) and include `terraform pre-flight` checks.