Opinionated rules for designing, implementing, and operating horizontally-sharded databases at scale.
Your application is scaling. User growth is accelerating. Your monolithic database is becoming the chokepoint that keeps you awake at 3 AM. Sound familiar?
While your competitors are adding read replicas and calling it "scaling," you're about to implement the horizontal sharding architecture that powers systems handling millions of transactions per second.
Every high-growth engineering team hits this wall. Your PostgreSQL instance is maxed at 32 cores and 1TB RAM. Read replicas help with queries, but write throughput is still bottlenecked on a single primary. Adding more vertical scale is expensive and temporary.
The real problem? Your data architecture wasn't designed for horizontal scale from day one.
Traditional scaling approaches fail because:
These Cursor Rules implement battle-tested sharding patterns used by teams at Netflix, Uber, and Discord to scale from thousands to billions of operations per day. Instead of fighting your database's limitations, you're architecting around them.
The ruleset transforms your approach to:
Eliminate Analysis Paralysis: Stop debating shard key selection. The rules provide decision frameworks based on cardinality, access patterns, and hotspot prevention.
Reduce Production Incidents: Automated health checks detect shard skew before it impacts users. Circuit breakers and retry policies handle transient failures gracefully.
Accelerate Feature Development: Clear patterns for cross-shard operations mean your team isn't reinventing distributed query logic for every new feature.
Compress Operational Overhead: Standardized monitoring, backup, and deployment procedures across all shard technologies reduce the learning curve for new team members.
-- This query scans 50M+ rows on a single instance
SELECT COUNT(*) FROM orders
WHERE created_at > '2024-01-01'
AND customer_region = 'us-west';
-- Team dreads this conversation:
-- "The analytics query is blocking production writes again"
-- Same query, but automatically routed to relevant shards only
SELECT COUNT(*) FROM orders_s1, orders_s2, orders_s3
WHERE shard_key IN (hash('us-west-1'), hash('us-west-2'), hash('us-west-3'))
AND created_at > '2024-01-01';
-- Query executes in parallel across 3 shards instead of blocking 1 instance
-- Smart composite shard key prevents hotspots
CREATE TABLE orders_s1 (
order_id BIGINT,
tenant_id VARCHAR(32),
-- Always include tenant_id in WHERE clauses
shard_key VARCHAR(64) GENERATED ALWAYS AS (
CONCAT(tenant_id, ':', MOD(HASH(order_id), 16))
),
PRIMARY KEY (shard_key, order_id)
);
// Health check endpoint automatically generated
GET /internal/shards
{
"shard_1": {
"replication_lag_ms": 45,
"storage_free_gb": 127,
"qps": 2847,
"p95_latency_ms": 12
},
"shard_2": {
"replication_lag_ms": 1200, // Alert: lag > 1000ms
"storage_free_gb": 8, // Alert: storage < 10GB
"qps": 4923,
"p95_latency_ms": 89
}
}
# .cursorrules configuration
database_sharding:
shard_count: 16
replica_factor: 3
shard_key_pattern: "tenant_id"
rebalance_threshold: 1.3 # Trigger when max/avg > 1.3
The rules automatically adapt to your stack:
MongoDB Sharding:
// Automatically enables balancer and sets chunk size
sh.enableSharding("ecommerce");
sh.shardCollection("ecommerce.orders", {
"tenant_id": "hashed"
});
Citus (PostgreSQL):
-- Generates distribution and co-location logic
SELECT create_distributed_table('orders', 'tenant_id');
SELECT create_reference_table('lookup_tenants');
Vitess (MySQL):
{
"sharded": true,
"vindexes": {
"tenant_hash": {
"type": "hash"
}
},
"tables": {
"orders": {
"column_vindexes": [
{
"column": "tenant_id",
"name": "tenant_hash"
}
]
}
}
}
# Auto-generated operational scripts
./scripts/add_shard.sh --replica-count=3 --zone=us-east-1a
./scripts/rebalance_check.sh --threshold=25 # Run every 6h via cron
./scripts/backup_all_shards.sh --retention=30d
# Prometheus rules generated automatically
- alert: ShardSkewDetected
expr: (max(shard_row_count) / avg(shard_row_count)) > 1.3
for: 5m
- alert: ReplicationLagHigh
expr: shard_replication_lag_seconds > 5
for: 2m
Query Performance: Teams report 60-80% reduction in query latency after implementing proper shard key selection and query routing patterns.
Operational Stability: Automated failover and rebalancing reduce database-related incidents by 75% within the first quarter.
Development Velocity: Standardized patterns for cross-shard operations mean new features involving distributed data ship 3x faster.
Cost Optimization: Horizontal scaling on commodity hardware typically reduces database infrastructure costs by 40-60% compared to continued vertical scaling.
These rules handle the complex scenarios that break most sharding implementations:
Your database architecture becomes a competitive advantage instead of a scaling limitation. While other teams are still debating whether to shard, you're already running distributed queries across dozens of nodes with sub-50ms latency.
Start implementing production-grade sharding patterns that scale with your business growth, not against it.
You are an expert in distributed SQL, NoSQL, and data-plane orchestration (PostgreSQL/Citus, MySQL/Vitess, MongoDB, Cassandra, ProxySQL, Apache ShardingSphere, Kubernetes).
Key Principles
- Prefer horizontal sharding first; scale reads with replicas only when write volume is bottleneck.
- Pick the shard key to maximize uniform cardinality and minimize cross-shard traffic.
- Shared-nothing: each shard owns data + WAL/commit-log + local indexes; no foreign-key constraints across shards.
- Always plan for resharding; build indirection (lookup table or logical router) so keys can migrate without app changes.
- Automate everything: provisioning, rebalancing, fail-over, backup.
- Replicate every shard 3× (minimum) across fault domains; design for one replica loss without service impact.
- Keep operational metadata (topology, health, metrics) in a strongly-consistent store (e.g., etcd, ZooKeeper, Consul).
SQL / Query Rules
- NEVER run SELECT *; project only required columns to reduce network fan-out.
- All WHERE clauses on sharded tables MUST include the shard key (= or IN) to ensure single-shard routing.
- Use prepared statements with bind variables; embed shard-key as first parameter for deterministic routing.
- Avoid multi-shard transactions. If unavoidable, use two-phase commit with strict timeout (<500 ms) and compensate on failure.
- Keep cross-shard JOINs in the service layer; do not rely on the database unless using a distributed SQL engine that supports it (e.g., Citus).
- When performing resharding, disable adaptive query plans; pin execution plans to avoid oscillation.
Schema & DDL Conventions
- Sharded tables use suffix _s<cluster_id> (e.g., orders_s1). Global lookup tables use prefix glb_.
- Never include monotonic keys (AUTO_INCREMENT, timestamp) alone as shard keys ➜ hotspot risk.
- Composite key pattern: (tenant_id, entity_id) where tenant_id is hashed modulus N shards.
- Keep primary-key width ≤ 24 bytes to minimize secondary-index bloat.
- Place data-local secondary indexes only on columns also containing the shard key.
Error Handling & Validation
- Detect shard skew: if (max_shard_size / avg_shard_size) > 1.3 ⇒ trigger auto-rebalance job.
- Build health-check endpoint /internal/shards returning replication lag, free storage, QPS, 95p latency per shard.
- Retry policy: idempotent reads 3× at 200 ms back-off; non-idempotent writes escalate to circuit-breaker.
- All client libraries must raise ShardRoutingError when the key cannot be mapped; NEVER default to broadcast query.
- Fail-over: replica promotion completes <15 s; client retries with exponential back-off (100, 200, 400 ms… up to 5 s).
Framework-Specific Rules
MongoDB
- Enable sharding via enableSharding(dbName) before collection-level shardCollection.
- Use hashed shard key for high-cardinality collections; range key only when you need efficient pagination.
- Balancer settings: chunksize ≤ 128 MB; allowBalancing = true in non-peak hours window.
Cassandra
- Token-aware driver MUST be enabled; set consistency LOCAL_QUORUM for reads/writes.
- Set num_tokens = 256 for large clusters; run nodetool rebalance quarterly.
Vitess (MySQL)
- VSchema must declare shard-key column and routing rules.
- Use vtgate for all client connections; never bypass.
- Resharding: vtctl Reshard —source ks/-80,80- —dest ks/-40,40-80,80-.
Citus (PostgreSQL)
- distribute_table('table', 'shard_key'); co-location groups for foreign-keys.
- Use adaptive executor (citus.use_local_task_placement = on) for low-latency single-shard queries.
ProxySQL (MySQL)
- Configure query_rules hashShard(router_id) to map parameter-index=1.
- Monitor stats_mysql_connection_pool for per-shard saturation; rebalance when >80 % utilisation.
Testing
- Unit: verify router maps 1000 random keys to uniform shard histogram (σ² < 5 %).
- Integration: spin 3-shard docker-compose; run TPC-C 1000 warehouses; assert 95p latency < 50 ms.
- Chaos: kill-9 primary process; ensure replica promotion + app retries succeed <30 s.
Performance
- Batch writes in 500-row chunks to amortize network latency.
- Enable parallel query execution (SET max_parallel_workers_per_gather = 4) for analytics on Citus.
- Keep p99 replica lag <2 s; throttle writes when lag >5 s.
Security
- Encrypt data-in-flight with mTLS between router and shards.
- Obfuscate shard-key in URLs; never leak physical shard ids.
- Grant only EXECUTE on routing stored procs; no raw SELECT on sharded tables for application roles.
Backup & DR
- Each shard performs incremental backups every 15 min and base backup nightly.
- Store WAL / commit-logs on object storage in another region; retention 30 days.
- Run cross-shard restore drill quarterly.
Operational Playbooks
- Add shard: provision VM, restore base backup, register in topology, trigger chunk-move scheduler.
- Remove shard: drain traffic, verify lag 0, copy residual chunks, decommission replica set.
- Rebalance: monitor_chunk_distribution job every 6 h; threshold 25 % deviation.
Documentation & Naming
- docs/sharding/ contains ADRs, shard-key rationale, migration run-books.
- Use Terraform module db_shard_* for infra; variables: shard_count, replica_factor, az_map.