Actionable Rules for designing, implementing, and operating sharded databases at scale.
You've hit the wall. Your monolithic database is buckling under load, queries are timing out, and your application is crawling to a halt. Adding vertical scale only delays the inevitable, and horizontal scaling feels like navigating a minefield of consistency nightmares and cross-shard query disasters.
Every high-growth application faces the same brutal truth: single-database architectures don't scale indefinitely. You're dealing with:
Traditional scaling approaches—read replicas, connection pooling, query optimization—only postpone the problem. You need fundamental architectural change.
These Cursor Rules implement production-ready sharding patterns that eliminate guesswork and prevent the common pitfalls that destroy application performance. Instead of trial-and-error learning that costs months of development time, you get battle-tested configurations for distributed SQL systems.
The rules handle the complex decisions for you: shard key selection algorithms, routing logic, error handling patterns, and database-specific optimizations for CockroachDB, Google Cloud Spanner, AWS Aurora, and MongoDB.
Eliminate Architecture Paralysis Stop debating shard key strategies. The rules provide concrete algorithms for different data patterns—hashed keys for even distribution, compound keys for geo-sharding, monotonic IDs for time-series data.
Skip the Migration Disasters Avoid the weeks of downtime and data corruption risks. Get automated migration scripts, blue-green deployment patterns, and zero-downtime rebalancing procedures.
Prevent Cross-Shard Query Hell
The rules enforce denormalization patterns and query restrictions that keep your application fast. No more accidentally writing queries that span hundreds of shards.
Automate Operational Complexity Circuit breakers, health monitoring, and automatic failover configurations eliminate manual intervention during shard failures.
-- Hours of analysis, performance testing, and production hotspots
CREATE TABLE orders (
id SERIAL PRIMARY KEY, -- Creates hotspots
customer_id INT,
created_at TIMESTAMP
);
-- Generated automatically with proven distribution patterns
CREATE TABLE orders (
order_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
hashed_shard INT GENERATED ALWAYS AS (
mod(abs(('x'||substr(encode(order_id, 'hex'),1,16))::bit(64)::bigint), 16)
) STORED
) PARTITION BY HASH (hashed_shard);
// Brittle, error-prone routing logic
const shard = hashFunction(key) % shardCount;
const result = await pools[shard].query(sql, params);
// Automated routing with failure handling
const SHARDS = [
{ id: 0, pool: pgPool0, status: 'healthy' },
{ id: 1, pool: pgPool1, status: 'healthy' }
];
export function route(key: string) {
const hash = murmurHash64(key) % SHARDS.length;
const shard = SHARDS[hash];
if (shard.status === 'circuit_open') {
throw new ShardUnavailableError(`Shard ${hash} unavailable`);
}
return shard;
}
Constantly checking dashboards, manually investigating performance issues, reactive problem-solving when shards become unbalanced.
// Structured logging with automatic alerting
logger.info({
shard: shardId,
operation: 'SELECT',
latency_ms: 37,
row_count: 1,
cache_hit: true
});
Add the rules to your .cursorrules file and let Cursor generate the foundational sharding infrastructure:
The rules guide you through data modeling that supports efficient sharding:
-- Automatically generates optimal partition schemes
CREATE TABLE users (
user_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email VARCHAR UNIQUE NOT NULL,
hashed_shard INT GENERATED ALWAYS AS (
mod(abs(('x'||substr(encode(user_id, 'hex'),1,16))::bit(64)::bigint), 16)
) STORED
) PARTITION BY HASH (hashed_shard);
Get deployment scripts and configurations for your target platform:
The rules provide runbooks for:
75% Reduction in Query Response Time
Proper shard key selection eliminates hotspots and ensures queries hit single shards instead of scanning across multiple nodes.
90% Decrease in Cross-Shard Operations
Denormalization patterns and query restrictions prevent expensive distributed transactions that kill performance.
Zero Downtime Deployments
Blue-green migration scripts and automated rebalancing eliminate maintenance windows that disrupt user experience.
Predictable Scaling Costs
Automated capacity planning and shard splitting prevent over-provisioning and optimize cloud spending as you grow.
4x Faster Development Velocity
Skip months of architecture research and testing. Get production-ready configurations immediately and focus on building features instead of infrastructure.
You'll transform database scaling from a months-long engineering project into a configuration exercise. Your application handles 10x traffic growth without architectural rewrites, and your team ships features instead of fighting infrastructure problems.
Stop delaying the inevitable. Implement distributed sharding correctly from the start and build applications that scale effortlessly.
You are an expert in Distributed SQL & NoSQL sharding. (PostgreSQL, MySQL, MongoDB, CockroachDB, Google Cloud Spanner, AWS Aurora, Azure Cosmos DB)
Key Principles
- Always model data first: identify core entities, relationships, and access patterns before picking a shard key.
- Favour even data distribution over human-readable keys; minimise cross-shard traffic.
- Design for growth: plan how you will add or split shards without downtime.
- Automate everything: provisioning, monitoring, rebalancing, backup, and fail-over.
- Prefer idempotent, retry-safe operations and deterministic shard routing logic.
- Make the happy path simple; surface failures early with explicit errors.
SQL & DDL Rules
- Use immutable, monotonically increasing 128-bit UUIDs or ULIDs as primary keys if range queries are not required.
- Partitioned tables (PostgreSQL 15+, MySQL 8+) must include the shard key as the first column of every primary & foreign key.
- Always create a supporting global secondary index on the shard key to speed up routing.
- Name partitions/shards with the pattern <table>_p_<range_start>_<range_end>. Example:
```sql
CREATE TABLE orders (
order_id BIGINT,
customer_id BIGINT,
created_at TIMESTAMPTZ,
...
) PARTITION BY RANGE (created_at);
```
- Disallow `SELECT *` across all shards; enforce `WHERE shard_key = ?` with row-level RLS policies.
- Never join on non-shard keys. If unavoidable, denormalise or replicate the small dimension table to all shards.
- Use transactional DDL (e.g., PostgreSQL) or blue-green schema rollout scripts for atomic shard-wise migrations.
Error Handling & Validation
- Guard every write with a `BEGIN…COMMIT` that times out < 5 s. Abort long-running X-Shard 2-phase commits.
- Implement application-level idempotency keys to de-duplicate retried writes after partial shard failures.
- Validate shard key presence in application layer; reject requests without a resolvable key (HTTP 422).
- Centralised circuit-breaker: after 3 successive failures on one shard, open circuit for 30 s and route reads to replicas only.
Framework-Specific Rules
CockroachDB
- Use `REGIONAL BY ROW` tables for low-latency geo-sharding.
- Enable `kv.range_merge.queue_enabled = true` for automatic range merging post-resplit.
Google Cloud Spanner
- Pick compound keys with a low-cardinality leading column to avoid hotspotting.
- Use `INTERLEAVE IN PARENT` to colocate child tables and remove cross-shard joins.
AWS Aurora (MySQL)
- Use `aurora_lab_mode=1` to unlock parallel query for fan-out selects.
- Prefer hash-partitioned Aurora Serverless v2 with auto-scaling capacity units.
MongoDB
- Always shard collections before they reach 256 GB.
- Enable `balancer` window outside business hours; set `chunksize` ≤ 128 MB.
Testing
- Synthetic data generator must mimic real cardinality; verify `EXPLAIN` shows single-shard routing ≥ 95 % queries.
- Chaos test every quarter: randomly kill a shard node and assert < 30 s recovery.
- Load test re-shard procedure; target < 5 % tail latency degradation.
Performance
- Cache shard-mapping metadata in-memory (TTL 60 s) to avoid directory lookups.
- Use client-side parallel scatter-gather only for read-only analytics and cap concurrency to 8 shards/request.
- Monitor: p99 latency, row count per shard, chunk imbalance ratio (< 1.3 ideal).
Security
- Encrypt data in transit with mTLS between app ↔ shard.
- Use per-shard KMS keys; rotate every 90 days.
- Grant `SELECT` on remote shards only to analytics role.
Observability
- Log shard key + latency in structured JSON: `{shard:12, op:"SELECT", ms:37}`.
- Alert if any shard’s disk utilisation > 80 % or latency > 2× cluster median for 10 min.
Directory & Repository Conventions
- `database/ddl/` – declarative CREATE/ALTER statements per table with shard metadata.
- `database/shard-router/` – library exposing `route(key) -> shard_id`.
- `ops/scripts/` – automated rebalancing & migration scripts.
Examples
Hashed Shard Key in Postgres 15
```sql
CREATE EXTENSION IF NOT EXISTS pgcrypto;
CREATE TABLE users (
user_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
hashed_shard INT GENERATED ALWAYS AS (mod(abs(('x'||substr(encode(user_id, 'hex'),1,16))::bit(64)::bigint), 16)) STORED
) PARTITION BY HASH (hashed_shard);
```
Application-side Routing (TypeScript)
```ts
const SHARDS = [
{ id: 0, pool: pgPool0 },
{ id: 1, pool: pgPool1 },
// ...
];
export function route(key: string) {
const hash = murmurHash64(key) % SHARDS.length;
return SHARDS[hash];
}
```
Always document shard key rationale and link to `decision-records/ADR-042-shard-key.md`.