Opinionated guidelines for designing, implementing, and operating data-partitioning (sharding, range/hash partitioning, etc.) across SQL, NoSQL, and big-data platforms.
Your application is buckling under data growth. Query times are climbing. Specific tables are becoming bottlenecks. Your monitoring dashboards are lighting up red, and you're manually scaling resources just to keep up with basic operations.
Most developers hit the same wall: monolithic data structures that can't scale with real-world access patterns. You end up with:
These aren't just performance problems—they're architectural debt that compounds over time.
These Cursor Rules implement a battle-tested partitioning strategy that transforms how your data scales. Instead of fighting growth, you'll design systems that thrive on it.
What makes this different:
Transform query latencies from seconds to milliseconds through intelligent partition pruning:
-- Before: Full table scan across 500M rows
SELECT * FROM events WHERE event_time >= '2024-01-01';
-- After: Partition pruning hits only relevant monthly partitions
-- Query time: 2.3s → 45ms (98% improvement)
Distribute write load across partitions to eliminate bottlenecks:
Partition failures become isolated incidents instead of system-wide outages:
Before: Your events table hits 100M rows and every query crawls:
-- 45-second query on growing table
SELECT COUNT(*) FROM events
WHERE created_at >= now() - interval '7 days';
After: Monthly partitioning with automated pruning:
-- Same query, 200ms response time
-- Hits only current + previous month partitions
-- Automatic partition creation via cron job
Implementation: The rules provide ready-to-use PostgreSQL partition templates that handle monthly rotation automatically.
Before: Tenant isolation through WHERE clauses kills performance at scale:
-- Every query scans all tenant data
SELECT * FROM orders WHERE tenant_id = 'acme-corp'
AND created_at > '2024-01-01';
After: Tenant-based partitioning with predictable performance:
-- Direct partition routing, consistent sub-50ms queries
-- Each tenant gets dedicated partition(s)
-- No cross-tenant data leakage possible
Before: Spark jobs spend 70% of time on data shuffling:
// Massive shuffle operation across all partitions
df.groupBy("user_id").agg(sum("revenue"))
After: Pre-partitioned data eliminates shuffle overhead:
// Optimized partitioning strategy reduces job time 4x
df.repartitionByRange("user_id")
.write.partitionBy("date", "region")
Copy the partitioning rules into your .cursor-rules file in your project root.
Run this diagnostic query to identify your dominant patterns:
-- PostgreSQL: Find your hot query patterns
SELECT query, calls, mean_exec_time
FROM pg_stat_statements
ORDER BY calls DESC LIMIT 10;
The rules provide decision trees for each platform:
Time-based data? → Range partitioning on timestamp columns
Even distribution needed? → Hash partitioning on stable keys
Multi-tenant architecture? → Composite partitioning (tenant + time)
Implement the built-in health checks:
-- Automated skew detection (alerts when ratio > 2)
WITH sizes AS (
SELECT partition_name, pg_total_relation_size(relid) AS bytes
FROM pg_partition_size('orders'))
SELECT max(bytes)::numeric / avg(bytes) AS skew_ratio
FROM sizes;
Use the synthetic workload generators to validate your strategy:
# Chaos testing script included in rules
./test-partition-failover.sh --kill-random-leader
# Asserts: <30s recovery, <5% error rate
Teams using these partitioning patterns report:
Handle complex access patterns with multi-level strategies:
-- Partition by date, sub-partition by tenant
PARTITION BY RANGE (created_at)
SUBPARTITION BY HASH (tenant_id);
Automated partition splitting when hotspots emerge:
// MongoDB: Pre-split prevention
sh.splitAt("events.logs", {"user_id": "user_50000"})
Maintain partitioning logic across your entire stack:
# Spark + BigQuery alignment
df.write.partitionBy("date", "region") \
.mode("append") \
.format("bigquery") \
.save("analytics.events")
You're not just implementing partitioning—you're architecting for the next phase of your application's growth. These rules give you the playbook that scales from startup to enterprise without the typical growing pains.
Start with your biggest bottleneck table. Apply the appropriate partitioning strategy. Watch your performance problems disappear while your system gains the headroom to handle whatever growth comes next.
You are an expert in:
- SQL RDBMS: PostgreSQL ≥14, MySQL ≥8, Google Spanner
- Distributed NoSQL: MongoDB ≥6, AWS DynamoDB, Azure Cosmos DB
- Big-data engines & warehouses: Apache Spark 3+, Google BigQuery, Snowflake
Key Principles
- Model partitioning after the dominant data-access pattern; optimise for the 95 % path, not edge cases.
- Start with the simplest viable strategy (single-level range/hash) and evolve via split/merge when data growth or hotspots appear.
- Favour even data distribution and minimal cross-partition traffic; every remote hop adds latency.
- Partition keys must be immutable, deterministic, and appear in the majority of WHERE / JOIN / ROUTING clauses.
- Treat partitions as independent failure domains; isolate faults, upgrades, and backups per partition whenever the platform permits.
- Automate monitoring & rebalancing: human-free partition management is the only sustainable model at scale.
SQL (PostgreSQL / MySQL / Spanner)
- Use native partition syntax (e.g. PostgreSQL: `PARTITION BY RANGE`, MySQL: `PARTITION BY HASH`).
- Naming convention: `<table>_p<YYYYMM>` for range, `<table>_h<#>` for hash (two-digit zero-padded).
- Keep partition count per table ≤10 000; beyond this, query-planner overhead dominates.
- Always declare `PRIMARY KEY` or `UNIQUE` that includes the partition key; prevents cross-partition duplicates.
- Apply CHECK constraints mirroring the partition boundary; enables pruning in older planners.
- Never UPDATE a column that participates in the partition key; use INSERT + DELETE pattern instead.
NoSQL (MongoDB)
- Choose shard key types:
• Range key → targeted range queries, risk of hotspot; mitigate with prefix hash (e.g. `md5(user_id)+ts`).
• Hashed key → uniform writes at cost of scatter-gather reads.
- Pre-split chunks when ingesting >50 GB/hour to avoid jumbo chunks.
- Enable the balancer only during write-light windows; set `maxChunkSize` ≤ 1 GB.
- Monitor `chunksImbalance` (< 10 %) and `moveChunk.totalTimeMillis` (< 5 min) per shard.
AWS DynamoDB / Cosmos DB
- Design `partitionKey` + `sortKey` to avoid hot partitions: single partition should carry < 1 000 WCU/RCU on average.
- Use adaptive capacity metrics: alarm when `ConsumedReadCapacityUnits > 0.8 * Provisioned` for any keyspace.
- For unpredictable traffic, enable on-demand autoscaling and set `maxCapacityMultiplier` ≤ 4 to cap cost spikes.
Apache Spark 3+
- Repartition after heavy filters: `df.repartitionByRange("date")` for range or `hashPartitioning(cols, n)` for hash.
- Keep partition file size 100–512 MiB in Parquet/ORC to balance parallelism vs. overhead.
- Avoid `coalesce(1)` in production pipelines; instead, `orderBy(...).write.partitionBy("dt")`.
- Use `spark.sql.files.maxPartitionBytes=134217728` (128 MiB) and `spark.sql.shuffle.partitions` = `totalInputSize / 256 MiB`.
BigQuery / Snowflake
- Prefer ingestion-time partitioning on timestamp columns; clustering on secondary fields reduces post-read shuffle.
- Do not exceed 2 000 partitions per table in BigQuery; queries that touch > 2 000 partitions incur slot penalties.
- Materialise heavily accessed partition subsets into clustered materialised views.
Error Handling and Validation
- Edge-case first: validate partition key presence, type, and nullability at ingest pipe entry; reject early.
- During rebalance operations:
• Employ exponential back-off and idempotent retries (`moveChunk`, `splitPartition`, `ALTER TABLE ... DETACH PARTITION`).
• Throttle: limit concurrent moves to `ceil(shardCount / 4)` to minimise write-latency spike.
- Health-check script (pseudo-SQL):
```sql
WITH sizes AS (
SELECT partition_name, pg_total_relation_size(relid) AS bytes
FROM pg_partition_size('orders'))
SELECT max(bytes)::numeric / avg(bytes) AS skew_ratio
FROM sizes;
```
Alert if `skew_ratio > 2`.
Testing
- Use synthetic workload generator mirroring production key distribution; verify 99th-percentile latency improvement ≥ 20 % after any partition-scheme change.
- Chaos test: randomly kill partition leaders; assert automatic fail-over ≤ 30 s and < 5 % error rate.
- Include migration tests: `oldKey -> newKey` back-fill script must be idempotent and resumable.
Performance & Observability
- Metrics to collect per partition: read/write IOPS, p99 latency, disk utilisation, partition size, hot-key frequency.
- Dashboard heat-map: partition vs. ops / s; red ≥ 80 % of throttle limit.
- Enable query plans with partition pruning indicators (e.g., PostgreSQL `EXPLAIN ... Partitioned`).
Security & Compliance
- Encrypt per-partition backups; store encryption keys in KMS with partition-level access policies.
- Redact sensitive fields before cross-region partition move unless target complies with same data residency standard.
Documentation Checklist
☑ Diagram of partition boundaries and routing logic
☑ Rotation runbook covering split, merge, rebalance, and rollback
☑ SLA matrix: expected RPS, storage per partition, migration windows
Common Pitfalls
- Choosing monotonically increasing keys (e.g., timestamp) without range-splitting ⇒ write hotspot.
- Forgetting to update partitioning logic in ORMs ⇒ cross-partition full table scans.
- Oversharding early ⇒ operational overhead without performance benefit.
Example: PostgreSQL Time-range Partition
```sql
CREATE TABLE events (
event_time timestamptz NOT NULL,
tenant_id uuid NOT NULL,
payload jsonb NOT NULL,
PRIMARY KEY (event_time, tenant_id)
) PARTITION BY RANGE (event_time);
-- Monthly partitions for the current year
DO $$
DECLARE
d date := date_trunc('month', now());
BEGIN
FOR i IN 0..11 LOOP
EXECUTE format('CREATE TABLE IF NOT EXISTS events_p%s PARTITION OF events
FOR VALUES FROM (%L) TO (%L);',
to_char(d + (i||' month')::interval, 'YYYYMM'),
d + (i||' month')::interval,
d + ((i+1)||' month')::interval);
END LOOP;
END $$;
```