Comprehensive Rules for writing and maintaining highly-performant SQL against very large databases (>1 TB). Covers indexing, partitioning, statistics, execution-plan analysis, monitoring tooling, and framework-specific features (SQL Server 2025, PostgreSQL 16, MySQL 8.0).
You're managing terabyte-scale databases, and every query optimization decision impacts millions of users. One poorly written query can cascade into system-wide performance degradation, turning your elegant application into a sluggish nightmare.
The difference between a 10-second query and a 100-millisecond query isn't just about user experience—it's about infrastructure costs, system stability, and your ability to scale. But here's the reality: most developers are optimizing blind, relying on intuition instead of data-driven decisions.
Every day, applications worldwide burn through compute resources because developers:
These aren't just performance issues—they're architectural debt that compounds over time. A single missing index can cost thousands in cloud compute. A poorly partitioned table can make your entire analytics pipeline unusable.
These Cursor Rules transform your database development approach from reactive troubleshooting to proactive performance engineering. Instead of hunting down slow queries after they've already impacted users, you'll write optimized SQL from the start.
What you get:
This isn't about memorizing SQL syntax—it's about understanding how your database engine actually processes your queries at scale.
-- Typical problematic query
SELECT * FROM Orders O
JOIN Customers C ON O.CustomerID = C.ID
WHERE YEAR(O.OrderDate) = 2024;
-- Results in:
-- - Full table scan on Orders (45 million rows)
-- - Function on indexed column prevents index usage
-- - Implicit conversion between CustomerID types
-- - 30-second execution time in production
-- Optimized with strategic indexing
ALTER TABLE Orders ADD order_year AS (YEAR(OrderDate)) PERSISTED;
CREATE INDEX IX_Orders_OrderYear ON Orders(order_year);
SELECT o.order_id, o.total, c.customer_name
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.order_year = 2024;
-- Results in:
-- - Index seek on IX_Orders_OrderYear
-- - 150ms execution time (200x faster)
-- - Predictable performance as data grows
SQL Server 2025 Optimization:
-- Automatic batch mode processing for large analytics
SELECT
YEAR(order_date) as order_year,
SUM(total_amount) as revenue
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY YEAR(order_date)
OPTION (USE HINT('ENABLE_BATCH_MODE'));
PostgreSQL 16 Parallel Processing:
-- Declarative partitioning with parallel workers
CREATE TABLE orders_2024 PARTITION OF orders
FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');
SET max_parallel_workers_per_gather = 4;
SET work_mem = '256MB';
Challenge: Customer analytics queries timing out during peak traffic
Implementation:
-- Partition orders by month for time-series analysis
CREATE TABLE orders_partitioned (
order_id BIGINT,
customer_id INT,
order_date DATE,
total_amount DECIMAL(10,2)
) PARTITION BY RANGE (order_date);
-- Strategic index for dashboard queries
CREATE INDEX IX_Orders_Customer_Date_Amount
ON orders_partitioned (customer_id, order_date)
INCLUDE (total_amount);
Result: Dashboard load time reduced from 45 seconds to 1.2 seconds
Challenge: Monthly reconciliation queries consuming entire database resources
Implementation:
-- Materialized computed columns for complex calculations
ALTER TABLE transactions
ADD fiscal_month AS (CONCAT(YEAR(transaction_date), '-',
FORMAT(MONTH(transaction_date), '00'))) PERSISTED;
CREATE INDEX IX_Transactions_FiscalMonth
ON transactions (fiscal_month)
INCLUDE (amount, account_id);
Result: Reconciliation runtime reduced from 6 hours to 12 minutes
-- Enable Query Store for plan regression analysis
ALTER DATABASE YourDB SET QUERY_STORE = ON;
ALTER DATABASE YourDB SET QUERY_STORE (
OPERATION_MODE = READ_WRITE,
DATA_FLUSH_INTERVAL_SECONDS = 60,
INTERVAL_LENGTH_MINUTES = 5
);
-- Analyze your most expensive queries
SELECT TOP 10
query_sql_text,
execution_count,
total_worker_time/execution_count AS avg_cpu_time,
total_logical_reads/execution_count AS avg_logical_reads
FROM sys.query_store_query_text qt
JOIN sys.query_store_query q ON qt.query_text_id = q.query_text_id
JOIN sys.query_store_plan p ON q.query_id = p.query_id
JOIN sys.query_store_runtime_stats rs ON p.plan_id = rs.plan_id
ORDER BY total_worker_time DESC;
SQL Server 2025:
-- Enable Intelligent Query Processing
ALTER DATABASE SCOPED CONFIGURATION
SET BATCH_MODE_ADAPTIVE_JOINS = ON;
ALTER DATABASE SCOPED CONFIGURATION
SET BATCH_MODE_MEMORY_GRANT_FEEDBACK = ON;
PostgreSQL 16:
-- Optimize parallel processing
SET max_parallel_workers_per_gather = 4;
SET parallel_leader_participation = on;
SET work_mem = '128MB';
MySQL 8.0:
-- Enable histogram statistics
ANALYZE TABLE orders UPDATE HISTOGRAM ON customer_id WITH 100 BUCKETS;
SET SESSION optimizer_switch = 'condition_fanout_filter=on';
-- Create performance baseline
CREATE PROCEDURE sp_capture_performance_baseline
AS
BEGIN
-- Capture wait statistics
SELECT
wait_type,
waiting_tasks_count,
wait_time_ms,
signal_wait_time_ms
INTO #wait_stats_baseline
FROM sys.dm_os_wait_stats
WHERE wait_type NOT IN ('CLR_SEMAPHORE', 'LAZYWRITER_SLEEP');
-- Alert on significant changes
-- Implementation depends on your monitoring tool
END
These Cursor Rules don't just optimize individual queries—they establish a systematic approach to database performance that scales with your application. You'll move from reactive troubleshooting to proactive performance engineering, writing SQL that performs consistently from development through production.
The difference between a database that struggles under load and one that scales effortlessly isn't just about hardware—it's about understanding how your queries actually execute and optimizing accordingly. With these rules, you'll have that understanding built into every query you write.
Stop fighting slow queries. Start engineering fast ones.
You are an expert in:
- ANSI-SQL, T-SQL (SQL Server 2025), PL/pgSQL (PostgreSQL 16), MySQL 8.0
- Indexing, partitioning, statistics maintenance, execution-plan analysis
- Tooling: SSMS, pgAdmin, MySQL Workbench, SQL Diagnostic Manager, PMM, Redgate SQL Monitor, Datadog DBM
Key Principles
- Optimize for I/O first: aim to minimize logical reads and network round-trips.
- Treat execution plans as the single source of truth; never guess.
- Favour set-based operations; avoid RBAR (“row-by-agonizing-row”).
- Write idempotent, side-effect-free queries wherever possible.
- Solve the top 20 % of wait types that cause 80 % of runtime.
Language-Specific Rules (ANSI-SQL + vendor dialects)
- Always list required columns; never use SELECT *.
- Alias every table and prefix columns (e.g., o.order_id) for clarity.
- For filters, place sargable predicates (column op constant) in WHERE, move non-sargable expressions (e.g., functions on columns) to computed columns or indexed views.
- Prefer EXISTS over IN for semi-joins when the sub-set is large.
- Use UNION ALL unless deduplication is explicitly required.
- Parameterize every literal > 5 distinct values to promote plan reuse.
- Keep JOIN order aligned with estimated row counts: smallest first for nested-loop joins.
- Limit CTE chaining depth to 5; materialize long CTEs into temp tables when reused.
- For pagination, use keyset (WHERE > last_id ORDER BY id ASC LIMIT n) instead of OFFSET n.
Error Handling and Validation
- Validate all parameters for range and existence before executing the main query.
- In stored procedures, return @@ERROR / GET DIAGNOSTICS immediately after DML.
- Compare estimated vs. actual row counts; if estimate ≥ 10× off, update statistics or rewrite predicate.
- Capture deadlock graphs; add appropriate index or rewrite concurrency hotspot.
Framework-Specific Rules
SQL Server 2025
- Enable Intelligent Query Processing (IQP) but evaluate plan regression in Query Store first.
- Use Horizontal Fusion hints only when plan has ≥3 remote exchanges.
- In DirectQuery models (SSAS), use composite models partitioned by fiscal year.
- Apply OPTIMIZE_FOR_SEQUENTIAL_KEY on heavy insert tables with clustered identity PK.
PostgreSQL 16
- Implement declarative partitioning (RANGE on created_at) and attach indexes locally.
- Vacuum analyze after ≥10 % row churn; autovacuum scale factor ≤0.05 for time-series.
- Enable parallel_leader_participation = on; tune work_mem ≥64 MB per core.
MySQL 8.0
- Rely on InnoDB primary-key clustering; always include PK in secondary index leaves.
- Use EXPLAIN FORMAT=JSON; inspect "used_columns" to remove dead weight.
- Enable histogram statistics on high-cardinality, non-indexed columns.
Additional Sections
Testing & Continuous Tuning
- Maintain a reproducible workload script (TPC-DS subset) for regression tests.
- Gate every migration with before/after execution-plan diff using pt-query-digest or SQL Server DMVs.
Performance Patterns
- Strategic Indexing
• Single-column non-clustered indexes on high-selectivity predicates.
• Composite indexes following equality-columns first, then range.
• Refresh index statistics WITH FULLSCAN after large bulk-loads.
- Partitioning
• Use RANGE partitions on temporal data; keep active window (≤3 months) on fast storage.
• Align partition key with most-common filter to guarantee partition elimination.
- Parallelism
• Max DOP = (#cores ÷ 2) for OLTP; #cores for DSS.
• For long-running reports, add OPTION (MAXDOP 8) to avoid worker starvation.
- Query Hints
• Use hints sparingly and document the symptom, not the solution.
• Remove hint during quarterly review; prefer schema changes.
Security
- Enforce least-privilege roles; grant EXECUTE on stored procs, not table SELECT.
- Sanitize dynamic SQL via QUOTENAME / format() to mitigate injection.
Monitoring & Tooling
- Capture wait stats every 1 min; alert when single wait type >50 % of total over 10 min.
- Track top 10 most-expensive plans by total worker_time daily.
- Annotate deployments in monitoring dashboards to correlate spikes.
Common Pitfalls & Anti-Patterns
- Cartesian joins due to missing ON clause → always run SET ANSI_WARNINGS ON, set row count alarms.
- Functions on indexed columns (e.g., WHERE YEAR(order_date)=2024) → create computed persisted column.
- Implicit conversions (varchar to nvarchar) → ensure types match exactly; check CONVERT_IMPLICIT in plan.
Examples
-- Bad
SELECT * FROM Orders O JOIN Customers C ON O.CustomerID = C.ID WHERE YEAR(O.OrderDate) = 2024;
-- Good
ALTER TABLE Orders ADD order_year AS (YEAR(OrderDate)) PERSISTED;
CREATE INDEX IX_Orders_OrderYear ON Orders(order_year);
SELECT o.order_id, o.total, c.customer_name
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.order_year = 2024;
Follow these rules rigorously to sustain sub-second query performance on multi-terabyte databases.