Opinionated Rules for designing, coding and operating highly-scalable, cloud-native back-end services with TypeScript, Node.js, Docker and Kubernetes.
Building scalable applications shouldn't feel like walking through a minefield. Yet most Node.js developers spend weeks debugging race conditions, wrestling with Docker configurations, and scrambling to add monitoring after their service crashes in production. You know the drill—what works perfectly on localhost becomes a nightmare at scale.
Your current development setup is optimized for building features, not for building systems that survive production traffic. You're probably:
The result? You spend more time firefighting production issues than shipping features. Your velocity drops as complexity grows, and scaling becomes a technical debt nightmare instead of a growth opportunity.
These Cursor Rules transform your development workflow into a systematic approach for building production-ready, scalable Node.js services. Instead of learning harsh lessons in production, you'll build fault-tolerance, observability, and scalability patterns directly into your daily development process.
What makes this different? These aren't just coding standards—they're a complete system for building services that can handle real-world complexity from day one. Every line of code you write will follow battle-tested patterns used by teams running services at massive scale.
// Before: Mixed concerns, no fault tolerance, poor observability
app.get('/users/:id', async (req, res) => {
const user = await db.query('SELECT * FROM users WHERE id = ?', [req.params.id]);
res.json(user);
});
Problems: No input validation, direct database coupling, no error handling, no metrics, no caching.
// After: Clean architecture, fault tolerance, full observability
@Controller('/users')
export class UserController {
constructor(
private userService: UserService,
private metrics: PrometheusService
) {}
@Get('/:id')
async getUser(@Param() params: GetUserParams, @Headers('x-request-id') requestId: string) {
const timer = this.metrics.httpRequestDuration.startTimer();
try {
const user = await this.userService.findById(params.id, { requestId });
timer({ method: 'GET', route: '/users/:id', status_code: '200' });
return { data: user };
} catch (error) {
timer({ method: 'GET', route: '/users/:id', status_code: error.statusCode });
throw error;
}
}
}
Immediate benefits: Type-safe parameters, automatic metrics collection, structured error handling, request tracing, dependency injection for testability.
// Before: No connection pooling, no retry logic, no monitoring
const user = await db.query('SELECT * FROM users WHERE id = ?', [userId]);
Problems: Single point of failure, no connection management, database errors crash the service.
// After: Connection pooling, circuit breaker, retry logic, metrics
@Injectable()
export class UserRepository {
constructor(
private db: DatabasePool,
private circuitBreaker: CircuitBreaker,
private logger: Logger
) {}
async findById(id: string, context: RequestContext): Promise<User | null> {
return this.circuitBreaker.fire(async () => {
const result = await this.db.query(
'SELECT * FROM users WHERE id = $1',
[id],
{ timeout: 5000, retries: 3 }
);
this.logger.info('User query executed', {
userId: id,
requestId: context.requestId,
executionTime: result.duration
});
return result.rows[0] ? UserSchema.parse(result.rows[0]) : null;
});
}
}
Immediate benefits: Automatic connection management, fault tolerance, performance monitoring, structured logging, type safety.
// Before: Mocked everything, no integration testing, brittle tests
test('should return user', async () => {
const mockDb = { query: jest.fn().mockResolvedValue({ rows: [{ id: '1', name: 'John' }] }) };
const result = await getUserById('1', mockDb);
expect(result.name).toBe('John');
});
Problems: Tests don't catch integration issues, database logic isn't tested, contract changes break silently.
// After: Real database testing, contract validation, comprehensive coverage
describe('UserService Integration', () => {
let app: TestApp;
let db: TestDatabase;
beforeAll(async () => {
db = await createTestDatabase();
app = await createTestApp({ database: db });
});
test('should return user with proper caching', async () => {
// Arrange: Real data in real database
const user = await db.users.create({ name: 'John', email: '[email protected]' });
// Act: Hit the actual endpoint
const response = await app.request()
.get(`/users/${user.id}`)
.expect(200);
// Assert: Validate response contract and side effects
expect(response.body).toMatchSchema(UserResponseSchema);
expect(response.body.data.name).toBe('John');
// Verify caching worked
const cacheHit = await app.redis.get(`user:${user.id}`);
expect(cacheHit).toBeTruthy();
});
});
Immediate benefits: Tests catch real integration issues, database migrations are tested, caching behavior is verified, API contracts are validated.
.cursor-rules file in your project rootCreate a new service using the recommended folder structure:
mkdir my-scalable-service && cd my-scalable-service
npm init -y
npm install express @types/express typescript ts-node
mkdir -p src/{domain,application,infrastructure/{http,db}}
Cursor will now suggest the complete service architecture based on the rules.
Create docker-compose.yml for local development:
version: '3.8'
services:
postgres:
image: postgres:15
environment:
POSTGRES_DB: myservice
POSTGRES_USER: dev
POSTGRES_PASSWORD: dev
ports:
- "5432:5432"
redis:
image: redis:7-alpine
ports:
- "6379:6379"
prometheus:
image: prom/prometheus
ports:
- "9090:9090"
Set up your package.json scripts:
{
"scripts": {
"dev": "ts-node-dev src/index.ts",
"build": "tsc",
"test": "jest",
"test:integration": "jest --config jest.integration.config.js",
"lint": "eslint src/**/*.ts",
"security": "npm audit && snyk test",
"docker:build": "docker build -t my-service .",
"k8s:deploy": "helm upgrade --install my-service ./charts/my-service"
}
}
Cursor will automatically suggest the proper configurations for each tool based on the rules.
Run through the complete development workflow:
# Start dependencies
docker-compose up -d
# Run tests (should pass with generated examples)
npm test
# Start development server
npm run dev
# Check health endpoints
curl http://localhost:3000/health/ready
curl http://localhost:3000/health/live
# View metrics
curl http://localhost:3000/metrics
If everything works, you now have a production-ready development environment.
These rules don't just improve your code—they transform how you think about building scalable systems. Stop learning scalability lessons the hard way in production. Start building services that are ready for real-world complexity from the first commit.
Your next microservice will be production-ready, horizontally scalable, and fully observable. The question isn't whether you can afford to implement these patterns—it's whether you can afford not to.
You are an expert in TypeScript • Node.js • Docker • Kubernetes • Serverless (AWS Lambda / Azure Functions) • Microservices • Redis • PostgreSQL • Prometheus • GitHub Actions.
Key Principles
- Design for elasticity, horizontal scaling and fault-tolerance from day 0.
- Prefer small, independently deployable microservices with clear bounded contexts.
- API-first thinking: versioned, self-describing contracts (OpenAPI >3.0).
- Stateless services; persist state in external datastores or caches.
- Code must be simple, modular and observable by default (metrics, logs, traces).
- Infrastructure is code; every change is tracked, reviewed and repeatable.
TypeScript
- Enable "strict", "noUncheckedIndexedAccess", "exactOptionalPropertyTypes".
- Use `interface` for public contracts; use `type` for internal aliases & unions.
- Functions over classes unless polymorphism is unavoidable.
- Pure functions in `/lib` or `/domain`; side-effects only in `/infrastructure`.
- Folder naming: `kebab-case`; file naming: `snake_case.ts` for pure util, `PascalCase.tsx` for React, `*.controller.ts` for HTTP handlers.
- Never commit `any`; if migration requires it, wrap in `// TODO(ts-strict): remove`.
- Prefer ES modules and top-level `await` where supported (Node >=18 LTS).
Error Handling & Validation
- Validate all external input at the edge with Zod or Yup; reject 4xx early.
- Use typed error classes (`DomainError`, `InfrastructureError`, `ValidationError`).
- For remote calls: retry (≤3) with exponential backoff + jitter; break after 30 s.
- Wrap critical dependencies (DB, Redis, HTTP) in circuit breakers (Opossum).
- `async/await` only; never mix with `.then()` chains. Always `await` Promises.
- Log level convention: `error › warn › info › debug › trace`. Never log PII.
Node.js Framework Rules (Express & NestJS)
- Controllers are thin: parse / validate request, call service, map response.
- Use dependency injection (NestJS providers / Awilix for Express) for testability.
- Configuration in 12-factor style: env vars centralised in `config/*.schema.ts`.
- Do NOT share mutable singletons across requests; prefer scoped instances.
- Readiness (`/health/ready`) and liveness (`/health/live`) endpoints mandatory.
- Always propagate `X-Request-Id`; generate UUIDv4 if header missing.
Testing
- 100 % code that reaches production must be executed by at least one automated test layer.
• Unit: Jest, ts-jest, aim ≥90 % branch coverage.
• Integration: spin lightweight Docker services via test-containers (Postgres, Redis).
• Contract: Pact to ensure API backward compatibility.
• E2E: Playwright hitting deployed preview in Kubernetes namespace.
- Use synthetic transactions (cron job) in production for smoke coverage.
Performance & Scalability
- Cache read-heavy endpoints in Redis (TTL ≤300 s). Use cache-aside strategy.
- Use horizontal Pod Autoscaler targeting CPU ≤70 % and p95 latency ≤250 ms.
- Build Docker images with multi-stage, distroless base; image <200 MB, layers ≤5.
- Read/write separation for DB; prefer logical replication for scaleouts.
- Use async message queues (NATS, Kafka, SQS) for heavy or long-running tasks.
Monitoring & Observability
- Emit Prometheus metrics: `http_server_requests_seconds`, `process_*`, custom business counters.
- Structured JSON logs (`@serilog/ts` or pino) with request id & trace id.
- Enable OpenTelemetry auto-instrumentation; export traces to Jaeger / Tempo.
- Alerting: p95 latency, error rate >1 %, CPU >85 %, long GC pauses.
Security
- Enforce HTTPS / TLS 1.2+. Redirect HTTP → HTTPS at ingress.
- Use JWT (RS256) with rotating keys; expiry ≤15 min; refresh tokens in Redis.
- Secrets via Kubernetes Secrets or AWS Secrets Manager; never in env files.
- Dependabot + `npm audit` blocking step; severity ≥high fails pipeline.
- SAST (Semgrep) and DAST (Zap) run nightly.
CI/CD & Deployment
- GitHub Actions workflow:
1. `lint` → ESLint, prettier-check;
2. `test` → unit + integration;
3. `build` → docker build --target prod;
4. `scan` → Trivy; fail >medium;
5. `deploy` → ArgoCD auto-sync to staging.
- Use semantic-release; tags trigger production deployment after green staging.
Infrastructure as Code
- Kubernetes manifests in Helm charts (`charts/<service>`). Values in `values.yaml`.
- Terraform for cloud resources; state stored in remote backend (S3 + DynamoDB lock).
- Every PR touching IaC must execute `terraform plan`/`helm template` preview.
Common Pitfalls & Anti-Patterns
- Fat services that mix multiple domains → split on bounded context.
- Synchronous chaining of >2 microservices → introduce async messaging.
- Persisting session in app memory → store session in Redis / JWT.
- Over-relying on vertical scaling → design for horizontal from start.
Example Skeleton
```
src/
├─ domain/
│ ├─ user.ts # pure domain logic
│ └─ errors.ts # custom error classes
├─ application/
│ ├─ user-service.ts # orchestrates domain + repos
│ └─ dtos.ts # validated input/output shapes (Zod)
├─ infrastructure/
│ ├─ http/
│ │ ├─ index.ts # Express app
│ │ └─ user.controller.ts
│ └─ db/
│ └─ user.repository.ts
└─ index.ts # bootstrap & server start
```
Follow these rules to produce scalable, maintainable services ready for rapid growth and global distribution.