Advanced Rules for building resilient, observable and standardized error-handling flows in TypeScript/Node.js back-end services.
You're tired of debugging production issues at 2 AM because another uncaught exception brought down your API. You're spending more time investigating cryptic error logs than building features. And your monitoring dashboard looks like a Christmas tree because every failure cascades into multiple alerts.
Most Node.js APIs fail catastrophically because developers treat error handling as an afterthought. You end up with:
catch(err) blocks that swallow context and make debugging impossibleThe result? Your API becomes unreliable, your team burns out from constant firefighting, and you lose trust from frontend teams and customers.
These Cursor Rules implement a comprehensive error handling system that catches problems early, provides rich diagnostic context, and maintains API reliability under failure conditions. Instead of generic JavaScript error handling, you get:
// Before: Generic error with no context
catch (err) {
console.log('Something failed:', err.message);
res.status(500).json({ error: 'Internal Server Error' });
}
// After: Rich, traceable error with diagnostic context
catch (err: unknown) {
if (err instanceof DatabaseError) {
logger.error({
traceId: res.locals.traceId,
operation: 'user.findById',
userId: req.params.id,
error: err,
dbPool: err.poolStats
});
return next(new ServiceUnavailableError('Database temporarily unavailable', err));
}
}
Result: Reduce debugging time from hours to minutes with precise error context and automatic correlation.
// Automatic retry with exponential backoff
const result = await retry(
() => externalService.fetchUserData(userId),
{
retries: 5,
factor: 2,
minTimeout: 100,
maxTimeout: 2000,
onRetry: (error, attempt) => {
logger.warn({ traceId, attempt, error: error.message });
}
}
);
Result: Transform transient network failures from service outages into temporary delays.
// Consistent RFC 9457 error format across all endpoints
{
"type": "https://api.yourservice.com/errors/validation",
"title": "Validation Failed",
"status": 400,
"detail": "The request body contains invalid data",
"instance": "/api/users/123",
"traceId": "550e8400-e29b-41d4-a716-446655440000",
"errors": [
{
"field": "email",
"code": "VAL_4001",
"message": "Must be a valid email address"
}
]
}
Result: Frontend teams get predictable error structures they can handle programmatically.
When your PostgreSQL connection pool is exhausted:
// Automatic pool management with graceful degradation
async function getUserById(id: string): Promise<User> {
let client: PoolClient | undefined;
try {
client = await pool.connect();
const result = await client.query('SELECT * FROM users WHERE id = $1', [id]);
if (result.rows.length === 0) {
throw new NotFoundError(`User ${id} not found`, 'USER_404');
}
return mapToUser(result.rows[0]);
} catch (err: unknown) {
if (err instanceof PoolError && err.code === 'POOL_EXHAUSTED') {
// Circuit breaker pattern - fail fast instead of queuing
throw new ServiceUnavailableError('Database overloaded', err);
}
throw err;
} finally {
client?.release();
}
}
Before: Connection timeouts cascade into multiple failed requests After: Fast failure with clear diagnosis and automatic recovery
When third-party services become unreliable:
// Resilient external service calls
@traced('payment.process')
async function processPayment(paymentData: PaymentRequest): Promise<PaymentResult> {
const span = trace.getActiveSpan();
try {
const result = await circuitBreaker.execute(() =>
paymentProvider.charge(paymentData)
);
span?.setStatus({ code: SpanStatusCode.OK });
return result;
} catch (err: unknown) {
span?.recordException(err as Error);
if (err instanceof CircuitBreakerOpenError) {
// Graceful degradation - queue for later processing
await paymentQueue.add('retry-payment', paymentData);
throw new ServiceUnavailableError('Payment service temporarily unavailable');
}
throw err;
}
}
Before: Payment failures cause checkout abandonment After: Automatic queuing with user notification of delayed processing
npm install winston @sentry/node @opentelemetry/api async-retry opossum
npm install --save-dev @types/uuid
// src/errors/base.ts
export interface AppError extends Error {
code: string;
status: number;
details?: unknown;
cause?: Error;
}
export class BaseError extends Error implements AppError {
constructor(
message: string,
public code: string,
public status: number = 500,
public details?: unknown,
public cause?: Error
) {
super(message);
this.name = this.constructor.name;
Error.captureStackTrace(this, this.constructor);
}
}
export class ValidationError extends BaseError {
constructor(message: string, details?: unknown, cause?: Error) {
super(message, 'VALIDATION_ERROR', 400, details, cause);
}
}
// src/middleware/error-handler.ts
export const errorHandler: ErrorRequestHandler = (err, _req, res, _next) => {
const traceId = res.locals.traceId;
// Log technical details
logger.error({
traceId,
error: err.message,
stack: err.stack,
code: err.code,
cause: err.cause?.message
});
// Send user-friendly response
const response = toRfc9457(err, traceId);
res.status(err.status || 500).json(response);
};
// Automatic error forwarding
export const asyncHandler = <T extends RequestHandler>(fn: T): T =>
((req, res, next) =>
Promise.resolve(fn(req, res, next)).catch(next)
) as unknown as T;
// Usage in routes
router.get('/users/:id', asyncHandler(async (req, res) => {
const user = await userService.findById(req.params.id);
res.json(user);
}));
Your error dashboard transforms from chaos to clarity:
// Automatic metrics collection
const errorMetrics = {
totalErrors: counter('api_errors_total', { labels: ['code', 'endpoint'] }),
errorDuration: histogram('error_resolution_duration_seconds'),
circuitBreakerTrips: counter('circuit_breaker_trips_total')
};
Before: Alert fatigue from duplicate, unclear notifications After: Actionable alerts with precise error classification and automatic correlation
Stop treating errors as edge cases. These rules make error handling a competitive advantage that keeps your API running smoothly and your team focused on building features instead of fixing production fires.
Your future self (and your on-call rotation) will thank you.
You are an expert in Node.js, TypeScript, Express, PostgreSQL, Redis, Docker, Kubernetes and modern observability stacks (Winston, Sentry, OpenTelemetry).
Key Principles
- Fail fast, surface early: detect and report anomalies as close to the source as possible.
- Prefer explicit, typed errors over generic Error instances; never swallow exceptions.
- Keep `try` blocks minimal; validate inputs before entering them.
- Return structured, RFC 9457-compatible error objects to clients.
- Separate technical logging (for operators) from functional error messages (for users).
- Correlate every error with a `traceId`/`requestId` to enable distributed tracing.
- Graceful degradation > uncontrolled termination: fall back or partially succeed when possible.
- Write tests that assert error paths with the same rigour as the happy path.
- Build observability in: log, metric, and trace every unhandled exception automatically.
TypeScript (Node.js)
- Always use `unknown` for caught errors and narrow their type:
```ts
try { /* … */ }
catch (err: unknown) {
if (err instanceof DatabaseError) { … }
}
```
- Create domain-level error classes that extend `BaseError` (which extends `Error`) and carry `code`, `status`, `details`, `cause`.
- Enforce an error interface:
```ts
export interface AppError extends Error { code: string; status: number; details?: unknown; cause?: Error; }
```
- Prefer `async/await`; handle promise rejections with `.catch()` only when chaining:
```ts
service.do().catch(handleAsyncError)
```
- Use `never` return type for functions that intentionally terminate the process.
- Narrow switch statements exhaustively and use the `assertUnreachable(_x: never): never` helper.
Error Handling and Validation
- Catch specific exceptions (`DatabaseError`, `ValidationError`); avoid `catch (e) {}`.
- Validation first: run Joi/Zod schema validation or class-validator before business logic.
- Use early returns to avoid nested `if` ladders:
```ts
if (!isValid) return next(new ValidationError("Invalid body"));
```
- Standardise error codes: `VAL_4001`, `AUTH_4012`, `DB_5003`.
- Map internal errors → HTTP status:
- 400–499 for client issues (missing data, pre-condition failures).
- 500–599 for server issues; hide internals.
- Implement exponential back-off + circuit-breaker for transient external faults.
- Always release/rollback resources in `finally` or via async disposers (e.g., `await pool.connect().then(client => …).finally(client.release)`).
- Log at appropriate level: `error` (unexpected), `warn` (recoverable), `info` (controlled failures).
Express Framework Rules
- Register a single error-handling middleware last:
```ts
app.use((err: AppError, _req, res, _next) => {
logger.error({ traceId: res.locals.traceId, err });
const payload = toRfc9457(err);
res.status(err.status || 500).json(payload);
});
```
- Wrap every route handler with an `asyncHandler` utility to forward errors to the middleware:
```ts
export const asyncHandler = <T extends RequestHandler>(fn: T): T =>
((req, res, next) => Promise.resolve(fn(req, res, next)).catch(next)) as unknown as T;
```
- Populate `requestId` middleware using `uuidv4()`; attach to `res.locals.traceId` and logger default meta.
- Use `http-errors` or custom factory for frequent HTTP errors (`createError(404, "Not Found")`).
- Document error responses in OpenAPI under `components.responses` and reference across endpoints.
Logging & Observability
- Use Winston with JSON transport; include `timestamp`, `level`, `message`, `traceId`, `stack`, `code`.
- Forward unhandled rejections and uncaught exceptions to the same logger and exit with `process.exit(1)`.
- Instrument with OpenTelemetry: create spans around business actions; record exceptions on span.
- Integrate Sentry/Rollbar/Bugsnag for alerting. Always add `user.id` and release version tags.
Testing
- Write unit tests that intentionally throw each custom error and verify:
- correct HTTP status
- RFC 9457 body shape
- side-effects (e.g., rollback).
- Use `Promise.allSettled()` in tests covering multiple async branches.
- Simulate network faults with Nock or MSW and assert exponential back-off behaviour.
- Track metrics like MTTR via CI dashboards; fail PR if error coverage < 90 %.
Performance & Resilience
- Use `async-retry` or `p-retry` with `factor: 2`, `minTimeout: 100`, `maxTimeout: 2000`, `retries: 5`.
- Apply bulkhead pattern: limit parallel DB queries to pool size.
- Downstream circuit breaker (e.g., `opossum`): trip after 50 % failures over 10 requests, 30 s cooldown.
Security
- Never leak stack traces or internal codes to clients; map them to generic 500 messages.
- Sanitize user-provided strings before logging.
- Store logs in secure, append-only storage; enforce 30-day retention and GDPR redaction APIs.
Deployment & Ops
- Health check (`/healthz`) returns 200 only if dependencies reachable; otherwise 503.
- Use PM2 with `--max-memory-restart` to auto-restart on leaks, and forward logs to centralized store.
- Monitor key error rates (`5xx`, `DB_500*`) and page on SLO breaches.
Common Pitfalls & Anti-Patterns
- Swallowing errors in empty `catch`.
- Logging sensitive PII.
- Throwing non-Error values (`throw "string"`).
- Relying on process-wide `unhandledRejection` as main mechanism; always catch locally first.