Opinionated rules for designing, implementing, and operating production-grade stateless backend services in Go.
You're tired of services that break when traffic spikes, deployment rollbacks that lose user sessions, and debugging state corruption across distributed instances. Traditional stateful Go services create operational nightmares that wake you up at 3 AM.
Every stateful decision compounds your operational complexity:
Your current Go services probably violate statelessness without you realizing it: global caches, in-memory user sessions, local file writes, and connection pooling that assumes persistent state.
These aren't generic Go guidelines—they're battle-tested patterns for building services that scale horizontally without operational drama. Every rule enforces true statelessness while maintaining Go's performance characteristics.
Core Transformation:
// ❌ Before: Stateful nightmare
var userSessions = make(map[string]*Session) // Global mutable state
var cache = NewInMemoryCache() // Instance-specific cache
func GetUser(w http.ResponseWriter, r *http.Request) {
session := userSessions[getSessionID(r)] // Breaks with multiple instances
if cached := cache.Get(userID); cached != nil {
// Cache miss on different instance
}
}
// ✅ After: Stateless excellence
func (s *Server) GetUser(ctx context.Context, req *api.GetUserRequest) (*api.GetUserResponse, error) {
// Self-contained request with all context
if err := validateGetUser(req); err != nil {
return nil, status.Errorf(codes.InvalidArgument, err.Error())
}
// External state store, not instance memory
user, err := s.repo.FetchUser(ctx, req.Id)
if errors.Is(err, repository.ErrNotFound) {
return nil, status.Error(codes.NotFound, "user not found")
}
return &api.GetUserResponse{User: toProto(user)}, nil
}
Deployment Velocity: Zero-downtime rolling updates become standard. No more coordinating session drainage or maintaining sticky connections.
Scaling Response Time: Auto-scaling from 2 to 20 instances in under 60 seconds instead of manual capacity planning weeks in advance.
Debug Time Reduction: State-related bugs become impossible. No more "works on my instance" or distributed state synchronization issues.
Infrastructure Cost: Eliminate over-provisioning for peak loads. True horizontal scaling means paying only for actual usage.
# Old process - 2-hour maintenance window
kubectl drain node-1 --grace-period=300 # Wait for sessions to expire
kubectl apply -f deployment.yaml # Hope nothing breaks
# Monitor for 30 minutes, rollback if issues
# New process - 2-minute rolling update
kubectl apply -f deployment.yaml
# Done. Any instance can serve any request immediately
You need traffic predictions, sticky session configuration, session replication strategies, and complex health checks that understand instance state.
# Kubernetes HPA just works
spec:
scaleTargetRef:
name: user-service
minReplicas: 2
maxReplicas: 100
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
Running integration tests requires complex state setup, database seeding with user sessions, and coordinating multiple service instances with shared state.
func TestGetUser(t *testing.T) {
tests := []struct {
name string
req *api.GetUserRequest
want *api.GetUserResponse
}{
// Every test is isolated and deterministic
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// No state cleanup needed between tests
got, err := server.GetUser(context.Background(), tt.req)
assert.Equal(t, tt.want, got)
})
}
}
Run this checklist against your existing Go services:
# Find global mutable state
grep -r "var.*=.*make\|var.*=.*\[\]" --include="*.go" .
# Find filesystem writes
grep -r "os\.Create\|ioutil\.WriteFile\|os\.OpenFile" --include="*.go" .
# Find in-memory caches
grep -r "sync\.Map\|map\[.*\]\|\*cache\." --include="*.go" .
// Move session data to Redis/database
type UserService struct {
sessionStore SessionStore // Interface to external store
repo UserRepository
}
func (s *UserService) GetUserProfile(ctx context.Context, token string) (*User, error) {
// JWT contains all necessary context
claims, err := s.validateJWT(token)
if err != nil {
return nil, fmt.Errorf("invalid token: %w", err)
}
// Fetch from external store, not local memory
return s.repo.GetUser(ctx, claims.UserID)
}
func main() {
// Lazy initialization - defer expensive operations
srv := &Server{
dbPool: &lazy.Value{}, // Initialize on first use
}
// Fast startup < 250ms
log.Info("Server starting", "port", 8080)
if err := srv.Listen(":8080"); err != nil {
log.Fatal("Failed to start", "error", err)
}
}
srv := &http.Server{Handler: r, Addr: ":8080"}
go func() { _ = srv.ListenAndServe() }()
// Wait for shutdown signal
ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt)
defer stop()
<-ctx.Done()
// Graceful shutdown < 5s
shutdownCtx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
_ = srv.Shutdown(shutdownCtx)
# deploy/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: user-service
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0
maxSurge: 30%
template:
spec:
containers:
- name: user-service
image: user-service:latest
resources:
requests:
cpu: 100m # 60% of average load
memory: 128Mi
limits:
cpu: 200m # 150% of requests
memory: 256Mi
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 5
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 15
securityContext:
runAsNonRoot: true
readOnlyRootFilesystem: true
// Structured logging with correlation ID
func (s *Server) middleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
correlationID := uuid.New().String()
ctx := context.WithValue(r.Context(), "correlation_id", correlationID)
logger := slog.With("correlation_id", correlationID)
logger.Info("Request started",
"method", r.Method,
"path", r.URL.Path,
)
start := time.Now()
next.ServeHTTP(w, r.WithContext(ctx))
logger.Info("Request completed",
"duration_ms", time.Since(start).Milliseconds(),
)
})
}
Week 1: Eliminate deployment-related outages. Rolling updates become routine operations instead of coordinated events.
Week 2: Auto-scaling starts working predictably. CPU and memory-based scaling triggers work correctly without sticky session complications.
Month 1: Development velocity increases 40% due to simplified local testing and deterministic integration tests.
Month 3: Infrastructure costs decrease 25-40% through efficient resource utilization and elimination of over-provisioning for peak loads.
Ongoing: Operations team spends time building features instead of debugging state synchronization issues and session-related failures.
Your Go services will handle traffic spikes gracefully, deploy with confidence, and scale automatically. Most importantly, you'll sleep through the night knowing your stateless architecture won't create 3 AM emergencies.
The rules enforce these practices automatically in your Cursor IDE, preventing stateful patterns before they reach production. Every suggestion guides you toward truly stateless, production-ready Go services.
You are an expert in Go, REST/HTTP, gRPC, Kubernetes, Docker, AWS Lambda, PostgreSQL, Redis, and modern CI/CD pipelines.
# Key Principles
- Each request is self-contained; never depend on in-memory or local filesystem state.
- Externalize mutable data to purpose-built stores (PostgreSQL, Redis, S3).
- Design for horizontal scaling: any instance can serve any request at any time.
- Prefer immutable infrastructure; replace rather than patch running artifacts.
- Fast startup (< 250 ms) and graceful shutdown (< 5 s) are mandatory.
- Emit structured JSON logs with a correlation/request ID on every line.
- Treat infrastructure (Kubernetes manifests, Terraform, Helm) as version-controlled code.
# Go
- Always accept a `context.Context` as the first param in exported funcs.
- Prohibit global mutable state; use dependency injection via constructor funcs.
- Keep handlers pure: decode → validate → business logic → encode.
- Use idiomatic error handling: `fmt.Errorf("%w", err)` for wrapping.
- Return slices, not pointers to slices; return zero values instead of `nil` maps/slices.
- Keep binaries small: `go build -trimpath -ldflags "-s -w"` in CI.
- Directory layout:
- `cmd/<service>`: service entry point
- `internal/`: non-exported packages
- `pkg/`: reusable libraries
- `api/`: protobuf or OpenAPI definitions
- `deploy/`: Kubernetes/Helm charts
# Error Handling & Validation
- Validate payloads immediately after decode; respond `400` on first failure.
- Wrap all errors with context; surface only sanitized messages to clients.
- Use typed errors (`var ErrNotFound = errors.New("not found")`) for flow control.
- Reject requests > 5 MB or longer than 30 s via middleware.
- Always honor client-cancellation through `ctx.Done()`.
# Framework-Specific Rules
## Kubernetes
- Deploy as Deployment + Service; set `readinessProbe` and `livenessProbe`.
- `resources.requests` ≤ 60 % average load; `resources.limits` ≤ 150 %.
- Use rolling updates; `maxUnavailable: 0`, `maxSurge: 30 %`.
- Mount config via ConfigMap; mount secrets via Secret/CSI driver; never bake them into images.
## AWS Lambda (optional build)
- Handler must be stateless, single binary; cold-start target < 400 ms.
- Use provisioned concurrency for predictable latency spikes.
## gRPC / REST Gateway
- Expose protobuf definitions under `api/` and generate both gRPC and REST gateway stubs.
- Enable reflection and health checks (`grpc-health-probe`).
# Testing
- Unit tests for every exported function (≥ 90 % coverage); no external network calls.
- Table-driven tests; seed RNG with constant to keep determinism.
- Use golden files for HTTP/gRPC contract validation.
- End-to-end tests spin up service + dependencies via Docker Compose.
# Performance
- Avoid global connection pools; share a single pool per process.
- Prefer `sync.Pool` for hot structs; benchmark before merging.
- Profile regularly (`pprof`) in CI; block commits that raise p95 latency > 20 %.
# Security
- Require JWT or mTLS on every endpoint.
- Validate JWT signature & expiration; reject unsigned or expired tokens.
- Rotate secrets using Kubernetes Secrets + external secret manager.
- Run containers as non-root; set `readOnlyRootFilesystem: true`.
# Observability
- Use OpenTelemetry for traces & metrics; export OTLP to collector.
- Correlate logs, metrics, and traces using `trace_id`.
- Record RED metrics: Requests, Errors, Duration.
# CI/CD
- Multi-stage Dockerfile: build → scratch/distroless runtime.
- Run `go vet`, `staticcheck`, `golangci-lint` on every PR.
- Automatically build and push image tags `v<semver>` and latest commit SHA.
- Deploy to staging on merge; promote to prod only after smoke tests pass.
# Common Pitfalls & Anti-Patterns
- Caching in memory → breaks statelessness. Use Redis or CDN instead.
- Sticky sessions in load balancer → disable; use round-robin.
- Ignoring `SIGTERM` → implement graceful shutdown: stop accepting, drain, exit.
- Writing to local filesystem → directs logs to STDOUT; persistent data to object storage.
# Example: Minimal Handler Skeleton
```go
func (s *Server) GetUser(ctx context.Context, req *api.GetUserRequest) (*api.GetUserResponse, error) {
if err := validateGetUser(req); err != nil {
return nil, status.Errorf(codes.InvalidArgument, err.Error())
}
user, err := s.repo.FetchUser(ctx, req.Id)
if errors.Is(err, repository.ErrNotFound) {
return nil, status.Error(codes.NotFound, "user not found")
}
if err != nil {
return nil, status.Errorf(codes.Internal, "fetch user: %v", err)
}
return &api.GetUserResponse{User: toProto(user)}, nil
}
```
# Fast Startup Checklist
- Compile with `-race` only in test stages, not prod.
- Defer heavy initializations (e.g., warm caches) until first request.
- Lazy-load TLS certs via server option.
# Graceful Shutdown Template
```go
srv := &http.Server{Handler: r, Addr: ":8080"}
go func() { _ = srv.ListenAndServe() }()
<-ctx.Done()
shutdownCtx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
_ = srv.Shutdown(shutdownCtx)
```