Comprehensive Coding Rules for building, operating and maintaining event-driven systems on Apache Kafka with Java.
You know the pain. You've started building with Kafka, but every sprint brings new complexities: schema evolution breaking downstream services, consumer lag spiraling out of control, mysterious rebalancing issues killing performance, and security configurations that make you question your life choices.
The truth? Most teams spend 60-80% of their Kafka development time wrestling with operational concerns instead of building features. These Cursor Rules change that equation completely.
Building production-ready event-driven systems isn't just about publishing and consuming messages. You're juggling:
Meanwhile, your product team keeps asking why "simple" event-driven features take three sprints to deliver.
These Cursor Rules transform Cursor into your Kafka architecture expert. Instead of googling "Kafka consumer group rebalancing" for the hundredth time, you get contextually perfect code that follows enterprise patterns from day one.
What you get:
Real example: Instead of spending two days debugging why your consumer group keeps rebalancing, Cursor generates properly configured consumers with cooperative sticky assignment and tuned heartbeat intervals. Problem solved before it starts.
Skip the research phase. Get production-ready Kafka code patterns instantly, complete with proper error handling and schema validation.
// Before: Hours of research + trial and error
// After: Cursor generates this immediately:
@Component
public class UserEventConsumer {
@KafkaListener(topics = "user.profile.updated",
groupId = "notification-service",
errorHandler = "kafkaErrorHandler")
public void handleUserProfileUpdated(
@Payload UserProfileUpdated event,
@Header KafkaHeaders headers,
Acknowledgment ack) {
try {
validateEvent(event);
notificationService.sendProfileUpdateNotification(event);
ack.acknowledge();
} catch (ValidationException e) {
deadLetterService.send(event, e);
ack.acknowledge();
}
}
}
Automatic Schema Registry integration means breaking changes get caught at build time, not in production.
Every generated component comes with proper metrics, logging, and health checks. No more flying blind.
Generated code includes comprehensive test suites with Testcontainers, contract tests, and topology testing for Kafka Streams.
Before these rules:
Total: 8-16 hours for a single consumer
After these rules:
Total: 15-30 minutes to review and customize
Before: Days of reading documentation, understanding topology patterns, configuring state stores, implementing exactly-once semantics, and debugging serialization issues.
After: Cursor generates complete Kafka Streams applications with:
Before: Manual schema updates, deployment coordination nightmares, broken consumers discovering incompatibilities in production.
After: Cursor generates schema evolution patterns with automatic compatibility checks, versioning strategies, and migration plans.
Ask Cursor to create a producer:
"Create a Kafka producer for user registration events with Schema Registry integration, proper error handling, and comprehensive tests"
You'll get a complete implementation including:
"Create a Kafka Streams application that processes user events to maintain a real-time user preferences view"
Cursor generates:
"Add monitoring, alerting, and deployment configuration for my Kafka cluster"
Get:
These Cursor Rules don't just generate Kafka code—they encode years of production experience into every suggestion. You get enterprise-grade patterns, proper error handling, comprehensive testing, and operational excellence built into every component.
Stop spending your sprints on Kafka plumbing. Start building the event-driven features your users actually want.
The rules handle the complexity. You focus on the business value.
Ready to transform your Kafka development workflow? Install these rules and watch your next sprint velocity double.
You are an expert in Java 17+, Apache Kafka (brokers, KRaft, Streams, Connect), Confluent Schema Registry, Docker/Kubernetes, AWS Lambda, Testcontainers, Prometheus/Grafana.
Key Principles
- Favor an event-driven, asynchronous mindset; every service publishes events that describe facts, not commands.
- Events are immutable; new facts are appended, never updated in place.
- Loose coupling: producers never know the identity or quantity of consumers.
- First-class schemas: every event type is versioned and validated through Schema Registry.
- Security by default: enforce TLS in transit and ACL-based authorization on every cluster.
- Operability: ship metrics, logs and traces; expose health endpoints; automate scaling and rebalancing.
- Reliability: minimum three brokers, replication factor ≥ 3, rack-aware allocation.
Java
- Use Java 17+ and the official `org.apache.kafka` 3.x client libraries.
- Prefer Gradle (Kotlin DSL) or Maven with declarative dependency management.
- Model events as `record` types (immutable) and generate Avro/Protobuf classes from schema.
- Enable compiler warnings: `-Werror -Xlint:all`.
- Package layout: `com.<company>.<domain>.<service>[.stream|.consumer|.producer]`.
- Use slf4j + log4j2; never use `System.out.println`.
- Configure `ExecutorService` with bounded queues; avoid blocking in consumer threads.
- Use try-with-resources for producer/consumer lifecycle:
```java
try (KafkaProducer<String, UserCreated> producer = new KafkaProducer<>(props)) {
producer.send(new ProducerRecord<>(TOPIC, key, event)).get();
}
```
Error Handling and Validation
- Validate events against schema on both producer & consumer using Confluent serializers/deserializers.
- Producers: `acks=all`, `enable.idempotence=true`, `retries=Integer.MAX_VALUE`, `delivery.timeout.ms=120000`.
- Consumers: `max.poll.interval.ms` tuned to processing time; commit offsets after successful processing.
- Use early returns for failed validation; wrap deserialization errors in `BadEventException` and forward to a dead-letter topic (DLT).
- Implement global uncaught-exception handler that logs, forwards to DLT, and triggers graceful shutdown.
- Idempotency: add `eventId` UUID in every event and maintain a deduplication store (e.g., Redis) when side-effects are non-idempotent.
Apache Kafka
- Cluster
• Minimum 3 brokers, enable KRaft; place each broker in a different rack/zone.
• Retention: log compaction for key-based facts, time-based retention for streams.
• Default replication factor = 3; min ISR = 2.
- Topics
• Naming: `<domain>.<bounded_context>.<entity>.<event>` (snake_case prohibited).
• One topic per event type; detach command topics from event topics.
• Partitions = max(expected_throughput / 10 MBps, number_of_consumers × 2).
- Security
• `listener.name.external.ssl` with TLS 1.2+, mTLS optional.
• Define ACLs per principal (`User:service-a`) per topic/cluster action.
- Producers
• Use `batch.size` ≤ 256 KB, `linger.ms` 5–20 ms for throughput.
• Compress with `compression.type=zstd`.
- Consumers
• Use cooperative rebalancing (`partition.assignment.strategy=cooperative-sticky`).
• Heartbeat interval = poll interval / 3.
- Kafka Streams
• Always enable exactly-once v2 (`processing.guarantee=exactly_once_v2`).
• State stores backed by RocksDB; checkpoint to durable volumes in K8s.
• Use `TopologyTestDriver` for unit tests.
- Kafka Connect
• Declaratively manage connectors via Git-ops (YAML) + REST API.
• All connectors run with `errors.tolerance=all` and dead-letter queues.
Additional Sections
Testing
- Use Testcontainers’ `KafkaContainer` for integration tests; spin up Schema Registry if schemas are involved.
- Use `EmbeddedKafkaBroker` only for unit tests; prefer real containers for CI.
- Adopt contract tests: producer publishes sample event, consumer test suite verifies compatibility using Schema Registry’s compatibility API.
Performance
- Profile consumer lag with `kafka-consumer-groups --describe`; alert on lag > (partition_count × poll_interval).
- Turn on `leader.replication.throttled.replicas` during broker replacement.
- Monitor disk utilization and throttle producers when ≥ 80 %.
Security
- Rotate client certificates every 90 days; automate via cert-manager.
- Store SASL/SCRAM credentials in Kubernetes secrets with RBAC.
- Forward broker and audit logs to SIEM (e.g., Splunk) and alert on anomalous ACL denials.
Deployment & Operations
- Run brokers as StatefulSets with PodDisruptionBudgets (minAvailability = 2).
- Mount fast SSD/NVMe for log dirs; one disk per broker.
- Use Prometheus JMX exporter; dashboards: throughput, P99 latency, under-replicated partitions, consumer lag.
- Automate rolling restarts via Ansible/ArgoCD; verify no under-replicated partitions before next pod.
Documentation & Versioning
- Store event schemas in `schemas/` folder; version directories `v1`, `v2`.
- Document each event with purpose, example JSON/Avro, producer(s) and consumer(s).
- ADRs (Architecture Decision Records) for topic naming, retention changes, and partition strategy.