Actionable, end-to-end guidelines for designing, operating, and scaling Apache NiFi data-flows with high performance, reliability, and security.
Building reliable, high-performance data flows shouldn't require months of trial-and-error tuning. These comprehensive Cursor Rules transform your NiFi development from reactive troubleshooting to proactive engineering excellence.
Every day, data engineers burn hours on the same preventable issues:
These aren't "learning experiences" – they're productivity killers that these rules eliminate entirely.
These rules provide battle-tested configurations and patterns for every aspect of NiFi development:
Performance Architecture: Repository separation strategies, JVM tuning profiles, and OS-level optimizations that handle terabyte-scale daily throughput
Production Security: HTTPS-only configurations, LDAP integration patterns, and certificate rotation automation that satisfy enterprise security requirements
Operational Excellence: Monitoring dashboards, alerting thresholds, and automated health checks that catch issues before they impact data delivery
Development Workflows: Modular flow design patterns, Git-versioned templates, and comprehensive testing frameworks
# Typical daily fire-fighting scenarios
- 2+ hours debugging back-pressure issues
- Manual repository cleanup every week
- Security audit findings requiring urgent patches
- Difficult troubleshooting in monolithic flows
# Proactive, predictable operations
- <5 minutes MTTR with proper monitoring dashboards
- Automated repository management with optimal performance
- Security-compliant by default with zero findings
- Modular flows enabling 10x faster debugging
Previous Approach: Single massive flow processing everything
# Monolithic design causing frequent issues
IngestEverything-Processor → TransformEverything → OutputEverything
├── Debugging requires stopping entire pipeline
├── One component failure affects all data types
└── Performance tuning impacts unrelated processes
Rules-Based Approach: Modular, domain-specific flows
# Fetch-S3-Orders → Validate-Order-Schema → Route-Priority-Orders
# Fetch-Kafka-Events → Transform-Event-JSON → Write-HDFS-Events
# Monitor-Flow-Health → Alert-Slack-Channel
Impact: 75% faster troubleshooting, independent scaling per data type
Previous Setup: All repositories on single disk
# Common performance killer
/nifi/data/
├── flowfile_repository/ # Competing for I/O
├── content_repository/ # with content storage
└── provenance_repository/ # and provenance writes
Rules-Based Setup: Isolated high-speed storage
# Optimal I/O distribution
/nifi/flowfile/ # SSD - frequent small reads/writes
/nifi/content/ # NVMe - large data blocks
/nifi/prov/ # SSD - write-heavy provenance data
Impact: 3-5x throughput improvement, eliminated I/O bottlenecks
Default Configuration: HTTP development setup in production
# Audit nightmare waiting to happen
nifi.web.http.port=8080
nifi.web.https.port= # Empty - no encryption
nifi.security.user.authorizer= # No access controls
Rules-Based Security: Enterprise-ready from day one
# Secure by default configuration
nifi.web.https.host: your-nifi.company.com
nifi.web.https.port: 8443
nifi.security.user.authorizer: ldap-user-group-provider
nifi.sensitive.props.key: ${VAULT_SENSITIVE_KEY}
Impact: Zero security findings, automated compliance reporting
# Apply JVM optimizations
export JVM_HEAP_INIT=8g JVM_HEAP_MAX=8g
export JVM_ARGS="-XX:+UseG1GC -XX:MaxGCPauseMillis=200"
# Configure repository separation
sudo mkdir -p /nifi/{flowfile,content,prov}
sudo chown nifi:nifi /nifi/{flowfile,content,prov}
# Generate certificates and configure HTTPS
./bin/tls-toolkit.sh standalone -n nifi-node1.company.com
# Update nifi.properties with HTTPS settings from rules
# Prometheus metrics configuration
reporting.task.PrometheusReportingTask:
endpoint: /metrics
port: 9092
Follow the <Verb>-<Object>-<Qualifier> naming convention:
Fetch-S3-CustomerDataValidate-CSV-SchemaRoute-Error-RecordsWeek 1: Immediate stability improvements
Week 2-4: Operational transformation
Long-term: Engineering excellence achieved
Cluster Architecture: 3-node minimum with external Zookeeper, dedicated network VLANs Container Deployment: Kubernetes Helm charts with production-ready resource limits Observability Integration: Prometheus/Grafana dashboards, Slack alerting webhooks Development Lifecycle: Git-versioned flow templates, automated testing frameworks
These rules represent thousands of hours of production NiFi experience distilled into immediately actionable configuration patterns. Stop fighting default settings and start building data infrastructure that scales reliably from day one.
Your data engineering team deserves tools that enhance their expertise rather than create unnecessary complexity. These rules deliver exactly that transformation.
You are an expert in Apache NiFi, JVM tuning, Linux operations, and observability tooling (Prometheus, Grafana, Zabbix).
Technology Stack Declaration
- Apache NiFi 1.22+ (stand-alone & clustered)
- Java 17 JVM & Linux (Ubuntu 20.04+/RHEL 8+)
- Python/Groovy/SQL for ExecuteScript & ExecuteStreamCommand
- Apache Kafka, S3/HDFS, relational DBs via JDBC
- Kubernetes & Helm for containerised deployments
- Prometheus/Grafana for metrics; Zabbix for infra alerts
Key Principles
- Build small, modular flows; never monolithic mega-canvases.
- Favour stateless, idempotent processors; keep state in external stores.
- Throughput first: isolate repositories on independent SSD/NVMe.
- Secure by default: TLS-only UI/API; RBAC via NiFi Registry/LDAP.
- Document every change in flow comments + Git-versioned templates.
- Tune OS/JVM/NiFi per workload; never rely on defaults.
Language-Specific Rules (NiFi Flow DSL & Scripting)
- Processor Naming
• <Verb>-<Object>-<Qualifier>, e.g. Fetch-S3-Images, Route-CSV-Errors.
• Use camelCase for processor ID-level variables; snake_case for FlowFile attributes (e.g. file_type, retry_count).
- Attribute Keys
• Prefix domain-specific keys: kafka.partition, s3.etag.
• Reserve nifi.* prefix strictly for NiFi internals.
- ExecuteScript / ExecuteStreamCommand
• Always declare `@Grapes` dependencies at top for Groovy.
• In Python, wrap script in `def onTrigger(context):` for TestRunner.
• Return `REL_SUCCESS` last; failures/penalise paths early-exit.
- Connection Settings
• Default back-pressure: object count ≤ 10 000 or data size ≤ 2 GB.
• Enable prioritiser `PrioritizeByAttribute` (attribute=priority) when SLAs exist.
Error Handling and Validation
- Begin each flow segment with `ValidateRecord`/`ValidateCSV` to drop bad data early.
- Split error relationships explicitly:
• REL_FAILURE → Put to Dead-Letter Queue (DLQ)
• REL_RETRY → Automatically penalise & route back to queue
- Enable `penalize` on failures to introduce delay (prevents hot-loop).
- Implement back-pressure on every fan-in/fan-out connection; size limits based on RAM /2.
- Use `MonitorActivity` + Slack web-hook for flow-stalled alerts (<5 min idle).
Framework-Specific Rules (Apache NiFi)
Repositories
- FlowFile, Content, Provenance on separate mount points: /nifi/flowfile, /nifi/content, /nifi/prov (each XFS/ext4 on SSD/NVMe).
- Set `nifi.provenance.repository.max.storage.time=48 hours` (cluster) or 24 h (stand-alone).
Concurrency & Scheduling
- Increase `nifi.queue.swap.threshold=50 000` for high-queue workloads.
- Processor scheduling: use Event-Driven only for Kafka/MQ consumers; Timer-Driven for CPU-bound tasks.
- `nifi.flowservice.writedelay.intervals=1 min` to reduce disk thrash.
Cluster Configuration
- Minimum 3-node, odd number for Zookeeper quorum.
- Use embedded ZK only for POC; in prod, external ZK ensemble.
- Configure `nifi.cluster.node.protocol.port` on dedicated VLAN.
DevOps & Deployment
- Container image env vars override: NIFI_WEB_HTTPS_PORT, JVM_HEAP_INIT=8g, JVM_HEAP_MAX=8g.
- Use Helm chart values: `properties.isSecure=true`, `properties.writeAheadLog.enable=true`.
Additional Sections
Testing
- Use NiFi Maven Test Framework: `TestRunner` to unit-test custom processors.
- Maintain a dedicated DEV NiFi with identical configs; auto-deploy branches via GitOps.
- Smoke test: ingest 1 GB synthetic data, expect <2% error, <5 min lag.
Performance
- JVM: `-Xms8g -Xmx8g -XX:+UseG1GC -XX:MaxGCPauseMillis=200`.
- OS: `vm.swappiness=1`, `fs.file-max=262144`, `noatime` mount option.
- Increase `nifi.max.concurrent.write.streams=64` on NVMe arrays.
- Run `nifi.sh diagnostics` weekly; archive logs >14 days.
Security
- Mandatory HTTPS: set `nifi.web.https.host`, `nifi.web.https.port`.
- Enable sensitive props key in `nifi.properties` and store in Vault.
- Map LDAP groups → NiFi policies via `authorizers.xml`.
- Regularly rotate certs; use ACME for automation.
Observability
- Expose metrics via `PrometheusReportingTask` on /metrics.
- Dashboards: CPU, heap, FlowFile queued, JVM GC time, back-pressure ratio.
- Alert thresholds: Heap > 85 %, queue > 80 %, repo FS > 70 %.
Documentation & Governance
- All canvas changes saved to NiFi Registry; commit messages: <JIRA-ID>: <summary>.
- Tag flows with labels: team-name, data-domain, SLA.
- Store versioned flows in Git main branch; protect with PR reviews.
Common Pitfalls & Remedies
- Problem: UI freezes under heavy Provenance load → Reduce storage.time, move repo to faster disk.
- Problem: Back-pressure never clears → Check downstream penalised queue, increase concurrent tasks.
- Problem: Cluster node offloads constantly → Network MTU mismatch; set MTU = jumbo(9000) consistently.
Quick-Reference Commands
```bash
# Start NiFi
./bin/nifi.sh start
# JVM heap dump on OOM
jmap -dump:live,file=/tmp/heap.hprof $(pgrep -f nifi)
# Flow export
nifi-registry export-flow -f /flows/order_ingest -o order_ingest.json
```