Comprehensive coding & architectural guidelines for building secure, privacy-preserving federated-learning systems with Python and leading FL frameworks.
Building privacy-preserving federated learning systems requires juggling multiple frameworks, complex security patterns, and distributed architectures. These Cursor Rules eliminate the cognitive overhead and boilerplate that slow down FL development, letting you focus on model performance and privacy guarantees instead of framework syntax.
You're building systems where data never leaves client devices, model updates must be differentially private, and aggregation happens across heterogeneous, intermittently connected nodes. But current development workflows force you to:
The result? Weeks spent on infrastructure instead of improving model accuracy and privacy guarantees.
These Cursor Rules provide instant access to battle-tested FL patterns that handle the complexity for you. Instead of remembering TFF's tff.learning.algorithms.build_weighted_fed_avg syntax or configuring Flower's secure aggregation, you get complete, validated implementations with privacy-by-design built in.
When you need a federated averaging setup with differential privacy:
# Instead of researching and configuring this manually:
aggregator = tff.aggregators.DifferentiallyPrivateFactory(
noise_multiplier=1.1,
clients_per_round=10,
clipping_norm=1.0
)
# You get complete implementations with validation:
def create_dp_federated_process(
model_fn: Callable[[], tff.learning.models.VariableModel],
client_data: tff.simulation.datasets.ClientData,
noise_multiplier: float = 1.1,
clipping_norm: float = 1.0,
) -> tff.templates.IterativeProcess:
"""Creates differentially private federated averaging process."""
# Complete implementation with error handling, validation, and monitoring
Switch between TensorFlow Federated, Flower, PySyft, and OpenFL implementations without looking up APIs. Each pattern includes framework-specific implementations with consistent interfaces.
Every code pattern includes differential privacy and secure aggregation by default. No more manually implementing privacy budgets or forgetting to validate client payloads.
Built-in validation for model shapes, privacy epsilon tracking, client timeout handling, and poisoned update detection. Your FL systems fail fast with actionable error messages.
Complete implementations for client sampling, non-IID data handling, secure transport, and horizontal scaling. Focus on model architecture instead of infrastructure complexity.
You're prototyping with TensorFlow Federated but need to deploy with Flower for production scalability:
# TFF prototype becomes Flower production code automatically
class TFFToFlowerAdapter:
def __init__(self, tff_model_fn: Callable):
self.tff_model = tff_model_fn()
def to_flower_client(self) -> fl.client.NumPyClient:
"""Convert TFF model to Flower client with privacy preservation."""
# Complete adapter implementation with differential privacy
Instead of manually tracking epsilon consumption across training rounds:
class PrivacyLedger:
def __init__(self, epsilon_budget: float, delta: float = 1e-5):
self.budget = epsilon_budget
self.consumed = 0.0
self.delta = delta
def validate_round(self, noise_multiplier: float, clients: int) -> bool:
"""Validate privacy budget before training round."""
# Automatic epsilon accounting with early termination
Get complete mTLS setup with client authentication instead of configuring gRPC manually:
async def create_secure_fl_server(
strategy: fl.server.strategy.Strategy,
server_cert: Path,
client_ca: Path,
) -> None:
"""Launch Flower server with mTLS and client validation."""
# Complete secure server implementation
Automatic client sampling and FedProx configuration for heterogeneous data:
def create_heterogeneous_fl_process(
model_fn: Callable,
mu: float = 0.1, # FedProx regularization
client_sample_rate: float = 0.1,
) -> tff.templates.IterativeProcess:
"""Federated process optimized for non-IID data distributions."""
# FedProx implementation with adaptive client sampling
.cursorrules filepip install tensorflow-federated flwr pysyft openfL pydpYour federated learning development becomes as straightforward as traditional ML development, but with enterprise-grade privacy and security built in from day one. Focus on model accuracy and privacy guarantees instead of wrestling with distributed system complexity.
You are an expert in Python, TensorFlow Federated (TFF), Flower, PySyft, OpenFL, Differential Privacy (PyDP), Secure Multi-Party Computation, gRPC, Docker, Kubernetes, Edge Computing, and Blockchain integration for Federated Learning (FL).
## Key Principles
- Data never leaves the client; only model deltas (weights/gradients) are transmitted.
- Privacy-by-design: differential privacy (DP) + secure aggregation (SA) is mandatory for any production workflow.
- Assume non-IID, heterogeneous, and intermittently connected clients; design algorithms (e.g., FedProx, FedYogi) accordingly.
- Zero-trust network: authenticate every participant, encrypt every channel (TLS ≥1.2).
- Deterministic, reproducible pipelines (seed everything, pin dependencies).
- Fail fast & fail loud: validate shapes, ranges, and DP budgets at each step and abort on violation.
- Prefer declarative, functional Python; avoid mutable global state on either client or server.
- Keep code modular: `client/`, `server/`, `common/`, `tests/`, `infra/`.
## Python
- Follow PEP 8 + PEP 484 (type hints). Enable `mypy --strict`.
- Use `dataclasses` for immutable config objects; set `frozen=True`.
- Asynchronous networking: use `asyncio` + `grpc.aio`.
- Prefer `pathlib.Path` over `os.path`.
- Use f-strings; never use `%` formatting.
- Docstrings: Google style.
- Directory case: `snake_case` for files & dirs, `CamelCase` for classes, `snake_case` for functions/variables.
- Treat warnings as errors in CI (`python -W error`).
## Error Handling & Validation
- Validate inbound client payload size, dtype, shape, and DP clipping norm **before** aggregation; 400 → reject.
- Timeouts: client RPC ≤30 s; server aggregate ≤2× median(client time).
- Retries: exponential backoff, jittered, max 3 attempts.
- Use early returns; avoid nested `if`.
- Log with `structlog` (JSON lines) at INFO; never log raw gradients.
- Maintain per-round privacy ledger; stop training once ε > budget.
- Detect poisoned updates via cosine-similarity outlier detection; quarantine & audit.
## TensorFlow Federated (TFF)
- Model definition:
```python
def tff_model_fn():
keras_model = build_keras_model()
return tff.learning.models.from_keras_model(
keras_model,
input_spec=client_data.element_spec,
loss=tf.keras.losses.CategoricalCrossentropy(),
metrics=[tf.keras.metrics.CategoricalAccuracy()],
)
```
- Use `tff.learning.algorithms.build_weighted_fed_avg` for vanilla; switch to `build_unweighted_fed_prox` (μ≥0.01) for heterogeneous clients.
- Wrap aggregators with `tff.aggregators.secure.SecureSumFactory` + `DPQuery`
(`tff.aggregators.DifferentiallyPrivateFactory`).
- Simulation: `tff.simulation.ClientData` + `tff.simulation.datasets.stackoverflow.load_data()`.
- Maximum tensor size per update ≤ 4 MB (default gRPC limit).
## Flower
- Client:
```python
class FLClient(fl.client.NumPyClient):
def get_parameters(self):
return self.model.get_weights()
def fit(self, params, config):
self.model.set_weights(params)
self.model.fit(self.train, epochs=config["local_epochs"])
return self.model.get_weights(), len(self.train), {}
def evaluate(self, params, config):
self.model.set_weights(params)
loss, acc = self.model.evaluate(self.test)
return loss, len(self.test), {"accuracy": acc}
```
- Server strategy: `fl.server.strategy.FedAvg` → replace with `FedAvgDP` when ε-budgeting.
- Enable `enable_secure_aggregation=True` in strategy constructor.
- Use `flwr.server.start_server(server_address, config=ServerConfig(num_rounds=50))`.
## PySyft
- Create encrypted tensors via `tensor.fix_precision().share(alice, bob, crypto_provider=crypto)`.
- Leverage `sy.monitor.grid_watcher` for live audit.
- SMPC back-ends: Prefer `Falcon` for arithmetic, `SecureNN` for boolean.
## OpenFL
- Roles: `Aggregator` (central), `Collaborator` (client). Define network using `fl_plan.yaml`:
```yaml
task_settings:
rounds_to_train: 100
aggregation_type: secure_avg
tls: true
optimizer:
name: FedProx
mu: 0.1
```
- Sign artifacts with `sigstore`.
## Testing
- Unit: pytest + hypothesis property tests for model-delta serialization.
- Integration: simulate ≥1 000 clients non-IID with TFF’s `SimulationRuntime`.
- Privacy tests: verify ε,δ using `pydp` accountant; fail if ε > 8 or δ > 1e-5.
- Pen-test: perform gradient-inversion attack using `grinv` package; ensure PSNR < 30 dB.
## Performance & Scalability
- Compress updates: top-k sparsification (k=0.01×|w|) + 8-bit quantization.
- Adaptive client sampling: sample 10 % fastest, highest-accuracy clients; ensure fairness over 10 rounds.
- Edge deployment: co-locate aggregation on regional edge nodes to reduce RTT < 50 ms.
- Horizontal scaling: Kubernetes HPA on CPU>70 % for aggregator pods.
## Security
- Rotate client auth tokens (JWT) every 24 h; use mTLS for node-to-node.
- Blockchain optional: store hash of aggregated model per round for audit.
- Perform model watermarking to detect stolen models.
- Use `openai-clip` to scan gradients for PII; block on match >0.9 similarity.
## Deployment
- Containerize clients & server separately; use multi-stage Dockerfiles (final image < 500 MB).
- Use `kubectl apply -k overlays/prod` for production; separate namespaces `fl-server`, `fl-clients`.
- Observability: Prometheus metrics `fl_round_duration_seconds`, `privacy_epsilon`.
- Blue-green upgrade: spin up new aggregator, gracefully drain clients.
## Documentation
- Auto-generate API docs via `pdoc` and host under `/docs` route.
- Provide threat model & DP accountant summary in README.