Opinionated coding & architectural rules for building production-grade neuro-symbolic AI systems that combine neural perception with symbolic reasoning.
Traditional deep learning burns through thousands of hours and terabytes of data to achieve what symbolic reasoning can solve with a handful of rules. If you're building AI systems that need to explain their decisions, handle edge cases gracefully, and learn from minimal examples, you need a neuro-symbolic approach.
You've been there: your neural network performs beautifully on benchmark datasets but falls apart the moment it encounters scenarios slightly outside its training distribution. Meanwhile, you're burning through GPU hours and labeled data just to handle basic reasoning tasks that humans solve with simple logical rules.
The core inefficiency: Pure neural approaches treat reasoning as pattern matching, requiring exponentially more data to learn logical relationships that can be expressed in a few symbolic rules. Your model needs 10,000 examples to learn "if A and B, then C" when you could encode this directly.
The interpretability gap: When your model makes a wrong decision, you get attention maps and gradient visualizations—not the logical reasoning chain that would actually help you debug and improve the system.
These rules implement a hybrid-first architecture that combines neural perception modules with symbolic reasoning layers, giving you the pattern recognition power of deep learning with the interpretability and data efficiency of symbolic AI.
Here's what changes in your development workflow:
# Instead of this black-box approach:
prediction = model(input_data) # No explanation possible
confidence = torch.softmax(logits, dim=-1)
# You build this:
perception_output = scene_parser_net(image)
reasoning_trace = knowledge_graph.infer(perception_output)
prediction = reasoning_trace.conclusion
explanation = reasoning_trace.explain() # Full logical derivation
The architecture enforces explainability: Every public API must expose an explain() or trace() method, making interpretability a first-class feature rather than an afterthought.
Before: Train on 100,000 medical images to learn basic diagnostic rules After: Encode medical knowledge in ontologies, train perception on 5,000 images, let symbolic reasoning handle diagnostic logic
# Neural perception extracts symptoms
symptoms = symptom_detector_net(medical_image)
# Symbolic reasoning applies medical knowledge
diagnosis = medical_kb.diagnose(symptoms)
print(diagnosis.explain()) # "Patient has condition X because symptoms A+B+C match rule R1"
Before: Fine-tune large language models on thousands of legal documents After: Use neural networks for entity extraction, symbolic reasoning for legal precedent application
Before: End-to-end neural networks making unexplainable driving decisions After: Neural perception for scene understanding, symbolic reasoning for traffic rule compliance
Data Efficiency: Symbolic priors reduce training data requirements by >50%. Instead of learning traffic rules from millions of driving examples, you encode them once and let the neural network focus on perception.
Robust Reasoning: When your neural perception makes errors, symbolic reasoning can detect inconsistencies and flag them rather than propagating errors silently through the system.
Real Interpretability: Not just "this pixel was important" but "the system concluded X because rule Y fired when conditions A and B were detected."
src/
perception/ # PyTorch neural networks
logic/ # RDFlib symbolic rules & Prolog reasoning
integration/ # Pipeline orchestration
ontologies/ # Domain knowledge as RDF/OWL files
This separation forces you to think about what belongs in neural vs symbolic components, preventing the common mistake of trying to learn everything end-to-end.
@dataclass(frozen=True, slots=True)
class ReasoningResult:
conclusion: Any
confidence: float
trace: ReasoningTrace
def explain(self) -> str:
return self.trace.generate_explanation()
Every inference must produce a trace. This isn't optional debugging output—it's core functionality that forces you to build interpretable systems from the ground up.
Instead of training on massive datasets:
# Define domain knowledge declaratively
knowledge_graph.add_rule("""
IF ?person has_symptom fever AND ?person has_symptom cough
THEN ?person likely_has respiratory_infection
""")
# Train neural perception on minimal labeled data
perception_model.train(small_labeled_dataset)
def infer_with_validation(inputs):
try:
neural_output = perception_net(inputs)
# Validate neural output against symbolic constraints
if not knowledge_graph.is_consistent(neural_output):
raise InconsistencyError("Neural output violates domain constraints")
reasoning_result = symbolic_reasoner.infer(neural_output)
return reasoning_result
except Exception as e:
# Attach reasoning trace to every exception
e.trace = current_reasoning_trace
raise
Training Time: Reduce model training time by 60-80% through symbolic priors and smaller required datasets.
Debugging Efficiency: Instead of analyzing attention maps and trying to reverse-engineer model behavior, you get logical traces that explain exactly why each decision was made.
Edge Case Handling: Symbolic reasoning catches and handles scenarios that would break pure neural approaches, reducing production failures.
Domain Expert Integration: Subject matter experts can directly contribute domain knowledge through ontologies rather than requiring ML engineering to encode their expertise.
Regulatory Compliance: Built-in explainability satisfies audit requirements in regulated industries without retrofitting interpretability tools.
IBM Neuro-Symbolic Toolkit: Production-ready pipelines with built-in explanation generation
VAILS: Graph-oriented DSL for complex reasoning workflows
Concordia: Component-based architecture for dialogue and planning systems
These rules configure your development environment to leverage these frameworks effectively while maintaining code quality and interpretability standards.
These Cursor Rules transform your development process from training black-box models to building interpretable AI systems that combine the best of neural and symbolic approaches. You'll write less debugging code, need less training data, and ship AI systems that can actually explain their decisions.
The future of AI isn't just about bigger models—it's about smarter architectures that reason explicitly. These rules help you build that future today.
You are an expert in Python • PyTorch • RDFlib • SPARQL • Prolog • IBM Neuro-Symbolic AI Toolkit (NSTK) • VAILS • Concordia
Key Principles
- Hybrid first: always pair neural perception modules with symbolic reasoning layers.
- Modular graph design: keep neural, symbolic, and integration code in separate packages.
- Explanation as a feature: every public API must expose an `explain()` or `trace()` method.
- Declarative knowledge: store domain rules in ontologies (RDF/OWL) instead of hard-coding.
- Data efficiency: inject symbolic priors to cut training data >50 %.
- Fail loudly, fail early: detect concept drift & reasoning contradictions during runtime.
- Reproducibility: seed everything, version datasets, and snapshot knowledge graphs.
- Security & privacy by design: never log raw personal data—hash or redact first.
Python
- Use Python 3.11+ with `pyproject.toml` & PEP 582 (no virtualenv folder in repo).
- Mandatory static typing (`mypy --strict`). Use `typing.Annotated` for units.
- Functional core, OO shell: pure functions inside `logic/` & `perception/`; thin façade classes in `services/`.
- Use `dataclass(frozen=True, slots=True)` for immutable knowledge entities.
- Naming:
• Neural nets: snake_case ending in `_net` (e.g., `scene_parser_net`).
• Symbolic rules: kebab-case RDF URIs (e.g., `ns:has-parent`).
- All modules must export `__all__` and include doctring with references to ontology IRIs.
Error Handling and Validation
- Validate tensors shapes & dtype at function entry using `torchtyping`.
- Validate RDF triples via SHACL before ingestion.
- Use early returns for error branches; keep the happy path last.
- Attach a `ReasoningTrace` object to every raised exception:
```python
raise InferenceError("Unsatisfied pre-conditions", trace=trace)
```
- On contradiction detection, log the minimal unsat core, not the full KB.
Framework-Specific Rules
VAILS
- Store VAILS graphs under `models/vails/` with versioned subfolders (`v1/`, `v2/`).
- Use graph-oriented DSL; never embed raw SQL—translate to graph patterns.
- Keep integration layer thin: limit to I/O marshaling & execution orchestration.
Concordia
- Organize Concordia components by capability (`learning/`, `reasoning/`, `dialogue/`).
- Configure deterministic planners first; fallback planners second for robustness.
IBM Neuro-Symbolic Toolkit (NSTK)
- Prefer `nstk.pipeline.Pipeline` for end-to-end flows; avoid custom schedulers.
- Register symbolic modules with explicit provenance metadata (`source`, `license`).
- Always expose `generate_explanation=True` in pipeline config.
Additional Sections
Testing
- Unit: pytest with parameterized cases generated from symbolic rules.
- Property: Hypothesis strategies seeded by ontology constraints.
- Integration: end-to-end Jupyter notebooks that compare model & symbolic outputs.
- Benchmarks: use `neurosym-bench` ≥0.4 scoring (accuracy, reasoning depth, explainability).
Performance
- Cache reasoning results in RedisGraph keyed by hash(neural_output).
- Quantize models to INT8; validate that symbolic accuracy drop <1 %.
- Use curriculum learning starting with symbolic priors, then fine-tune end-to-end.
Security
- Enforce RDF access control via SHACL ACL shapes.
- Sign model artifacts with Sigstore; verify at load time.
Documentation
- Each public function has: short summary, input/output types, failure modes, example, ontology links.
- Auto-publish docs with MkDocs-Material; embed live SPARQL queries.
Directory Layout
```
src/
perception/ # PyTorch nets
logic/ # Prolog/RDFlib symbolic rules
integration/ # Glue code (pipeline, adapters)
ontologies/ # *.ttl files
tests/
models/
vails/
checkpoints/
```