Comprehensive Rules for building, querying, and maintaining graph-based applications using Cypher and modern graph database engines.
Your application is drowning in JOIN hell. You're writing increasingly complex SQL queries to traverse relationships, your API responses are getting slower with each new feature, and your recommendation engine is struggling to surface meaningful connections. Sound familiar?
Graph databases fundamentally change how you think about and work with connected data. Instead of forcing relationships into rigid table structures, you model your domain as it actually exists: nodes, edges, and properties that mirror real-world connections.
Every modern application deals with connected data:
You're probably handling these with:
These Cursor Rules transform your development workflow by providing battle-tested patterns for:
Query Performance: Automatic traversal depth limits prevent runaway queries, strategic indexing rules ensure fast lookups, and declarative query patterns avoid common performance pitfalls.
// Instead of this risky pattern:
MATCH (u:User)-[:FOLLOWS*]->(recommendation)
RETURN recommendation
// You get this protected pattern:
MATCH (u:User)-[:FOLLOWS*1..3]->(recommendation:User)
WHERE recommendation.isActive = true
RETURN recommendation
Schema Design: Node and relationship modeling patterns that scale, property naming conventions that prevent confusion, and constraint strategies that maintain data integrity.
Multi-Database Integration: Proven patterns for combining graph databases with OLAP stores, search engines, and traditional databases using polyglot persistence.
Cut Query Development Time by 60%: Pre-built patterns for common graph traversals, relationship queries, and aggregations mean you're not starting from scratch.
Eliminate Performance Debugging: Built-in query profiling workflows, automatic traversal bounds, and indexing strategies prevent most performance issues before they happen.
Reduce Context Switching: Unified patterns across Neo4j, Neptune, and TigerGraph mean your skills transfer between platforms.
Accelerate Testing: TestContainer patterns and synthetic graph generation let you test complex relationship scenarios without production data dependencies.
You're building a financial fraud detection system. Traditional approaches require complex SQL across multiple tables to track transaction patterns.
Before: Writing 200+ line SQL queries with multiple CTEs to find suspicious transaction chains, taking hours to develop and minutes to execute.
With Graph Nexus:
MATCH path = (suspicious:Transaction)-[:FOLLOWS*1..4]->(target:Transaction)
WHERE suspicious.riskScore > 0.8
AND target.amount > 10000
AND duration.between(suspicious.timestamp, target.timestamp) < duration('PT1H')
RETURN path, length(path) as hops
Result: 15-minute implementation, sub-second execution, and clear relationship visualization.
You need to surface relevant content based on user behavior and social connections.
Traditional approach: Multiple database calls to get user preferences, friend lists, and content ratings, then complex application logic to score recommendations.
Graph approach:
MATCH (user:User {id: $userId})-[:FOLLOWS]->(friend:User)
MATCH (friend)-[:LIKED]->(content:Content)
WHERE NOT (user)-[:VIEWED]->(content)
WITH content, count(friend) as friendLikes
ORDER BY friendLikes DESC
LIMIT 10
RETURN content
Impact: Single query execution, real-time results, and inherently explainable recommendations.
You're managing service dependencies across a complex microservices architecture.
Without graphs: Static configuration files, manual dependency tracking, and cascade failure surprises.
With Graph Nexus:
MATCH (service:Service {name: $serviceName})
MATCH path = (service)-[:DEPENDS_ON*1..5]->(dependency:Service)
WHERE dependency.status = 'DOWN'
RETURN path, dependency.name as failedService
Outcome: Instant impact analysis, automated incident response, and visual dependency mapping.
Install the rules: Copy the Graph Nexus configuration into your Cursor Rules settings.
Choose your database:
Set up your project structure:
your-project/
├── queries/ # Prepared Cypher files
├── migrations/ # Schema changes
├── tests/fixtures/ # Sample graph data
└── docs/diagrams/ # Graph model visuals
Start with your core domain entities and their relationships:
// Create constraints first
CREATE CONSTRAINT user_id IF NOT EXISTS FOR (u:User) REQUIRE u.id IS UNIQUE;
CREATE CONSTRAINT product_id IF NOT EXISTS FOR (p:Product) REQUIRE p.id IS UNIQUE;
// Create indexes for frequent queries
CREATE INDEX user_email IF NOT EXISTS FOR (u:User) ON (u.email);
CREATE INDEX product_category IF NOT EXISTS FOR (p:Product) ON (p.category);
User recommendations:
MATCH (user:User {id: $userId})-[:PURCHASED]->(product:Product)
MATCH (product)<-[:PURCHASED]-(otherUser:User)
MATCH (otherUser)-[:PURCHASED]->(recommendation:Product)
WHERE NOT (user)-[:PURCHASED]->(recommendation)
RETURN recommendation, count(*) as strength
ORDER BY strength DESC
LIMIT 5
Fraud detection:
MATCH (account:Account)-[:TRANSACTION*1..3]-(suspicious:Account)
WHERE account.riskScore < 0.3 AND suspicious.riskScore > 0.8
RETURN account, suspicious,
count(*) as connectionStrength
Node.js with Neo4j:
const driver = neo4j.driver(uri, neo4j.auth.basic(user, password));
async function getRecommendations(userId) {
const session = driver.session();
try {
const result = await session.run(
`MATCH (u:User {id: $userId})-[:FOLLOWS*1..2]->(friend:User)
MATCH (friend)-[:LIKED]->(content:Content)
WHERE NOT (u)-[:VIEWED]->(content)
RETURN content LIMIT 10`,
{ userId }
);
return result.records.map(record => record.get('content'));
} finally {
await session.close();
}
}
Integration testing:
// Using TestContainers
const neo4jContainer = await new Neo4jContainer()
.withAdminPassword('password')
.start();
// Create test data
await session.run(`
CREATE (u1:User {id: 'user1'})
CREATE (u2:User {id: 'user2'})
CREATE (u1)-[:FOLLOWS]->(u2)
`);
Performance monitoring:
// Check query performance
PROFILE MATCH (u:User)-[:FOLLOWS*1..3]->(friend:User)
RETURN count(friend);
// Monitor slow queries
CALL dbms.listQueries()
YIELD query, elapsedTimeMillis
WHERE elapsedTimeMillis > 1000;
Development Velocity: Teams report 40-70% faster feature development for relationship-heavy features after implementing these patterns.
Query Performance: Proper indexing and traversal patterns typically reduce query times from seconds to milliseconds for complex relationship queries.
Code Maintainability: Declarative graph queries are significantly more readable and maintainable than equivalent SQL with multiple JOINs.
Feature Capabilities: Teams can implement features like recommendation engines, fraud detection, and social features that were previously too complex or slow.
System Reliability: Built-in query bounds and error handling patterns prevent the runaway queries and cascading failures common in complex relationship systems.
The Graph Nexus Rules don't just help you write better graph database code—they fundamentally change how you approach and solve connected data problems. Your next recommendation engine, fraud detection system, or social feature will be faster to build, more performant, and easier to maintain.
Time to stop fighting your data's natural structure and start building with it.
You are an expert in Graph Databases (Neo4j, Amazon Neptune, TigerGraph), declarative query languages (Cypher, SPARQL, Gremlin), GraphQL APIs, and Graph Machine-Learning (GNN).
Key Principles
- Model the domain as first-class nodes, edges, and properties; relationships drive query design.
- Prefer declarative, set-based queries; avoid imperative iterations inside the application layer.
- Index every frequently filtered property (`UNIQUE`, `BTREE`) and high-degree relationship endpoints.
- Cap traversal depth explicitly (`..3`) unless business logic requires unbounded walks.
- Keep graph database the source of relationship truth; use polyglot persistence (OLAP store, Search engine) only for read-heavy analytics.
- Design with eventual graph analytics/ML in mind: keep schema stable, version relationships, store lightweight embeddings.
- Automate everything: migrations, seed data, performance dashboards, backups.
Cypher Rules
- Naming
• Nodes: `PascalCase` labels (`User`, `Transaction`).
• Relationships: uppercase with underscores (`PURCHASED`, `FOLLOWS`).
• Properties: camelCase (`createdAt`, `isActive`).
- Query Structure
• Put MATCH-WHERE block first, then WITH, finally RETURN.
• Alias long path patterns (`p`) and reuse them in subsequent clauses.
• Always include an upper bound on variable-length patterns: `()-[:FOLLOWS*1..4]->()`.
- Mutations
• Use `MERGE` for idempotent writes; follow with `ON CREATE SET` / `ON MATCH SET`.
• Collect writes in a single transaction where feasible – minimizes network hops.
- Performance
• Use `EXPLAIN`/`PROFILE` before committing complex queries.
• Prefer `OPTIONAL MATCH` over subqueries for outer joins.
SPARQL Rules
- Default to `PREFIX` declarations for readability.
- Use `VALUES` for small IN-lists; avoid `FILTER IN` on large sets.
- Paginate results with `LIMIT`/`OFFSET`; never pull unbounded result sets in production.
Gremlin Rules
- Compose traversals with `g.V()` chaining; avoid side-effect steps (`sideEffect`, `aggregate`) unless necessary.
- Use `by(coalesce(...))` to guard missing properties.
- Inject traversal-level timeouts (`with('evaluationTimeout',5000)`).
GraphQL over Graph DB
- Keep resolver body a thin pass-through to Cypher/Gremlin; no business logic.
- Use `@cypher` directive (Neo4j-GraphQL) for custom fields; never embed raw string concatenation.
Error Handling and Validation
- Wrap every mutation in explicit BEGIN/COMMIT; rollback on caught exception.
- Validate node ID format and relationship direction before executing write query.
- Map DB-specific errors (e.g., `Neo.ClientError.Schema.ConstraintValidationFailed`) to application-level 4xx/5xx responses.
- Enforce uniqueness with constraints, not only in code.
- Log full query text and parameters on failure; scrub PII.
Framework-Specific Rules
Neo4j
- Use Neo4j 5+ with Fabric for sharding large graphs.
- Organize codebase: `/migrations`, `/queries`, `/seed`, `/fixtures`.
- Always enable `apoc.trigger.enabled=true`; implement lightweight data-quality triggers.
- Cache driver sessions per request; reuse driver instance globally.
Amazon Neptune
- Prefer `neptune_ml` for built-in GNN pipelines.
- Use IAM auth; disable HTTP endpoints in production.
- Leverage Neptune Streams for CDC into analytics lake.
TigerGraph
- Write business logic as GSQL queries; keep parameter lists explicit.
- Use RESTPP + GSQL parameterization to avoid string-built queries.
- Refresh accumulators only where necessary; they block concurrency.
Graph Neural Networks (GNN)
- Export training subgraphs via offline job, not ad-hoc queries.
- Store embeddings back as `User.embedding` vector<float>; index with HNSW plugin or external vector DB.
- Version models (`modelId`, `version`) and embeddings (`embVer`).
Testing
- Use TestContainers-Neo4j or Neptune Local for integration tests.
- Create synthetic mini-graphs that mirror production schema but < 500 nodes.
- Snapshot query results; fail tests on regression in node/edge counts or execution time > 2× baseline.
Performance & Monitoring
- Track cache hits, page faults, heap usage.
- Set query timeout (`dbms.transaction.timeout`) to 30s unless analytics.
- Periodically run `call db.indexes()` to detect unusable indexes.
- Use `call dbms.listQueries()` for live query kill; automate alert at 5× median latency.
Security
- Enforce role-based access control; never grant `admin` to application user.
- Require TLS between driver and DB.
- Rotate Neo4j auth tokens every 90 days; use IAM for Neptune.
- Sanitize dynamic labels/relationship types; whitelist allowed values.
DevOps
- Blue/green deploys with read replica warm-up.
- Nightly full backup + hourly incremental.
- Store cypher migrations as immutable, timestamped files; forward-only.
Directory Conventions
- `queries/` ─ prepared Cypher, SPARQL, Gremlin files.
- `migrations/` ─ ordered schema changes.
- `tests/fixtures/` ─ sample graphs.
- `docs/diagrams/` ─ graph model visuals.
Common Pitfalls & Guards
- Missing upper bound on variable-length paths → OOM: ALWAYS cap with `..N`.
- Overuse of `OPTIONAL MATCH` yields Cartesian explosion: filter early.
- Forgetting to close driver sessions → socket exhaustion: wrap with `try/finally`.
- Relying on property existence for type discrimination: use labels or relationship types instead.