MCP (Model Context Protocol) server for interacting with dbt Docs artifacts (manifest.json, catalog.json) and exposing dbt graph, lineage and metadata search APIs.
https://github.com/mattijsdp/dbt-docs-mcpStop hunting through dbt docs and lineage graphs. This MCP server gives Claude, Cursor, and other AI assistants direct access to your dbt project metadata, so you can ask questions about your data models in natural language and get instant, accurate answers.
You know the drill: you're debugging a data issue at 2 AM, trying to figure out which upstream model is causing problems. You're clicking through the dbt docs interface, tracing lineage manually, grep-ing through SQL files, and cross-referencing the catalog. What should take 30 seconds stretches into 20 minutes of hunting.
Or you're onboarding a new team member who keeps asking "Where does this column come from?" and "What models depend on this table?" - questions that require you to stop your work and become a human search engine for your own data warehouse.
This MCP server bridges that gap by exposing your dbt artifacts through the Model Context Protocol. Your AI assistant can now:
Debugging Data Issues: "Which models feed data into the customer_metrics table?" Instead of clicking through lineage graphs, get an instant list of all upstream dependencies with their current status and last run times.
Impact Analysis: "What breaks if I change the schema of raw_orders?" Trace all downstream models that depend on specific columns before making breaking changes.
Code Discovery: "Find all models that calculate customer lifetime value." Search across compiled SQL to locate business logic scattered throughout your project.
Column Lineage Investigation: "Where does the total_revenue column in monthly_reports ultimately come from?" Trace back through all transformations to the original source tables.
New Team Member Onboarding: Instead of explaining your data architecture repeatedly, team members can ask the AI assistant directly about model relationships, business logic, and data flow.
The server reads your existing dbt artifacts - no additional setup required in your dbt project:
git clone https://github.com/mattijsdp/dbt-docs-mcp.git
cd dbt-docs-mcp
uv sync
Point it at your dbt artifacts and add the MCP configuration to your AI client:
{
"mcpServers": {
"DBT Docs MCP": {
"command": "uv",
"args": ["run", "--with", "networkx,mcp[cli],rapidfuzz,dbt-core,python-decouple,sqlglot,tqdm", "mcp", "run", "/path/to/dbt-docs-mcp/src/mcp_server.py"],
"env": {
"MANIFEST_PATH": "/path/to/your/target/manifest.json",
"SCHEMA_MAPPING_PATH": "/path/to/schema_mapping.json",
"MANIFEST_CL_PATH": "/path/to/manifest_column_lineage.json"
}
}
}
}
For column-level lineage tracing, run the included preprocessing script once:
python scripts/create_manifest_cl.py \
--manifest-path /path/to/manifest.json \
--catalog-path /path/to/catalog.json \
--schema-mapping-path ./schema_mapping.json \
--manifest-cl-path ./manifest_column_lineage.json
This parses your entire dbt project to build column dependency graphs. Depending on project size, this can take a while (potentially hours for large projects), but you only run it when your schema changes significantly.
Instead of context-switching between your IDE, dbt docs, and terminal to understand your data warehouse, you maintain flow state by asking natural language questions. Your AI assistant becomes your dbt project expert, giving you instant access to the institutional knowledge trapped in your manifest files.
For teams managing complex dbt projects, this eliminates the bottleneck of senior engineers constantly fielding "where does this data come from?" questions. Knowledge becomes self-service and searchable.
The server works with any MCP-compatible AI client, so you can integrate it into your existing development workflow without changing tools.