Knowledge Graph
The knowledge graph (state/graph.json) is the structured representation of compiled vault knowledge. Compile also turns that graph into orientation pages such as wiki/graph/report.md, wiki/graph/share-card.md, wiki/graph/share-card.svg, wiki/graph/share-kit/, wiki/graph/index.md, and per-community summaries under wiki/graph/communities/.
Graph Structure
{
"generatedAt": "2025-01-15T10:30:00Z",
"nodes": [...],
"edges": [...],
"hyperedges": [...],
"communities": [...],
"sources": [...],
"pages": [...]
}Nodes
Each node represents a source, module, symbol, rationale, concept, entity, task, or decision:
- type —
source,module,symbol,rationale,concept,entity,memory_task, ordecision - pageId — Linked wiki page when one exists
- projectIds — Project scope for the node
- language — Code language for module and symbol nodes
- symbolKind — Function, class, interface, enum, and similar code symbol types
- communityId — Derived graph cluster membership
- degree / bridgeScore / isGodNode — Derived connectivity signals
- freshness —
freshorstale - confidence — Extraction confidence score
- sourceClass —
first_party,third_party,resource, orgeneratedfor repo-aware material - rationale nodes — Parser-backed comments/docstrings linked back to modules or symbols through
rationale_for - task nodes — Durable task ledger entries linked to context packs, outputs, touched paths, decisions, and follow-ups
Edges
Edges connect nodes with relationship metadata:
- relation — Claim, import/export, define, call, inheritance, or implementation relation
- status —
extracted,inferred,conflicted, orstale - evidenceClass —
extracted,inferred, orambiguous - confidence — Numeric confidence score
- provenance — Source ids backing the edge
- similarityReasons — Present on
semantically_similar_toedges to explain which shared features triggered the inferred link - similarityBasis —
feature_overlaporembeddings, so reports and the viewer can distinguish deterministic overlap from embedding-backed similarity - task relations —
uses_context,records_decision,touched,produced_output, andfollows_upconnect tasks to their evidence and handoff trail
Hyperedges
The graph also carries top-level hyperedges for multi-node group patterns that cannot be represented cleanly as one pairwise edge:
- relation — currently
participate_in,implement, orform - nodeIds — the member nodes in the pattern
- why — deterministic explanation of why the pattern was created
- sourcePageIds — canonical pages that ground the pattern
Hyperedges feed wiki/graph/report.md, wiki/graph/report.json, swarmvault graph explain, MCP get_hyperedges, and GraphML/Cypher exports.
Community ids are derived locally from graph structure through a Louvain clustering pass over non-source nodes, while disconnected nodes stay in singleton communities. MCP get_community resolves a community by id or label and returns its members, pages, and top evidence edges.
The graph report JSON also carries higher-level health signals such as community cohesion, isolated-node warnings, ambiguous-edge ratios, and suggested follow-up questions derived from weak or ambiguous graph regions.
Pages
The graph also carries a page registry:
pages[].pathandpages[].titlefor preview/navigationpages[].kindandpages[].statusfor filteringpages[].projectIdsfor project-aware searchpages[].sourceClassfor first-party vs third-party/resource/generated filteringpages[].backlinksandpages[].relatedPageIdsfor local workspace navigation
Local Graph Navigation
SwarmVault also exposes deterministic local graph tools without calling a provider:
swarmvault graph queryswarmvault graph pathswarmvault graph explainswarmvault graph god-nodesswarmvault graph shareswarmvault graph blastswarmvault graph statusswarmvault graph update
The same graph-native read surface is exposed over MCP through query_graph, graph_report, graph_stats, get_node, get_community, get_neighbors, get_hyperedges, shortest_path, god_nodes, and blast_radius.
When an embedding-capable provider is available, swarmvault graph query and the graph-report similarity pass also use cached embeddings from state/embeddings.json. tasks.embeddingProvider is the explicit way to choose that backend, but SwarmVault can also fall back to a queryProvider with embeddings support. Without that, the same surfaces still work with lexical graph matching only.
Provenance
Every edge in the graph traces back to a specific source and claim. This enables you to verify any piece of knowledge by following the provenance chain back to the raw input.