Architecture
SwarmVault is built around a simple pipeline: ingest → shape → analyze → compile → query.
Data Flow
- Raw sources are ingested and stored immutably with content hashes
- **Schema guidance** comes from
swarmvault.schema.md, which defines vault-specific rules - Analysis extracts concepts, entities, claims, and questions from each source
- Compilation merges analyses into a unified knowledge graph
- Wiki generation produces Markdown pages from the graph
- Search indexing enables full-text queries over the wiki
Dual Outputs
SwarmVault produces two canonical artifacts:
- **Wiki** (
wiki/) — Human-readable Markdown pages organized by page kind (index, sources, concepts, entities, outputs) - **Graph** (
state/graph.json) — Machine-readable JSON with nodes, edges, and full provenance metadata
Key Design Principles
- Immutable inputs — raw sources are never modified
- Deterministic compilation — same inputs produce same outputs
- Schema-guided behavior — each vault can impose its own structure without code changes
- Provenance tracking — every claim traces back to its source
- Anti-drift — linting detects when knowledge becomes stale
- Provider agnostic — swap LLMs without changing the pipeline