Architecture

SwarmVault is built around a simple pipeline: ingest → shape → analyze → compile → query.

Data Flow

  1. Raw sources are ingested and stored immutably with content hashes
  2. **Schema guidance** comes from swarmvault.schema.md, which defines vault-specific rules
  3. Analysis extracts concepts, entities, claims, and questions from each source
  4. Compilation merges analyses into a unified knowledge graph
  5. Wiki generation produces Markdown pages from the graph
  6. Search indexing enables full-text queries over the wiki

Dual Outputs

SwarmVault produces two canonical artifacts:

  • **Wiki** (wiki/) — Human-readable Markdown pages organized by page kind (index, sources, concepts, entities, outputs)
  • **Graph** (state/graph.json) — Machine-readable JSON with nodes, edges, and full provenance metadata

Key Design Principles

  • Immutable inputs — raw sources are never modified
  • Deterministic compilation — same inputs produce same outputs
  • Schema-guided behavior — each vault can impose its own structure without code changes
  • Provenance tracking — every claim traces back to its source
  • Anti-drift — linting detects when knowledge becomes stale
  • Provider agnostic — swap LLMs without changing the pipeline