OSS Validation
SwarmVault has a second release-validation lane that runs the published npm package against a small pinned corpus of public repositories.
This is deliberately different from the normal smoke flow:
- smoke proves the core product flows quickly
- the OSS corpus proves those flows on real repo shapes people actually work with
The default corpus stays intentionally small so provider cost and run time remain bounded.
SwarmVault also keeps a separate tiny controlled fixture matrix in the OSS repo. That matrix covers the core code-language baseline plus the local non-code file kinds the CLI consumes, and it runs through the same installed-package smoke path before broader public-repo validation.
That tiny matrix now also includes:
- local ingest for the full Word family (
.docx,.docm,.dotx,.dotm), Excel family (.xlsx,.xlsm,.xlsb,.xls,.xltx,.xltm), PowerPoint family (.pptx,.pptm,.potx,.potm), OpenDocument (.odt,.odp,.ods), Jupyter notebooks (.ipynb), plus Rich Text (.rtf), BibTeX (.bib), Org-mode (.org), and AsciiDoc (.adoc), each with extracted text and metadata sidecars - local
.rstingest with normalized headings and searchable extracted text - browser-style markdown and HTML inbox bundles with copied local assets
- an optional packaged browser check that opens both
graph serveand the exported HTML graph in headless Chromium and verifies selection, path highlighting, and deselection
Default Gated Repos
sindresorhus/ky: compact TypeScript library with source, tests, and docsremarkjs/react-markdown: docs-heavy JS/TS repo with architecture and examplespallets/itsdangerous: small Python library with code plus documentationnecolas/normalize.css: small web/docs repo with stylesheet-first content
Optional Canary
apple/sample-food-truck
That canary is not part of the default gated lane. It is there to exercise a mixed-language Apple-style project layout without slowing the normal release gate.
Run It
From the OSS repo:
pnpm live:oss:corpusTarget one or two repos while iterating:
pnpm live:oss:corpus -- --repo ky --repo react-markdownInclude the canary explicitly:
pnpm live:oss:corpus -- --include-canaryRun the same flow against a provider-backed lane:
OPENAI_API_KEY=... pnpm live:oss:corpus -- --lane openaiWhat It Verifies
For each repo, the runner uses the installed CLI path and executes:
swarmvault init
swarmvault ingest <repo> --repo-root <repo>
swarmvault compile
swarmvault benchmark
swarmvault graph query "<repo prompt>"
swarmvault query "<repo prompt>"
swarmvault graph export --html <output>It then checks:
- source, page, and module-page counts
- first-party source classification
- benchmark output
- graph query results
- saved
queryoutput wiki/graph/report.json- standalone graph export output
Why It Uses Small Repos
The goal is repeatable release validation, not heroic one-off stress tests.
Keeping the default corpus small means:
- lower provider costs
- faster local reruns while fixing issues
- clearer regression signals when behavior changes
- fewer false failures from giant vendored or generated trees
The tiny matrix complements that by giving release validation a stable per-language and per-file-type baseline:
- code baseline: JavaScript, JSX, TypeScript, TSX, Bash, Python, Go, Rust, Java, Kotlin, Scala, Dart, Lua, Zig, C#, C, C++, PHP, Ruby, and PowerShell
- additional parser-backed language coverage is exercised separately for Elixir, OCaml, Objective-C, ReScript, Solidity, HTML, CSS, and Vue single-file components
- local files: markdown, text, reStructuredText, HTML, PDF, the Word family (
.docx/.docm/.dotx/.dotm), RTF, EPUB, CSV/TSV, the Excel family (.xlsx/.xlsm/.xlsb/.xls/.xltx/.xltm), the PowerPoint family (.pptx/.pptm/.potx/.potm), OpenDocument (.odt/.odp/.ods), Jupyter notebooks, BibTeX, Org-mode, AsciiDoc, structured config/data (JSON/YAML/TOML/XML/INI/ENV/PROPERTIES), images (including modern.heic/.avif/.jxl), and code
The browser-backed packaged lane is intentionally optional because it needs a local Chromium install, but when enabled it validates the same installed-package artifacts people actually run:
pnpm exec playwright install chromium
pnpm live:smoke:heuristic:browserWhen a real bug is found in one of these repos, reduce it into a stable regression test or small derived fixture before treating the fix as done.