Groq Provider

Groq is exposed as a named preset over SwarmVault's OpenAI-compatible adapter. Groq specializes in fast inference for open-weight models.

Configuration

{
  "providers": {
    "fast": {
      "type": "groq",
      "model": "llama-3.3-70b-versatile"
    }
  }
}

Field	Default	Description
`model`	--	Model ID (e.g., `llama-3.3-70b-versatile`, `mixtral-8x7b-32768`)
`apiKeyEnv`	`"GROQ_API_KEY"`	Environment variable for API key
`baseUrl`	`https://api.groq.com/openai/v1`	API base URL
`apiStyle`	`"chat"`	API style (`chat` for Chat Completions)
`capabilities`	--	Override auto-detected capabilities

export GROQ_API_KEY=gsk_...

Speed: Groq is optimized for low-latency inference. Useful as a fast provider for tasks like lint or compile where speed matters more than maximum quality.
Rate limits: Groq has per-model rate limits on tokens per minute. If you hit limits during large compiles, consider using Groq for query/lint tasks and a different provider for compile.
Models: Groq hosts Llama, Mixtral, and Gemma variants. Check the Groq console for the current model list.