Groq Provider
Groq is exposed as a named preset over SwarmVault's OpenAI-compatible adapter. Groq specializes in fast inference for open-weight models.
Configuration
{
"providers": {
"fast": {
"type": "groq",
"model": "llama-3.3-70b-versatile"
}
}
}Options
| Field | Default | Description |
|---|---|---|
model | -- | Model ID (e.g., llama-3.3-70b-versatile, mixtral-8x7b-32768) |
apiKeyEnv | "GROQ_API_KEY" | Environment variable for API key |
baseUrl | https://api.groq.com/openai/v1 | API base URL |
apiStyle | "chat" | API style (chat for Chat Completions) |
capabilities | -- | Override auto-detected capabilities |
Environment Variable
export GROQ_API_KEY=gsk_...Notes
- Speed: Groq is optimized for low-latency inference. Useful as a fast provider for tasks like lint or compile where speed matters more than maximum quality.
- Rate limits: Groq has per-model rate limits on tokens per minute. If you hit limits during large compiles, consider using Groq for query/lint tasks and a different provider for compile.
- Models: Groq hosts Llama, Mixtral, and Gemma variants. Check the Groq console for the current model list.