Groq Provider

Groq is exposed as a named preset over SwarmVault's OpenAI-compatible adapter. Groq specializes in fast inference for open-weight models.

Configuration

{
  "providers": {
    "fast": {
      "type": "groq",
      "model": "llama-3.3-70b-versatile"
    }
  }
}

Options

FieldDefaultDescription
model--Model ID (e.g., llama-3.3-70b-versatile, mixtral-8x7b-32768)
apiKeyEnv"GROQ_API_KEY"Environment variable for API key
baseUrlhttps://api.groq.com/openai/v1API base URL
apiStyle"chat"API style (chat for Chat Completions)
capabilities--Override auto-detected capabilities

Environment Variable

export GROQ_API_KEY=gsk_...

Notes

  • Speed: Groq is optimized for low-latency inference. Useful as a fast provider for tasks like lint or compile where speed matters more than maximum quality.
  • Rate limits: Groq has per-model rate limits on tokens per minute. If you hit limits during large compiles, consider using Groq for query/lint tasks and a different provider for compile.
  • Models: Groq hosts Llama, Mixtral, and Gemma variants. Check the Groq console for the current model list.