Agentgateway Model and Provider Cookbook
Route to any LLM through a single gateway. Agentgateway supports any provider with an OpenAI-compatible API.
Native Providers
First-class support with full API translation in agentgateway.
OpenAI
Nativegpt-4o
gpt-4o-mini
gpt-4-turbo
+32 more
api.openai.com
Auth:
$OPENAI_API_KEY
Anthropic
Nativeclaude-opus-4-6
claude-sonnet-4-6
claude-opus-4-5
+11 more
api.anthropic.com
Auth:
$ANTHROPIC_API_KEY
Amazon Bedrock
Nativeanthropic.claude-sonnet-4.6
anthropic.claude-opus-4.6
anthropic.claude-sonnet-4.5
+44 more
bedrock-runtime.{region}.amazonaws.com
Auth:
$AWS_ACCESS_KEY_ID
Google Gemini
Nativegemini-2.5-pro
gemini-2.5-flash
gemini-2.5-flash-lite
+23 more
generativelanguage.googleapis.com
Auth:
$GOOGLE_KEY
Google Vertex AI
Nativegemini-2.5-pro
gemini-2.5-flash
gemini-2.5-flash-lite
+29 more
{region}-aiplatform.googleapis.com
Auth:
$VERTEX_AI_API_KEY
Azure OpenAI
Nativegpt-4o
gpt-4o-mini
gpt-4-turbo
+27 more
{resource}.openai.azure.com
Auth:
$AZURE_API_KEY
OpenAI-Compatible Providers
These providers expose an OpenAI-compatible API. Agentgateway routes to them using the openai provider type with custom host, port, and path overrides.
Mistral AI
OpenAI-compatmistral-large-latest
mistral-large-2512
mistral-medium-latest
+26 more
api.mistral.ai
Auth:
$MISTRAL_API_KEY
DeepSeek
OpenAI-compatdeepseek-chat
deepseek-reasoner
deepseek-v3
+4 more
api.deepseek.com
Auth:
$DEEPSEEK_API_KEY
xAI (Grok)
OpenAI-compatgrok-4
grok-4-fast-reasoning
grok-4-fast-non-reasoning
+11 more
api.x.ai
Auth:
$XAI_API_KEY
Groq
OpenAI-compatllama-3.3-70b-versatile
llama-3.1-8b-instant
llama-4-maverick-17b-128e-instruct
+12 more
api.groq.com
Auth:
$GROQ_API_KEY
Cohere
OpenAI-compatcommand-r-plus
command-r
command-a-03-2025
+11 more
api.cohere.com
Auth:
$COHERE_API_KEY
Together AI
OpenAI-compatmeta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
meta-llama/Llama-3.3-70B-Instruct-Turbo
meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo
+20 more
api.together.xyz
Auth:
$TOGETHER_API_KEY
Fireworks AI
OpenAI-compatllama-v3p3-70b-instruct
llama-v3p1-405b-instruct
llama-v3p1-70b-instruct
+26 more
api.fireworks.ai
Auth:
$FIREWORKS_API_KEY
Perplexity AI
OpenAI-compatsonar-pro
sonar
sonar-deep-research
+6 more
api.perplexity.ai
Auth:
$PERPLEXITY_API_KEY
OpenRouter
OpenAI-compatopenai/gpt-4o
openai/gpt-5
openai/gpt-5-mini
+43 more
openrouter.ai
Auth:
$OPENROUTER_API_KEY
Cerebras
OpenAI-compatllama-3.3-70b
llama3.1-70b
llama3.1-8b
+5 more
api.cerebras.ai
Auth:
$CEREBRAS_API_KEY
SambaNova
OpenAI-compatMeta-Llama-3.1-405B-Instruct
Meta-Llama-3.1-70B-Instruct
Meta-Llama-3.1-8B-Instruct
+11 more
api.sambanova.ai
Auth:
$SAMBANOVA_API_KEY
DeepInfra
OpenAI-compatmeta-llama/Llama-4-Scout-17B-16E-Instruct
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
meta-llama/Llama-3.3-70B-Instruct-Turbo
+20 more
api.deepinfra.com
Auth:
$DEEPINFRA_API_KEY
HuggingFace
OpenAI-compatmeta-llama/Llama-4-Scout-17B-16E-Instruct
meta-llama/Llama-4-Maverick-17B-128E-Instruct
meta-llama/Llama-3.1-70B-Instruct
+16 more
api-inference.huggingface.co
Auth:
$HF_API_KEY
Nvidia NIM
OpenAI-compatmeta/llama-4-maverick-17b-128e-instruct
meta/llama-4-scout-17b-16e-instruct
meta/llama-3.1-405b-instruct
+16 more
integrate.api.nvidia.com
Auth:
$NVIDIA_API_KEY
Replicate
OpenAI-compatmeta/llama-4-scout-17b-16e-instruct
meta/llama-4-maverick-17b-128e-instruct
meta/llama-3.1-405b-instruct
+9 more
api.replicate.com
Auth:
$REPLICATE_API_KEY
AI21
OpenAI-compatjamba-1.5-large
jamba-1.5-mini
jamba-instruct
+5 more
api.ai21.com
Auth:
$AI21_API_KEY
Cloudflare Workers AI
OpenAI-compat@cf/meta/llama-3.1-8b-instruct
@cf/meta/llama-3.1-70b-instruct
@cf/meta/llama-3.2-3b-instruct
+6 more
api.cloudflare.com
Auth:
$CF_API_TOKEN
Lambda AI
OpenAI-compathermes-3-llama-3.1-405b-fp8
hermes-3-llama-3.1-70b-fp8
llama-3.1-405b-instruct
+4 more
api.lambdalabs.com
Auth:
$LAMBDA_API_KEY
Nebius AI Studio
OpenAI-compatmeta-llama/Llama-3.1-70B-Instruct
meta-llama/Llama-3.1-405B-Instruct
meta-llama/Llama-3.3-70B-Instruct
+7 more
api.studio.nebius.ai
Auth:
$NEBIUS_API_KEY
Novita AI
OpenAI-compatmeta-llama/llama-3.1-70b-instruct
meta-llama/llama-3.1-405b-instruct
meta-llama/llama-3.3-70b-instruct
+5 more
api.novita.ai
Auth:
$NOVITA_API_KEY
Hyperbolic
OpenAI-compatmeta-llama/Llama-3.1-70B-Instruct
meta-llama/Llama-3.1-405B-Instruct
meta-llama/Llama-3.3-70B-Instruct
+5 more
api.hyperbolic.xyz
Auth:
$HYPERBOLIC_API_KEY
Enterprise & Regional Providers
Enterprise cloud platforms and regional AI providers with OpenAI-compatible APIs.
Databricks
OpenAI-compatdatabricks-meta-llama-3-1-70b-instruct
databricks-meta-llama-3-3-70b-instruct
databricks-meta-llama-3-1-405b-instruct
+21 more
{workspace}.databricks.com
Auth:
$DATABRICKS_TOKEN
GitHub Models
OpenAI-compatgpt-4o
gpt-4o-mini
gpt-5
+25 more
models.inference.ai.azure.com
Auth:
$GITHUB_TOKEN
Scaleway
OpenAI-compatllama-3.1-70b-instruct
llama-3.3-70b-instruct
mistral-nemo-instruct
+5 more
api.scaleway.ai
Auth:
$SCALEWAY_API_KEY
Dashscope (Qwen / Alibaba)
OpenAI-compatqwen-turbo
qwen-plus
qwen-max
+20 more
dashscope.aliyuncs.com
Auth:
$DASHSCOPE_API_KEY
Moonshot AI
OpenAI-compatmoonshot-v1-8k
moonshot-v1-32k
moonshot-v1-128k
+4 more
api.moonshot.cn
Auth:
$MOONSHOT_API_KEY
Zhipu AI (Z.AI)
OpenAI-compatglm-5
glm-4.7
glm-4
+9 more
open.bigmodel.cn
Auth:
$ZHIPU_API_KEY
Volcano Engine (ByteDance)
OpenAI-compatdoubao-pro-32k
doubao-pro-128k
doubao-pro-256k
+5 more
maas-api.ml-platform-cn.volces.com
Auth:
$VOLC_API_KEY
IBM watsonx
OpenAI-compatibm/granite-3-8b-instruct
ibm/granite-3-2b-instruct
ibm/granite-3.1-8b-instruct
+16 more
{region}.ml.cloud.ibm.com
Auth:
$WATSONX_API_KEY
Snowflake Cortex
OpenAI-compatclaude-3-5-sonnet
claude-4-sonnet
claude-sonnet-4-5
+19 more
{account}.snowflakecomputing.com
Auth: No API key needed
OVHcloud AI
OpenAI-compatDeepSeek-R1-Distill-Llama-70B
Llama-3.3-70B-Instruct
Llama-3.1-70B-Instruct
+5 more
llama-3-3-70b-instruct.endpoints.kepler.ai.cloud.ovh.net
Auth:
$OVH_API_KEY
Oracle Cloud OCI
OpenAI-compatmeta.llama-3.1-405b-instruct
meta.llama-3.1-70b-instruct
meta.llama-3.3-70b-instruct
+3 more
inference.generativeai.{region}.oci.oraclecloud.com
Auth:
$OCI_API_KEY
Anyscale
OpenAI-compatmeta-llama/Llama-3-70b-chat-hf
meta-llama/Llama-3-8b-chat-hf
mistralai/Mixtral-8x22B-Instruct-v0.1
+4 more
api.endpoints.anyscale.com
Auth:
$ANYSCALE_API_KEY
Local & Self-Hosted
Run models locally or in-cluster. No TLS or external API keys required.
Ollama
Localllama3.2
llama3.1
llama3.1:70b
+30 more
localhost / in-cluster
Auth: No API key needed
vLLM
Localmeta-llama/Llama-4-Scout-17B-16E-Instruct
meta-llama/Llama-3.1-8B-Instruct
meta-llama/Llama-3.1-70B-Instruct
+10 more
localhost / in-cluster
Auth: No API key needed
llama.cpp
LocalAny GGUF model
Llama 3.x
Llama 4.x
+6 more
localhost / in-cluster
Auth: No API key needed
Triton Inference Server
LocalAny TensorRT-LLM model
Any vLLM backend model
Any Python backend model
+1 more
localhost / in-cluster
Auth: No API key needed
No providers found
Try a different search term
See which providers support each API endpoint type
Browse by Endpoint
Click any endpoint to see which providers support it and get ready-to-use configurations
Inference
Media
Specialized
Platform
Chat Completions API
43 providers support /chat/completions
Send messages and receive AI-generated responses. The most common LLM endpoint.
Supported Providers — click a provider to generate its config
OpenAI
Nativeapi.openai.com
Anthropic
Nativeapi.anthropic.com
Amazon Bedrock
Nativebedrock-runtime.{region}.amazonaws.com
Google Gemini
Nativegenerativelanguage.googleapis.com
Google Vertex AI
Native{region}-aiplatform.googleapis.com
Azure OpenAI
Native{resource}.openai.azure.com
Mistral AI
OpenAI-compatapi.mistral.ai
DeepSeek
OpenAI-compatapi.deepseek.com
xAI (Grok)
OpenAI-compatapi.x.ai
Groq
OpenAI-compatapi.groq.com
Cohere
OpenAI-compatapi.cohere.com
Together AI
OpenAI-compatapi.together.xyz
Fireworks AI
OpenAI-compatapi.fireworks.ai
Perplexity AI
OpenAI-compatapi.perplexity.ai
OpenRouter
OpenAI-compatopenrouter.ai
Cerebras
OpenAI-compatapi.cerebras.ai
SambaNova
OpenAI-compatapi.sambanova.ai
DeepInfra
OpenAI-compatapi.deepinfra.com
HuggingFace
OpenAI-compatapi-inference.huggingface.co
Nvidia NIM
OpenAI-compatintegrate.api.nvidia.com
Replicate
OpenAI-compatapi.replicate.com
AI21
OpenAI-compatapi.ai21.com
Cloudflare Workers AI
OpenAI-compatapi.cloudflare.com
Lambda AI
OpenAI-compatapi.lambdalabs.com
Nebius AI Studio
OpenAI-compatapi.studio.nebius.ai
Novita AI
OpenAI-compatapi.novita.ai
Hyperbolic
OpenAI-compatapi.hyperbolic.xyz
Databricks
OpenAI-compat{workspace}.databricks.com
GitHub Models
OpenAI-compatmodels.inference.ai.azure.com
Scaleway
OpenAI-compatapi.scaleway.ai
Dashscope (Qwen / Alibaba)
OpenAI-compatdashscope.aliyuncs.com
Moonshot AI
OpenAI-compatapi.moonshot.cn
Zhipu AI (Z.AI)
OpenAI-compatopen.bigmodel.cn
Volcano Engine (ByteDance)
OpenAI-compatmaas-api.ml-platform-cn.volces.com
IBM watsonx
OpenAI-compat{region}.ml.cloud.ibm.com
Snowflake Cortex
OpenAI-compat{account}.snowflakecomputing.com
OVHcloud AI
OpenAI-compatllama-3-3-70b-instruct.endpoints.kepler.ai.cloud.ovh.net
Oracle Cloud OCI
OpenAI-compatinference.generativeai.{region}.oci.oraclecloud.com
Anyscale
OpenAI-compatapi.endpoints.anyscale.com
Ollama
Locallocalhost / in-cluster
vLLM
Locallocalhost / in-cluster
llama.cpp
Locallocalhost / in-cluster
Triton Inference Server
Locallocalhost / in-cluster
Save as config.yaml and run with agentgateway -f config.yaml
Run these kubectl apply commands in order