Agentgateway Model and Provider Cookbook

Route to any LLM through a single gateway. Agentgateway supports any provider with an OpenAI-compatible API.

746+
Models
43+
LLM Gateway Providers
20
API Endpoints
Search by Endpoints
1 Secret
2 Backend
3 Route

Native Providers

First-class support with full API translation in agentgateway.

OpenAI

Native
35 models
gpt-4o gpt-4o-mini gpt-4-turbo +32 more
api.openai.com

Auth: $OPENAI_API_KEY

View configuration

Anthropic

Native
14 models
claude-opus-4-6 claude-sonnet-4-6 claude-opus-4-5 +11 more
api.anthropic.com

Auth: $ANTHROPIC_API_KEY

View configuration

Amazon Bedrock

Native
47 models
anthropic.claude-sonnet-4.6 anthropic.claude-opus-4.6 anthropic.claude-sonnet-4.5 +44 more
bedrock-runtime.{region}.amazonaws.com

Auth: $AWS_ACCESS_KEY_ID

View configuration

Google Gemini

Native
26 models
gemini-2.5-pro gemini-2.5-flash gemini-2.5-flash-lite +23 more
generativelanguage.googleapis.com

Auth: $GOOGLE_KEY

View configuration

Google Vertex AI

Native
32 models
gemini-2.5-pro gemini-2.5-flash gemini-2.5-flash-lite +29 more
{region}-aiplatform.googleapis.com

Auth: $VERTEX_AI_API_KEY

View configuration

Azure OpenAI

Native
30 models
gpt-4o gpt-4o-mini gpt-4-turbo +27 more
{resource}.openai.azure.com

Auth: $AZURE_API_KEY

View configuration

OpenAI-Compatible Providers

These providers expose an OpenAI-compatible API. Agentgateway routes to them using the openai provider type with custom host, port, and path overrides.

Mistral AI

OpenAI-compat
29 models
mistral-large-latest mistral-large-2512 mistral-medium-latest +26 more
api.mistral.ai

Auth: $MISTRAL_API_KEY

View configuration

DeepSeek

OpenAI-compat
7 models
deepseek-chat deepseek-reasoner deepseek-v3 +4 more
api.deepseek.com

Auth: $DEEPSEEK_API_KEY

View configuration

xAI (Grok)

OpenAI-compat
14 models
grok-4 grok-4-fast-reasoning grok-4-fast-non-reasoning +11 more
api.x.ai

Auth: $XAI_API_KEY

View configuration

Groq

OpenAI-compat
15 models
llama-3.3-70b-versatile llama-3.1-8b-instant llama-4-maverick-17b-128e-instruct +12 more
api.groq.com

Auth: $GROQ_API_KEY

View configuration

Cohere

OpenAI-compat
14 models
command-r-plus command-r command-a-03-2025 +11 more
api.cohere.com

Auth: $COHERE_API_KEY

View configuration

Together AI

OpenAI-compat
23 models
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 meta-llama/Llama-3.3-70B-Instruct-Turbo meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo +20 more
api.together.xyz

Auth: $TOGETHER_API_KEY

View configuration

Fireworks AI

OpenAI-compat
29 models
llama-v3p3-70b-instruct llama-v3p1-405b-instruct llama-v3p1-70b-instruct +26 more
api.fireworks.ai

Auth: $FIREWORKS_API_KEY

View configuration

Perplexity AI

OpenAI-compat
9 models
sonar-pro sonar sonar-deep-research +6 more
api.perplexity.ai

Auth: $PERPLEXITY_API_KEY

View configuration

OpenRouter

OpenAI-compat
46 models
openai/gpt-4o openai/gpt-5 openai/gpt-5-mini +43 more
openrouter.ai

Auth: $OPENROUTER_API_KEY

View configuration

Cerebras

OpenAI-compat
8 models
llama-3.3-70b llama3.1-70b llama3.1-8b +5 more
api.cerebras.ai

Auth: $CEREBRAS_API_KEY

View configuration

SambaNova

OpenAI-compat
14 models
Meta-Llama-3.1-405B-Instruct Meta-Llama-3.1-70B-Instruct Meta-Llama-3.1-8B-Instruct +11 more
api.sambanova.ai

Auth: $SAMBANOVA_API_KEY

View configuration

DeepInfra

OpenAI-compat
23 models
meta-llama/Llama-4-Scout-17B-16E-Instruct meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 meta-llama/Llama-3.3-70B-Instruct-Turbo +20 more
api.deepinfra.com

Auth: $DEEPINFRA_API_KEY

View configuration

HuggingFace

OpenAI-compat
19 models
meta-llama/Llama-4-Scout-17B-16E-Instruct meta-llama/Llama-4-Maverick-17B-128E-Instruct meta-llama/Llama-3.1-70B-Instruct +16 more
api-inference.huggingface.co

Auth: $HF_API_KEY

View configuration

Nvidia NIM

OpenAI-compat
19 models
meta/llama-4-maverick-17b-128e-instruct meta/llama-4-scout-17b-16e-instruct meta/llama-3.1-405b-instruct +16 more
integrate.api.nvidia.com

Auth: $NVIDIA_API_KEY

View configuration

Replicate

OpenAI-compat
12 models
meta/llama-4-scout-17b-16e-instruct meta/llama-4-maverick-17b-128e-instruct meta/llama-3.1-405b-instruct +9 more
api.replicate.com

Auth: $REPLICATE_API_KEY

View configuration

AI21

OpenAI-compat
8 models
jamba-1.5-large jamba-1.5-mini jamba-instruct +5 more
api.ai21.com

Auth: $AI21_API_KEY

View configuration

Cloudflare Workers AI

OpenAI-compat
9 models
@cf/meta/llama-3.1-8b-instruct @cf/meta/llama-3.1-70b-instruct @cf/meta/llama-3.2-3b-instruct +6 more
api.cloudflare.com

Auth: $CF_API_TOKEN

View configuration

Lambda AI

OpenAI-compat
7 models
hermes-3-llama-3.1-405b-fp8 hermes-3-llama-3.1-70b-fp8 llama-3.1-405b-instruct +4 more
api.lambdalabs.com

Auth: $LAMBDA_API_KEY

View configuration

Nebius AI Studio

OpenAI-compat
10 models
meta-llama/Llama-3.1-70B-Instruct meta-llama/Llama-3.1-405B-Instruct meta-llama/Llama-3.3-70B-Instruct +7 more
api.studio.nebius.ai

Auth: $NEBIUS_API_KEY

View configuration

Novita AI

OpenAI-compat
8 models
meta-llama/llama-3.1-70b-instruct meta-llama/llama-3.1-405b-instruct meta-llama/llama-3.3-70b-instruct +5 more
api.novita.ai

Auth: $NOVITA_API_KEY

View configuration

Hyperbolic

OpenAI-compat
8 models
meta-llama/Llama-3.1-70B-Instruct meta-llama/Llama-3.1-405B-Instruct meta-llama/Llama-3.3-70B-Instruct +5 more
api.hyperbolic.xyz

Auth: $HYPERBOLIC_API_KEY

View configuration

Enterprise & Regional Providers

Enterprise cloud platforms and regional AI providers with OpenAI-compatible APIs.

Databricks

OpenAI-compat
24 models
databricks-meta-llama-3-1-70b-instruct databricks-meta-llama-3-3-70b-instruct databricks-meta-llama-3-1-405b-instruct +21 more
{workspace}.databricks.com

Auth: $DATABRICKS_TOKEN

View configuration

GitHub Models

OpenAI-compat
28 models
gpt-4o gpt-4o-mini gpt-5 +25 more
models.inference.ai.azure.com

Auth: $GITHUB_TOKEN

View configuration

Scaleway

OpenAI-compat
8 models
llama-3.1-70b-instruct llama-3.3-70b-instruct mistral-nemo-instruct +5 more
api.scaleway.ai

Auth: $SCALEWAY_API_KEY

View configuration

Dashscope (Qwen / Alibaba)

OpenAI-compat
23 models
qwen-turbo qwen-plus qwen-max +20 more
dashscope.aliyuncs.com

Auth: $DASHSCOPE_API_KEY

View configuration

Moonshot AI

OpenAI-compat
7 models
moonshot-v1-8k moonshot-v1-32k moonshot-v1-128k +4 more
api.moonshot.cn

Auth: $MOONSHOT_API_KEY

View configuration

Zhipu AI (Z.AI)

OpenAI-compat
12 models
glm-5 glm-4.7 glm-4 +9 more
open.bigmodel.cn

Auth: $ZHIPU_API_KEY

View configuration

Volcano Engine (ByteDance)

OpenAI-compat
8 models
doubao-pro-32k doubao-pro-128k doubao-pro-256k +5 more
maas-api.ml-platform-cn.volces.com

Auth: $VOLC_API_KEY

View configuration

IBM watsonx

OpenAI-compat
19 models
ibm/granite-3-8b-instruct ibm/granite-3-2b-instruct ibm/granite-3.1-8b-instruct +16 more
{region}.ml.cloud.ibm.com

Auth: $WATSONX_API_KEY

View configuration

Snowflake Cortex

OpenAI-compat
22 models
claude-3-5-sonnet claude-4-sonnet claude-sonnet-4-5 +19 more
{account}.snowflakecomputing.com

Auth: No API key needed

View configuration

OVHcloud AI

OpenAI-compat
8 models
DeepSeek-R1-Distill-Llama-70B Llama-3.3-70B-Instruct Llama-3.1-70B-Instruct +5 more
llama-3-3-70b-instruct.endpoints.kepler.ai.cloud.ovh.net

Auth: $OVH_API_KEY

View configuration

Oracle Cloud OCI

OpenAI-compat
6 models
meta.llama-3.1-405b-instruct meta.llama-3.1-70b-instruct meta.llama-3.3-70b-instruct +3 more
inference.generativeai.{region}.oci.oraclecloud.com

Auth: $OCI_API_KEY

View configuration

Anyscale

OpenAI-compat
7 models
meta-llama/Llama-3-70b-chat-hf meta-llama/Llama-3-8b-chat-hf mistralai/Mixtral-8x22B-Instruct-v0.1 +4 more
api.endpoints.anyscale.com

Auth: $ANYSCALE_API_KEY

View configuration

Local & Self-Hosted

Run models locally or in-cluster. No TLS or external API keys required.

Ollama

Local
33 models
llama3.2 llama3.1 llama3.1:70b +30 more
localhost / in-cluster

Auth: No API key needed

View configuration

vLLM

Local
13 models
meta-llama/Llama-4-Scout-17B-16E-Instruct meta-llama/Llama-3.1-8B-Instruct meta-llama/Llama-3.1-70B-Instruct +10 more
localhost / in-cluster

Auth: No API key needed

View configuration

llama.cpp

Local
9 models
Any GGUF model Llama 3.x Llama 4.x +6 more
localhost / in-cluster

Auth: No API key needed

View configuration

Triton Inference Server

Local
4 models
Any TensorRT-LLM model Any vLLM backend model Any Python backend model +1 more
localhost / in-cluster

Auth: No API key needed

View configuration
Browse by Endpoint

See which providers support each API endpoint type

Browse by Endpoint

Click any endpoint to see which providers support it and get ready-to-use configurations

Inference

Media

Specialized

Platform

Chat Completions API

43 providers support /chat/completions

Send messages and receive AI-generated responses. The most common LLM endpoint.

Supported Providers — click a provider to generate its config

OpenAI

Native
api.openai.com
chat completions responses embeddings images images_edits audio_speech audio_transcriptions audio_translations moderations fine_tuning files batches realtime models

Anthropic

Native
api.anthropic.com
chat messages batches models

Amazon Bedrock

Native
bedrock-runtime.{region}.amazonaws.com
chat embeddings images fine_tuning batches models

Google Gemini

Native
generativelanguage.googleapis.com
chat embeddings images audio_speech video fine_tuning files models

Google Vertex AI

Native
{region}-aiplatform.googleapis.com
chat embeddings images video fine_tuning batches models

Azure OpenAI

Native
{resource}.openai.azure.com
chat completions embeddings images audio_speech audio_transcriptions audio_translations fine_tuning files batches models

Mistral AI

OpenAI-compat
api.mistral.ai
chat completions embeddings fim moderations fine_tuning files models

DeepSeek

OpenAI-compat
api.deepseek.com
chat completions models

xAI (Grok)

OpenAI-compat
api.x.ai
chat completions embeddings images models

Groq

OpenAI-compat
api.groq.com
chat embeddings audio_transcriptions audio_translations models

Cohere

OpenAI-compat
api.cohere.com
chat embeddings rerank classify fine_tuning models

Together AI

OpenAI-compat
api.together.xyz
chat completions embeddings images rerank fine_tuning files models

Fireworks AI

OpenAI-compat
api.fireworks.ai
chat completions embeddings images audio_transcriptions fine_tuning models

Perplexity AI

OpenAI-compat
api.perplexity.ai
chat

OpenRouter

OpenAI-compat
openrouter.ai
chat models

Cerebras

OpenAI-compat
api.cerebras.ai
chat completions models

SambaNova

OpenAI-compat
api.sambanova.ai
chat completions embeddings models

DeepInfra

OpenAI-compat
api.deepinfra.com
chat completions embeddings images audio_transcriptions audio_speech models

HuggingFace

OpenAI-compat
api-inference.huggingface.co
chat completions embeddings images audio_speech audio_transcriptions models

Nvidia NIM

OpenAI-compat
integrate.api.nvidia.com
chat completions embeddings rerank models

Replicate

OpenAI-compat
api.replicate.com
chat images audio_speech audio_transcriptions fine_tuning models

AI21

OpenAI-compat
api.ai21.com
chat embeddings models

Cloudflare Workers AI

OpenAI-compat
api.cloudflare.com
chat embeddings images audio_transcriptions models

Lambda AI

OpenAI-compat
api.lambdalabs.com
chat completions models

Nebius AI Studio

OpenAI-compat
api.studio.nebius.ai
chat completions embeddings images models

Novita AI

OpenAI-compat
api.novita.ai
chat completions embeddings images audio_speech audio_transcriptions video models

Hyperbolic

OpenAI-compat
api.hyperbolic.xyz
chat completions embeddings images audio_transcriptions models

Databricks

OpenAI-compat
{workspace}.databricks.com
chat completions embeddings models

GitHub Models

OpenAI-compat
models.inference.ai.azure.com
chat embeddings models

Scaleway

OpenAI-compat
api.scaleway.ai
chat embeddings models

Dashscope (Qwen / Alibaba)

OpenAI-compat
dashscope.aliyuncs.com
chat completions embeddings images audio_speech audio_transcriptions rerank fine_tuning files models

Moonshot AI

OpenAI-compat
api.moonshot.cn
chat files models

Zhipu AI (Z.AI)

OpenAI-compat
open.bigmodel.cn
chat embeddings images video fine_tuning files batches models

Volcano Engine (ByteDance)

OpenAI-compat
maas-api.ml-platform-cn.volces.com
chat completions embeddings images audio_speech audio_transcriptions fine_tuning files batches models

IBM watsonx

OpenAI-compat
{region}.ml.cloud.ibm.com
chat embeddings rerank fine_tuning models

Snowflake Cortex

OpenAI-compat
{account}.snowflakecomputing.com
chat embeddings models

OVHcloud AI

OpenAI-compat
llama-3-3-70b-instruct.endpoints.kepler.ai.cloud.ovh.net
chat completions embeddings images audio_transcriptions models

Oracle Cloud OCI

OpenAI-compat
inference.generativeai.{region}.oci.oraclecloud.com
chat embeddings models

Anyscale

OpenAI-compat
api.endpoints.anyscale.com
chat completions embeddings models

Ollama

Local
localhost / in-cluster
chat completions embeddings images images_edits responses messages models

vLLM

Local
localhost / in-cluster
chat completions embeddings audio_transcriptions audio_translations responses messages rerank realtime models

llama.cpp

Local
localhost / in-cluster
chat completions embeddings fim rerank responses messages models

Triton Inference Server

Local
localhost / in-cluster
chat models
Agentgateway Config /chat/completions

Save as config.yaml and run with agentgateway -f config.yaml

Test it