For the complete documentation index, see llms.txt. Markdown versions of all docs pages are available by appending .md to any docs URL.
Chat completions
Send chat completion requests through agentgateway using the OpenAI Chat Completions API.
The OpenAI Chat Completions API (/v1/chat/completions) is the primary interface for text generation and chat applications in agentgateway.
About
The OpenAI Chat Completions API is the most widely used LLM endpoint. Agentgateway proxies these requests to your configured providers while providing token usage tracking, observability metrics, and policy enforcement.
By default, requests to agentgateway use the Chat Completions API. These requests are translated to the upstream provider’s native API format when necessary.
Route type configuration
In the simplified llm configuration, agentgateway automatically maps /v1/chat/completions requests to the completions route type, so no explicit route configuration is required.
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
llm:
models:
- name: "*"
provider: openAI
params:
apiKey: "$OPENAI_API_KEY"To configure the route type explicitly, use the binds/listeners/routes format and set the completions route type in the policies.ai.routes map.
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
binds:
- port: 4000
listeners:
- routes:
- backends:
- ai:
name: openai
provider:
openAI: {}
policies:
ai:
routes:
"/v1/chat/completions": "completions"
backendAuth:
key: "$OPENAI_API_KEY"Using the API
Using the Chat Completions API works exactly the same as consuming OpenAI directly, with only a change to the base URL. This allows you to continue using existing code and SDKs.
curl 'http://localhost:4000/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "gpt-4o-mini",
"messages": [
{
"role": "user",
"content": "Tell me a story"
}
]
}'Token usage tracking
After sending Chat Completions requests, verify that agentgateway recorded token usage metrics.
- Open the agentgateway metrics endpoint.
- Look for the
agentgateway_gen_ai_client_token_usagemetric. The metric includes labels for the token type (inputoroutput) and the model used.
For more information about LLM metrics and observability, see Observe traffic.