Observe traffic
Review LLM-specific metrics, logs, and traces.
Before you begin
Complete an LLM guide, such as the control spend guide. This guide sends a request to the LLM and receives a response. You can use this request and response example to verify metrics, logs, and traces.
View LLM metrics
You can access the agentgateway metrics endpoint to view LLM-specific metrics, such as the number of tokens that you used during a request or response.
- Open the agentgateway metrics endpoint.
- Look for the
agentgateway_gen_ai_client_token_usage
metric. This metric is a histogram and includes important information about the request and the response from the LLM, such as:gen_ai_token_type
: Whether this metric is about a request (input
) or response (output
).gen_ai_operation_name
: The name of the operation that was performed.gen_ai_system
: The LLM provider that was used for the request/response.gen_ai_request_model
: The model that was used for the request.gen_ai_response_model
: The model that was used for the response.
For more information, see the Semantic conventions for generative AI metrics in the OpenTelemetry docs.
View traces
-
Use
docker compose
to spin up a Jaeger instance with the following components:- An OpenTelemetry collector that receives traces from the agentgateway. The collector is exposed on
http://localhost:4317
. - A Jaeger agent that receives the collected traces. The agent is exposed on
http://localhost:14268
. - A Jaeger UI that is exposed on
http://localhost:16686
.
docker compose -f - up -d <<EOF services: jaeger: container_name: jaeger restart: unless-stopped image: jaegertracing/all-in-one:latest ports: - "127.0.0.1:16686:16686" - "127.0.0.1:14268:14268" - "127.0.0.1:4317:4317" environment: - COLLECTOR_OTLP_ENABLED=true EOF
- An OpenTelemetry collector that receives traces from the agentgateway. The collector is exposed on
-
Configure your agentgateway proxy to emit traces and send them to the built-in OpenTelemetry collector agent.
cat <<EOF > config.yaml config: tracing: otlpEndpoint: http://localhost:4317 randomSampling: true binds: - port: 3000 listeners: - routes: - backends: - ai: name: openai provider: openAI: # Optional; overrides the model in requests model: gpt-3.5-turbo policies: backendAuth: key: "$OPENAI_API_KEY" EOF
-
Run your agentgateway proxy.
agentgateway -f config.yaml
-
Send a request to the OpenAI provider.
curl 'http://0.0.0.0:3000/' \ --header 'Content-Type: application/json' \ --data ' { "model": "gpt-3.5-turbo", "messages": [ { "role": "user", "content": "Tell me a short story" } ] }'
-
Open the Jaeger UI and verify that you can see traces for your LLM request.
View logs
Agentgateway automatically logs information to stdout. When you run agentgateway on your local machine, you can view a log entry for each request that is sent to agentgateway in your CLI output.
Example for a successful request to the OpenAI LLM:
2025-09-03T20:30:08.686967Z info request gateway=bind/3000 listener=listener0 route_rule=route0/default
route=route0 endpoint=api.openai.com:443 src.addr=127.0.0.1:54140 http.method=POST http.host=0.0.0.0 http.
path=/ http.version=HTTP/1.1 http.status=200 llm.provider=openai llm.request.model=gpt-3.5-turbo llm.
request.tokens=11 llm.response.model=gpt-3.5-turbo-0125 llm.response.tokens=331 duration=4305ms
Example for a rate limited request:
2025-09-03T19:40:18.687849Z info request gateway=bind/3000 listener=listener0 route_rule=route0/default
route=route0 endpoint=api.openai.com:443 src.addr=127.0.0.1:51794 http.method=POST http.host=0.0.0.0 http.
path=/ http.version=HTTP/1.1 http.status=429 error=rate limit exceeded duration=206ms