Skip to content

For the complete documentation index, see llms.txt. Markdown versions of all docs pages are available by appending .md to any docs URL.

Page as Markdown

View metrics and logs

Review LLM-specific metrics and logs.

To calculate costs from token usage metrics, see the cost tracking guide.
For external logging platforms (also known as prompt logging, request/response logging, or audit trail) like Langfuse and LangSmith, see the LLM Observability integrations.

Before you begin

Complete an LLM guide, such as an LLM provider-specific guide. This guide sends a request to the LLM and receives a response. You can use this request and response example to verify metrics and logs.

View LLM metrics

You can access the agentgateway metrics endpoint to view LLM-specific metrics, such as the number of tokens that you used during a request or response.

  1. Port-forward the agentgateway proxy on port 15020.
    kubectl port-forward deployment/agentgateway-proxy -n agentgateway-system 15020  
  2. Open the agentgateway metrics endpoint.
  3. Look for the agentgateway_gen_ai_client_token_usage metric. This metric is a histogram and includes important information about the request and the response from the LLM, such as:
    • gen_ai_token_type: Whether this metric is about a request (input) or response (output).
    • gen_ai_operation_name: The name of the operation that was performed.
    • gen_ai_system: The LLM provider that was used for the request/response.
    • gen_ai_request_model: The model that was used for the request.
    • gen_ai_response_model: The model that was used for the response.

For more information, see the Semantic conventions for generative AI metrics in the OpenTelemetry docs.

Track per-user metrics

When you set up API key authentication with per-user rate limiting, you can filter token usage metrics by user ID to track spending and usage patterns for each virtual key.

For a complete virtual key setup guide, see Virtual key management.

Example PromQL query for per-user token usage:

# Total tokens consumed by each user
sum by (user_id) (
  agentgateway_gen_ai_client_token_usage_sum{gen_ai_token_type="input"} +
  agentgateway_gen_ai_client_token_usage_sum{gen_ai_token_type="output"}
)

View logs

Agentgateway automatically logs information to stdout. When you run agentgateway on your local machine, you can view a log entry for each request that is sent to agentgateway in your CLI output.

To view the logs:

kubectl logs deployment/agentgateway-proxy -n agentgateway-system

Example for a successful request to the OpenAI LLM:

2025-12-12T21:56:02.809082Z	info	request gateway=agentgateway-system/agentgateway-proxy listener=http
route=agentgateway-system/openai endpoint=api.openai.com:443 src.addr=127.0.0.1:60862 http.method=POST
http.host=localhost http.path=/openai http.version=HTTP/1.1 http.status=200 protocol=llm gen_ai.
operation.name=chat gen_ai.provider.name=openai gen_ai.request.model=gpt-3.5-turbo gen_ai.response.
model=gpt-3.5-turbo-0125 gen_ai.usage.input_tokens=68 gen_ai.usage.output_tokens=298 duration=2488ms 
Was this page helpful?
Agentgateway assistant

Ask me anything about agentgateway configuration, features, or usage.

Note: AI-generated content might contain errors; please verify and test all returned information.

Tip: one topic per conversation gives the best results. Use the + button in the chat header to start a new conversation.

Switching topics? Starting a new conversation improves accuracy.
↑↓ navigate select esc dismiss

What could be improved?

Your feedback helps us improve assistant answers and identify docs gaps we should fix.

Need more help? Join us on Discord: https://discord.gg/y9efgEmppm

Want to use your own agent? Add the Solo MCP server to query our docs directly. Get started here: https://search.solo.io/.