Transform requests

Use LLM request transformations to dynamically compute and set fields in LLM requests using Common Expression Language (CEL) CEL (Common Expression Language) A simple expression language used throughout agentgateway to enable flexible configuration. CEL expressions can access request context, JWT claims, and other variables to make dynamic decisions. expressions. Transformations let you enforce policies such as capping token usage or conditionally modifying request parameters, without changing client code.

To learn more about CEL, see the following resources:

Before you begin

  1. Set up an agentgateway proxy.
  2. Set up access to the OpenAI LLM provider.

Configure LLM request transformations

  1. Create an AgentgatewayPolicy resource to apply an LLM request transformation. The following example caps max_tokens to 10, regardless of what the client requests.

    kubectl apply -f- <<EOF
    apiVersion: agentgateway.dev/v1alpha1
    kind: AgentgatewayPolicy
    metadata:
      name: cap-max-tokens
      namespace: agentgateway-system
      labels:
        app: agentgateway
    spec:
      targetRefs:
      - group: gateway.networking.k8s.io
        kind: HTTPRoute
        name: openai
      backend:
        ai:
          transformations:
          - field: max_tokens
            expression: "min(llmRequest.max_tokens, 10)"
    EOF
    SettingDescription
    backend.ai.transformationsA list of LLM request field transformations.
    fieldThe name of the LLM request field to set. Maximum 256 characters.
    expressionA CEL expression that computes the value for the field. Use the llmRequest variable to access the original LLM request body. Maximum 16,384 characters.
    ℹ️

    You can specify up to 64 transformations per policy. Transformations take priority over overrides for the same field. If an expression fails to evaluate, the field is silently removed from the request.

    Thinking budget fields, such as reasoning_effort and thinking_budget_tokens can also be set or capped by using transformations. This way, operators can enforce reasoning limits centrally without requiring client changes. For example, use "field": "reasoning_effort" with the expression "medium" to cap all requests to medium reasoning efforts regardless of what the client sends.

  2. Send a request with max_tokens set to a value greater than 10. The transformation caps it to 10 before the request reaches the LLM provider. Verify that the completion_tokens value in the response is 10 or fewer, the response is capped and the finish_reason is set to length.

    curl "$INGRESS_GW_ADDRESS/openai" \
    -H "content-type: application/json" \
    -d '{
      "model": "gpt-3.5-turbo",
      "max_tokens": 5000,
      "messages": [
        {
          "role": "user",
          "content": "Tell me a short story"
        }
      ]
    }' | jq 
    curl "localhost:8080/openai" \
    -H "content-type: application/json" \
    -d '{
      "model": "gpt-3.5-turbo",
      "max_tokens": 5000,
      "messages": [
        {
          "role": "user",
          "content": "Tell me a short story"
        }
      ]
    }' | jq 

    Example output:

    {
      "model": "gpt-3.5-turbo-0125",
      "usage": {
        "prompt_tokens": 12,
        "completion_tokens": 10,
        "total_tokens": 22,
        "completion_tokens_details": {
          "reasoning_tokens": 0,
          "audio_tokens": 0,
          "accepted_prediction_tokens": 0,
          "rejected_prediction_tokens": 0
        },
        "prompt_tokens_details": {
          "cached_tokens": 0,
          "audio_tokens": 0
        }
      },
      "choices": [
        {
          "message": {
            "content": "Once upon a time, in a small village nestled",
            "role": "assistant",
            "refusal": null,
            "annotations": []
          },
          "index": 0,
          "logprobs": null,
          "finish_reason": "length"
        }
      ],
      ...
    }
    

Cleanup

You can remove the resources that you created in this guide.
kubectl delete AgentgatewayPolicy -n agentgateway-system -l app=agentgateway
Agentgateway assistant

Ask me anything about agentgateway configuration, features, or usage.

Note: AI-generated content might contain errors; please verify and test all returned information.

Tip: one topic per conversation gives the best results. Use the + button in the chat header to start a new conversation.

Switching topics? Starting a new conversation improves accuracy.
↑↓ navigate select esc dismiss

What could be improved?

Your feedback helps us improve assistant answers and identify docs gaps we should fix.

Need more help? Join us on Discord: https://discord.gg/y9efgEmppm

Want to use your own agent? Add the Solo MCP server to query our docs directly. Get started here: https://search.solo.io/.