Skip to content

Azure OpenAI

Configure Azure as an LLM provider in agentgateway.

Azure supports two endpoint types:

  • Azure OpenAI (openAI): Connect to Azure OpenAI Service deployments at {resourceName}.openai.azure.com.
  • Azure AI Foundry (foundry): Connect to Azure AI Foundry project endpoints at {resourceName}-resource.services.ai.azure.com.

Before you begin

Install and set up an agentgateway proxy.

Set up access to Azure

  1. Retrieve the resource name and, if applicable, the project name from the Azure AI Foundry portal or the Azure portal. For example:

    • For an Azure OpenAI endpoint like https://{my-resource}.openai.azure.com, the resource name is my-resource.
    • For an Azure AI Foundry endpoint like https://{my-resource}-resource.services.ai.azure.com and path /api/projects/{my-project}, the resource name is my-resource and the project name is my-project. If the resource name and the project name are the same, you can leave the projectName field empty.
  2. Store the API key to access your model deployment in an environment variable. If you are using implicit Entra ID authentication (such as managed identity or workload identity), you can skip this step.

    export AZURE_API_KEY=<insert your model deployment key>
  3. Create a Kubernetes secret to store your API key. If you are using implicit Entra ID authentication, skip this step.

    kubectl apply -f- <<EOF
    apiVersion: v1
    kind: Secret
    metadata:
      name: azure-secret
      namespace: agentgateway-system
    type: Opaque
    stringData:
      Authorization: $AZURE_API_KEY
    EOF
  4. Create an AgentgatewayBackend resource to configure the Azure LLM provider.

    kubectl apply -f- <<EOF
    apiVersion: agentgateway.dev/v1alpha1
    kind: AgentgatewayBackend
    metadata:
      name: azure
      namespace: agentgateway-system
    spec:
      ai:
        provider:
          azure:
            resourceName: my-resource
            resourceType: openAI
            model: gpt-4.1-mini
    EOF

    Review the following table to understand this configuration. For more information, see the API reference.

    SettingDescription
    ai.provider.azureDefine the Azure provider.
    azure.resourceNameThe Azure resource name used to construct the endpoint hostname.
    azure.resourceTypeThe endpoint type: openAI for Azure OpenAI Service, or foundry for Azure AI Foundry.
    azure.modelThe model to use for requests, such as gpt-4.1-mini.
    azure.projectNameThe Foundry project name. Required when resourceType is foundry.
    azure.apiVersionOptional API version override. Defaults to v1. For legacy deployments, use a dated version like 2025-01-01-preview.
  5. Create an HTTPRoute resource that routes incoming traffic to the AgentgatewayBackend. The following example sets up a route. Note that agentgateway automatically rewrites the endpoint to the appropriate chat completion endpoint of the LLM provider for you, based on the LLM provider that you set up in the AgentgatewayBackend resource.

    kubectl apply -f- <<EOF
    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: azure
      namespace: agentgateway-system
    spec:
      parentRefs:
        - name: agentgateway-proxy
          namespace: agentgateway-system
      rules:
      - backendRefs:
        - name: azure
          namespace: agentgateway-system
          group: agentgateway.dev
          kind: AgentgatewayBackend
    EOF
  6. Send a request to the LLM provider API along the route that you previously created. Verify that the request succeeds and that you get back a response from the chat completion API.

    Cloud Provider LoadBalancer:

    curl "$INGRESS_GW_ADDRESS/v1/chat/completions" -H content-type:application/json  -d '{
       "model": "",
       "messages": [
         {
           "role": "system",
           "content": "You are a helpful assistant."
         },
         {
           "role": "user",
           "content": "Write a short haiku about cloud computing."
         }
       ]
     }' | jq

    Localhost:

    curl "localhost:8080/v1/chat/completions" -H content-type:application/json  -d '{
       "model": "",
       "messages": [
         {
           "role": "system",
           "content": "You are a helpful assistant."
         },
         {
           "role": "user",
           "content": "Write a short haiku about cloud computing."
         }
       ]
     }' | jq

    Example output:

    {
      "id": "chatcmpl-9A8B7C6D5E4F3G2H1",
      "object": "chat.completion",
      "created": 1727967462,
      "model": "gpt-4.1-mini",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Floating servers bright,\nData streams through endless sky,\nClouds hold all we need."
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 28,
        "completion_tokens": 19,
        "total_tokens": 47
      }
    }

Next steps

Multiple endpoints

Set up other API endpoints such as embeddings or models.

Prompt guards

Set up prompt guards for your LLM traffic.

LLM observability

View metrics and logs for LLM traffic.

Was this page helpful?
Agentgateway assistant

Ask me anything about agentgateway configuration, features, or usage.

Note: AI-generated content might contain errors; please verify and test all returned information.

Tip: one topic per conversation gives the best results. Use the + button in the chat header to start a new conversation.

Switching topics? Starting a new conversation improves accuracy.
↑↓ navigate select esc dismiss

What could be improved?

Your feedback helps us improve assistant answers and identify docs gaps we should fix.

Need more help? Join us on Discord: https://discord.gg/y9efgEmppm

Want to use your own agent? Add the Solo MCP server to query our docs directly. Get started here: https://search.solo.io/.