Azure

Configuration and setup for Azure AI services provider

Configure Azure as an LLM provider in agentgateway.

Azure supports two endpoint types:

Azure OpenAI (openAI): Connect to Azure OpenAI Service deployments at {resourceName}.openai.azure.com.
Azure AI Foundry (foundry): Connect to Azure AI Foundry project endpoints at {resourceName}-resource.services.ai.azure.com.

Before you begin

Install and set up an agentgateway proxy.

Set up access to Azure

Retrieve the resource name and, if applicable, the project name from the Azure AI Foundry portal or the Azure portal. For example:
- For an Azure OpenAI endpoint like https://{my-resource}.openai.azure.com, the resource name is my-resource.
- For an Azure AI Foundry endpoint like https://{my-resource}-resource.services.ai.azure.com and path /api/projects/{my-project}, the resource name is my-resource and the project name is my-project. If the resource name and the project name are the same, you can leave the projectName field empty.
Store the API key to access your model deployment in an environment variable. If you are using implicit Entra ID authentication (such as managed identity or workload identity), you can skip this step.
```
export AZURE_API_KEY=<insert your model deployment key>
```

Create a Kubernetes secret to store your API key. If you are using implicit Entra ID authentication, skip this step.

kubectl apply -f- <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: azure-secret
  namespace: agentgateway-system
type: Opaque
stringData:
  Authorization: $AZURE_API_KEY
EOF

Create an AgentgatewayBackend resource to configure the Azure LLM provider.

kubectl apply -f- <<EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  name: azure
  namespace: agentgateway-system
spec:
  ai:
    provider:
      azure:
        resourceName: my-resource
        resourceType: openAI
        model: gpt-4.1-mini
EOF

kubectl apply -f- <<EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  name: azure
  namespace: agentgateway-system
spec:
  ai:
    provider:
      azure:
        resourceName: my-resource
        resourceType: foundry
        projectName: my-project
        model: gpt-4.1-mini
EOF

When you use implicit Entra ID authentication, the gateway automatically obtains a token using DefaultAzureCredential. No secret or policies.auth is required. This works with managed identity, workload identity, or Azure CLI credentials.

kubectl apply -f- <<EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  name: azure
  namespace: agentgateway-system
spec:
  ai:
    provider:
      azure:
        resourceName: my-resource
        resourceType: openAI
        model: gpt-4.1-mini
EOF

Review the following table to understand this configuration. For more information, see the API reference.

Setting	Description
`ai.provider.azure`	Define the Azure provider.
`azure.resourceName`	The Azure resource name used to construct the endpoint hostname.
`azure.resourceType`	The endpoint type: `openAI` for Azure OpenAI Service, or `foundry` for Azure AI Foundry.
`azure.model`	The model to use for requests, such as `gpt-4.1-mini`.
`azure.projectName`	The Foundry project name. Required when `resourceType` is `foundry`.
`azure.apiVersion`	Optional API version override. Defaults to `v1`. For legacy deployments, use a dated version like `2025-01-01-preview`.

Create an HTTPRoute resource that routes incoming traffic to the AgentgatewayBackend. The following example sets up a route. Note that agentgateway automatically rewrites the endpoint to the appropriate chat completion endpoint of the LLM provider for you, based on the LLM provider that you set up in the AgentgatewayBackend resource.

kubectl apply -f- <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: azure
  namespace: agentgateway-system
spec:
  parentRefs:
    - name: agentgateway-proxy
      namespace: agentgateway-system
  rules:
  - backendRefs:
    - name: azure
      namespace: agentgateway-system
      group: agentgateway.dev
      kind: AgentgatewayBackend
EOF

kubectl apply -f- <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: azure
  namespace: agentgateway-system
spec:
  parentRefs:
    - name: agentgateway-proxy
      namespace: agentgateway-system
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /azure
    backendRefs:
    - name: azure
      namespace: agentgateway-system
      group: agentgateway.dev
      kind: AgentgatewayBackend
EOF

Send a request to the LLM provider API along the route that you previously created. Verify that the request succeeds and that you get back a response from the chat completion API.

Cloud Provider LoadBalancer:

curl "$INGRESS_GW_ADDRESS/v1/chat/completions" -H content-type:application/json  -d '{
   "model": "",
   "messages": [
     {
       "role": "system",
       "content": "You are a helpful assistant."
     },
     {
       "role": "user",
       "content": "Write a short haiku about cloud computing."
     }
   ]
 }' | jq

Localhost:

curl "localhost:8080/v1/chat/completions" -H content-type:application/json  -d '{
   "model": "",
   "messages": [
     {
       "role": "system",
       "content": "You are a helpful assistant."
     },
     {
       "role": "user",
       "content": "Write a short haiku about cloud computing."
     }
   ]
 }' | jq

Cloud Provider LoadBalancer:

curl "$INGRESS_GW_ADDRESS/azure" -H content-type:application/json  -d '{
   "model": "",
   "messages": [
     {
       "role": "system",
       "content": "You are a helpful assistant."
     },
     {
       "role": "user",
       "content": "Write a short haiku about cloud computing."
     }
   ]
 }' | jq

Localhost:

curl "localhost:8080/azure" -H content-type:application/json  -d '{
   "model": "",
   "messages": [
     {
       "role": "system",
       "content": "You are a helpful assistant."
     },
     {
       "role": "user",
       "content": "Write a short haiku about cloud computing."
     }
   ]
 }' | jq

Example output:

{
  "id": "chatcmpl-9A8B7C6D5E4F3G2H1",
  "object": "chat.completion",
  "created": 1727967462,
  "model": "gpt-4.1-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Floating servers bright,\nData streams through endless sky,\nClouds hold all we need."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 19,
    "total_tokens": 47
  }
}

Next steps

Multiple endpoints

Set up other API endpoints such as embeddings or models.

Prompt guards

Set up prompt guards for your LLM traffic.

LLM observability

View metrics and logs for LLM traffic.

Was this page helpful?

Azure

Before you begin

Set up access to Azure

Next steps

What could be improved?