Azure
Configuration and setup for Azure AI services provider
Configure Azure as an LLM provider in agentgateway.
Azure supports two endpoint types:
- Azure OpenAI (
openAI): Connect to Azure OpenAI Service deployments at{resourceName}.openai.azure.com. - Azure AI Foundry (
foundry): Connect to Azure AI Foundry project endpoints at{resourceName}-resource.services.ai.azure.com.
Before you begin
Install and set up an agentgateway proxy.Set up access to Azure
Retrieve the resource name and, if applicable, the project name from the Azure AI Foundry portal or the Azure portal. For example:
- For an Azure OpenAI endpoint like
https://{my-resource}.openai.azure.com, the resource name ismy-resource. - For an Azure AI Foundry endpoint like
https://{my-resource}-resource.services.ai.azure.comand path/api/projects/{my-project}, the resource name ismy-resourceand the project name ismy-project. If the resource name and the project name are the same, you can leave theprojectNamefield empty.
- For an Azure OpenAI endpoint like
Store the API key to access your model deployment in an environment variable. If you are using implicit Entra ID authentication (such as managed identity or workload identity), you can skip this step.
export AZURE_API_KEY=<insert your model deployment key>Create a Kubernetes secret to store your API key. If you are using implicit Entra ID authentication, skip this step.
kubectl apply -f- <<EOF apiVersion: v1 kind: Secret metadata: name: azure-secret namespace: agentgateway-system type: Opaque stringData: Authorization: $AZURE_API_KEY EOFCreate an AgentgatewayBackend resource to configure the Azure LLM provider.
kubectl apply -f- <<EOF apiVersion: agentgateway.dev/v1alpha1 kind: AgentgatewayBackend metadata: name: azure namespace: agentgateway-system spec: ai: provider: azure: resourceName: my-resource resourceType: openAI model: gpt-4.1-mini EOFReview the following table to understand this configuration. For more information, see the API reference.
Setting Description ai.provider.azureDefine the Azure provider. azure.resourceNameThe Azure resource name used to construct the endpoint hostname. azure.resourceTypeThe endpoint type: openAIfor Azure OpenAI Service, orfoundryfor Azure AI Foundry.azure.modelThe model to use for requests, such as gpt-4.1-mini.azure.projectNameThe Foundry project name. Required when resourceTypeisfoundry.azure.apiVersionOptional API version override. Defaults to v1. For legacy deployments, use a dated version like2025-01-01-preview.Create an HTTPRoute resource that routes incoming traffic to the AgentgatewayBackend. The following example sets up a route. Note that agentgateway automatically rewrites the endpoint to the appropriate chat completion endpoint of the LLM provider for you, based on the LLM provider that you set up in the AgentgatewayBackend resource.
kubectl apply -f- <<EOF apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: azure namespace: agentgateway-system spec: parentRefs: - name: agentgateway-proxy namespace: agentgateway-system rules: - backendRefs: - name: azure namespace: agentgateway-system group: agentgateway.dev kind: AgentgatewayBackend EOFSend a request to the LLM provider API along the route that you previously created. Verify that the request succeeds and that you get back a response from the chat completion API.
Cloud Provider LoadBalancer:
curl "$INGRESS_GW_ADDRESS/v1/chat/completions" -H content-type:application/json -d '{ "model": "", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "Write a short haiku about cloud computing." } ] }' | jqLocalhost:
curl "localhost:8080/v1/chat/completions" -H content-type:application/json -d '{ "model": "", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "Write a short haiku about cloud computing." } ] }' | jqExample output:
{ "id": "chatcmpl-9A8B7C6D5E4F3G2H1", "object": "chat.completion", "created": 1727967462, "model": "gpt-4.1-mini", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Floating servers bright,\nData streams through endless sky,\nClouds hold all we need." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 28, "completion_tokens": 19, "total_tokens": 47 } }