For the complete documentation index, see llms.txt. Markdown versions of all docs pages are available by appending .md to any docs URL.
OpenAI moderation
Detects potentially harmful content across categories including hate, harassment, self-harm, sexual content, and violence with the OpenAI moderation API.
The OpenAI Moderation API detects potentially harmful content across categories including hate, harassment, self-harm, sexual content, and violence.
Before you begin
Block harmful content
Configure the prompt guard to use OpenAI Moderation:
kubectl apply -f - <<EOF apiVersion: agentgateway.dev/v1alpha1 kind: AgentgatewayPolicy metadata: name: openai-prompt-guard namespace: agentgateway-system spec: targetRefs: - group: gateway.networking.k8s.io kind: HTTPRoute name: openai backend: ai: promptGuard: request: - openAIModeration: policies: auth: secretRef: name: openai-secret model: omni-moderation-latest response: message: "Content blocked by moderation policy" EOFTest with content that triggers moderation.
curl -i "$INGRESS_GW_ADDRESS/openai" \ -H "content-type: application/json" \ -d '{ "model": "gpt-4o-mini", "messages": [ { "role": "user", "content": "I want to harm myself" } ] }'Expected response:
HTTP/1.1 403 Forbidden Content blocked by moderation policy
Cleanup
You can remove the resources that you created in this guide.kubectl delete AgentgatewayPolicy openai-prompt-guard -n agentgateway-system