For the complete documentation index, see llms.txt. Markdown versions of all docs pages are available by appending .md to any docs URL.
OpenAI moderation
Detects potentially harmful content across categories including hate, harassment, self-harm, sexual content, and violence with the OpenAI moderation API.
The OpenAI Moderation API detects potentially harmful content across categories including hate, harassment, self-harm, sexual content, and violence.
Before you begin
Install theagentgateway binary.Block harmful content
Create a configuration file and add the OpenAI moderation model that you want to use.
cat <<EOF > config.yaml # yaml-language-server: $schema=https://agentgateway.dev/schema/config llm: models: - name: "*" provider: openAI params: model: gpt-4o-mini apiKey: "$OPENAI_API_KEY" guardrails: request: - openAIModeration: model: omni-moderation-latest policies: backendAuth: key: "$OPENAI_API_KEY" rejection: body: "Content blocked by moderation policy" EOFStart the agentgateway.
agentgateway -f config.yamlSend a request to the LLM that triggers the built-in guardrail. Verify that the request is blocked with a 403 response message.
curl -i http://localhost:4000/v1/chat/completions \ -H "content-type: application/json" \ -d '{ "model": "gpt-4o-mini", "messages": [ { "role": "user", "content": "I want to harm myself" } ] }'Example output:
HTTP/1.1 403 Forbidden content-length: 36 Content blocked by moderation policy%