Skip to content

For the complete documentation index, see llms.txt. Markdown versions of all docs pages are available by appending .md to any docs URL.

Page as Markdown

OpenAI moderation

Detects potentially harmful content across categories including hate, harassment, self-harm, sexual content, and violence with the OpenAI moderation API.

The OpenAI Moderation API detects potentially harmful content across categories including hate, harassment, self-harm, sexual content, and violence.

Before you begin

Install the agentgateway binary.

Block harmful content

  1. Create a configuration file and add the OpenAI moderation model that you want to use.

    cat <<EOF > config.yaml
    # yaml-language-server: $schema=https://agentgateway.dev/schema/config
    llm:
      models:
      - name: "*"
        provider: openAI
        params:
          model: gpt-4o-mini
          apiKey: "$OPENAI_API_KEY"
        guardrails:
          request:
          - openAIModeration:
              model: omni-moderation-latest
              policies:
                backendAuth:
                  key: "$OPENAI_API_KEY"
            rejection:
              body: "Content blocked by moderation policy"
    EOF
  2. Start the agentgateway.

    agentgateway -f config.yaml
  3. Send a request to the LLM that triggers the built-in guardrail. Verify that the request is blocked with a 403 response message.

    curl -i http://localhost:4000/v1/chat/completions \
      -H "content-type: application/json" \
      -d '{
        "model": "gpt-4o-mini",
        "messages": [
          {
            "role": "user",
            "content": "I want to harm myself"
          }
        ]
      }'

    Example output:

    HTTP/1.1 403 Forbidden
    content-length: 36
    
    Content blocked by moderation policy%    
    
Was this page helpful?
Agentgateway assistant

Ask me anything about agentgateway configuration, features, or usage.

Note: AI-generated content might contain errors; please verify and test all returned information.

Tip: one topic per conversation gives the best results. Use the + button in the chat header to start a new conversation.

Switching topics? Starting a new conversation improves accuracy.
↑↓ navigate select esc dismiss

What could be improved?

Your feedback helps us improve assistant answers and identify docs gaps we should fix.

Need more help? Join us on Discord: https://discord.gg/y9efgEmppm

Want to use your own agent? Add the Solo MCP server to query our docs directly. Get started here: https://search.solo.io/.