Safety Filter

A post-processing mechanism that screens AI model outputs for harmful, inappropriate or policy-violating content before presenting results to the user.

In Plain Language

A last line of defence that checks AI outputs before they reach users. Even if the AI generates something harmful, the safety filter catches and blocks it. Like a security checkpoint at the exit.

Why This Matters

Safety filters are a governance control that provides a last line of defence against harmful AI outputs. Your governance framework should require safety filtering for all customer-facing AI systems and define monitoring procedures to ensure filters remain effective.