Safety Filter
A post-processing mechanism that screens AI model outputs for harmful, inappropriate or policy-violating content before presenting results to the user.
In Plain Language
A last line of defence that checks AI outputs before they reach users. Even if the AI generates something harmful, the safety filter catches and blocks it. Like a security checkpoint at the exit.
Why This Matters
Safety filters are a governance control that provides a last line of defence against harmful AI outputs. Your governance framework should require safety filtering for all customer-facing AI systems and define monitoring procedures to ensure filters remain effective.
.png)
