Toxicity Detection
AI systems designed to identify and flag harmful, abusive, hateful or offensive language in text, used for content moderation and safety guardrails.
In Plain Language
AI that identifies hateful, abusive or harmful language in text. Used by social media platforms and chatbots to catch and filter out toxic content before it reaches users.
Why This Matters
Toxicity detection is a governance control for any organisation that deploys customer-facing AI. Your governance framework should require toxicity filtering and monitoring for all AI systems that generate or process user-facing content.
.png)
