Trojan Attack
A type of backdoor attack where a model is trained to behave normally on standard inputs but produces malicious outputs when a specific trigger pattern is detected.
In Plain Language
A hidden "sleeper" attack embedded in an AI. The model behaves perfectly during testing, but when a specific trigger appears in real use, it switches to malicious behaviour.
.png)
