Content Moderation AI Comparison Hub

3 papers - avg viability 7.0

Reference Surfaces

xList-Hate: A Checklist-Based Framework for Interpretable and Generalizable Hate Speech Detection(7.0)
xList-Hate offers an interpretable framework for robust cross-domain hate speech detection, enabling fine-grained content moderation.
Improving Implicit Hate Speech Detection via a Community-Driven Multi-Agent Framework(6.0)
Develop a multi-agent system for detecting implicit hate speech with enhanced classification accuracy and fairness.