Content Moderation

Trending

3papers

5.7viability

+100%30d

Papers

1–3 of 3

Research Paper·Jan 29, 2026

KID: Knowledge-Injected Dual-Head Learning for Knowledge-Grounded Harmful Meme Detection

Internet memes have become pervasive carriers of digital culture on social platforms. However, their heavy reliance on metaphors and sociocultural context also makes them subtle vehicles for harmful c...

7.0 viability

Research Paper·Feb 27, 2026

FlexGuard: Continuous Risk Scoring for Strictness-Adaptive LLM Content Moderation

Ensuring the safety of LLM-generated content is essential for real-world deployment. Most existing guardrail models formulate moderation as a fixed binary classification task, implicitly assuming a fi...

5.0 viability

Research Paper·Mar 2, 2026

GMP: A Benchmark for Content Moderation under Co-occurring Violations and Dynamic Rules

Online content moderation is essential for maintaining a healthy digital environment, and reliance on AI for this task continues to grow. Consider a user comment using national stereotypes to insult a...