Content Moderation Comparison Hub

KID: Knowledge-Injected Dual-Head Learning for Knowledge-Grounded Harmful Meme Detection(7.0)

KID provides state-of-the-art knowledge-grounded harmful meme detection with a dual-head learning architecture.

Detection of Illicit Content on Online Marketplaces using Large Language Models(7.0)

A multilingual illicit content detection tool for online marketplaces using LLMs to improve safety and moderation.

FlexGuard: Continuous Risk Scoring for Strictness-Adaptive LLM Content Moderation(5.0)

Develop FlexGuard, an adaptable AI moderation tool that assigns risk scores for content across various strictness levels to improve moderation accuracy and robustness.

GMP: A Benchmark for Content Moderation under Co-occurring Violations and Dynamic Rules(5.0)

Develop a benchmark for AI content moderation tackling co-occurring violations and dynamic rule changes.

Content Moderation Comparison Hub

Reference Surfaces

Top Papers