ScienceToStartup
Dashboard
Research
Trends
Topics
Saved
Articles
Changelog
Careers
About
Enterprise
Resources
Home
Resources
State Reports
AI Benchmarks
State of AI Benchmarks
4 papers · avg viability 5.3
Download CSV
View topic page
Top papers
CorpusQA: A 10 Million Token Benchmark for Corpus-Level Analysis and Reasoning
(6.0)
Pencil Puzzle Bench: A Benchmark for Multi-Step Verifiable Reasoning
(5.0)
LifeBench: A Benchmark for Long-Horizon Multi-Source Memory
(5.0)
SourceBench: Can AI Answers Reference Quality Web Sources?
(5.0)