This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
266
Wikitags
AI Benchmarking
This page is a stub.
Subscribe
Discussion
Subscribe
Discussion
Posts tagged
AI Benchmarking
Most Relevant
21
Introducing BenchBench: An Industry Standard Benchmark for AI Strength
Jozdien
7mo
0
16
Improving Model-Written Evals for AI Safety Benchmarking
Sunishchal Dev
,
Marius Hobbhahn
1y
0
5
Auto-Enhance: Developing a meta-benchmark to measure LLM agents’ ability to improve other agents
Sam F. Brown
,
BasilLabib
,
Codruta (Coco) Lugoj
,
Sai Sasank Y
1y
0
5
MMLU’s Moral Scenarios Benchmark Doesn’t Measure What You Think it Measures
corey morris
2y
2