This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Wikitags
AI Benchmarking
This page is a stub.
Subscribe
Subscribe
Discussion
0
Discussion
0
Posts tagged
AI Benchmarking
Most Relevant
21
Introducing BenchBench: An Industry Standard Benchmark for AI Strength
Jozdien
5mo
0
16
Improving Model-Written Evals for AI Safety Benchmarking
Sunishchal Dev
,
Marius Hobbhahn
11mo
0
5
Auto-Enhance: Developing a meta-benchmark to measure LLM agents’ ability to improve other agents
Sam F. Brown
,
BasilLabib
,
Codruta (Coco) Lugoj
,
Sai Sasank Y
1y
0
5
MMLU’s Moral Scenarios Benchmark Doesn’t Measure What You Think it Measures
corey morris
2y
2