AI ALIGNMENT FORUM
AF

Wikitags

AI Benchmarking

This page is a stub.
Subscribe
Subscribe
Discussion0
Discussion0
Posts tagged AI Benchmarking
21Introducing BenchBench: An Industry Standard Benchmark for AI Strength
Jozdien
5mo
0
16Improving Model-Written Evals for AI Safety Benchmarking
Sunishchal Dev, Marius Hobbhahn
11mo
0
5Auto-Enhance: Developing a meta-benchmark to measure LLM agents’ ability to improve other agents
Sam F. Brown, BasilLabib, Codruta (Coco) Lugoj, Sai Sasank Y
1y
0
5MMLU’s Moral Scenarios Benchmark Doesn’t Measure What You Think it Measures
corey morris
2y
2
Add Posts