METR has not intended to claim to have audited anything, or to claim to be providing meaningful oversight or accountability, but there has been some confusion about whether METR is an auditor or planning to be one. To clarify this point: 1. METR’s top priority is to develop the science...
Update 3/14/2024: This post is out of date. For current information on the task bounty, see our Task Development Guide. Summary METR (formerly ARC Evals) is looking for (1) ideas, (2) detailed specifications, and (3) well-tested implementations for tasks to measure performance of autonomous LLM agents. Quick description of key...
Update: We are no longer accepting gnarly bug submissions. However, we are still accepting submissions for our Task Bounty! Tl;dr: Looking for hard debugging tasks for evals, paying greater of $60/hr or $200 per example. METR (formerly ARC Evals) is interested in producing hard debugging tasks for models to attempt...
Note: This is not a personal post. I am sharing on behalf of the ARC Evals team. Potential risks of publication and our response This document expands on an appendix to ARC Evals’ paper, “Evaluating Language-Model Agents on Realistic Autonomous Tasks.” We published this report in order to i) increase...
Blogpost version Paper We have just released our first public report. It introduces methodology for assessing the capacity of LLM agents to acquire resources, create copies of themselves, and adapt to novel challenges they encounter in the wild. Background ARC Evals develops methods for evaluating the safety of large language...
[Written for more of a general-public audience than alignment-forum audience. We're working on a more thorough technical report.] We believe that capable enough AI systems could pose very large risks to the world. We don’t think today’s systems are capable enough to pose these sorts of risks, but we think...
Post status: pretty rough + unpolished, thought it might be worthwhile getting this out anyway I feel like I've encountered various people having misunderstandings of LLMs that seem to be related to using the 'simulator' framing. I'm probably being horrendously uncharitable to the people in question, I'm not confident that...