Nikola Jurkovic — AI Alignment Forum

Romeo Dean and I ran a slightly modified version of this format for members of AISST and we found it a very useful and enjoyable activity!

We first gathered to do 2 hours of reading and discussing, and then we spent 4 hours switching between silent writing and discussing in small groups.

The main changes we made are:

We removed the part where people estimate probabilities of ASI and doom happening by the end of each other’s scenarios.
We added a formal benchmark forecasting part for 7 benchmarks using private Metaculus questions (forecasting values at Jan 31 2025):
1. GPQA
2. SWE-bench
3. GAIA
4. InterCode (Bash)
5. WebArena
6. Number of METR tasks completed
7. ELO on LMSys arena relative to GPT-4-1106

We think the first change made it better, but in hindsight we would have reduced the number of benchmarks to around 3 (GPQA, SWE-bench and LMSys ELO), or given participants much more time.

METR: Measuring AI Ability to Complete Long Tasks

Nikola Jurkovic9mo116

This has been one of the most important results for my personal timelines to date. It was a big part of the reason why I recently updated from ~3 year median to ~4 year median to AI that can automate >95% of remote jobs from 2022, and why my distribution overall has become more narrow (less probability on really long timelines).

Scenario Forecasting Workshop: Materials and Learnings

Nikola Jurkovic2y20

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments