Ben Pace

I'm an admin of this site; I work full-time on trying to help people on LessWrong refine the art of human rationality.

Longer bio:


AI Alignment Writing Day 2019
AI Alignment Writing Day 2018

Wiki Contributions

Load More


Christiano, Cotra, and Yudkowsky on AI progress

Wow thanks for pulling that up. I've gotta say, having records of people's predictions is pretty sweet. Similarly, solid find on the Bostrom quote.

Do you think that might be the 20% number that Eliezer is remembering? Eliezer, interested in whether you have a recollection of this or not. [Added: It seems from a comment upthread that EY was talking about superforecasters in Feb 2016, which is after Fan Hui.]

Christiano, Cotra, and Yudkowsky on AI progress

Adding my recollection of that period: some people made the relevant updates when DeepMind's system beat the European Champion Fan Hui (in October 2015). My hazy recollection is that beating Fan Hui started some people going "Oh huh, I think this is going to happen" and then when AlphaGo beat Lee Sedol (in March 2016) everyone said "Now it is happening".

Discussion with Eliezer Yudkowsky on AGI interventions

Thank you for this follow-up comment Adam, I appreciate it.

Discussion with Eliezer Yudkowsky on AGI interventions

Glad to hear. And yeah, that’s the crux of the issue for me.

Discussion with Eliezer Yudkowsky on AGI interventions


One of Eliezer's claims here is

It is very, very clear that at present rates of progress, adding that level of alignment capability as grown over the next N years, to the AGI capability that arrives after N years, results in everybody dying very quickly.

This is a claim I basically agree with.

I don't think the situation is entirely hopeless, but I don't think any of the current plans (or the current alignment field) are on track to save us.

Discussion with Eliezer Yudkowsky on AGI interventions

Thank you for the links Adam. To clarify, the kind of argument I'm really looking for is something like the following three (hypothetical) examples.

  • Mesa-optimization is the primary threat model of unaligned AGI systems. Over the next few decades there will be a lot of companies building ML systems that create mesa-optimizers. I think it is within 5 years of current progress that we will understand how ML systems create mesa-optimizers and how to stop it.Therefore I think the current field is adequate for the problem (80%).
  • When I look at the research we're outputting, it seems to me to me that we are producing research at a speed and flexibility faster than any comparably sized academic department globally, or the ML industry, and so I am much more hopeful that we're able to solve our difficult problem before the industry builds an unaligned AGI. I give it a 25% probability, which I suspect is much higher than Eliezer's.
  • I basically agree the alignment problem is hard and unlikely to be solved, but I don't think we have any alternative than the current sorts of work being done, which is a combo of (a) agent foundations work (b) designing theoretical training algorithms (like Paul is) or (c) directly aligning narrowly super intelligent models. I am pretty open to Eliezer's claim that we will fail but I see no alternative plan to pour resources into.

Whatever you actually think about the field and how it will save the world, say it! 

It seems to me that almost all of your the arguments you’ve made work whether the field is a failure or not. The debate here has to pass through whether the field is on-track or not, and we must not sidestep that conversation.

I want to leave this paragraph as social acknowledgment that you mentioned upthread that you're tired and taking a break, and I want to give you a bunch of social space to not return to this thread for however long you need to take! Slow comments are often the best.

Discussion with Eliezer Yudkowsky on AGI interventions

Adam, can you make a positive case here for how the work being done on prosaic alignment leads to success? You didn't make one, and without it I don't understand where you're coming from. I'm not asking you to tell me a story that you have 100% probability on, just what is the success story you're acting under, such that EY's stances seem to you to be mostly distracting people from the real work.

Discussion with Eliezer Yudkowsky on AGI interventions


(...I'll be at the office, thinking about how to make enough progress fast enough.)

We're Redwood Research, we do applied alignment research, AMA

Would you prefer questions here or on the EA Forum?

Load More