Ben Pace

I'm an admin of this site; I work full-time on trying to help people on LessWrong refine the art of human rationality.

Longer bio:


AI Alignment Writing Day 2019
AI Alignment Writing Day 2018


Risks from Learned Optimization: Introduction

For me, this is the paper where I learned to connect ideas about delegation to machine learning. The paper sets up simple ideas of mesa-optimizers, and shows a number of constraints and variables that will determine how the mesa-optimizers will be developed – in some environments you want to do a lot of thinking in advance then delegate execution of a very simple algorithm to do your work (e.g. this simple algorithm Critch developed that my group house uses to decide on the rent for each room), and in some environments you want to do a little thinking and then delegate a very complex algorithm to figure out what to do (e.g. evolution is very stupid and then makes very complex brains to figure out what to do in lots of situations that humans encountered in the EEA).

Seeing this more clearly in ML shocked me with the level of inadequacy that ML has for being able to do this with much direction whatsoever. It just doesn't seem like something that we have much control of. Of course I may be wrong, and there are some simple proposals (though that have not worked so far). Nonetheless, it's a substantial step forward in discussing delegation in modern ML systems. It discusses lots of related ideas very clearly.

Definitely should be included in the review. I expect to vote on this with something like +5 to +8.

I don't do research in this area, I expect others like Daniel Filan and Adam Shimi will have more detailed opinions of the sequence's strengths and weaknesses. (Nonetheless I stand by my assessment and will vote accordingly.)

Review of 'Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More'

Yeah I agree. I think it's useful to have a public record of it, and I'm glad that public conversation happened, but I don't think it's an important part of the ongoing conversation in the rationality community, and the conversation wasn't especially insightful.

I hope some day we'll have better debates with more resources devoted by either side than a FB comment thread, and perhaps one day that will be good for the review.

Utility ≠ Reward

For another datapoint, I'll mention that I didn't read this post nor Gradient Hacking at the time, I read the sequence, and I found that to be pretty enlightening and quite readable.

2020 AI Alignment Literature Review and Charity Comparison

hurrah! victory for larks, with yet another comprehensive review! how long can he keep it up? another decade? i hope so!

(Also I had 3 laugh-out-loud moments. I will let the studious reader find all your hidden jokes.)

Gradient hacking

This is one of the scarier posts I've read on LW. I feel kinda freaked out by this post. It's an important technical idea.

Six AI Risk/Strategy Ideas

The first three examples here have been pretty helpful to me in considering how DSAs and takeoffs will go and why they may be dangerous.

AGI will drastically increase economies of scale

Seems like an important consideration, and explained concisely.

Load More