Ben Pace

I'm an admin of this site; I work full-time on trying to help people on LessWrong refine the art of human rationality.

Longer bio:


AI Alignment Writing Day 2019
AI Alignment Writing Day 2018


Rogue AGI Embodies Valuable Intellectual Property

Assuming that the discounted value of a monopoly in this IP is reasonably close to Alice’s cost of training, e.g. 1x-3x, competition between Alpha and Beta only shrinks the available profits by half, and Beta expects to acquire between 10%-50% of the market,

Basic econ q here: I think that 2 competitors can often cut the profits by much more than half, because they can always undercut each other until they hit the cost of production. Especially if you're going from 1 seller to 2, I think that can shift a market from monopoly to not-a-monopoly, so I think it might be a lot less valuable.

Still, obviously likely to be worth it to the second company, so I totally expect the competition to happen.

Finite Factored Sets

Curated. This is a fascinating framework that (to the best of my understanding) makes substantive improvements on the Pearlian paradigm. It's also really exciting that you found a new simple sequence. 

Re: the writeup, it's explained very clearly, the Q&A interspersed is a very nice touch. I like that the talk factorizes.

I really appreciate the research exploration you do around ideas of agency and am very happy to celebrate the writeups like this when you produce them.

Knowledge is not just map/territory resemblance

The original lesswrong 1.0 had the following header at the top of each page, pointing at a certain concept of map/territory resemblance:

I don't remember the image you show. I looked it up, I don't see this header on the wayback machine. I see a map atop this post in 2009 and then not too long after it becomes the grey texture that stayed until LW 2.0. Where did you get your image from?

Abstraction Talk

Yeah, we can have a try and see whether it ends up being worth publishing.

Abstraction Talk

Nice. I'll get a transcript made on and share it with you for edits.

Testing The Natural Abstraction Hypothesis: Project Intro

Curated. This is a pretty compelling research line and seems to me like it has the potential to help us a great deal in understanding how to interface and understand and align machine intelligence systems. It's also the compilation of a bunch of good writing+work from you that I'd like to celebrate, and it's something of a mission statement for the ongoing work.

I generally love all the images and like the way it adds a bunch of prior ideas together.

Formal Inner Alignment, Prospectus

Curated. Solid attempt to formalize the core problem, and solid comment section from lots of people.

Agency in Conway’s Game of Life

I recall once seeing someone say with 99.9% probability that the sun would still rise 100 million years from now, citing information about the life-cycle of stars like our sun. Someone else pointed out that this was clearly wrong, that by default that sun would be taken apart for fuel on that time scale, by us or some AI, and that this was a lesson in people's predictions about the future being highly inaccurate. 

But also, "the thing that means there won't be a sun sometime soon" is one of the things I'm pointing to when talking about "general intelligence". This post reminded me of that.

AMA: Paul Christiano, alignment researcher


(If both parties are interested in that debate I’m more than happy to organize it in whatever medium and do any work like record+transcripts or book an in-person event space.)

AMA: Paul Christiano, alignment researcher

The stuff about ‘alien’ knowledge sounds really fascinating, and I’d be excited about write-ups. All my concrete intuitions here come from reading Distill.Pub papers.

Load More