Ruben Bloom

Team Lead for LessWrong



«Boundaries», Part 1: a key missing concept from utility theory

Curated. It's not everyday that someone attempts to add concepts to the axioms of game theory/bargaining theory/utility theory and I'm pretty excited for where this is headed, especially if the implications are real for EA and x-risk.

Conditioning Generative Models with Restrictions

When I process new posts, I add tags so they can more easily be found later. I wasn't sure what tag this one with beyond "AI". I think it'd increase the likelihood your post gets found and read later if you look through the available AI tags (you can search or use the Concepts page), or create new tags if you think none of the existing ones fit.

On how various plans miss the hard bits of the alignment challenge

Curated. I could imagine a world where different people pursue different agendas in a “live and let live” way, with no one waiting to be too critical of anyone else. I think that’s a world where many people could waste a lot of time with nothing prompting them to reconsider. I think posts like this one give us a chance to avoid scenarios like that. And posts like this can spur discussion of the higher-level approaches/intuitions that spawn more object-level research agenda. The top comments here by Paul Christianno, John Wentworth, and others are a great instance of this.

I also kind of like how this just further develops my gears-level understanding of why Nate predicts doom. There’s color here beyond AGI Ruin: List of Lethalities, which I assume captured most of Nate’s pessimism, but in fact I wonder if Nate disagrees with Eliezer and thinks things would be a bunch more hopeful if only people worked on the right stuff (in contrast with the problem is too hard for our civilization).

Lastly I’ll note that I think it’s good that Nate wrote this post even before being confident he could pass other people’s ITT. I’m glad he felt it was okay to be critical (with caveats) even before his criticisms were maximally defensible (e.g. because he thinks he could pass an ITT).

Where I agree and disagree with Eliezer

Curated. Eliezer's List of Lethalities post has received an immense amount of attention, rightly so given the content, and I am extremely glad to see this response go live since Eliezer's views do not reflect a consensus, and it would be sad to have only one set of views be getting all the attention when I do think many of the questions are non-obvious. 

I am very pleased to see public back-and-forth on questions of not just "how and whether we are doomed", but the specific gears behind them (where things will work vs cannot work). These questions bear on the enormous resources poured into AI safety work right now. Ensuring those resources get allocated in a way that actually the improve odds of our success is key.

I hope that others continue to share and debate their models of the world, Alignment, strategy, etc. in a way that is both on record and easily findable by others. Hopefully, we can look back in 10, 20, 50, etc years and reflect on how well we reasoned in these cloudy times.

AGI Ruin: A List of Lethalities

I'm curious about why you decided it wasn't worth your time.

Going from the post itself, the case for publishing it goes something like "the whole field of AI Alignment is failing to produce useful work because people aren't engaging with what's actually hard about the problem and are ignoring all the ways their proposals are doomed; perhaps yelling at them via this post might change some of that."

Accepting the premises (which I'm inclined to), trying to get the entire field to correct course seems actually pretty valuable, maybe even worth a month of your time, now that I think about it.

Call For Distillers

Curated. I think this is a message that's well worth getting out there, and a write-up of a message I find myself telling people often. As more people are interested in joining the Alignment field, I think we should establish this is a way that people can start contributing. A suggestion here is that people can further flesh out LessWrong wiki-tag pages on AI (see the concepts page), and I'd be interested in building further framework on LessWrong to enable distillation work.

It Looks Like You're Trying To Take Over The World

Curated. I like fiction. I like that this story is fiction. I hope that all stories even at all vaguely like this one remain fiction.

Alignment research exercises

Curated. Exercises are crucial for the mastery of topics and the transfer of knowledge, it's great to see someone coming up with them for the nebulous field of Alignment.

More Is Different for AI

Here it is!

You might want to edit the description and header image.

More Is Different for AI

We can also make a Sequence. I assume "More Is Different for AI" should be the title of the overall Sequence too?

Load More