Raymond Arnold

I've been a LessWrong organizer since 2011, with roughly equal focus on the cultural, practical and intellectual aspects of the community. My first project was creating the Secular Solstice and helping groups across the world run their own version of it. More recently I've been interested in improving my own epistemic standards and helping others to do so as well.


Some AI research areas and their relevance to existential safety

Curated, for several reasons.

I think it's really hard to figure out how to help with beneficial AI. Various career and research paths vary in how likely they are to help, or harm, or fit together. I think many prominent thinkers in the AI landscape have developed nuanced takes on how to think about the evolving landscape, but often haven't written up those thoughts. 

I like this post both for laying out a lot of object-level thoughts about that, and also for demonstrating a possible framework for organizing those object-level thoughts, and for doing it very comprehensively.  

I haven't finished processing all of the object level points and am not sure which ones I endorse at this point. But I'm looking forward to debate on the various points here. I'd welcome other thinkers in the AI Existential Safety space writing up similarly comprehensive posts about how they think about all of this.

The Solomonoff Prior is Malign

Curated. This post does a good job of summarizing a lot of complex material, in a (moderately) accessible fashion.

Draft report on AI timelines

I'm assuming part of the point is the LW crosspost still buries things in a hard-to-navigate google doc, which prevents it from easily getting cited or going viral, and Ajeya is asking/hoping for trust that they can get the benefit of some additional review from a wider variety of sources.

Forecasting Thread: AI Timelines


I think this was a quite interesting experiment in LW Post format. Getting to see everyone's probability-distributions in visual graph format felt very different from looking at a bunch of numbers in a list, or seeing them averaged together. I especially liked some of the weirder shapes of some people's curves.

This is a bit of an offbeat curation, but I think it's good to periodically highlight experimental formats like this.

What's a Decomposable Alignment Topic?

Am I correct that the real generating rule here is something like "I have a group of people who'd like to work on some alignment open problems, and want a problem that is a) easy to give my group, and b) easy to subdivide once given to my group?"

Will OpenAI's work unintentionally increase existential risks related to AI?

Fwiw I recently listened to the excellent song 'The Good Doctor' which has me quite delighted to get random megaman references.

Matt Botvinick on the spontaneous emergence of learning algorithms

(Flagging that I curated the post, but was mostly relying on Ben and Habryka's judgment, in part since I didn't see much disagreement. Since this discussion I've become more agnostic about how important this post is)

One thing this comment makes me want is more nuanced reacts that people have affordance to communicate how they feel about a post, in a way that's easier to aggregate.

Though I also notice that with this particular post it's a bit unclear what the react would be appropriate, since it sounds like it's not "disagree" so much as "this post seems confused" or something.

Matt Botvinick on the spontaneous emergence of learning algorithms

The thing I meant by "catastrophic" is "leading to the death of the organism."

This doesn't seem like what it should mean here. I'd think catastrophic in the context of "how humans (programmed by evolution) might fail by evolution's standards" should mean "start pursuing strategies that don't result in many children or longterm population success." (where premature death of the organism might be one way to cause that, but not the only way)

Matt Botvinick on the spontaneous emergence of learning algorithms

Curated. [Edit: no longer particularly endorsed in light of Rohin's comment, although I also have not yet really vetted Rohin's comment either and currently am agnostic on how important this post is]

When I first started following LessWrong, I thought the sequences made a good theoretical case for the difficulties of AI Alignment. In the past few years we've seen more concrete, empirical examples of how AI progress can take shape and how that might be alarming. We've also seen more concrete simple examples of AI failure in the form of specification gaming and whatnot. 

I haven't been following all of this in depth and don't know how novel the claims here are [fake edit: gwern notes in the comments that similar phenomena have been observed elsewhere]. But, this seemed noteworthy for getting into the empirical observation of some of the more complex concerns about inner alignment. 

I'm interested in seeing more discussion of these results, what they mean and how people think about them.

Our take on CHAI’s research agenda in under 1500 words

I'm wondering if the Rainforest thing is somehow tied to some other disagreements (between you/me or you/MIRI-cluster).

Where, something like "the fact that it requires some interpretive labor to model the Rainforest as an agent in the first place" is related to why it seems hard to be helpful to humans, i.e. humans aren't actually agents. You get an easier starting ground since we have the ability to write down goals and notice inconsistencies in them, but that's not actually that reliable. We are not in fact agents and we need to somehow build AIs that reliable seem good to us anyway.

(Curious if this feels relevant either to Rohin, or other "MIRI cluster" folk)

Load More