How To Get Into Independent Research On Alignment/Agency

[-]Steven Byrnes4y*170

the sort of person who this post is already aimed at (i.e. people who are excited to forge their own path in a technical field where everyone is fundamentally confused) is probably not the sort of person who is aiming for minor contributions anyway.

For me, there were two separate decisions. (1) Around March 2019, having just finished my previous intense long-term internet hobby, I figured my next intense long-term internet hobby was gonna be AI alignment; (2) later on, around June 2020, I started trying to get funding for full-time independent work. (I couldn't work at an org because I didn't want to move to a different city.)

I want to emphasize that at the earlier decision-point, I was absolutely "aiming for minor contributions". I didn't have great qualifications, or familiarity with the field, or a lot of time. But I figured that I could eventually get to a point where I could write helpful comments on other people's blog posts. And that would be my contribution!

Well, I also figured I should be capable of pedagogy and outreach. And that was basically the first thing I did—I wrote a little talk summarizing the field for newbies, and gave it to one audience, and tried and failed to give it to a second audience.

(I find it a lot easier to "study topic X, in order to do Y with that knowledge", compared to "study topic X" full stop. Just starting out on my new hobby, I had no Y yet, so "giving a pedagogical talk" was an obvious-to-me choice of Y.)

Then I had some original ideas! And blogged about them. But they turned out to be bad.

Then I had different original ideas! And blogged about them in my free time for like a year before I applied for LTFF.

…and they rejected me. On the plus side, their rejection came with advice about exactly what I was missing if I wanted to reapply. On the minus side, the advice was pretty hard to follow, given my time constraints. So I started gradually chipping away at the path towards getting those things done. But meanwhile, my rejected LTFF application got forwarded around, and I got a grant offer from a different source a few months later (yay).

With that background, a few comments on the post:

I wrote a fair bit on LessWrong, and researched some agency problems, even before quitting my job. I do expect it helps to “ease into it” this way, and if you’re coming in fresh you should probably give yourself extra time to start writing up ideas, following the field, and getting feedback.

I also went down the "ease into it" path. It's especially (though not exclusively) suitable for people like me who are OK with long-term intense internet hobbies. (AI alignment was my 4th long-term intense internet hobby in my lifetime. Probably last. They are frankly pretty exhausting, especially with a full-time job and kids.)

Probably the most common mistake people make when first attempting to enter the alignment/agency research field is to not have any model at all of the main bottlenecks to alignment, or how their work will address those bottlenecks.

Just to clarify:

This quote makes sense to me if you read "when first attempting to enter the field" as meaning "when first attempting to enter the field as a grant-funded full-time independent researcher".

On the other hand, when you're first attempting to learn about and maybe dabble in the field, well obviously you won't have a good model of the field yet.

One more thing:

the sort of person who this post is already aimed at (i.e. people who are excited to forge their own path in a technical field where everyone is fundamentally confused) is probably not the sort of person who is aiming for minor contributions anyway.

If you're a kinda imposter-syndrome-y person who just constitutionally wouldn't dream of looking themselves in the mirror and saying "I am aiming for a major contribution!", well me too, and don't let John scare you off. :-P

I can attest that it’s an awesome job.

I agree!

[-]AlexMennen4y110

This post claims that having the necessary technical skills probably means grad-level education, and also that you should have a broad technical background. While I suppose these claims are probably both true, it's worth pointing out that there's a tension between them, in that PhD programs typically aim to develop narrow skillsets, rather than broad ones. Often the first year of a PhD program will focus on acquiring a moderately broad technical background, and then rapidly get progressively more specialized, until you're writing a thesis, at which point whatever knowledge you're still acquiring is highly unlikely to be useful for any project that isn't very similar to your thesis.

My advice for people considering a PhD as preparation for work in AI alignment is that only the first couple years should really be thought of as preparation, and for the rest of the program, you should be actually doing the work that the beginning of the PhD was preparation for. While I wouldn't discourage people from starting a PhD as preparation for work in AI alignment work, I would caution that finishing the program may or may not be a good course of action for you, and you should evaluate this while in the program. Don't end up like me, a seventh-year PhD student working on a thesis project highly unlikely to be applicable to AI alignment despite harboring vague ambitions of working in the field.

[-]johnswentworth4y40

Strong agree. A lot of the technical material which I think is relevant is typically not taught until the grad level, but that does not mean that actually finishing a PhD program is useful. Indeed, I sometimes joke that dropping out of a PhD program is one of the most widely-recognized credentials by people currently in the field - you get the general technical background skills, and also send a very strong signal of personal agency.

[-]Rob Bensinger4y110

I love this post. Thanks, John.

[-]Koen.Holtman4y70

As nobody else has mentioned it yet in this comment section: AI Safety Support is a resource-hub specifically set up to help people get into alignment research field.

I am a 50 year old independent alignment researcher. I guess I need to mention for the record that I never read the sequences, and do not plan to. The piece of Yudkowsky writing that I'd recommend everybody interested in alignment should read is Corrigibilty. But in general: read broadly, and also beyond this forum.

I agree with John's observation that some parts of alignment research are especially well-suited to independent researchers, because they are about coming up with new frames/approaches/models/paradigms/etc.

But I would like to add a word of warning. Here are two somewhat equally valid ways to interpret LessWrong/Alignment Forum:

It is a very big tent that welcomes every new idea
It is a social media hang-out for AI alignment researchers who prefer to engage with particular alignment sub-problems and particular styles of doing alignment research only.

So while I agree with John's call for more independent researchers developing good new ideas, I need to warn you that your good new ideas may not automatically trigger a lot of interest or feedback on this forum. Don't tie your sense of self-worth too strongly to this forum.

On avoiding bullshit: discussion on this forum are often a lot better than on some other social media sites, but still Sturgeon's law applies.

[-]Raemon4y50

Curated. This post matched my own models of how folk tend to get into independent alignment research, and I've seen some people whose models I trust more endorse the post as well. Scaling good independent alignment research seems very important.

I do like that the post also specifies who shouldn't be going to independent research.

[-]Raemon3y30Review for 2021 Review

I'd ideally like to see a review from someone who actually got started on Independent Alignment Research via this document, and/or grantmakers or senior researchers who have seen up-and-coming researchers who were influenced by this document.

But, from everything I understand about the field, this seems about right to me, and seems like a valuable resource for people figuring out how to help with Alignment. I like that it both explains the problems the field faces, and it lays out some of the realpolitik of getting grants.

Actually, rereading this, it strikes me as a pretty good "intro to the John Wentworth worldview", weaving a bunch of disparate posts together into a clear frame.

[-][anonymous]4y00

Hi John, thanks a lot.

Your posts are coming at the perfect time. I just gave my notice at my current job, I have about 3 years of runway ahead of me in which I can do whatever I want. I should definitely at least evaluate AI Safety research. My background is a bachelor's in AI (that's a thing in the Netherlands). The little bits of research I did try got good feedback.

Even though I'm in a great position to try this, it still feels like a huge gamble. I'm aware that a lot of AI Safety research is already of questionable quality. So my question is: how can I determine as quickly as possible whether I'm cut out for this?

Not just asking to reduce financial risk, but also because I feel like my learning trajectory would be quite different if I already knew that it was going to work out in the long run. I'd be able to study the fundamentals a lot more before trying research.

[-]Koen.Holtman4y30

I'm aware that a lot of AI Safety research is already of questionable quality. So my question is: how can I determine as quickly as possible whether I'm cut out for this?

My key comment here is that, to be an independent researcher, you will have to rely day-by-day on your own judgement on what has quality and what is valuable. So do you think you have such judgement and could develop it further?

To find out, I suggest you skim a bunch of alignment research agendas, or research overviews like this one, and then read some abstracts/first pages of papers mentioned in there, while trying apply your personal, somewhat intuitive judgement to decide

which agenda item/approach looks most promising to you as an actual method for improving alignment
which agenda item/approach you feel you could contribute most to, based on your own skills.

If your personal intuitive judgement tells you nothing about the above questions, if it all looks the same to you, then you are probably not cut out to be an independent alignment researcher.

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

92

How To Get Into Independent Research On Alignment/Agency

92

Background Models

Independence

Preparadigmicity

Getting Paid

Don’t Bullshit

Reading

The Hamming Question

Use Your Pareto Frontier

Legibility

When To Start

Runway

Meta