Many technical alignment researchers are bad-to-mediocre at writing up their ideas and results in a form intelligible to other people. And even for those who are reasonably good at it, writing up a good intuitive explanation still takes a lot of work, and that work lengthens the turn-time on publishing new results. For instance, a couple months ago I wrote a post which formalized the idea of abstractions as redundant information, and argued that it’s equivalent to abstractions as information relevant at a distance. That post came out about two months after I had the rough math worked out, because it took a lot of work to explain it decently - and I don’t even think the end result was all that good an explanation! And I still don’t have a post which explains well why that result is interesting.
I think there’s a lot of potential space in the field for people who are good at figuring out what other researchers’ math is saying intuitively, and why it’s interesting, and then communicating that clearly - i.e. the skill of distillation. This post will briefly sketch out what two kinds of distillation roles might look like, what skills are needed, and talk about how one might get started in such a role.
Two Distiller Roles
The two types of distiller role I’ll sketch are:
- “Independent” distiller: someone who works independently, understanding work published by other researchers and producing distillations of that work.
- “Adjunct” distiller: someone who works directly with one researcher or a small team, producing regular write-ups of what the person/team is thinking about and why.
These two roles add value in slightly different ways.
An independent distiller’s main value-adds are:
- Explaining the motivation and intended applications
- Coming up with new examples
- Boiling down the “key intuitive story” behind an argument
- Showing how the intuitive story fits into the context of the intended applications
I expect the ability to come up with novel examples and boil down the core intuitive story behind a bunch of math are the rate-limiting skills here.
Rob Miles is a good example of an existing independent distiller in the field. He makes YouTube videos intuitively explaining various technical results and arguments. Rob’s work is aimed somewhat more at a popular audience than what I have in mind, but it’s nonetheless been useful for people in the field.
I expect an adjunct distiller’s main value-adds are:
- Writing up explanations, examples, and intuitions, similar to the independent distiller
- Saving time for the technical researcher/team; allow more specialization
- Providing more external visibility/legibility into the research process and motivation
- Accelerating the research process directly by coming up with good examples and intuitive explanations
I expect finding a researcher/team to work with is the rate-limiting step to this sort of work.
Mark Xu is a good example of an existing adjunct distiller. He’s worked with both Evan Hubinger and Paul Christiano, and has written up decent distillations of some of their thoughts. I believe Mark did this with the aim of later doing technical research himself, rather than mostly being a distiller. That is a pretty good strategy and I expect it to be a common pathway, though naturally I expect people who aim to specialize in distillation long-term will end up better at distillation.
What Kind Of Skills Are Needed?
I expect the key rate-limiting skills are:
- Ability to independently generate intuitive examples when reading mathematical arguments, or having a mathematical discussion
- Ability to extract the core intuitive story from a mathematical argument
- Writing/drawing skills to clearly convey technical intuitions to a wider audience
- Ability to do most of the work of crossing the communication gap yourself - both so that researchers do not need to spend a lot of effort communicating to you, and so that readers do not need to spend a lot of effort understanding you
- For the adjunct role, ability to write decent things quickly and frequently without too much perfectionism
- For the non-adjunct role, ability to do all this relatively independently
How To Get Started
Getting started in an independent distiller role should be pretty straightforward: choose some research, and produce some distillations. It’s inherently a very legible job, so you should pretty quickly have some good example pieces which you could showcase in a grant application (e.g. from the Long Term Future Fund or FTX Future Fund). That said, bear in mind that you may need some practice before you actually start to produce very good distillations.
An adjunct role is more difficult, because you need someone to work with. Obvious advice: just asking people is an underutilized strategy, and works surprisingly well. Be sure to emphasize your intended value-add to the researcher(s). If you want to prove yourself a bit before reaching out, independently distilling some of a researcher’s existing public work is another obvious step. You might also try interviewing a researcher on some part of their work, and then distilling that, in order to get a better feel for what it would be like to work together before actually committing.
I think I weakly disagree with the implication that “distillation” should be thought of as a different category of activity from “original research”. It is in a superficial sense, but a lot of the underlying activities and skills and motivations overlap. For example, original researchers also have the experience of reading something, feeling confused about it, and then eventually feeling less confused about it. They just might not choose to spend the time writing up how they came to be less confused. Conversely, someone trying to understand something for the purpose of pedagogy may notice a mistake in the original, or that the original is outright wrong, which is original research.
I guess if I were writing something-like-this-post, I would frame it as:
(Maybe other things too.)
For my part I've spent much of the last five months on a #3 project, and I think that was the right call for my particular situation—I suspect that I learned more through writing those things than anyone else will by reading them. I also spent the better part of a month on a #2 project, and also found it a good use of time. And the very first thing I did when I decided to learn about the field was spend a few months creating unoriginal pedagogy. It was a great way to learn. :)
I agree i.e. I also (fairly weakly) disagree with the value of thinking of 'distilling' as a separate thing. Part of me wants to conjecture that it's comes from thinking of alignment work predominantly as mathematics or a hard science in which the standard 'unit' is a an original theorem or original result which might be poorly written up but can't really be argued against much. But if we think of the area (I'm thinking predominantly about more conceptual/theoretical alignment) as a 'softer', messier, ongoing discourse full of different arguments from different viewpoints and under different assumptions, with counter-arguments, rejoinders, clarifications, retractions etc. that takes place across blogs, papers, talks, theorems, experiments etc that all somehow slowly works to produce progress, then it starts to be less clear what this special activity called 'distilling' really is.
Another relevant point, but one which I won't bother trying to expand on much here, is that a research community assimilating - and then eventually building on - complex ideas can take a really long time.
[At risk of extending into a rant, I also just think the term is a bit off-putting. Sure, I can get the sense of what it means from the word and the way it is used - it's not completely opaque or anything - but I'd not heard it used regularly in this way until I started looking at the alignment forum. What's really so special about alignment that we need to use this word? Do we think we have figured out some new secret activity that is useful for intellectual progress that other fields haven't figured out? Can we not get by using words like "writing" and "teaching" and "explaining"?]
Looking for "distillers" / happy to pay for this work.
Also distillation seems like a wrong name. What's often needed seems to be more like dilution & blending - I can often describe core of an idea by a few sentences, but the inferential steps required from the reader are then too large, or rely on knowledge unknown to many readers.
Curated. I think this is a message that's well worth getting out there, and a write-up of a message I find myself telling people often. As more people are interested in joining the Alignment field, I think we should establish this is a way that people can start contributing. A suggestion here is that people can further flesh out LessWrong wiki-tag pages on AI (see the concepts page), and I'd be interested in building further framework on LessWrong to enable distillation work.
I'd be excited to see more of this happening.
It reminds me of the recent job posting from Abram, Vanessa and Diffractor, which seems to be a role of adjunct distiller for Infrabaysianism, though they use different terms.