(Minor update to change Steve's labelling following this comment, and also because I realized that I never added the footnotes...)
This post is part of the work done at Conjecture.
In Old Masters and Young Geniuses, economist-turned-art-data-analyst David Galenson investigates a striking regularity in the careers of painters: art history and markets favors either their early pieces or the complete opposite — their last ones. From this pattern and additional data, Galenson extracts and defends a separation of creatives into two categories, two extremes of a spectrum: conceptual innovators (remembered as young geniuses) and experimental innovators (remembered as old masters).
Conceptual innovators, like painter Pablo Picasso, start with a crystal clear idea of their goal, and spend tremendous amounts of time in preliminary research, preparatory drawing and all-around planning. They then mostly stick to these extensive plans (or have them executed by others) when concretely creating the output. As such, their most impressive innovations generally come from their ground breaking ideas, when they know and have done so little that they can simplify and break all the rules they haven't yet learned — they're young geniuses. Galenson provides additional examples: artist Andy Warhol, novelist Herman Melville, movie director Orson Welles, scientist Albert Einstein...
On the other hand, experimental innovators, like painter Paul Cézanne, only figure out their aim by relentless trial-and-error, making something up and then iterating on it. Their intuitions start vague and their goals cloudy, leading to their perpetual uncertainty and doubt about having accomplished what they wanted. Yet because experimental innovators keep on refining their attempts, and because they build on all that happened before, their best output (measured by metrics like auction prices, bestselling list, mentions in textbooks) emerges towards the end of their lives — they're old masters. Some additional examples given by Galensen are: abstract painter Jackson Pollock, novelist Virginia Woolf, movie director Alfred Hitchcock, and scientist Charles Darwin.
Galenson's distinction jolted my mind and turned it racing with questions: in which box did I fall? And others in alignment? What were the consequences for accelerating science in general and alignment in particular? What are the limits of the distinction?
Unfortunately, the painter-based distinction proved unwieldy to analyze conceptual research, my main application. Yet there was definitely something there...
This post thus follows my process in carving a frame out of the old master/young genius template which can be applied to heady topics like alignment and epistemology, and illuminates both my own mistakes with regard to my work, and the different styles of alignment researchers. My frame applies to research outputs rather than to researchers, and separate them into mosaics (works which fit within a clear, simple, trimmed down structure, like colored tiles in a mosaic), and palimpsests (works which iterate on previous idea by shifting and altering them, rewriting on top like a palimpsest)
Epistemic status: Like most of my work, this post fits into a palimpsest rather than a mosaic (see definitions below). As such, it is but one step of an iterative process. I currently agree with it; I will find it limited or inadequate soon, and will probably build something new on top (or with its ashes).
Going back to Galenson's conceptual and experimental innovators for a moment — what was the problem? For my purposes, I had to confront three issues.
This lead me to the following corrections to Galenson's distinction:
Taking the conceptual vs experimental innovators frame through these two transformations gives us "mosaics vs palimpsests".
A mosaic (from the ancient art form) is a set of research works that all fit within a simple, explicit, preordained structure, like tiles in a mosaic. Whether complete or in progress, mosaics exude coherence and reshape the object of investigation into a form that makes perfect sense. Thus new research within a mosaic builds on top of the previous works, rather than over.
This underlying structure might be quite local (just assumptions and framing about a subproblem) or amount to a whole paradigm.
A palimpsest (from the medieval manuscripts which monks wrote over) is a set of research works which follows a process of iteration without clear initial vision. The sequence of successive ideas reveals the refinement, correction, and wholesale changes of the initial attempt, in order to capture the elusive intuitions that prompted the work.
Then individuals might lean (with different levels of strength) towards one shape of research over another. Which leads to an analogous of the distinction between conceptual innovators/young geniuses vs experimental innovators/old masters. Yet focusing on the shape of research rather than the people lets us accommodate those thinkers who regularly alternate, or consider the possibility of dependencies between the two (maybe mosaics require initial palimpsests).
Now, to refine the broad strokes of these two shapes, let's look at examples from conceptual alignment research.
The Sequences (and really everything written by Eliezer) provide the most obvious instance of mosaic in conceptual alignment. All the posts build on each other, often in subtle and implicit ways, and the worldview shared there stays coherent through and through. It even fits perfectly with both more modern works of Eliezer (like Security Mindset and Ordinary Paranoia) and older ones (like the AI Foom Debate).
On the opposite corner, most of Paul's work basks in a definite palimpsest shape. His wealth of blog posts keeps coming back to the same topics (indirect normativity, amplification, universality...), each time explored from a slightly different angle, or correcting an inadequacy in the previous formulation. Even Paul's research methodology draws the shape of a palimpsest!
(As a side note, this feels like a core reason for the notorious difficulty of Paul's writing (another being the theoretical computer science framing which is not that common here): most of his posts and papers don’t give a self-contained, up-to-date and coherent perspective on Paul's approach. Instead, he has many fascinating iterations that get into seemingly nitpicky details (which matter a lot for the iteration) and require to have internalized most of the previous steps of iteration. Yet when you have, he writes in a clear and direct fashion.)
Back to example-listing. On the mosaic-side, here is a non-exhaustive list:
And here is the palimpsest-shaped one:
In the middle ground, I place for example Vanessa's work, notably InfraBayes. On the one hand, most results (for example InfraBayes Physicalism) fit cleanly within a structure related to InfraBayes; on the other hand, my model of Vanessa and Diffractor's process includes many unplanned results, with potential alterations of the focus and aims of the research.
Or to say it differently, InfraBayes feels like iterating on the consequences and uses of some mathematical formalization with a vague aim, thus in between mosaics and palimpsests.
Whenever I get some interesting idea like mosaics and palimpsests, I automatically search how to apply it to myself. Although Galenson's original distinction confused me when I tried to apply it to my own creative process (as already mentioned), mosaics and palimpsests let me name one of my main failure modes: I keep wanting to make mosaics, when my intuitions and my personality clearly predispose me to palimpsests.
I can now see and name the urge, the pull towards writing something clean, nice, powerful, final. I can see my admiration (often mixed with jealousy) for the ground of optimization, for John's posts and sequences, for Scott's beautiful and crisp formalisms — for mosaics in general. And how it colored my own constant frustration with my own work, which never was as structured, which always rebelled against the inadequate structure, which never wanted to stop shifting just for once.
And the irony is that the best things I produced, the ideas that people around me leverage the most — epistemic strategies, productive mistakes, epistemological vigilance for alignment, unbounded atomic optimization — all clearly fit into patterns of palimpsests. Some of them are even different steps of the same palimpsest!
Palimpsests just fit my way of thinking: I cannot stop investigating and doubting the assumptions and the simplifications that I make; and my work doesn't cleanly build on itself, but rewrites itself and branches off in sprawling fashion.
Thus aiming for mosaics created the condition of my own unproductivity and misery. Instead of iterating faster and sharing the steps of my palimpsests, I ended up stressing myself out to complete massive projects of superb structure, before realizing that none of the pieces fitted together. And praising structure above all else, I was unable to acknowledge the productivity of these bad fits — the interesting part was the revealed inadequacy.
Somehow I managed to be the advocate of productive mistakes and avoid them in my own work. Do as I say not as I do, right?
One of Galenson's avowed goals is to highlight a cultural trend (in painting, art, entrepreneurship, and science) that favors the dashing masterpiece which springs fully formed (conceptual innovation/mosaic) over the less legible and longer matured result of an iterative process (experimental innovation/palimpsest).
And he has a point. In myself and in others around, I now see a clear bias towards mosaic. Which makes perfect sense — mosaics look good from the start, as they explicitly hand you a compressed structure with which to see the world. Whereas palimpsests merely gesture at vague intuitions, half-baked arguments, and weird framings.
Or without so many words, mosaics just look cooler.
Out of all the negative effects that come from this bias, the worst are probably:
Palimpsest-oriented researchers are not the only ones to suffer from this bias, though: even someone who focuses on mosaics deals with the effect of the idolization of mosaics. Namely, mosaics often don't get productive pushback and criticism.
Indeed, reactions to mosaics tend to fit into one of three categories:
In short, mosaics polarize, which is great for attention and status-building, but makes it harder to see them as productive mistakes and thus to separate the brilliant insights from the misguided assumptions.
But maybe the bias gets something right — maybe we really only want mosaics. Just like asking which one is better, this feels like missing the point: it's quite obvious that both contribute to a research field. Mosaics leverage the power of normal science to explore particularly productive assumptions, whereas palimpsests question these assumptions and clear the messy jungle of details and confusion around complex ideas and notions. There might be incredibly rare cases where only one makes sense, but no field of science (certainly not alignment) currently has this property.
With all that said, I don't really know how to correct this bias. I feel like LW and the AF already have a partial community-wide acknowledgment of it, given the repeated encouragement to write more process-giving and iterative posts. That being said, this hasn't translated that well into more karma and comments for people doing this in my experience.
This is one of the open problems around mosaics and palimpsests.
This list reflects my own interpretation, and any pushback — especially from the mentioned researchers — would be interesting.
Note the pattern that while mosaic-shaped research points to specific works, palimpsest-shaped research is harder to isolate from all the work of the author.
This might seem weird given that Evan cowrote the mosaic-looking Risks from Learned Optimization. Yet based on my interactions with him, Evan iterates a lot on his ideas, encourages others to do so, and changed his framing of multiple points of Risks — while still keeping his aim at deception.
I think this one is correct but more subtle. I go into some detail in this comment.
This one still needs to be published, now that I finally realized I shouldn't aim for mosaics.
One clear exception: Paul is palimpsest-oriented yet is widely read and deservedly gets a lot of karma and comments.
John makes a similar distinction in this comment — palimpsests are rarely ready for "Come at me bro" type feedback.
Once again noting notable exceptions, much of the work on alignment is underlied by productive disagreement with the initial mosaic of Eliezer. That is, Disagreements that acknowledge the core problem while refusing or adapting key assumptions of the original frame. See for example this recent post by Paul for an explicit exploration of such productive disagreement.
Yet not all mosaics have to be written that way: Eliezer almost always prefers a literary style for example.
Funny that you classify me as a mosaic-ist; from my perspective, ever since I first got interested in neuroscience +AGI, I’ve kept writing and rewriting certain ideas about that topic, over and over, each time with fewer mistakes and better framings (I’d like to think). Feels palimpsest-y. For example, this is the second post I ever wrote on neuroscience, and is obviously a very very confused rough draft of stuff that I was (re-re-re-)writing about two months ago, and which I expect to rewrite yet again in the future.
Thanks for the comment!
To be honest, I had more trouble classifying you, and now that you commented, I think you're right that I got the wrong label. My reasoning was that your agenda and directions look far more explicit and precise than Paul or Evan's, which is definitely a more mosaic-y trait. On the other hand, there is the iteration that you describe, and I can clearly see a difference in terms of updating between you and let's say John/Eliezer.
My current model is that you're more palimpsest-y, but compared with most of us, you're surprisingly good at making your current iteration fit into a proper structure that you can make explicit and legible.
(Will update the post in consequence. ;) )