Running Lightcone Infrastructure, which runs LessWrong and Lighthaven.space. You can reach me at habryka@lesswrong.com.
(I have signed no contracts or agreements whose existence I cannot mention, which I am mentioning here as a canary)
Hmm, you have a very different read of Richard's message than I do. I agree Miles' statement did not reason through safety policies, but IMO his blogging since then has included a lot of harsh words for OpenAI, in a way that at least to me made the connection clear (and I think also to many others, but IDK, it's still doing some tea-leaf reading).
My impression is that few (one or two?) of the safety people who have quit a leading lab did so to protest poor safety policies, and of those few none saw staying as a viable option.
While this isn't amazing evidence, my sense is there have been around 6 people who quit who in-parallel to them announcing their leave called out OpenAI's reckless attitude towards risk (at various levels of explicitness, but quite strongly in all cases by standard professional norms).
It's hard to say that people quit "to protest safety policies", but they definitely used their leaving to protest safety policies. My sense is almost everyone who left in the last year (Daniel, William, Richard, Steven Adler, Miles) did so with a pretty big public message.
I think the core argument is "if you want to slow down, or somehow impose restrictions on AI research and deployment, you need some way of defining thresholds. Also, most policymaker's cruxes appear to be that AI will not be a big deal, but if they thought it was going to be a big deal they would totally want to regulate it much more. Therefore, having policy proposals that can use future eval results as a triggering mechanism is politically more feasible, and also, epistemically helpful since it allows people who do think it will be a big deal to establish a track record".
I find these arguments reasonably compelling, FWIW.
Sure, here are some things:
I can probably think of some more.
(For what it's worth, it appears to me that people started using the term "scheming" in much more confusing and inconsistent ways after this post was written and tried to give that term a technical meaning. I currently think this was quite bad. I do like a lot of the content of the paper/essay/post. I have like one conversation every two weeks that ends up derailed or confused because the two participants are using "scheming" in different specific ways, assuming the other person has the same meaning in mind)
How are the triangle numbers not quadratic?
Sure looks quadratic to me.
This essay seems to have lost the plot of where the problems with AI come from. I was historically happy that Conjecture focused on the parts of AI development that are really obviously bad, like having a decent chance of literally killing everyone or permanently disempowering humanity, but instead this seems like it's a random rant against AI-generated art, and name-calling of obviously valuable tools like AI coding assistants .
I am not sure what happened. I hope you find the plot again.
(Edit note: I fixed up some formatting that looked a bit broken or a bit confusing. Mostly replacing some manual empty lines with "*" characters with some of our proper horizontal rule elements, and removing italics from the executive summary, since our font is kind of unreadable if you have whole paragraphs of italicized text. Feel free to revert)
(I meant the more expansive definition. Plausible that me and Zac talked past each other because of that)