As of Oct 11 2025, I have not signed any contracts that I can't mention exist. I'll try to update this statement at least once a year, so long as it's true. I added this statement thanks to the one in the gears to ascension's bio.
Something you didn't mention, but which I think would be a good idea, would be to create a scientific proof of concept of this happening. E.g. take a smallish LLM that is not doom-y, finetune it on doom-y documents like Eliezer Yudkowsky writing or short stories about misaligned AI, and then see whether it became more misaligned.
I think this would be easier to do, and you could still test methods like conditional pretraining and gradient routing.
I mostly believe this. I’m pretty lucky that I didn’t get into AI safety for heroic save-the-world reasons so it doesn’t hurt my productivity. I currently work on research aimed at reducing s-risk at CLR.
Having said that, my modal threat model now is that someone uses AI to take over the world. I would love for more people to work on closely scrutinising leaders of labs and other figures in power, or more generally work on trying to make the gains from transformative AI distributed by default