kave

Hello! I work at Lightcone and like LessWrong :-)

Posts

Sorted by New

Wiki Contributions

Comments

Curated. I feel like over the last few years my visceral timelines have shortened significantly. This is partly in contact with LLMs, particularly their increased coding utility, and a lot downstream of Ajeya's and Daniel's models and outreach (I remember spending an afternoon on an arts-and-crafts 'build your own timeline distribution' that Daniel had nerdsniped me with). I think a lot of people are in a similar position and have been similarly influenced. It's nice to get more details on those models and the differences between them, as well as to hear Ege pushing back with "yeah but what if there are some pretty important pieces that are missing and won't get scaled away?", which I hear from my environment much less often.

There are a couple of pieces of extra polish that I appreciate. First, having some specific operationalisations with numbers and distributions up-front is pretty nice for grounding the discussion. Second, I'm glad that there was a summary extracted out front, as sometimes the dialogue format can be a little tricky to wade through.

On the object level, I thought the focus on schlep in the Ajeya-Daniel section and slowness of economy turnover in the Ajaniel-Ege section was pretty interesting. I think there's a bit of a cycle with trying to do complicated things like forecast timelines, where people come up with simple compelling models that move the discourse a lot and sharpen people's thinking. People have vague complaints that the model seems like it's missing something, but it's hard to point out exactly what. Eventually someone (often the person with the simple model) is able to name one of the pieces that is missing, and the discourse broadens a bit. I feel like schlep is a handle that captures an important axis that all three of our participants differ on.

I agree with Daniel that a pretty cool follow-up activity would be an expanded version of the exercise at the end with multiple different average worlds.

As a general matter, Anthropic has consistently found that working with frontier AI models is an essential ingredient in developing new methods to mitigate the risk of AI.

What are some examples of work that is most largeness-loaded and most risk-preventing? My understanding is that interpretability work doesn't need large models (though I don't know about things like influence functions). I imagine constitutional AI does. Is that the central example or there are other pieces that are further in this direction?

Curated. I am excited about many more distillations and expositions of relevant math on the Alignment Forum. There are a lot of things I like about this post as a distillation:

  • Exercises throughout. They felt like they were simple enough that they helped me internalise definitions without disrupting the flow of reading.
  • Pictures! This post made me start thinking of finite factorisations as hyperrectangles, and histories as dimensions that a property does not extend fully along.
  • Clear links from Finite Factored Sets to Pearl. I think these are roughly the same links made in the original, but they felt clearer and more orienting here.
  • Highlighting which of Scott's results are the "main" results (even more than the "Fundamental Theorem" name already did).
  • Magdalena Wache's engagement in the comments.

I do think the pictures became less helpful to me towards the end, and I thus have worse intuitions about the causal inference part. I'm also not sure about the emphasis of this post on causal rather than temporal inference. But I still love the post overall.

If you assume the human brain was trained roughly optimally, then requiring more data, at a given parameter number, to be optimal pushes timelines out. If instead you had a specific loss number in mind, then a more efficient scaling law would pull timelines in.