Hey, I'm Owen. I think rationality is pretty rad.


Search versus design

I mostly focused on the interpretability section as that's what I'm most familiar with, and I think your criticisms are very valid. I also wrote up some thoughts recently on where post-hoc interpretability fails, and Daniel Filan has some good responses in the comments below.

Also, re: disappointment on tree regularization, something that does seem more promising is Daniel Filan and others at CHAI working on investigating modularity in neural nets. You can probably ask him more, but last time we chatted, he also had some thoughts (unpublished) on how to enforce modularization as a regularizer, which seems to be what you wished the tree reg paper would have done.

Overall, this is great stuff, and I'll need to spend more time thinking about the design vs search distinction (which makes sense to me at first glance)/

Will OpenAI's work unintentionally increase existential risks related to AI?

Some OpenAI people are on LW. It'd be interesting to hear their thoughts as well.

Two general things which have made me less optimistic about OpenAI are that:

  1. They recently spun-out a capped-profit company, which seems like the end goal is monetizing some of their recent advancements. The page linked in the previous sentence also has some stuff about safety and about how none of their day-to-day work is changing, but it doesn't seem that encouraging.

  2. They've recently partnered up with Microsoft, presumably for product integration. This seems like it positions them as less of a neutral entity, especially as Alphabet owns DeepMind.

Forecasting AI Progress: A Research Agenda

I hadn't heard of the Delphi method before, so this paper brought it to my attention.

It's nice to see concrete forecasting questions laid out in a principled way. Now the perhaps harder step is trying to get traction on them ;^).

Note: The tables in pages 9 and 10 are a little blurry to read. They are also not text, so it's not easy to copy-paste them into another format for better viewing. I think it'd be good to update the images to either be clearer or translate it into a text table.

Tessellating Hills: a toy model for demons in imperfect search

Hi, thanks for sharing and experimentally trying out the theory in the previous post! Super cool.

Do you have the code for this up anywhere?

I'm also a little confused by the training procedure. Are you just instantiating a random vector and then doing GD with regards to the loss function you defined? Do the charts show the loss averaged over many random vectors (and function variants)?