Wiki Contributions


This post seems like it was quite influential. This is basically a trivial review to allow the post to be voted on.


Alignment research: 30

Could you share some breakdown for what these people work on? Does this include things like the 'anti-bias' prompt engineering?


Is your argument about personnel overlap that one could do some sort of mixed effect regression, with location as the primary independent variable and controls for individual productivity? If so I'm so somewhat skeptical about the tractability: the sample size is not that big, the data seems messy, and I'm not sure it would capture necessarily the fundamental thing we care about. I'd be interested in the results if you wanted to give it a go though!

More importantly, I'm not sure this analysis would be that useful. Geography-based-priors only really seem useful for factors we can't directly observe; for an organization like CHAI our direct observations will almost entirely screen off this prior. The prior is only really important for factors where direct measurement is difficult, and hence we can't update away from the prior, but for those we can't do the regression. (Though I guess we could do the regression on known firms/researchers and extrapolate to new unknown orgs/individuals).

The way this plays out here is we've already spent the vast majority of the article examining the research productivity of the organizations; geography based priors only matter insomuchas you think they can proxy for something else that is not captured in this.

As befits this being a somewhat secondary factor, it's worth noting that I think (though I haven't explicitly checked) in the past I have supported bay area organisations more than non-bay-area ones.   

  • I prioritized posts by named organizations.
    • Diffractor does not list any institutional affiliations on his user page.
    • No institution I noticed listed the post/sequence on their 'research' page.
    • No institution I contacted mentioned the post/sequence.
  • No post in the sequence was that high in the list of 2021 Alignment Forum posts, sorted by karma.
  • Several other filtering methods also did not identify the post

However upon reflection it does seem to be MIRI-affiliated so perhaps should have been affiliated; if I have time I may review and edit it in later.


Hey Daniel, thanks very much for the comment. In my database I have you down as class of 2020, hence out of scope for that analysis, which was class of 2018 only. I didn't include 2019 or 2020 classes because I figured it takes time to find your footing, do research, write it up etc., so absence of evidence would not be very strong evidence of absence. So please don't consider this as any reflection on you. Ironically I actually did review one of your papers in the above - this one - which I did indeed think was pretty relevant! (Cntrl-F 'Hendrycks' to find the paragraph in the article). Sorry if this was not clear from the text.

Load More