Alignment research: 30
Could you share some breakdown for what these people work on? Does this include things like the 'anti-bias' prompt engineering?
It includes the people working on the kinds of projects I listed under the first misconception. It does not include people working on things like the mitigation you linked to. OpenAI distinguishes internally between research staff (who do ML and policy research) and applied staff (who work on commercial activities), and my numbers count only the former.
Thanks, that's very kind of you!
Is your argument about personnel overlap that one could do some sort of mixed effect regression, with location as the primary independent variable and controls for individual productivity? If so I'm so somewhat skeptical about the tractability: the sample size is not that big, the data seems messy, and I'm not sure it would capture necessarily the fundamental thing we care about. I'd be interested in the results if you wanted to give it a go though!
More importantly, I'm not sure this analysis would be that useful. Geography-based-priors only really seem us... (read more)
Thanks, fixed in both copies.
However upon reflection it does seem to be MIRI-affiliated so perhaps should have been affiliated; if I have time I may review and edit it in later.
Notice that in MIRI's summary of 2020 they wrote "From our perspective, our most interesting public work this year is Scott Garrabrant’s Cartesian frames model and Vanessa Kosoy’s work on infra-Bayesianism."
Hey Daniel, thanks very much for the comment. In my database I have you down as class of 2020, hence out of scope for that analysis, which was class of 2018 only. I didn't include 2019 or 2020 classes because I figured it takes time to find your footing, do research, write it up etc., so absence of evidence would not be very strong evidence of absence. So please don't consider this as any reflection on you. Ironically I actually did review one of your papers in the above - this one - which I did indeed think was pretty relevant! (Cntrl-F 'Hendrycks' to find the paragraph in the article). Sorry if this was not clear from the text.
See next year's post here.