Nathan Helm-Burger

AI alignment researcher, formerly ML engineer. Background in Neuroscience.


Sorted by New

Wiki Contributions


I think this is an excellent description of GPT-like models. It both fits with my observations and clarifies my thinking. It also leads me to examine in a new light questions which have been on my mind recently: 

What is the limit of power of simulation that our current architectures (with some iterative improvements) can achieve when scaled to greater power (via additional computation, improved datasets, etc)?

Is a Simulator model really what we want? Can we trust the outputs we get from it to help us with things like accelerating alignment research? What might failure modes look like?

Super handy seeming intro for newcomers.

I recommend adding Jade Leung to your list of governance people.

As for the list of AI safety people, I'd like to add that there are some people who've written interesting and much discussed content that it would be worth having some familiarity with.

John Wentworth

Steven Byrnes

Vanessa Kosoy

And personally I'm quite excited about the school of thought developing under the 'Shard theory' banner.

For shard theory info: 

I'm excited to participate in this, and feel like the mental exercise of exploring this scenario would be useful for my education on AI safety. Since I'm currently funded by a grant from the Long Term Future Fund for reorienting my career to AI safety, and feel that this would be a reasonable use of my time, you don't need to pay me. I'd be happy to be a full-time volunteer for the next couple weeks.

Edit: I participated and was paid, but only briefly. Turns out I was too distracted thinking and talking about how the process could be improved and the larger world implications to actually be useful as an object-level worker. I feel like the experience was indeed useful for me, but not as useful to Beth as I had hoped. So... thanks and sorry!

Thanks Rohin. I also feel that interviewing after my 3 more months of independent work is probably the correct call.

I'm potentially interested in the Research Engineer position on the Alignment Team, but I'm currently 3 months into a 6 month grant from LTFF to reorient my career from general machine learning to AI safety specifically. My current plan is to keep doing solo work () until the last month of my grant period then begin applying to AI safety work at places like Anthropic, Redwood Research, Open AI, and Deepmind. 

Do you think there's a significant advantage to applying soon vs 3 months from now?