This past semester, HAIST and MAIA (the Harvard and MIT AI safety student groups) ran an adapted version of Richard Ngo's AGI Safety Fundamentals alignment curriculum. This adaptation – which consists of eight 2-hour long meetings, with all readings done during the meeting – is now available on the AGISF website.
In this post, we discuss the adapted curriculum and its intended use, and we recommend that other in-person reading groups following AGISF use this adaptation.[1]
The adapted curriculum was made by refining a slightly rustier first adaptation, with significant help from Richard Ngo and feedback from participants. The key differences between the adapted curriculum and the mainline AGISF alignment curriculum are:
The way that HAIST and MAIA used this curriculum, and the way we recommend other groups use it, is:
We note that this format introduces some new challenges, especially when there are slower readers.
To help with some of these challenges, Sam prepared a guide for HAIST and MAIA facilitators that included recommended discussion times, points of discussion, and advice about which readings to cut if necessary. That facilitator guide was for an outdated version of the curriculum, but we hope to have an updated facilitator guide in the next few weeks. We don’t want to make these public, but feel free to reach out to smarks@math.harvard.edu if you’re running a reading group and are interested in seeing the old or forthcoming facilitator guides.
Sam and Xander generally felt that the in-sessions reading format worked better than the take-home readings format, which HAIST used for an AGISF reading group in spring 2022. In particular:
Empirically, participants also thought that HAIST reading groups went well. Participants gave the program an overall rating of 8.6 on a 0-10 scale, and when asked “To what extent are you considering a career in AI safety?” the average response was 7.4, up from 4.7 before the program (though part of this movement was likely due to selection effects[2]).
Of course, the curriculum isn’t the only factor impacting how well a reading group goes. We note that smooth operations, facilitator quality, a high admissions bar, and printing the readings (as an alternative to reading on devices) also seemed important for things going well at HAIST.[3]
We don’t know whether the in-session readings format would work well for reading groups that meet virtually. If anyone experiments with this, we’d be very interested in hearing how it goes; you can contact Sam at smarks@math.harvard.edu.
The start-of-program and end-of-program surveys had response rates around 45% and 65%, respectively, with an attrition rate of around 40%.
See the HAIST update post for some relevant advice.