Modeling Transformative AI Risk (MTAIR)

This is a sequence outlining work by the Modeling Transformative AI Risk (MTAIR) project. The project is an attempt to map out the relationships between key hypotheses and cruxes involved in debates about catastrophic risks from advanced AI. The project is also an attempt to convert our hypothesis / crux map into a software-based model that can incorporate probability estimates or other quantitative factors in ways that might be useful for exploration, planning, and/or decision support.

This series of posts presents a preliminary version of the model, along with a discussion of some of our plans going forward. The primary purpose of this sequence of posts is to inform the community about our progress, and hopefully contribute meaningfully to the ongoing discussion.

A critical secondary goal is to get feedback from the community about this model as a form of expert engagement / elicitation. Although the project is still very much a work in progress, we believe that we are now at the stage where we can productively engage the community to solicit feedback, critiques, and suggestions, as well as contribute to the discourse. We also welcome ideas for further collaboration with other members of the community who are interested, or who are working on related projects.

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Modeling Transformative AI Risk (MTAIR)

Modeling Transformative AI Risk (MTAIR)