By Peter S. Park, Simon Goldstein, Aidan O’Gara, Michael Chen, and Dan Hendrycks [This post summarizes our new report on AI deception, available here] Abstract: This paper argues that a range of current AI systems have learned how to deceive humans. We define deception as the systematic inducement of false...
This is a draft written by Simon Goldstein, associate professor at the Dianoia Institute of Philosophy at ACU, and Pamela Robinson, postdoctoral research fellow at the Australian National University, as part of a series of papers for the Center for AI Safety Philosophy Fellowship's midpoint. Abstract: We propose developing AIs...
This post was written by Simon Goldstein, associate professor at the Dianoia Institute of Philosophy at ACU, and Cameron Domenico Kirk-Giannini, assistant professor at Rutgers University, for submission to the Open Philanthropy AI Worldviews Contest. Both authors are currently Philosophy Fellows at the Center for AI Safety. Abstract: Recent advances...
This is a draft written by Cameron Domenico Kirk-Giannini, assistant professor at Rutgers University, and Simon Goldstein, associate professor at the Dianoia Institute of Philosophy at ACU, as part of a series of papers for the Center for AI Safety Philosophy Fellowship's midpoint. Dan helped post to the Alignment Forum....
This is a draft written by Simon Goldstein, associate professor at the Dianoia Institute of Philosophy at ACU, as part of a series of papers for the Center for AI Safety Philosophy Fellowship. Dan helped post to the Alignment Forum. This draft is meant to solicit feedback. PDF of this...