Introduction We (Adam Shimi, Joe Collman & myself) are trying to emulate peer review feedback for Alignment Forum posts. This is the second review in the series. The first’s introduction sums up our motivation and approach rather well, we will not duplicate it here. Instead, let’s dive into today’s reviewed...
Introduction This review is part of a project with Joe Collman and Jérémy Perret to try to get as close as possible to peer review when giving feedback on the Alignment Forum. Our reasons behind this endeavor are detailed in our original post asking for suggestions of works to review;...
This week, the key alignment group, we answered two questions, 5-minute timer style: 1. Map out all of alignment (25 minutes) 2. Create an image/ table representing alignment (10 min.) You are free to stop here, to actually try to answer the questions yourself. Here is a link for a...
I want to make an actionable map of AI alignment. After years of reading papers, blog posts, online exchanges, books, and occasionally hidden documents about AI alignment and AI risk, and having extremely interesting conversations about it, most arguments I encounter now feel familiar at best, rehashed at worst. This...