Top postsTop post
Adam Morris
170
Ω
25
4
18
This point has been floating around implicitly in various papers (e.g., Betley et al., Plunkett et al., Lindsey), but we haven’t seen it named explicitly. We think it’s important, so we’re describing it here. There’s been growing interest in testing whether LLMs can introspect on their internal states or processes....
This post is a summary of our paper from earlier this year: Plunkett, Morris, Reddy, & Morales (2025). Adam received an ACX grant to continue this work, and is interested in finding more potential collaborators---if you're excited by this work, reach out! It would be useful for safety purposes if...