x

AI ALIGNMENT FORUM
AF

Casey Barkan — AI Alignment Forum

Casey Barkan

Casey Barkan

Message

81

Ω

19

2

5

1y

Casey Barkan

81

Ω

19

1y

Do LLMs know what they're capable of? Why this matters for AI safety, and initial findings

This post is a companion piece to a forthcoming paper. This work was done as part of MATS 7.0 & 7.1. Abstract We explore how LLMs’ awareness of their own capabilities affects their ability to acquire resources, sandbag an evaluation, and escape AI control. We quantify LLMs' self-awareness of capability...

Jul 13, 2025•53