x

AI ALIGNMENT FORUM

AF

vgel — AI Alignment Forum

Theia Vogel

Top postsTop post

Theia Vogel

Message

also known as thebes, theia vogel. other presences:

https://vgel.me (theia@vgel.me)
https://github.com/vgel
https://x.com/voooooogel (thebes)
https://bsky.app/profile/vgel.me
https://vgel.itch.io (games)
discord: @vgel
signal: @vgel.01

695

Ω

37

4

7

7mo

Theia Vogel

also known as thebes, theia vogel. other presences:

https://vgel.me (theia@vgel.me)
https://github.com/vgel
https://x.com/voooooogel (thebes)
https://bsky.app/profile/vgel.me
https://vgel.itch.io (games)
discord: @vgel
signal: @vgel.01

Small Models Can Introspect, Too

Recent work by Anthropic showed that Claude models, primarily Opus 4 and Opus 4.1, are able to introspect--detecting when external concepts have been injected into their activations. But not all of us have Opus at home! By looking at the logits, we show that a 32B open-source model that at...

Dec 21, 2025•126