AI ALIGNMENT FORUM
Sorted by New
New Tool: the Residual Stream Viewer
The positional embedding matrix and previous-token heads: how do they actually work?
SmartyHeaderCode: anomalous tokens for GPT3.5 and GPT-4