x

AI ALIGNMENT FORUM

AF

Considerations in diffuse control — AI Alignment Forum

Considerations in diffuse control

Feb 09, 2026 by Alek Westover

9Methodological considerations in making malign initializations for control research

Alek Westover, Vivek Hebbar, Julian Stastny

7mo

0

2Three visions for diffuse control

5mo

0

17Four Downsides of Training Policies Online

Alek Westover, egan

6mo

0

4Theoretical predictions on the sample efficiency of training policies and activation monitors

Alek Westover, Vivek Hebbar

6mo

0

18How will we do SFT on models with opaque reasoning?

Alek Westover, Vivek Hebbar, egan

5mo

0

17Model organisms researchers should check whether high LRs defeat their model organisms

Dylan Xu, SebastianP, Alek Westover, Vivek Hebbar, Julian Stastny

3mo

0

28How do LLMs generalize when we do training that is intuitively compatible with two off-distribution behaviors?

Dylan Xu, Alek Westover, Vivek Hebbar, SebastianP, frisby, Julian Stastny

3mo

0

7How to reduce capability degradation from off-model SFT

Dylan Xu, SebastianP, Alek Westover

1mo

0

17Advice for making robust-to-training model organisms

SebastianP, Alek Westover, Vivek Hebbar, Julian Stastny, Dylan Xu

2mo

1

23Why does off-model SFT degrade capabilities?

SebastianP, Dylan Xu, Alek Westover, Julian Stastny, Vivek Hebbar

2mo

0