Reasoning-Finetuning Repurposes Latent Representations in Base Models
Authors: Jake Ward*, Chuqiao Lin*, Constantin Venhoff, Neel Nanda (*Equal contribution). This work was completed during Neel Nanda's MATS 8.0 Training Phase. TL;DR * We computed a steering vector for backtracking using base model activations. * It causes the associated fine-tuned reasoning model to backtrack. * But, it doesn't cause...
Jul 23, 202535