How model editing could help with the alignment problem
Preface This article explores the potential of model editing techniques in aligning future AI systems. Initially, I was skeptical about its efficacy, especially considering the objectives of current model editing methods. I argue that merely editing "facts" isn't an adequate alignment strategy and end with suggestions for research avenues focused...
Sep 30, 202312