AI ALIGNMENT FORUM
AF

Wikitags

Archetypal Transfer Learning

Edited by MiguelDev, the gears to ascension, et al. last updated 5th Jul 2023

Archetypal Transfer Learning (ATL) is a proposal by @whitehatStoic for what is argued by the author to be a fine tuning approach that "uses archetypal data" to "embed Synthetic Archetypes". These Synthetic Archetypes are derived from patterns that models assimilate from archetypal data, such as artificial stories. The method yielded a shutdown activation rate of 57.33% in the GPT-2-XL model after fine-tuning. 

 

Related Tags: Corrigibility, Inner Alignment, Outer Alignment

Subscribe
Subscribe
Discussion0
Discussion0