“Words, like tranquil waters behind a dam, can become reckless and uncontrollable torrents of destruction when released without caution and wisdom.” — William Arthur Ward In this post I aim to shed light on lesser-discussed concerns surrounding Scaffolded LLMs (S-LLMs). The core of this post consists of three self-contained discussions,...
Produced during the Stanford Existential Risk Initiative (SERI) ML Alignment Theory Scholars (MATS) Program of 2022, under John Wentworth “Overconfidence in yourself is a swift way to defeat.” - Sun Tzu TL;DR: Escape into the Internet is probably an instrumental goal for an agentic AGI. An incompletely aligned AGI may...
Produced As Part Of The SERI ML Alignment Theory Scholars Program 2022 Research Sprint Under John Wentworth Two Deep Neural Networks with wildly different parameters can produce equally good results. Not only can a tweak to parameters leave performance unchanged, but in many cases, two neural networks with completely different...