Hjalmar_Wijk — AI Alignment Forum

Hjalmar_Wijk's Shortform

May 31, 20243

Autonomous replication and adaptation: an attempt at a concrete danger threshold

Note: This is a rough attempt to write down a more concrete threshold at which models might pose significant risks from autonomous replication and adaptation (ARA). It is fairly in the weeds and does not attempt to motivate or contextualize the idea of ARA very much, nor is it developed...

Aug 17, 202345

Tabooing 'Agent' for Prosaic Alignment

This post is an attempt to sketch a presentation of the alignment problem while tabooing words like agency, goals or optimization as core parts of the ontology.[1] This is not a critique of frameworks which treat these topics as fundamental, in fact I end up concluding that this is likely...

Aug 23, 201957

Hjalmar Wijk

Hjalmar Wijk

Hjalmar Wijk

Hjalmar Wijk

Hjalmar_Wijk's Shortform

Autonomous replication and adaptation: an attempt at a concrete danger threshold

Tabooing 'Agent' for Prosaic Alignment