Simon Fischer — AI Alignment Forum

[Aspiration-based designs] 2. Formal framework, basic algorithm

Summary. In this post, we present the formal framework we adopt during the sequence, and the simplest form of the type of aspiration-based algorithms we study. We do this for a simple form of aspiration-type goals: making the expectation of some variable equal to some given target value. The algorithm...

Apr 28, 202418

[Aspiration-based designs] 1. Informal introduction

Sequence Summary. This sequence documents research by SatisfIA, an ongoing project on non-maximizing, aspiration-based designs for AI agents that fulfill goals specified by constraints ("aspirations") rather than maximizing an objective function . We aim to contribute to AI safety by exploring design approaches and their software implementations that we believe...

Apr 28, 202444

How to safely use an optimizer

Summary: The post describes a method that allows us to use an untrustworthy optimizer to find satisficing outputs. Acknowledgements: Thanks to Benjamin Kolb (@benjaminko), Jobst Heitzig (@Jobst Heitzig) and Thomas Kehrenberg (@Thomas Kehrenberg) for many helpful comments. Introduction Imagine you have black-box access to a powerful but untrustworthy optimizing system,...

Mar 28, 202447