Aligned AI Proposals

Edited by Dakara last updated 4th Jan 2025

Aligned AI Proposals are proposals aimed at ensuring artificial intelligence systems behave in accordance with human intentions (intent alignment) or human values (value alignment).

The main goal of these proposals is to ensure that AI systems will, all things considered, benefit humanity.

Posts tagged Aligned AI Proposals

7The Best Way to Align an LLM: Is Inner Alignment Now a Solved Problem?

RogerDearnaley

5mo

0

18A "Bitter Lesson" Approach to Aligning AGI and ASI

RogerDearnaley

1y

0

11Why Aligning an LLM is Hard, and How to Make it Easier

RogerDearnaley

9mo

0

19How to Control an LLM's Behavior (why my P(DOOM) went down)

RogerDearnaley

2y

0

8Goodbye, Shoggoth: The Stage, its Animatronics, & the Puppeteer – a New Metaphor

RogerDearnaley

2y

0

16Requirements for a Basin of Attraction to Alignment

RogerDearnaley

2y

0

13Interpreting the Learning of Deceit

RogerDearnaley

2y

2

56A list of core AI safety problems and how I hope to solve them

davidad

2y

12

48AI Alignment Metastrategy

Vanessa Kosoy

2y

1

18Safety First: safety before full alignment. The deontic sufficiency hypothesis.

Chris Lakin

2y

1

15We have promising alignment plans with low taxes

Seth Herd

2y

0

18The (partial) fallacy of dumb superintelligence

Seth Herd

2y

0

34An Open Agency Architecture for Safe Transformative AI

davidad

3y

18

43Scalable Oversight and Weak-to-Strong Generalization: Compatible approaches to the same problem

Ansh Radhakrishnan, Buck, ryan_greenblatt, Fabien Roger

2y

2

15How to safely use an optimizer

Simon Fischer

2y

0

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Aligned AI Proposals