AI ALIGNMENT FORUM
AF

Wikitags

Aligned AI Proposals

Edited by Dakara last updated 4th Jan 2025

Aligned AI Proposals are proposals aimed at ensuring artificial intelligence systems behave in accordance with human intentions (intent alignment) or human values (value alignment).

The main goal of these proposals is to ensure that AI systems will, all things considered, benefit humanity.

Subscribe
Subscribe
Discussion0
Discussion0
Posts tagged Aligned AI Proposals
7The Best Way to Align an LLM: Is Inner Alignment Now a Solved Problem?
RogerDearnaley
3mo
0
18A "Bitter Lesson" Approach to Aligning AGI and ASI
RogerDearnaley
1y
0
11Why Aligning an LLM is Hard, and How to Make it Easier
RogerDearnaley
7mo
0
19How to Control an LLM's Behavior (why my P(DOOM) went down)
RogerDearnaley
2y
0
8Goodbye, Shoggoth: The Stage, its Animatronics, & the Puppeteer – a New Metaphor
RogerDearnaley
2y
0
16Requirements for a Basin of Attraction to Alignment
RogerDearnaley
2y
0
13Interpreting the Learning of Deceit
RogerDearnaley
2y
2
56A list of core AI safety problems and how I hope to solve them
davidad
2y
12
48AI Alignment Metastrategy
Vanessa Kosoy
2y
1
18Safety First: safety before full alignment. The deontic sufficiency hypothesis.
Chris Lakin
2y
1
15We have promising alignment plans with low taxes
Seth Herd
2y
0
18The (partial) fallacy of dumb superintelligence
Seth Herd
2y
0
34An Open Agency Architecture for Safe Transformative AI
davidad
3y
18
43Scalable Oversight and Weak-to-Strong Generalization: Compatible approaches to the same problem
Ansh Radhakrishnan, Buck, ryan_greenblatt, Fabien Roger
2y
2
15How to safely use an optimizer
Simon Fischer
1y
0
Load More (15/19)
Add Posts