x

AI ALIGNMENT FORUM

AF

Power Seeking (AI) — AI Alignment Forum

Power Seeking (AI)

Edited by Raemon last updated 24th Oct 2022

Power Seeking is a property that agents might have, where they attempt to gain more general ability to control their environment. It's particularly relevant to AIs, and related to Instrumental Convergence.

Add Posts

1

1

Posts tagged Power Seeking (AI)

1

16Instrumental convergence in single-agent systems

Edouard Harris, simonsdsuo

4y

4

1

16Categorical-measure-theoretic approach to optimal policies tending to seek power

3y

0

1

7POWERplay: An open-source toolchain to study AI power-seeking

3y

0

2

68Parametrically retargetable decision-makers tend to seek power

3y

4

2

49Steering Llama-2 with contrastive activation additions

Nina Panickssery, Wuschel Schulz, NickGabs, Meg, evhub, TurnTrout

2y

23

2

29No instrumental convergence without AI psychology

3mo

5

1

29Eli's review of "Is power-seeking AI an existential risk?"

4y

0

1

29A framework for thinking about AI power-seeking

2y

11

1

28Power-seeking can be probable and predictive for trained agents

3y

20

1

17Generalizing the Power-Seeking Theorems

6y

5

2

20Intrinsic Power-Seeking: AI Might Seek Power for Power’s Sake

1y

1

1

15[AN #170]: Analyzing the argument for risk from power-seeking AI

4y

0

1

10Power-seeking for successive choices

5y

9

0

23My Overview of the AI Alignment Landscape: Threat Models

4y

3

0

10Natural Abstraction: Convergent Preferences Over Information Structures

3y

0

Load More (15/16)

Add Posts