Power Seeking (AI)

Edited by Raemon last updated 24th Oct 2022

Power Seeking is a property that agents might have, where they attempt to gain more general ability to control their environment. It's particularly relevant to AIs, and related to Instrumental Convergence.

Posts tagged Power Seeking (AI)

16Instrumental convergence in single-agent systems

Edouard Harris, simonsdsuo

3y

4

16Categorical-measure-theoretic approach to optimal policies tending to seek power

jacek

3y

0

7POWERplay: An open-source toolchain to study AI power-seeking

Edouard Harris

3y

0

68Parametrically retargetable decision-makers tend to seek power

TurnTrout

3y

4

49Steering Llama-2 with contrastive activation additions

Nina Panickssery, Wuschel Schulz, NickGabs, Meg, evhub, TurnTrout

2y

23

29Eli's review of "Is power-seeking AI an existential risk?"

elifland

3y

0

29A framework for thinking about AI power-seeking

Joe Carlsmith

1y

11

28Power-seeking can be probable and predictive for trained agents

Vika, janos

3y

20

17Generalizing the Power-Seeking Theorems

TurnTrout

5y

5

20Intrinsic Power-Seeking: AI Might Seek Power for Power’s Sake

TurnTrout

1y

1

15[AN #170]: Analyzing the argument for risk from power-seeking AI

Rohin Shah

4y

0

10Power-seeking for successive choices

adamShimi

4y

9

23My Overview of the AI Alignment Landscape: Threat Models

Neel Nanda

4y

3

10Natural Abstraction: Convergent Preferences Over Information Structures

paulom

2y

0

12Incentives from a causal perspective

tom4everitt, James Fox, RyanCarey, mattmacdermott, sbenthall, Jonathan Richens

2y

0

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Power Seeking (AI)