AI ALIGNMENT FORUM
AF

1400
Maxime Riché
Ω2002
Message
Dialogue
Subscribe

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No posts to display.
1Maxime Riché's Shortform
1y
0
Evaluating the Existence Neutrality Hypothesis - Introductory Series
We need a Science of Evals
Maxime Riché2y30

FYI, the "Evaluating Alignment Evaluations" project of the current AI Safety Camp is working on studying and characterizing alignment(propensity) evaluations. We hope to contribute to the science of evals, and we will contact you next month. (Somewhat deprecated project proposal)

Reply
Sycophancy
2 years ago