AI ALIGNMENT FORUM
AF

Wikitags

Alignment Research Center (ARC)

Edited by Jessica W, et al. last updated 30th Dec 2024

Alignment Research Centre (ARC) is a non-profit research organization whose mission is to align future machine learning systems with human interests. Its current work focuses on developing an alignment strategy that could be adopted in industry today while scaling gracefully to future ML systems. Right now Paul Christiano, Mark Xu, and Jacob Hilton are researchers and Kyle Scott handles operations.

Subscribe
1
Subscribe
1
Discussion0
Discussion0
Posts tagged Alignment Research Center (ARC)
70ARC Evals new report: Evaluating Language-Model Agents on Realistic Autonomous Tasks
Beth Barnes
2y
4
84More information about the dangerous capability evaluations we did with GPT-4 and Claude.
Beth Barnes
2y
13
95ARC's first technical report: Eliciting Latent Knowledge
paulfchristiano, Mark Xu, Ajeya Cotra
4y
72
67Prizes for matrix completion problems
paulfchristiano
2y
1
58ARC is hiring theoretical researchers
paulfchristiano, Jacob_Hilton, Mark Xu
2y
5
57A bird's eye view of ARC's research
Jacob_Hilton
10mo
11
42ARC paper: Formalizing the presumption of independence
Erik Jenner
3y
0
38Steelmanning heuristic arguments
Dmitry Vaintrob
5mo
0
29Estimating Tail Risk in Neural Networks
Mark Xu
1y
8
26Low Probability Estimation in Language Models
Gabriel Wu
10mo
0
15AXRP Episode 23 - Mechanistic Anomaly Detection with Mark Xu
DanielFilan
2y
0
62ELK prize results
paulfchristiano, Mark Xu
3y
7
50Experimentally evaluating whether honesty generalizes
paulfchristiano
4y
21
45Evaluations project @ ARC is hiring a researcher and a webdev/engineer
Beth Barnes
3y
6
34ARC is hiring!
paulfchristiano, Mark Xu
4y
0
Load More (15/18)
Add Posts