stuhlmueller — AI Alignment Forum

A Library and Tutorial for Factored Cognition with Language Models

We want to advance process-based supervision for language models. To make it easier for others to contribute to that goal, we're sharing code for writing compositional language model programs, and a tutorial that explains how to get started: * The Interactive Composition Explorer (ICE) is a library for writing and...

Sep 28, 202247

Ought will host a factored cognition “Lab Meeting”

by jungofthewon and stuhlmueller

Ought will host a factored cognition “Lab Meeting” on Friday September 16 from 9:30AM - 10:30AM PT. We'll share the progress we've made using language models to decompose reasoning tasks into subtasks that are easier to perform and evaluate. This is part of our work on supervising process, not outcomes....

Sep 9, 202235

Prize for Alignment Research Tasks

Can AI systems substantially help with alignment research before transformative AI? People disagree. Ought is collecting a dataset of alignment research tasks so that we can: 1. Make progress on the disagreement 2. Guide AI research towards helping with alignment We’re offering a prize of $200-$2000 for each contribution to...

Apr 29, 202264

Elicit: Language Models as Research Assistants

Ought is an applied machine learning lab. We’re building Elicit, the AI research assistant. Our mission is to automate and scale open-ended reasoning. To get there, we train language models by supervising reasoning processes, not outcomes. This is better for reasoning capabilities in the short run and better for alignment...

Apr 9, 202273

Supervise Process, not Outcomes

We can think about machine learning systems on a spectrum from process-based to outcome-based: * Process-based systems are built on human-understandable task decompositions, with direct supervision of reasoning steps. * Outcome-based systems are built on end-to-end optimization, with supervision of final results. This post explains why Ought is devoted to...

Apr 5, 2022146

Competition: Amplify Rohin’s Prediction on AGI researchers & Safety Concerns

EDIT: The competition is now closed, thanks to everyone who participated! Rohin’s posterior distribution is here, and winners are in this comment. In this competition, we (Ought) want to amplify Rohin Shah’s forecast for the question: When will a majority of AGI researchers agree with safety concerns? Rohin has provided...

Jul 21, 202083

Machine Learning Projects on IDA

by Owain_Evans, William_S, and stuhlmueller

TLDR We wrote a 20-page document that explains IDA and outlines potential Machine Learning projects about IDA. This post gives an overview of the document. What is IDA? Iterated Distillation and Amplification (IDA) is a method for training ML systems to solve challenging tasks. It was introduced by Paul Christiano....

Jun 24, 201949

Andreas Stuhlmüller

Andreas Stuhlmüller

Andreas Stuhlmüller

Supervise Process, not Outcomes

Competition: Amplify Rohin’s Prediction on AGI researchers & Safety Concerns

Elicit: Language Models as Research Assistants

Prize for Alignment Research Tasks

Andreas Stuhlmüller

Supervise Process, not Outcomes

Competition: Amplify Rohin’s Prediction on AGI researchers & Safety Concerns

Elicit: Language Models as Research Assistants

Prize for Alignment Research Tasks

A Library and Tutorial for Factored Cognition with Language Models

Ought will host a factored cognition “Lab Meeting”

Prize for Alignment Research Tasks

Elicit: Language Models as Research Assistants

Supervise Process, not Outcomes

Competition: Amplify Rohin’s Prediction on AGI researchers & Safety Concerns

Machine Learning Projects on IDA