The Alignment Project is a global fund of over £15 million, dedicated to accelerating progress in AI control and alignment research. It is backed by an international coalition of governments, industry, venture capital and philanthropic funders. This post is part of a sequence on research areas that we are excited to fund through the The Alignment Project.
Apply now to join researchers worldwide in advancing AI safety.
From a scientific standpoint, understanding and measuring the key factors that influence the learning process—and turning those factors into levers of control—is a major boon to aligning AI systems with complex human values. This is because alignment to our intended goals can break down as the result of learning failure, for example, a model could be doing well on the loss it...
The Alignment Project is a global fund of over £15 million, dedicated to accelerating progress in AI control and alignment research. It is backed by an international coalition of governments, industry, venture capital and philanthropic funders.
This sequence sets out the research areas we are excited to fund – we hope this list of research ideas presents a novel contribution to the alignment field. We have deliberately focused on areas that we think the AI safety community currently underrates.
Apply now to join researchers worldwide in advancing AI safety.
For those with experience scaling and running ambitious projects, apply to our Strategy & Operations role here.
In-scope projects will aim to address either of the following challenges:
TLDR: This post distills Dynamical and Bayesian Phase Transitions in a Toy Model of Superposition by Chen et al. (2023), where they study developmental stages of the Toy Model of Superposition, understanding growth and form from the perspective of Singular Learning Theory (SLT).
Where do the bewildering and intricate structures of Nature come from? What purpose do they serve? In his famous 1917 book "On Growth and Form" the Scottish biologist and mathematician D'Arcy Wentworth Thompson wrote the following about the geometric forms of Phaeodaria, shown above:
...Great efforts have been made to attach a "biological meaning" to these elaborate structures and "to justify the hope that in time their utilitarian character will be more completely recognised" -- "On Growth and
Thanks for the very comprehensive review on generalisation!
As a historical note / broader context, the worry about model class over-expressivity has been there in the early days of Machine Learning. There was a mistrust of large blackbox models like random forest and SVM and their unusually low test or even cross-validation loss, citing ability of the models to fit noise. Breiman frank commentary back in 2001, "Statistical Modelling: The Two Cultures", touch on this among other worries about ML models. The success of ML has turn this worry into the general... (read more)