We develop a simple model that predicts progress in the performance of field-effect transistor-based GPUs under the assumption that transistors can no longer miniaturize after scaling down to roughly the size of a single silicon atom. We construct a composite model from a performance model (a model of how GPU...
In short: Training runs of large Machine Learning systems are likely to last less than 14-15 months. This is because longer runs will be outcompeted by runs that start later and therefore use better hardware and better algorithms. [Edited 2022/09/22 to fix an error in the hardware improvements + rising...
Executive Summary Using a dataset of 470 models of graphics processing units (GPUs) released between 2006 and 2021, we find that the amount of floating-point operations/second per $ (hereafter FLOP/s per $) doubles every ~2.5 years. For top GPUs, we find a slower rate of improvement (FLOP/s per $ doubles...
Summary * We are a new research organization working on investigating trends in Machine Learning and forecasting the development of Transformative Artificial Intelligence * This work is done in close collaboration with other organizations, like Rethink Priorities, Open Philanthropy, and MIT CSAIL * We will be hiring for 2-4 full-time...
Summary Using our dataset of milestone Machine Learning models, and our recent analysis of compute trends in ML, we project forward 70 years worth of trends in the amount of compute used to train Machine Learning models. Our simulations account for (a) uncertainty in estimates of the growth rates in...