Announcing Epoch’s dashboard of key trends and figures in Machine Learning

Jsevillamol

This is a linkpost for https://epochai.org/trends

Developments in Machine Learning have been happening extraordinarily fast, and as their impacts become increasingly visible, it becomes ever more important to develop a quantitative understanding of these changes. However, relevant data has thus far been scattered across multiple papers, has required expertise to gather accurately, or has been otherwise hard to obtain.

Given this, Epoch is thrilled to announce the launch of our new dashboard, which covers key numbers and figures from our research to help understand the present and future of Machine Learning. This includes:

Training compute requirements
Model size, measured by the number of trainable parameters
The availability and use of data for training
Trends in hardware efficiency
Algorithmic improvements for achieving better performance with fewer resources
The growth of investment in training runs over time

Our dashboard gathers all of this information in a single, accessible place. The numbers and figures are accompanied by further information such as confidence intervals, labels representing our degree of uncertainty in the results, and links to relevant research papers. These details are especially useful to illustrate which areas may require further investigation, and how much you should trust our findings.

Beyond accessibility, bringing these figures together allows us to compare and contrast trends and drivers of progress. For example, we can verify that growth in training compute is driven by improvements to hardware performance and rising investments:

We can also see that performance improvements have historically been driven by algorithmic progress and training compute growth by comparable amounts:

Overall, we hope that our dashboard will serve as a valuable resource for researchers, policymakers, and anyone interested in the future of Machine Learning.

We plan on keeping our dashboard regularly updated, so stay tuned! If you spot an error or would like to provide feedback, please feel free to reach out to us at info@epochai.org.

Visit now the dashboard

Thanks, that looks really useful! Do you have GPU price performance numbers for lower precision training? Models like Chinchilla were trained in bf16, so that seems a more relevant number.

Thanks Neel!

The difference between tf16 and FP32 comes to a x15 factor IIRC. Though also ML developers seem to prioritise other characteristics than cost effectiveness when choosing GPUs like raw performance and interconnect, so you can't just multiply the top price performance we showcase by this factor and expect that to match the cost performance of the largest ML runs today.

More soon-ish.

Looks awesome! Minor correction on the cost of the GPT-4 training run: the website says $40 million, but sama confirmed publicly that it was over $100M (and several news outlets have reported the latter number as well).

Thanks!

Our current best guess is that this includes costs other than the amortized compute of the final training run.

If no extra information surfaces we will add a note clarifying this and/or adjust our estimate.

Gotcha, that makes sense!