Skip to yearly menu bar Skip to main content


Invited Talk
in
Workshop: Machine Learning with New Compute Paradigms

Low-Precision & Analog Computational Techniques for Sustainable & Accurate AI Inference & Training

Kailash Gopalakrishnan


Abstract:

The recent rise of Generative AI has led to a dramatic increase in the sizes and computational needs for AI models. This compute explosion has raised serious cost and sustainability concerns in both the training & deployment phases of these large models. Low-precision techniques, that lower the precision of the weights, activations, and gradients, have been successfully employed to reduce the training-precision from 32-bits down to 8-bits (FP8) and the inference-precision down to 4-bits (INT4). These advances have enabled more than a 10-fold improvement in compute efficiency over the past decade – however, it is expected that further gains may be limited. Recent developments in analog computational techniques offer the promise of achieving an additional 10-100X enhancement in crucial metrics, including energy efficiency and computational density. In this presentation, we will provide an overview of these significant recent breakthroughs, which are likely to play a pivotal role in advancing Generative AI and making it more sustainable and accessible to a wider audience.

Chat is not available.