Skip to yearly menu bar Skip to main content


Poster
in
Workshop: ML with New Compute Paradigms

N Multipliers for N Bits: Learning Bit Multipliers for Non-Uniform Quantization

Raghav Singhal · Anmol Biswas · Sivakumar Elangovan · Shreyas Sabnis · Udayan Ganguly

[ ] [ Project Page ]
Sun 15 Dec noon PST — 1:40 p.m. PST

Abstract:

Effective resource management is critical for deploying Deep Neural Networks (DNNs) in resource-constrained environments, highlighting the importance of low-bit quantization to optimize memory and speed. In this paper, we introduce N-Multipliers-for-N-Bits, a novel method for non-linear quantization designed for efficient hardware implementation. Our method uses N parameters, distinct for every layer and corresponding to the N quantization bits, whose linear combinations span the set of allowed weights (and activations). Furthermore, we learn these parameters in parallel with the weights ensuring exceptional flexibility in the quantizer model with minimal hardware overhead. We validate our method on CIFAR10 and ImageNet, achieving competitive results with 3- and 4-bit quantized models. We demonstrate strong performance on 4-bit quantized Spiking Neural Networks (SNNs), evaluated on the CIFAR10-DVS and N-Caltech 101 datasets. Further, we address the issue of stuck-at faults in hardware, and demonstrate robustness to up to 30\% faulty bits.

Chat is not available.