Reducing parameter footprint and inference latency of machine learning models is being driven by diverse applications like mobile vision and on-device intelligence [Choudary 20], and it is increasingly important, as models become increasingly large. In this work, we propose to develop an alternative to the current train/compress paradigm, and instead we will train sparse high-capacity models from scratch, simultaneously achieving low training cost and high sparsity.
Researchers Geoffrey Négiar, UC Berkeley Michael Mahoney, UC Berkeley...