June 19, 2023
The size of deep neural networks has grown exponentially in recent years. Unfortunately, hardware devices have not kept pace with the rapidly increasing memory requirements. To cope with this, researchers have proposed various techniques including spilling, rematerialization, reduced precision training, model pruning, and so on. However, these approaches suffer from various limitations, such as increasing training time, affecting model accuracy, or requiring extensive manual modifications to the neural networks. We present MODeL, an algorithm that optimizes the lifetime and memory location of the tensors used to train neural networks. Our method automatically reduces the memory usage of existing neural networks without any of the drawbacks of other techniques. We formulate the problem as a joint integer linear program (ILP). We present several techniques to simplify the encoding of the problem, and enable our approach to scale to the size of state-of-the-art neural networks using an off-the-shelf ILP solver. We experimentally demonstrate that MODeL only takes seconds to allow the training of neural networks using 30% less memory on average. MODeL is an open-source project available at https://github.com/facebookresearch/model_opt.
Publisher
ICML
Research Topics
November 07, 2023
Jared Fernandez, Jacob Kahn, Clara Na, Yonatan Bisk, Emma Strubell
November 07, 2023
August 21, 2023
Bingyi Zhang, Hanqing Zeng, Viktor Prasanna
August 21, 2023
July 26, 2023
Youwei Liang, Kevin Stone, Chris Cummins, Mostafa Elhoushi, Jiadong Guo, Pengtao Xie, Hugh Leather, Yuandong Tian
July 26, 2023
April 26, 2023
Ashkan Yousefpour, Shen Guo, Ashish Shenoy, Sayan Ghosh, Pierre Stock, Kiwan Maeng, Schalk Krüger, Mike Rabbat, Carole-Jean Wu, Ilya Mironov
April 26, 2023
Product experiences
Foundational models
Product experiences
Latest news
Foundational models