A Systematic Methodology for Analysis of Deep Learning Hardware and Software Platforms

Part of Proceedings of Machine Learning and Systems 2 (MLSys 2020)

Bibtex Metadata Paper

Authors

Yu Wang, Gu-Yeon Wei, David Brooks

Abstract

Training deep learning models is compute-intensive and there is an industry-wide trend towards hardware and software specialization to improve performance. To systematically compare deep learning systems, we introduce a methodology comprised of a set of analysis techniques and parameterized end-to-end models for fully connected, convolutional, and recurrent neural networks. This methodology can be applied to analyze various hardware and software systems, and is intended to complement traditional methods. We demonstrate its utility by comparing two generations of specialized platforms (Google's Cloud TPU v2/v3), three heterogeneous platforms (Google TPU, Nvidia GPU, and Intel CPU), and specialized software stacks (TensorFlow and CUDA).