Xiaohu Tang, Shihao Han, Li Lyna Zhang, Ting Cao, Yunxin Liu
The boom of edge AI applications has spawned a great many neural network (NN) algorithms and inference platforms. Unfortunately, the fast pace of development in their fields have magnified the gaps between them. A well-designed NN algorithm with reduced number of computation operations and memory accesses can easily result in increased inference latency in real-world deployment, due to a mismatch between the algorithm and the features of target platforms.
Therefore, it is critical to understand the behaviour characteristics of NN design space on target platforms. However, none of existing NN benchmarking or characterization studies can serve this purpose. They only evaluate some sparse configurations in the design space for the purpose of platform optimization rather than the scaling in every design dimension for NN algorithm efficiency. This paper presents the first empirical study on the NN design space to learn NN behaviour characteristics on different inference platforms. The revealed characteristics can be used as guidelines to design efficient NN algorithms. We profile ten-thousand configurations from a cutting-edge NN design space on seven industrial edge AI platforms. Seven key findings as well as their causes and implications for efficient NN design are highlighted.