What is it about?
The paper conducts an analysis on the hardware characteristics and layer-wise performance of representative DNNs on CPUs with SIMD extensions. Based on the analysis, the paper proposes an approach at the instruction level to optimize the convolution operations and analyzes its limitations.
Featured Image
Photo by Pietro Jeng on Unsplash
Why is it important?
The study points out an interesting research direction for the future design of DNN accelerators. For example, we can design a dedicated branch predictor for DNNs. The paper provides a guideline for optimizing DNNs on CPUs with SIMD extensions, as well as the potential hardware solutions based on FPGAs and heterogeneous accelerators.
Perspectives
Read the Original
This page is a summary of: A Comprehensive Analysis of Low-Impact Computations in Deep Learning Workloads, June 2021, ACM (Association for Computing Machinery),
DOI: 10.1145/3453688.3461747.
You can read the full text:
Contributors
The following have contributed to this page