What is it about?

The paper conducts an analysis on the hardware characteristics and layer-wise performance of representative DNNs on CPUs with SIMD extensions. Based on the analysis, the paper proposes an approach at the instruction level to optimize the convolution operations and analyzes its limitations.

Featured Image

Why is it important?

The study points out an interesting research direction for the future design of DNN accelerators. For example, we can design a dedicated branch predictor for DNNs. The paper provides a guideline for optimizing DNNs on CPUs with SIMD extensions, as well as the potential hardware solutions based on FPGAs and heterogeneous accelerators.

Perspectives

The study offers me a thorough understanding on the DNNs with SIMD-CPU architectures. I hope the article provides some useful references for the optimization on DNNs.

LI HENGYI
Ritsumeikan Daigaku

Read the Original

This page is a summary of: A Comprehensive Analysis of Low-Impact Computations in Deep Learning Workloads, June 2021, ACM (Association for Computing Machinery),
DOI: 10.1145/3453688.3461747.
You can read the full text:

Read

Contributors

The following have contributed to this page