What is it about?

This work distributes the training workload across a variety of available heterogeneous parallel computing machines.

Featured Image

Why is it important?

CNNs have proven to be powerful classification tools in tasks that range from check reading to medical diagnosis, reaching close to human perception, and in some cases surpassing it. However, the problems to solve are becoming larger and more complex, which translates to larger CNNs, leading to longer training times that not even the adoption of stand-alone GPUs could keep up to. This problem is partially solved by using more processing units and distributed training methods that are offered by several frameworks dedicated to neural network training, such as Caffe, Torch, or TensorFlow. However, these techniques do not take full advantage of the possible parallelization offered by CNNs and the cooperative use of heterogeneous devices with different characteristics such as processing capabilities, clock speed, memory size, among others. This paper presents a new method for the parallel training of CNNs where only the convolutional layer is distributed. Results show that this technique is capable of diminishing the training time without affecting the classification performance for both CPUs and GPUs.

Perspectives

While inference is being increasingly performed on mobile devices (not restricted to smartphones), the training of CNNs, which are becoming much larger, has to be performed on the cloud. This work accommodates that trend in the sense that larger datasets will certainly require more than 60-90% of processing time calculating convolutions, and thus speedups will tend to increase accordingly with the currently proposed technology.

Gabriel Falcao
IT / University of Coimbra

Read the Original

This page is a summary of: Distributed Learning of CNNs on Heterogeneous CPU/GPU Architectures, Applied Artificial Intelligence, September 2018, Taylor & Francis,
DOI: 10.1080/08839514.2018.1508814.
You can read the full text:

Read

Contributors

The following have contributed to this page