What is it about?
At its core, the paper proposes a model to predict scalability of multi-threaded workloads with minimal overhead due to profiling. The state of the art requires that multiple measurements of effective parallelism are made at multiple degrees of parallelism, and scalability is interpolated between the measured configurations. SCALO does not require this and extrapolates scalability on the basis of cache misses and other events, as well as time spent in the (OpenMP) runtime, which inherently indicate parallel efficiency, and thus scalability.
Featured Image
Why is it important?
This work opens important opportunities A. For efficient co-execution of multi-threaded workloads, increasing utilisation of multi-core processors B. For runtime performance estimation
Perspectives
Read the Original
This page is a summary of: SCALO, ACM Transactions on Architecture and Code Optimization, December 2017, ACM (Association for Computing Machinery),
DOI: 10.1145/3158643.
You can read the full text:
Contributors
The following have contributed to this page