What is it about?

This research paper explores a new model that predicts how long it will take for Spark applications to run based on factors like idle time and backlog time. The model was tested on real-world workloads like Query-52 and K-Means to see how accurate it is in estimating execution times. The dynamic model in this paper builds on a static model that is mentioned in another of my paper's ' A Deterministic Model to Predict Execution Time of Spark Applications' https://link.springer.com/chapter/10.1007/978-3-031-25049-1_11

Featured Image

Why is it important?

As far as we know dynamic allocation has not been considered before in Modeling Execution Time of Spark Applications so this article contributes a unique perspective. Understanding how long it will take for Spark applications to run is crucial for optimizing performance and resource allocation in big data processing. By accurately predicting execution times, organizations can better plan their workflows, allocate resources efficiently, and improve overall system performance. Key Takeaways: •The model considers factors like idle time and backlog time to predict execution times accurately. • Error rates for Query-52 and K-Means workloads were around 4.96% and 4.74%, respectively, showing the model's effectiveness. • The model outperformed traditional machine learning models like linear regression, neural networks, decision trees, and random forest in predicting execution times. • Dynamic allocation of Spark executors based on workload can significantly impact the performance of Spark applications. • Accurate prediction of execution times can lead to better resource management and improved overall system efficiency.

Perspectives

Writing this article was a great pleasure as it has co-author with whom I have had long standing collaborations. This article presents a graph-based, deterministic model that accounts for dynamic allocation of executors for a spark application—that is, where the executors are allocated on demand for the application at run time. Leveraging this model, businesses can achieve better performance, cost savings, and more efficient resource management, ultimately improving overall system performance and reliability.

Hina Tariq
Toronto Metropolitan University

AI notice

Some of the content on this page has been created using generative AI.

Read the Original

This page is a summary of: Execution Time Prediction Model that Considers Dynamic Allocation of Spark Executors, January 2023, Springer Science + Business Media,
DOI: 10.1007/978-3-031-43185-2_23.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page