What is it about?

The current decade has beheld a tremendous spike in data volume, velocity, variety, and many other such aspects which we call as Big Data and which gave birth to a new kind of science commonly known as ”Data Science”. With the ”Data Apocalypse” in progress, it is evident that the conventional methods to handle these data would not suffice. We need distributed and parallel architectures like Cloud services (IaaS, PaaS, SaaS, STaaS, etc.). But is that enough to satisfy our needs? Here, we propose a tutorial in a very different direction when we are talking about Data Science, that is, bringing greenness in Big Data and Machine Learning (ML). We divide the tutorial into two parts primarily assuming that we are using cloud backbone for analytic and prediction tasks. The first part speaks about the techniques and tools to bring energy efficiency/greenness in the algorithmic and code level for Big Data and ML using Approximate Computing. The second part talks about the green techniques and power models at the infrastructural level for the cloud.

Featured Image

Why is it important?

Motivated by the fact that green computing is gaining prominence for its energy-efficient techniques and/or accelerating heavy computing-intensive processes, we want to show its utility of it in the data science domain. We intend to present a tutorial on energy-efficient and green data processing and analytics. Our tutorial can be seen as a combination of two different parts. 1) Use of Approximate Computing in Big Data and ML and 2) Green techniques and power models for cloud.

Perspectives

The tutorial aims at giving the attendees and participants a holistic overview of some green computing techniques which can be used for big data scenarios and ML applications. Specifically, the techniques like approximate computing and cloud power models will be discussed. The attendees can get an idea about the basics and applicability of these techniques for processing and performing learning and prediction in a huge amount of data. Also, they can learn about implementing these kinds of big data tasks in an energy-efficient way in cloud platforms. Since, the primary theme of CODS-COMAD speaks about data science at large and management of a huge amount of data along with the applicability of the same in real-life scenarios, we feel that our tutorial theme matches tremendously with it.

Hrishav Bakul Barua
TCS Research

Read the Original

This page is a summary of: Green Computing for Big Data and Machine Learning, January 2022, ACM (Association for Computing Machinery),
DOI: 10.1145/3493700.3493772.
You can read the full text:

Read

Contributors

The following have contributed to this page