What is it about?

Big data is collected and processed using different sources and tools that lead to privacy issues. Privacy preserving data publishing techniques such as k-anonymity, l-diversity, and t-closeness are used to de-identify the data; however, the chances of re-identification are always remain present since data is collected from multiple sources. Owing to the large volume of data, less generalisation or suppression is required to achieve the same level of privacy, which is also known as ‘large crowd effect’, although it is always challenging to handle such a large data for anonymization. MapReduce handles large volume of data and distributes the data into the smaller chunks across the multiple nodes; consequently, the full advantage of large volume of data is underachieved. Therefore, scalability of privacy preserving techniques becomes a challenging area of research. The authors explore this area and propose an algorithm named scalable k-anonymization (SKA) using MapReduce for privacy preserving big data publishing. The authors also compare the approach with existing approaches that results into a remarkable improvement of the data utility and significantly enhances the performance in terms of running time.

Featured Image

Why is it important?

Preserve privacy in big data

Perspectives

SKA is one of the best performing privacy preserving big data publishing technique

Dr. Brijesh B. Mehta
College of Technology and Engineering

Read the Original

This page is a summary of: Privacy preserving big data publishing: a scalable k-anonymization approach using MapReduce , IET Software, October 2017, the Institution of Engineering and Technology (the IET),
DOI: 10.1049/iet-sen.2016.0264.
You can read the full text:

Read

Contributors

The following have contributed to this page