What is it about?
Big data is collected and processed using different sources and tools that lead to privacy issues. Privacy preserving data publishing techniques such as k-anonymity, l-diversity, and t-closeness are used to de-identify the data; however, the chances of re-identification are always remain present since data is collected from multiple sources. Owing to the large volume of data, less generalisation or suppression is required to achieve the same level of privacy, which is also known as ‘large crowd effect’, although it is always challenging to handle such a large data for anonymization. MapReduce handles large volume of data and distributes the data into the smaller chunks across the multiple nodes; consequently, the full advantage of large volume of data is underachieved. Therefore, scalability of privacy preserving techniques becomes a challenging area of research. The authors explore this area and propose an algorithm named scalable k-anonymization (SKA) using MapReduce for privacy preserving big data publishing. The authors also compare the approach with existing approaches that results into a remarkable improvement of the data utility and significantly enhances the performance in terms of running time.
Featured Image
Why is it important?
Preserve privacy in big data
Perspectives
Read the Original
This page is a summary of: Privacy preserving big data publishing: a scalable k-anonymization approach using MapReduce , IET Software, October 2017, the Institution of Engineering and Technology (the IET),
DOI: 10.1049/iet-sen.2016.0264.
You can read the full text:
Contributors
The following have contributed to this page