Performance analysis of clustering algorithm under two kinds of big data architecture

Beibei Li; Bo Liu; Weiwei Lin; Ying Zhang

doi:10.3233/jhs-170556

What is it about?

To compare the performance of the clustering algorithm on two data processing architectures, the implementations of k-means clustering algorithm on two big data architectures are given at first in this paper. Then we focus on the differences of theoretical performance of k-means algorithm on two architectures from the mathematical point of view. The theoretical analysis shows that Spark architecture is superior to the Hadoop in aspects of the average execution time and I/O time. Finally, a text data set of social networking site of users’ behaviors is employed to conduct algorithm experiments. The results show that Spark is significantly less than MapReduce in aspects of the execution time and I/O time based on k-means algorithm. The theoretical analysis and the implementation technology of the big data algorithm proposed in this paper are a good reference for the application of big data technology

Why is it important?

Big data is the current research hotspot

Perspectives

The theoretical analysis and the implementation technology of the big data algorithm proposed in this paper are a good reference for the application of big data technology
Prof. weiwei lin
South China University of Technology

The theoretical analysis and the implementation technology of the big data algorithm proposed in this paper are a good reference for the application of big data technology.
Beibei Li

This page is a summary of: Performance analysis of clustering algorithm under two kinds of big data architecture, Journal of High Speed Networks, January 2017, IOS Press,
DOI: 10.3233/jhs-170556.
You can read the full text:

Read

Contributors

The following have contributed to this page

computing for big data

What is it about?

Why is it important?

Perspectives

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

computing for big data

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management