A Big Graph Clustering Algorithm Based on MapReduce

Yong Lin Leng; Qing Chen Zhang

doi:10.4028/www.scientific.net/AMR.1049-1050.1467

Paper Titles

An Application of Improved Newton-Ralph Method in Microgrid Power Flow Calculation
p.1448

A Fair Virtual Network Embedding Algorithm
p.1454

Network System Reliability and Unit Probability Importance Based on Monte Carlo Method
p.1458

Study on Maize Yield Prediction Using Time Series Analysis
p.1463

A Big Graph Clustering Algorithm Based on MapReduce
p.1467

Analytical Solution for the Time-Fractional Pennes Bioheat Transfer Equation on Skin Tissue
p.1471

Improving Image Classification Quality Using Multi-View Learning
p.1475

Multiplexing QR Decomposition Architecture Using the Givens-Rotation Algorithm for Adaptive Beamforming System
p.1480

Multi-View Semi-Supervised Learning Based Image Annotation
p.1486

HomeAdvanced Materials ResearchAdvanced Materials Research Vols. 1049-1050A Big Graph Clustering Algorithm Based on...

A Big Graph Clustering Algorithm Based on MapReduce

Abstract:

Graph clustering is an important technology in graph analysis area, the measure of similarity between node of graph is the presise for graph clustering. SimRank algorithm is a kind of universal structure similarity calculation model which is proposed by Jeh and Widom. SimRank algorithm using iterative method to calculate the similarity between nodes, so the time and space complexity is very high. With the rapid increase of data, the ability of single machine can not meet the requirement of the large-scale data calculation. In this paper, the distributed SimRank algorithm was proposed based on Mapreduce and was used to measure the similarity of graph. Then the distributed AP clustering algorithm was designed for clustering analysis graph nodes. The experimental was executed to compare the clustering running time and speedup and results show that the method can efficiently complete graph nodes similarity measure and clustering the large graph effectively.

You might also be interested in these eBooks

Modern Technologies in Materials, Mechanics and Intelligent Systems

View Preview

Info:

Periodical:

Advanced Materials Research (Volumes 1049-1050)

Pages:

1467-1470

DOI:

https://doi.org/10.4028/www.scientific.net/AMR.1049-1050.1467

Citation:

Cite this paper

Online since:

October 2014

Authors:

Yong Lin Leng*, Qing Chen Zhang

Keywords:

Ap Clustering, MapReduce, RDF, SimRank

Export:

RIS, BibTeX

Price:

Permissions:

Request Permissions

* - Corresponding Author

References

[1] H. C. Wang, J. Ma, Study of Efficient Clustering Algorithm on Large Graphs, Journal of Chinese Computer Systems, vol. 34, no. 6, pp.1417-1423, (2013).

Google Scholar

[2] F. Du, Y. G. Chen, X. Y. Du, Survey of RDF Query Processing Techniques, Journal of Software, vol. 24, no. 6, pp.1222-1241, (2013).

DOI: 10.3724/sp.j.1001.2013.04387

Google Scholar

[3] G. WU, Research on Key Technologies of RDF Graph Data Management, Tsinghua University press, (2008).

Google Scholar

[4] P. Zhao, J. Han and Y. Sun, P-rank: A comprehensive structural similarity measure over information networks, International Conference on Information and Knowledge Management, (2009).

DOI: 10.1145/1645953.1646025

Google Scholar

[5] G. Jeh and J. Widom, SimRank: a measure of structural-context similarity, " In Proceedings of the eighth ACM SIGKDD conference(KDD, 02), (2002).

DOI: 10.1145/775047.775126

Google Scholar

[6] Q. L. Han, H. W. Pan,S. B. Cai, et al., Nodes similarity measure method basedon sturcture-attribute balance graph, Computer Engineering and Applications, vol. 49, no. 1, pp.15-18, (2013).

Google Scholar

[7] H. Khosravi-Farsani , M. Nematbakhsh , G. Lausen., Structure/attribute computation of similarities between nodes of a RDF graph with application to linked data clustering, Intelligent Data Analysis, vol. 17, no. 2, pp.179-194, (2013).

DOI: 10.3233/ida-130573

Google Scholar

[8] X. F. Meng, X. Ci, Big Data Management: Concepts, Techniques and Challenges, Jouranl of Computer Research and Development, vol. 50, no. 1, pp.146-169, (2013).

Google Scholar

[9] B. Frey, D. Duck, Clustering by passing messages between data points, Science, vol. 315, no. 5814, pp.972-976, (2007).

DOI: 10.1126/science.1136800

Google Scholar