An Efficient K-Means Method Based on Centroid Handling for the Similarity Estimation
Girdhar Gopal Ladha1, Ravi Kumar Singh Pippal2

1Girdhar Gopal Ladha*, Ph.D. Scholar, Department of Computer Science, RKDF University, Bhopal (MP), India.
2Ravi Kumar Singh Pippal, Professor, Department of Computer Science RKDF University, Bhopal (MP), India.
Manuscript received on November 22, 2019. | Revised Manuscript received on December 15, 2019. | Manuscript published on December 30, 2019. | PP: 4337-4341 | Volume-9 Issue-2, December, 2019. | Retrieval Number: B3947129219/2019©BEIESP | DOI: 10.35940/ijeat.B3947.129219
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: The main aim of this paper is to handle centroid calculation in k-means efficiently. So that the distance estimation will be more accurate and prominent results will be fetched in terms of clustering. For this PIMA database has been considered. Data preprocessing has been performed for the unwanted data removal in terms of missing values. Then centroid initialization has been performed based on centroid tuning and randomization. For distance estimation Euclidean, Pearson Coefficient, Chebyshev and Canberra algorithms has been used. In this paper the evaluation has been performed based on the computational time analysis. The time calculation has been performed on different random sets. It is found to be prominent in all the cases considering the variations in all aspects of distance and population.
Keywords: K-means, Centroid Handling, Distance measures, Similarity estimation.