Improving the accuracy of k means algorithm using genetic algorithm

Fernando, W.T.R.; Wijeweera, K.R.; Dasanayaka, D.M.N.K.

IRUOR Home
→
Scholarly Publications
→
Conference and Symposia Proceedings
→
Ruhuna International Science and Technology Conference
→
RISTCON 2017
→
View Item

dc.contributor.author	Fernando, W.T.R.
dc.contributor.author	Wijeweera, K.R.
dc.contributor.author	Dasanayaka, D.M.N.K.
dc.date.accessioned	2023-01-26T10:29:37Z
dc.date.available	2023-01-26T10:29:37Z
dc.date.issued	2017-01-26
dc.identifier.issn	1391-8796
dc.identifier.uri	http://ir.lib.ruh.ac.lk/xmlui/handle/iruor/10386
dc.description.abstract	Clustering data, by recognizing a subset of representative examples, is used for processing sensory signals and detecting patterns in data. K means is the simplest clustering algorithm used in data clustering for such purposes. In K means approach, the number of expected clusters and their initial centroids should be provided as inputs. However the accuracy of convergence of the initial centroids towards the actual centroids de pends on the level of approximation of the initial centroids provided as inputs. Inappropriate initial centroids can cause the algorithm to get stuck in a local optimum rather than converging to the global optimum. This work proposes a method to overcome t his problem by approximating the initial centroids using genetic algorithm. In the proposed approach a randomly generated set of centroids are arranged as a chain (chromosome). A collection of such chromosomes form the initial population for the genetic al gorithm. This population is evolved using a proposed fitness function. The final centroids were observed by changing the percentages of selection, recombination, and mutation amounts. It was observed that the proposed approach yielded optimal results under recombination between 45% 65%, mutation between 10% 14%, and number of iterations in between20 30.The available approaches for improving the accuracy of k means algorithm using genetic algorithm have limitations such as number of clusters or the dimen sions in data sets that can be used with them. The proposed algorithm can be applied to data sets with any number of clusters and can be extended to any dimension.	en_US
dc.language.iso	en	en_US
dc.publisher	Faculty of Science, University of Ruhuna, Matara, Sri Lanka	en_US
dc.subject	Data Mining	en_US
dc.subject	Clustering Algorithms	en_US
dc.subject	Genetic Algorithms	en_US
dc.subject	K-means Algorithm	en_US
dc.subject	Computational Cost	en_US
dc.title	Improving the accuracy of k means algorithm using genetic algorithm	en_US
dc.type	Article	en_US