Improving the accuracy of k means algorithm using genetic algorithm

Fernando, W.T.R.; Wijeweera, K.R.; Dasanayaka, D.M.N.K.

IRUOR Home
→
Scholarly Publications
→
Conference and Symposia Proceedings
→
Ruhuna International Science and Technology Conference
→
RISTCON 2017
→
View Item

Improving the accuracy of k means algorithm using genetic algorithm

Fernando, W.T.R.; Wijeweera, K.R.; Dasanayaka, D.M.N.K.

URI: http://ir.lib.ruh.ac.lk/xmlui/handle/iruor/10386

Date: 2017-01-26

Abstract:

Clustering data, by recognizing a subset of representative examples, is used for processing sensory signals and detecting patterns in data. K means is the simplest clustering algorithm used in data clustering for such purposes. In K means approach, the number of expected clusters and their initial centroids should be provided as inputs. However the accuracy of convergence of the initial centroids towards the actual centroids de pends on the level of approximation of the initial centroids provided as inputs. Inappropriate initial centroids can cause the algorithm to get stuck in a local optimum rather than converging to the global optimum. This work proposes a method to overcome t his problem by approximating the initial centroids using genetic algorithm. In the proposed approach a randomly generated set of centroids are arranged as a chain (chromosome). A collection of such chromosomes form the initial population for the genetic al gorithm. This population is evolved using a proposed fitness function. The final centroids were observed by changing the percentages of selection, recombination, and mutation amounts. It was observed that the proposed approach yielded optimal results under recombination between 45% 65%, mutation between 10% 14%, and number of iterations in between20 30.The available approaches for improving the accuracy of k means algorithm using genetic algorithm have limitations such as number of clusters or the dimen sions in data sets that can be used with them. The proposed algorithm can be applied to data sets with any number of clusters and can be extended to any dimension.

Show full item record