Springer
Table of ContentsAuthor IndexSearch

Mining Comprehensible Clustering Rules with an Evolutionary Algorithm

Ioannis Sarafis1, Phil Trinder1, and Ali Zalzala2,3

1School of Mathematical and Computer Sciences
Heriot-Watt University
Riccarton Campus
Edinburgh, EH14 4AS Scotland, United Kingdom
{I.Sarafis,P.W.Trinder}@hw.ac.uk

2School of Engineering & Physical Sciences
Heriot-Watt University
Riccarton Campus
Edinburgh, EH14 4AS Scotland, United Kingdom

3School of Engineering
American University of Sharjah
P.O. 26666
Sharjah, UAE
A.Zalzala@hw.ac.uk

Abstract. In this paper, we present a novel evolutionary algorithm, called NOCEA, which is suitable for Data Mining (DM) clustering applications. NOCEA evolves individuals that consist of a variable number of non-overlapping clustering rules, where each rule includes d intervals, one for each feature. The encoding scheme is non-binary as the values for the boundaries of the intervals are drawn from discrete domains, which reflect the automatic quantization of the feature space. NOCEA uses a simple fitness function, which is radically different from any distance-based criterion function suggested so far. A density-based merging operator combines adjacent rules forming the genuine clusters in data. NOCEA has been evaluated on challenging datasets and we present results showing that it meets many of the requirements for DM clustering, such as ability to discover clusters of different shapes, sizes, and densities. Moreover, NOCEA is independent of the order of input data and insensitive to the presence of outliers, and to initialization phase. Finally, the discovered knowledge is presented as a set of non-overlapping clustering rules, contributing to the interpretability of the results.

LNCS 2724, p. 2301 ff.

Full article in PDF


lncs@springer.de
© Springer-Verlag Berlin Heidelberg 2003