Open Access Te Herenga Waka-Victoria University of Wellington
Browse

New representations in genetic programming for feature construction in k-means clustering

Download (384.23 kB)
conference contribution
posted on 2020-10-06, 22:08 authored by Andrew LensenAndrew Lensen, Bing XueBing Xue, Mengjie ZhangMengjie Zhang
© Springer International Publishing AG 2017. k-means is one of the fundamental and most well-known algorithms in data mining. It has been widely used in clustering tasks, but suffers from a number of limitations on large or complex datasets. Genetic Programming (GP) has been used to improve performance of data mining algorithms by performing feature construction—the process of combining multiple attributes (features) of a dataset together to produce more powerful constructed features. In this paper, we propose novel representations for using GP to perform feature construction to improve the clustering performance of the k-means algorithm. Our experiments show significant performance improvement compared to k-means across a variety of difficult datasets. Several GP programs are also analysed to provide insight into how feature construction is able to improve clustering performance.

History

Preferred citation

Lensen, A., Xue, B. & Zhang, M. (2017, January). New representations in genetic programming for feature construction in k-means clustering. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (10593 LNCS pp. 543-555). Springer International Publishing. https://doi.org/10.1007/978-3-319-68759-9_44

Title of proceedings

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volume

10593 LNCS

Publication or Presentation Year

2017-01-01

Pagination

543-555

Publisher

Springer International Publishing

Publication status

Published

ISSN

0302-9743

eISSN

1611-3349

Usage metrics

    Conference papers

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC