Open Access Te Herenga Waka-Victoria University of Wellington
Browse
lensen2018automatically.pdf (722.25 kB)

Automatically evolving difficult benchmark feature selection datasets with genetic programming

Download (722.25 kB)
conference contribution
posted on 2020-06-16, 22:19 authored by Andrew LensenAndrew Lensen, Bing XueBing Xue, Mengjie ZhangMengjie Zhang
© 2018 Copyright held by the owner/author(s). There has been a wealth of feature selection algorithms proposed in recent years, each of which claims superior performance in turn. A wide range of datasets have been used to compare these algorithms, each with different characteristics and quantities of redundant and noisy features. Hence, it is very difficult to comprehensively and fairly compare these feature selection methods in order to find which are most robust and effective. In this work, we examine using Genetic Programming to automatically synthesise redundant features for augmenting existing datasets in order to more scientifically test feature selection performance. We develop a method for producing complex multi-variate redundancies, and present a novel and intuitive approach to ensuring a range of redundancy relationships are automatically created. The application of these augmented datasets to well-established feature selection algorithms shows a number of interesting and useful results and suggests promising directions for future research in this area.

History

Preferred citation

Lensen, A., Xue, B. & Zhang, M. (2018, July). Automatically evolving difficult benchmark feature selection datasets with genetic programming. In GECCO 2018 - Proceedings of the 2018 Genetic and Evolutionary Computation Conference GECCO '18: Genetic and Evolutionary Computation Conference (pp. 458-465). ACM. https://doi.org/10.1145/3205455.3205552

Conference name

GECCO '18: Genetic and Evolutionary Computation Conference

Title of proceedings

GECCO 2018 - Proceedings of the 2018 Genetic and Evolutionary Computation Conference

Publication or Presentation Year

2018-07-02

Pagination

458-465

Publisher

ACM

Publication status

Published

Usage metrics

    Conference papers

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC