Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • apa.csl
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A surrogate-assisted GA enabling high-throughput ML by optimal feature and discretization selection
Karlstad University, Faculty of Health, Science and Technology (starting 2013), Department of Mathematics and Computer Science (from 2013).ORCID iD: 0000-0003-3461-7079
2020 (English)In: GECCO 2020 Companion - Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion, Association for Computing Machinery, Inc , 2020, p. 1632-1640Conference paper, Published paper (Refereed)
Abstract [en]

Novel lookup-based classification approaches allow machine-learning (ML) to be performed at extremely high classification rates for suitable low-dimensional classification problems. A central aspect of such approaches is the crucial importance placed on the optimal selection of features and discretized feature representations. In this work we propose and study a hybrid-genetic algorithm (hGAm) approach to solve this optimization problem. For the considered problem the fitness evaluation function is expensive, as it entails training a ML classifier with the proposed set of features and representations, and then evaluating the resulting classifier. We have here devised a surrogate problem by casting the feature selection and representation problem as a combinatorial optimization problem in the form of a multiple-choice quadratic knapsack problem (MCQKP). The orders of magnitude faster evaluation of the surrogate problem allows a comprehensive hGAm performance evaluation to be performed. The results show that a suitable trade-off exists at around 5000 fitness evaluations, and the results also provide a characterization of the parameter behaviors as input to future extensions.

Place, publisher, year, edition, pages
Association for Computing Machinery, Inc , 2020. p. 1632-1640
Keywords [en]
Discretization, Feature selection, GA, Surrogate problem, Combinatorial optimization, Economic and social effects, Feature extraction, Genetic algorithms, Machine learning, Classification approach, Classification rates, Combinatorial optimization problems, Feature representation, Hybrid genetic algorithms, Optimization problems, Orders of magnitude, Quadratic knapsack problems, Classification (of information)
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kau:diva-82975DOI: 10.1145/3377929.3398092Scopus ID: 2-s2.0-85089739633ISBN: 9781450371278 (print)OAI: oai:DiVA.org:kau-82975DiVA, id: diva2:1529705
Conference
2020 Genetic and Evolutionary Computation Conference, GECCO 2020, 8 July 2020 through 12 July 2020
Available from: 2021-02-19 Created: 2021-02-19 Last updated: 2021-04-27Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Garcia, Johan

Search in DiVA

By author/editor
Garcia, Johan
By organisation
Department of Mathematics and Computer Science (from 2013)
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 128 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • apa.csl
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf