Author Jagdhuber, Rudolf and Lang, Michel and Stenzl, Arnulf and Neuhaus, Jochen and Rahnenführer, Jörg
Title Cost-Constrained feature selection in binary classification: adaptations for greedy forward selection and genetic algorithms
Journal BMC Bioinformatics
Volume 21
Number 1
Pages 26
Abstract With modern methods in biotechnology, the search for biomarkers has advanced to a challenging statistical task exploring high dimensional data sets. Feature selection is a widely researched preprocessing step to handle huge numbers of biomarker candidates and has special importance for the analysis of biomedical data. Such data sets often include many input features not related to the diagnostic or therapeutic target variable. A less researched, but also relevant aspect for medical applications are costs of different biomarker candidates. These costs are often financial costs, but can also refer to other aspects, for example the decision between a painful biopsy marker and a simple urine test. In this paper, we propose extensions to two feature selection methods to control the total amount of such costs: greedy forward selection and genetic algorithms. In comprehensive simulation studies of binary classification tasks, we compare the predictive performance, the run-time and the detection rate of relevant features for the new proposed methods and five baseline alternatives to handle budget constraints.
Year 2020
Doi 10.1186/s12859-020-3361-9
