A Comprehensive Review of Feature Selection and Feature Selection Stability in Machine Learning
Loading...

Date
2023
Authors
Mustafa Büyükkeçeci
mehmet okur
Journal Title
Journal ISSN
Volume Title
Publisher
Open Access Color
GOLD
Green Open Access
No
OpenAIRE Downloads
OpenAIRE Views
Publicly Funded
No
Abstract
Feature selection is a dimension reduction technique used to select features that are relevant to machine learning tasks. Reducing the dataset size by eliminating redundant and irrelevant features plays a pivotal role in increasing the performance of machine learning algorithms speeding up the learning process and building simple models. The apparent need for feature selection has aroused considerable interest amongst researchers and has caused feature selection to find a wide range of application domains including text mining pattern recognition cybersecurity bioinformatics and big data. As a result over the years a substantial amount of literature has been published on feature selection and a wide variety of feature selection methods have been proposed. The quality of feature selection algorithms is measured not only by evaluating the quality of the models built using the features they select or by the clustering tendencies of the features they select but also by their stability. Therefore this study focused on feature selection and feature selection stability. In the pages that follow general concepts and methods of feature selection feature selection stability stability measures and reasons and solutions for instability are discussed.
Description
Keywords
Engineering, Feature selection;Dimensionality reduction;Types of feature selection;Feature selection stability;Stability measures, Mühendislik
Fields of Science
0206 medical engineering, 0202 electrical engineering, electronic engineering, information engineering, 02 engineering and technology
Citation
[1] Kohavi R. John G.H. “Wrappers for feature subset selection” Artificial Intelligence 97(1-2): 273-324 (1997).[2] Yu L. Liu H. “Efficient Feature Selection via Analysis of Relevance and Redundancy” Journal of Machine Learning Research 5: 1205-1224 (2004).[3] Yu L. Liu H. “Redundancy Based Feature Selection for Microarray Data” KDD ‘04: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Seattle WA USA 737-742 (2004).[4] Cho S.-B. Won H.-H. “Machine Learning in DNA Microarray Analysis for Cancer Classification” APBC ‘03: Proceedings of the First Asia-Pacific Bioinformatics Conference on Bioinformatics Adelaide SA Australia 19: 189-198 (2003).[5] Tang J. Zhou S. “A new approach for feature selection from microarray data based on mutual information” IEEE/ACM Transactions on Computational Biology and Bioinformatics 13(6): 1004-1015 (2016).[6] Inza I. Larranaga P. Blanco R. Cerrolaza A.J. “Filter versus wrapper gene selection approaches in DNA microarray domains” Artificial Intelligence in Medicine 31(2): 91-103 (2004).[7] Yang Q. Jia X. Li X. Feng J. Li W. Lee J. “Evaluating feature selection and anomaly detection methods of hard drive failure prediction” IEEE Transactions on Reliability 70(2): 749-760 (2021).[8] Lee W. Stolfo S.J. Mok K.W. “Adaptive intrusion detection: a data mining approach” Artificial Intelligence Review 14: 533-567 (2000).[9] Alazab A. Hobbs M. Abawajy J. Alazab M. “Using Feature Selection for Intrusion Detection System” International Symposium on Communications and Information Technologies (ISCIT) Gold Coast QLD Australia 296-301 (2012).[10] Huang K. Aviyente S. “Wavelet feature selection for image classification” IEEE Transactions on Image Processing 17(9): 1709-1720 (2008).[11] Dy J.G. Brodley C.E. Kak A. Broderick L.S. Aisen A.M. “Unsupervised feature selection applied to content-based retrieval of lung images” IEEE Transactions on Pattern Analysis and Machine Intelligence 25(3): 373-378 (2003).[12] Forman G. “An Extensive Empirical Study of Feature Selection Metrics for Text Classification” Journal of Machine Learning Research 3: 1289-1305 (2003).[13] Jing L.-P. Huang H.-K. Shi H.-B. “Improved Feature Selection Approach TFIDF in Text Mining” Proceedings of the International Conference on Machine Learning and Cybernetics Beijing China 944-946 (2002).[14] Bai X. Gao X. Xue B. “Particle swarm optimization based two-stage feature selection in text mining” 2018 IEEE Congress on Evolutionary Computation (CEC) 1-8 (2018).[15] Fisher R.A. “The use of multiple measurements in taxonomic problems” Annals of Eugenics 7: 179-188 (1936).[16] Han D. Kim J. “Unified simultaneous clustering and feature selection for unlabeled and labeled data” IEEE Transactions on Neural Networks and Learning Systems 29(12): 6083-6098 (2018).[17] Zhao Z. Liu H. “Spectral Feature Selection for Supervised and Unsupervised Learning” ICML ‘07: Proceedings of the 24th International Conference on Machine Learning Corvalis OR USA 1151-1157 (2007).[18] Tang J. Alelyani S. Liu H. “Feature selection for classification: a review” Data Classification: Algorithms and Applications CRC Press 37-64 (2014).[19] Ang J.C. Mirzal A. Haron H. Hamed H.N.A. “Supervised unsupervised and semi-supervised feature selection: a review on gene selection” IEEE/ACM Transactions on Computational Biology and Bioinformatics 13(5): 971-989 (2015).[20] Yang W. Wang K. Zuo W. “Neighborhood Component Feature Selection for High-Dimensional Data” Journal of Computers 7(1): 161-168 (2012).[21] Dy J.G. Brodley C.E. Wrobel S. (Editor) “Feature Selection for Unsupervised Learning” The Journal of Machine Learning Research 5: 845-889 (2004).[22] Solorio-Fernandez S. Carrasco-Ochoa J.A. Martinez-Trinidad J.F. “A review of unsupervised feature selection methods” Artificial Intelligence Review 53: 907-948 (2020).[23] Boutsidis C. Mahoney M.W. Drineas P. “Unsupervised Feature Selection for Principal Components Analysis” Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Las Vegas NV USA 61-69 (2008).[24] He X. Cai D. Niyogi P. “Laplacian Score for Feature Selection” NIPS ‘05: Proceedings of the 18th International Conference on Neural Information Processing Systems Vancouver BC Canada 507-514 (2005).[25] Zhao Z. Liu H. “Semi-supervised Feature Selection via Spectral Analysis” Proceedings of the 7th SIAM International Conference on Data Mining Minneapolis MN USA 641-646 (2007).[26] Ren J. Qiu Z. Fan W. Cheng H. Yu P.S. “Forward semi-supervised feature selection” PAKDD ‘08: Advances in Knowledge Discovery and Data Mining Lecture Notes in Computer Science 5012: 970-976 (2008).[27] Sheikhpour R. Sarram M.A. Gharaghani S. Chahooki M.A.Z. “A Survey on semi-supervised feature selection methods” Pattern Recognition 64: 141-158 (2017).[28] Xu Z. King I. Lyu M.R. Jin R. “Discriminative semi-supervised feature selection via manifold regularization” IEEE Transactions on Neural Networks 21(7): 1303-1308 (2010).[29] Zhao J. Lu K. He X. “Locality sensitive semi-supervised feature selection” Neurocomputing 71(10-12): 1842-1849 (2008).[30] Guyon I. Elisseeff A. Kaelbling L.P. (Editor) “An Introduction to Variable and Feature Selection” Journal of Machine Learning Research 3: 1157-1182 (2003).[31] Haury A.-C. Gestraud P. Vert J.-P. “The influence of feature selection methods on accuracy stability and interpretability of molecular signatures” PLoS ONE 6(12): e28210 (2011).[32] Breiman L. Friedman J.H. Stone C.J. Olshen R.A. “Classification and regression trees” 1st Ed. United Kingdom: Chapman and Hall/CRC 18-55 216-264 (1984).[33] Quinlan J.R. “Induction of decision trees” Machine Learning 1: 81-106 (1986).[34] Tharwat A. “Classification assessment methods: a detailed tutorial” Applied Computing and Informatics (2018).[35] Landgrebe T.C.W. Duin R.P.W. “Approximating the multiclass ROC by pairwise analysis” Pattern Recognition Letters 28(13): 1747-1758 (2007).[36] Fawcett T. “An introduction to ROC analysis” Pattern Recognition Letters 27(8): 861-874 (2006).[37] Turney P. “Technical note: bias and the quantification of stability” Machine Learning 20 23-33 (1995).[38] Hulse J.V. Khoshgoftaar T.M. Napolitano A. Wald R. “Feature Selection with High-Dimensional Imbalanced Data” 2009 IEEE International Conference on Data Mining Workshops Miami FL USA 507-514 (2009).[39] Maldonado S. Weber R. Famili F. “Feature selection for high-dimensional class-imbalanced data sets using support vector machines” Information Sciences 286: 228-246 (2014).[40] Viegas F. Rocha L. Gonçalves M. Mourao F. Sa G. Salles T. Andrade G. Sandin I. “A genetic programming approach for feature selection in highly dimensional skewed data” Neurocomputing 273: 554-569 (2018).[41] Katrutsa A. Strijov V. “Comprehensive study of feature selection methods to solve multicollinearity problem according to evaluation criteria” Expert Systems with Applications 76: 1-15 (2017).[42] Jain A. Zongker D. “Feature selection: evaluation application and small sample performance” IEEE Transactions on Pattern Analysis and Machine Intelligence 19(2): 153-158 (1997).[43] Wu X. Cheng Q. “Algorithmic Stability and Generalization of an Unsupervised FSA” NeurIPS 2021: 35th Conference on Neural Information Processing Systems 1-14 (2021).[44] Helleputte T. Dupont P. “Partially Supervised Feature Selection with Regularized Linear Models” ICML ‘09: Proceedings of the 26th Annual International Conference on Machine Learning 409-416 (2009).[45] Lai D.T.C. Garibaldi J.M. “Improving Semi-supervised Fuzzy C-Means Classification of Breast Cancer Data Using Feature Selection” 2013 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) Hyderabad India 1-8 (2013).[46] Kalousis A. Prados J. Hilario M. “Stability of feature selection algorithms: a study on high-dimensional spaces” Knowledge and Information Systems 12: 95-116 (2007).[47] Ding C. Peng H. “Minimum Redundancy Feature Selection from Microarray Gene Expression Data” Journal of Bioinformatics and Computational Biology 3(2): 185-205 (2005).[48] Shabbir A. Javed K. Ansari Y. Babri H.A. “Stability of Feature Ranking Algorithms on Binary Data” Pakistan Journal of Engineering and Applied Sciences 15: 76-86 (2014).[49] Jurman G. Merler S. Barla A. Paoli S. Galea A. Furlanello C. “Algebraic stability indicators for ranked lists in molecular profiling” Bioinformatics 24(2): 258-264 (2008).[50] Kononenko I. Simec E. Robnik-Sikonja M. “Overcoming the myopia of inductive learning algorithms with RELIEFF” Applied Intelligence 7: 39-55 (1997).[51] Saeys Y. Abeel T. Van de Peer Y. “Robust feature selection using ensemble feature selection techniques” ECML PKDD ‘08: Machine Learning and Knowledge Discovery in Databases 5212: 313-325 (2008).[52] Yu L. Ding C. Loscalzo S. “Stable Feature Selection via Dense Feature Groups” KDD ‘08: Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Las Vegas NV USA 803-811 (2008).[53] Kuncheva L.I. “A Stability Index for Feature Selection” Proceedings of the 25th IASTED International Conference on Artificial Intelligence and Applications Innsbruck Austria 390-395 (2007).[54] Dunne K. Cunningham P. Azuaje F. “Solutions to Instability Problems with Sequential Wrapper-based Approaches to Feature Selection” Journal of Machine Learning Research 1-22 (2002).[55] Lustgarten J.L. Gopalakrishnan V. Visweswaran S. “Measuring Stability of Feature Selection in Biomedical Datasets” AMIA ‘09: Annual Symposium Proceedings Published Online 406-410 (2009).[56] Zucknick M. Richardson S. Stronach E.A. “Comparing the characteristics of gene expression profiles derived by univariate and multivariate classification methods” Statistical Applications in Genetics and Molecular Biology 7(1): 1-28 (2008).[57] Shi L. Tong W. Fang H. Scherf U. Han J. Puri R.K. Frueh F.W. Goodsaid F.M. Guo L. Su Z. Han T. Fuscoe J.C. Xu Z.A. Patterson T.A. Hong H. Xie Q. Perkins R.G. Chen J.J. Casciano D.A. “Cross-platform comparability of microarray technology: intraplatform consistency and appropriate data analysis procedures are essential” BMC Bioinformatics 6 Article number S12 (2005).[58] Zhang M. Zhang L. Zou J. Yao C. Xiao H. Liu Q. Wang J. Wang D. Wang C. Guo Z. “Evaluating reproducibility of differential expression discoveries in microarray studies by considering correlated molecular changes” Bioinformatics 25(13): 1662-1668 (2009).[59] Wald R. Khoshgoftaar T. Dittman D. “A New Fixed-overlap Partitioning Algorithm for Determining Stability of Bioinformatics Gene Rankers” 11th International Conference on Machine Learning and Applications (ICMLA) Boca Raton FL USA 170-177 (2012).[60] Gulgezen G. Cataltepe Z. Yu. L. “Stable and accurate feature selection” ECML PKDD ‘09: Machine Learning and Knowledge Discovery in Databases 5781: 455-468 (2009).[61] Nogueira S. “Quantifying the stability of feature selection” Ph.D. Thesis University of Manchester Manchester United Kingdom 21-67 (2018).[62] Lausser L. Müssel C. Maucher M. Kestler H.A. “Measuring and visualizing the stability of biomarker selection techniques” Computational Statistics 28: 51-65 (2013).[63] Krizek P. Kittler J. Hlavac V. “Improving Stability of Feature Selection Methods” 12th International Conference on Computer Analysis of Images and Patterns (CAIP) Vienna Austria 929-936 (2007).[64] Guzman-Martinez R. Alaiz-Rodriguez R. “Feature selection stability assessment based on the Jensen-Shannon divergence” Lecture Notes in Computer Science 6911: 597-612 (2011).[65] Davis C.A. Gerick F. Hintermair V. Friedel C.C. Fundel K. Küffner R. Zimmer R. “Reliable gene signatures for microarray classification: assessment of stability and performance” Bioinformatics 22(19): 2356-2363 (2006).[66] Goh W.W.B. Wong L. “Evaluating Feature Selection Stability in Next-Generation Proteomics” Journal of Bioinformatics and Computational Biology 14(5): 1650029 (2016).[67] Nogueira S. Brown G. “Measuring the stability of feature selection” ECML PKDD ‘16: Machine Learning and Knowledge Discovery in Databases 9852: 442-457 (2016).[68] Munson M.A. Caruana R. “On feature selection bias-variance and bagging” ECML PKDD ‘09: Machine Learning and Knowledge Discovery in Databases 5782: 144-159 (2009).[69] Alelyani S. “On feature selection stability: a data perspective” Ph.D. Thesis Arizona State University Phoenix USA 10-57 (2013).[70] Alelyani S. Liu H. Wang L. “The Effect of the Characteristics of the Dataset on the Selection Stability” 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence Boca Raton FL USA 970-977 (2011).[71] Dittman D. Khoshgoftaar T. Wald R. Napolitano A. “Similarity Analysis of Feature Ranking Techniques on Imbalanced DNA Microarray Datasets” 2012 IEEE International Conference on Bioinformatics and Biomedicine Philadelphia PA USA 1-5 (2012).[72] Alelyani S. Zhao Z. Liu H. “A Dilemma in Assessing Stability of Feature Selection Algorithms” 2011 IEEE International Conference on High Performance Computing and Communications Banff AB Canada 701-707 (2011).[73] Han Y. Yu L. “A Variance Reduction Framework for Stable Feature Selection” 2010 IEEE International Conference on Data Mining Sydney NSW Australia 206-215 (2010).[74] Kamkar I. “Building stable predictive models for healthcare applications: a data-driven approach” Ph.D. Thesis Deakin University Geelong Australia 34-52 (2016).[75] Tang F. Adam L. Si B. “Group feature selection with multiclass support vector machine” Neurocomputing 317: 42-49 (2018).[76] Loscalzo S. Yu L. Ding C.H.Q. “Consensus Group Stable Feature Selection” Conference: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Paris France 567-575 (2009).
WoS Q
Scopus Q

OpenCitations Citation Count
36
Source
Gazi University Journal of Science
Volume
36
Issue
Start Page
1506
End Page
1520
Collections
PlumX Metrics
Citations
CrossRef : 4
Scopus : 53
Captures
Mendeley Readers : 236
Google Scholar™


