Afficher la notice abrégée

dc.contributor.authorRamos Pérez, Ismael 
dc.contributor.authorBarbero Aparicio, José Antonio 
dc.contributor.authorCanepa Oneto, Antonio Jesús 
dc.contributor.authorArnaiz González, Álvar 
dc.contributor.authorMaudes Raedo, Jesús M. 
dc.date.accessioned2026-01-26T08:14:11Z
dc.date.available2026-01-26T08:14:11Z
dc.date.issued2024-04
dc.identifier.issn2078-2489
dc.identifier.urihttps://hdl.handle.net/10259/11282
dc.description.abstractThe most common preprocessing techniques used to deal with datasets having high dimensionality and a low number of instances—or wide data—are feature reduction (FR), feature selection (FS), and resampling. This study explores the use of FR and resampling techniques, expanding the limited comparisons between FR and filter FS methods in the existing literature, especially in the context of wide data. We compare the optimal outcomes from a previous comprehensive study of FS against new experiments conducted using FR methods. Two specific challenges associated with the use of FR are outlined in detail: finding FR methods that are compatible with wide data and the need for a reduction estimator of nonlinear approaches to process out-of-sample data. The experimental study compares 17 techniques, including supervised, unsupervised, linear, and nonlinear approaches, using 7 resampling strategies and 5 classifiers. The results demonstrate which configurations are optimal, according to their performance and computation time. Moreover, the best configuration—namely, k Nearest Neighbor (KNN) + the Maximal Margin Criterion (MMC) feature reducer with no resampling—is shown to outperform state-of-the-art algorithms.en
dc.description.sponsorshipThis work was supported by the Junta de Castilla y León under project BU055P20 (JCyL/FEDER, UE) and by the Ministry of Science and Innovation under project PID2020-119894GB-I00, co-financed through European Union FEDER funds. Ismael Ramos-Pérez is funded through a pre-doctoral grant by the Universidad de Burgos.en
dc.format.mimetypeapplication/pdf
dc.language.isoenges
dc.publisherMDPIes
dc.relation.ispartofInformation. 2024, V. 15, n. 4, 223es
dc.rightsAtribución 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.subjectFeature selectionen
dc.subjectFeature reductionen
dc.subjectWide dataen
dc.subjectHigh dimensional dataen
dc.subjectImbalanced dataen
dc.subjectMachine learningen
dc.subject.otherInformáticaes
dc.subject.otherComputer scienceen
dc.subject.otherInteligencia artificiales
dc.subject.otherArtificial intelligenceen
dc.titleAn Extensive Performance Comparison between Feature Reduction and Feature Selection Preprocessing Algorithms on Imbalanced Wide Dataen
dc.typeinfo:eu-repo/semantics/articlees
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses
dc.relation.publisherversionhttps://doi.org/10.3390/info15040223es
dc.identifier.doi10.3390/info15040223
dc.identifier.essn2078-2489
dc.journal.titleInformationen
dc.volume.number15es
dc.issue.number4es
dc.page.initial223es
dc.type.hasVersioninfo:eu-repo/semantics/publishedVersiones


Fichier(s) constituant ce document

Thumbnail

Ce document figure dans la(les) collection(s) suivante(s)

Afficher la notice abrégée