Universidad de Burgos RIUBU Principal Default Universidad de Burgos RIUBU Principal Default
  • español
  • English
  • français
  • Deutsch
  • português (Brasil)
  • italiano
Universidad de Burgos RIUBU Principal Default
  • Ayuda
  • Contacto
  • Sugerencias
  • Acceso abierto
    • Archivar en RIUBU
    • Acuerdos editoriales para la publicación en acceso abierto
    • Controla tus derechos, facilita el acceso abierto
    • Sobre el acceso abierto y la UBU
    • español
    • English
    • français
    • Deutsch
    • português (Brasil)
    • italiano
    • español
    • English
    • français
    • Deutsch
    • português (Brasil)
    • italiano
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Listar

    Todo RIUBUComunidadesFechaAutor / DirectorTítuloMateria / AsignaturaEsta colecciónFechaAutor / DirectorTítuloMateria / Asignatura

    Mi cuenta

    AccederRegistro

    Estadísticas

    Ver Estadísticas de uso

    Compartir

    Ver ítem 
    •   RIUBU Principal
    • E-Prints y Datos de investigación
    • Grupos de investigación
    • Advanced Data Mining Research and Bioinformatics Learning (ADMIRABLE)
    • Artículos ADMIRABLE
    • Ver ítem
    •   RIUBU Principal
    • E-Prints y Datos de investigación
    • Grupos de investigación
    • Advanced Data Mining Research and Bioinformatics Learning (ADMIRABLE)
    • Artículos ADMIRABLE
    • Ver ítem

    Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10259/4814

    Título
    On feature selection protocols for very low-sample-size data
    Autor
    Kuncheva, Ludmila I. .
    Rodríguez Diez, Juan JoséAutoridad UBU Orcid
    Publicado en
    Pattern Recognition. 2018, V. 81, p. 660-673
    Editorial
    Elsevier
    Fecha de publicación
    2018-09
    ISSN
    0031-3203
    DOI
    10.1016/j.patcog.2018.03.012
    Resumen
    High-dimensional data with very few instances are typical in many application domains. Selecting a highly discriminative subset of the original features is often the main interest of the end user. The widely-used feature selection protocol for such type of data consists of two steps. First, features are selected from the data (possibly through cross-validation), and, second, a cross-validation protocol is applied to test a classifier using the selected features. The selected feature set and the testing accuracy are then returned to the user. For the lack of a better option, the same low-sample-size dataset is used in both steps. Questioning the validity of this protocol, we carried out an experiment using 24 high-dimensional datasets, three feature selection methods and five classifier models. We found that the accuracy returned by the above protocol is heavily biased, and therefore propose an alternative protocol which avoids the contamination by including both steps in a single cross-validation loop. Statistical tests verify that the classification accuracy returned by the proper protocol is significantly closer to the true accuracy (estimated from an independent testing set) compared to that returned by the currently favoured protocol.
    Palabras clave
    Feature selection
    Wide datasets
    Experimental protoco
    Training/testing
    Cross-validation
    Materia
    Computer science
    Informática
    URI
    http://hdl.handle.net/10259/4814
    Versión del editor
    https://doi.org/10.1016/j.patcog.2018.03.012
    Aparece en las colecciones
    • Artículos ADMIRABLE
    Attribution-NonCommercial-NoDerivatives 4.0 International
    Documento(s) sujeto(s) a una licencia Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International
    Ficheros en este ítem
    Nombre:
    Kuncheva-PR-2018.pdf
    Tamaño:
    3.448Mb
    Formato:
    Adobe PDF
    Thumbnail
    Visualizar/Abrir

    Métricas

    Citas

    Academic Search
    Ver estadísticas de uso

    Exportar

    RISMendeleyRefworksZotero
    • edm
    • marc
    • xoai
    • qdc
    • ore
    • ese
    • dim
    • uketd_dc
    • oai_dc
    • etdms
    • rdf
    • mods
    • mets
    • didl
    • premis
    Mostrar el registro completo del ítem