Universidad de Burgos RIUBU Principal Default Universidad de Burgos RIUBU Principal Default
  • español
  • English
  • français
  • Deutsch
  • português (Brasil)
  • italiano
Universidad de Burgos RIUBU Principal Default
  • Ayuda
  • Contacto
  • Sugerencias
  • Acceso abierto
    • Archivar en RIUBU
    • Acuerdos editoriales para la publicación en acceso abierto
    • Controla tus derechos, facilita el acceso abierto
    • Sobre el acceso abierto y la UBU
    • español
    • English
    • français
    • Deutsch
    • português (Brasil)
    • italiano
    • español
    • English
    • français
    • Deutsch
    • português (Brasil)
    • italiano
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Listar

    Todo RIUBUComunidadesFechaAutor / DirectorTítuloMateria / AsignaturaEsta colecciónFechaAutor / DirectorTítuloMateria / Asignatura

    Mi cuenta

    AccederRegistro

    Estadísticas

    Ver Estadísticas de uso

    Compartir

    Ver ítem 
    •   RIUBU Principal
    • E-Prints y Datos de investigación
    • Grupos de investigación
    • Advanced Data Mining Research and Bioinformatics Learning (ADMIRABLE)
    • Artículos ADMIRABLE
    • Ver ítem
    •   RIUBU Principal
    • E-Prints y Datos de investigación
    • Grupos de investigación
    • Advanced Data Mining Research and Bioinformatics Learning (ADMIRABLE)
    • Artículos ADMIRABLE
    • Ver ítem

    Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10259/5766

    Título
    Experimental evaluation of ensemble classifiers for imbalance in Big Data
    Autor
    Juez Gil, MarioAutoridad UBU Orcid
    Arnaiz González, ÁlvarAutoridad UBU Orcid
    Rodríguez Diez, Juan JoséAutoridad UBU Orcid
    García Osorio, CésarAutoridad UBU Orcid
    Publicado en
    Applied Soft Computing. 2021, V. 108, 107447
    Editorial
    Elsevier
    Fecha de publicación
    2021-09
    ISSN
    1568-4946
    DOI
    10.1016/j.asoc.2021.107447
    Resumen
    Datasets are growing in size and complexity at a pace never seen before, forming ever larger datasets known as Big Data. A common problem for classification, especially in Big Data, is that the numerous examples of the different classes might not be balanced. Some decades ago, imbalanced classification was therefore introduced, to correct the tendency of classifiers that show bias in favor of the majority class and that ignore the minority one. To date, although the number of imbalanced classification methods have increased, they continue to focus on normal-sized datasets and not on the new reality of Big Data. In this paper, in-depth experimentation with ensemble classifiers is conducted in the context of imbalanced Big Data classification, using two popular ensemble families (Bagging and Boosting) and different resampling methods. All the experimentation was launched in Spark clusters, comparing ensemble performance and execution times with statistical test results, including the newest ones based on the Bayesian approach. One very interesting conclusion from the study was that simpler methods applied to unbalanced datasets in the context of Big Data provided better results than complex methods. The additional complexity of some of the sophisticated methods, which appear necessary to process and to reduce imbalance in normal-sized datasets were not effective for imbalanced Big Data.
    Palabras clave
    Unbalance
    Imbalance
    Ensemble
    Resampling
    Big Data
    Spark
    Materia
    Informática
    Computer science
    URI
    http://hdl.handle.net/10259/5766
    Versión del editor
    https://doi.org/10.1016/j.asoc.2021.107447
    Aparece en las colecciones
    • Artículos ADMIRABLE
    Attribution-NonCommercial-NoDerivatives 4.0 Internacional
    Documento(s) sujeto(s) a una licencia Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Internacional
    Ficheros en este ítem
    Nombre:
    juez-asc_2021.pdf
    Tamaño:
    753.5Kb
    Formato:
    Adobe PDF
    Thumbnail
    Visualizar/Abrir

    Métricas

    Citas

    Academic Search
    Ver estadísticas de uso

    Exportar

    RISMendeleyRefworksZotero
    • edm
    • marc
    • xoai
    • qdc
    • ore
    • ese
    • dim
    • uketd_dc
    • oai_dc
    • etdms
    • rdf
    • mods
    • mets
    • didl
    • premis
    Mostrar el registro completo del ítem