<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="static/style.xsl"?><OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate>2026-04-17T23:14:08Z</responseDate><request verb="GetRecord" identifier="oai:riubu.ubu.es:10259/6206" metadataPrefix="mets">https://riubu.ubu.es/oai/request</request><GetRecord><record><header><identifier>oai:riubu.ubu.es:10259/6206</identifier><datestamp>2022-11-21T12:51:46Z</datestamp><setSpec>com_10259_5377</setSpec><setSpec>com_10259_5086</setSpec><setSpec>com_10259_2604</setSpec><setSpec>col_10259_5378</setSpec></header><metadata><mets xmlns="http://www.loc.gov/METS/" xmlns:doc="http://www.lyncode.com/xoai" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink" xsi:schemaLocation="http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/mets.xsd" PROFILE="DSpace METS SIP Profile 1.0" TYPE="DSpace ITEM" ID="&#xa;&#x9;&#x9;&#x9;&#x9;DSpace_ITEM_10259-6206" OBJID="&#xa;&#x9;&#x9;&#x9;&#x9;hdl:10259/6206">
<metsHdr CREATEDATE="2026-04-17T22:35:14Z">
<agent TYPE="ORGANIZATION" ROLE="CUSTODIAN">
<name>Repositorio Institucional de la Universidad de Burgos</name>
</agent>
</metsHdr>
<dmdSec ID="DMD_10259_6206">
<mdWrap MDTYPE="MODS">
<xmlData xmlns:mods="http://www.loc.gov/mods/v3" xsi:schemaLocation="http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-1.xsd">
<mods:mods xsi:schemaLocation="http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-1.xsd">
<mods:name>
<mods:role>
<mods:roleTerm type="text">author</mods:roleTerm>
</mods:role>
<mods:namePart>Juez Gil, Mario</mods:namePart>
</mods:name>
<mods:name>
<mods:role>
<mods:roleTerm type="text">author</mods:roleTerm>
</mods:role>
<mods:namePart>Arnaiz González, Álvar</mods:namePart>
</mods:name>
<mods:name>
<mods:role>
<mods:roleTerm type="text">author</mods:roleTerm>
</mods:role>
<mods:namePart>Rodríguez Diez, Juan José</mods:namePart>
</mods:name>
<mods:name>
<mods:role>
<mods:roleTerm type="text">author</mods:roleTerm>
</mods:role>
<mods:namePart>López Nozal, Carlos</mods:namePart>
</mods:name>
<mods:name>
<mods:role>
<mods:roleTerm type="text">author</mods:roleTerm>
</mods:role>
<mods:namePart>García Osorio, César</mods:namePart>
</mods:name>
<mods:extension>
<mods:dateAccessioned encoding="iso8601">2021-11-23T08:25:06Z</mods:dateAccessioned>
</mods:extension>
<mods:extension>
<mods:dateAvailable encoding="iso8601">2021-11-23T08:25:06Z</mods:dateAvailable>
</mods:extension>
<mods:originInfo>
<mods:dateIssued encoding="iso8601">2021-11</mods:dateIssued>
</mods:originInfo>
<mods:identifier type="issn">0925-2312</mods:identifier>
<mods:identifier type="uri">http://hdl.handle.net/10259/6206</mods:identifier>
<mods:identifier type="doi">10.1016/j.neucom.2021.08.086</mods:identifier>
<mods:abstract>One of the main goals of Big Data research, is to find new data mining methods that are able to process large amounts of data in acceptable times. In Big Data classification, as in traditional classification, class imbalance is a common problem that must be addressed, in the case of Big Data also looking for a solution that can be applied in an acceptable execution time. In this paper we present Approx-SMOTE, a parallel implementation of the SMOTE algorithm for the Apache Spark framework. The key difference with the original SMOTE, besides parallelism, is that it uses an approximated version of k-Nearest Neighbor which makes it highly scalable. Although an implementation of SMOTE for Big Data already exists (SMOTE-BD), it uses an exact Nearest Neighbor search, which does not make it entirely scalable. Approx-SMOTE on the other hand is able to achieve up to 30 times faster run times without sacrificing the improved classification performance offered by the original SMOTE.</mods:abstract>
<mods:language>
<mods:languageTerm authority="rfc3066">eng</mods:languageTerm>
</mods:language>
<mods:accessCondition type="useAndReproduction">Attribution-NonCommercial-NoDerivatives 4.0 Internacional</mods:accessCondition>
<mods:subject>
<mods:topic>SMOTE</mods:topic>
</mods:subject>
<mods:subject>
<mods:topic>Imbalance</mods:topic>
</mods:subject>
<mods:subject>
<mods:topic>Spark</mods:topic>
</mods:subject>
<mods:subject>
<mods:topic>Big data</mods:topic>
</mods:subject>
<mods:subject>
<mods:topic>Data mining</mods:topic>
</mods:subject>
<mods:titleInfo>
<mods:title>Approx-SMOTE: Fast SMOTE for Big Data on Apache Spark</mods:title>
</mods:titleInfo>
<mods:genre>info:eu-repo/semantics/article</mods:genre>
</mods:mods>
</xmlData>
</mdWrap>
</dmdSec>
<amdSec ID="TMD_10259_6206">
<rightsMD ID="RIG_10259_6206">
<mdWrap OTHERMDTYPE="DSpaceDepositLicense" MDTYPE="OTHER" MIMETYPE="text/plain">
<binData>RWwgYXV0b3IgY29tbyDDum5pY28gdGl0dWxhciBkZSBsb3MgZGVyZWNob3MgZGUgcHJvcGllZGFkIGludGVsZWN0dWFsIGRlIGxhIG9icmEsIG8gZGlzcG9uaWVuZG8gZGUgbG9zIGRlYmlkb3MgcGVybWlzb3MgZGUgbG9zIG90cm9zIHRpdHVsYXJlcywgc2kgbG9zIGh1YmllcmEsIHkgZW4gdmlydHVkIGRlIGxvcyBkZXJlY2hvcyBxdWUgbGUgY29uZmllcmUgbGEgbGVnaXNsYWNpw7NuIHZpZ2VudGUgc29icmUgcHJvcGllZGFkIGludGVsZWN0dWFsIHkgZGVyZWNob3MgZGUgYXV0b3IsIA0KQVVUT1JJWkEgYSBsYSBVbml2ZXJzaWRhZCBkZSBCdXJnb3MgYSBkaWZ1bmRpciwgZGUgbWFuZXJhIGdyYXR1aXRhLCBlbCBjb250ZW5pZG8gZGUgbG9zIGFyY2hpdm9zIGRpZ2l0YWxlcyBxdWUgY29ycmVzcG9uZGVuIGFsIGRvY3VtZW50byBkZXNjcml0byBhbnRlcmlvcm1lbnRlLCBjb24gY2Fyw6FjdGVyIG5vIGV4Y2x1c2l2byB5IGRlIG1hbmVyYSBww7pibGljYSBlbiBhY2Nlc28gYWJpZXJ0byBhIHRyYXbDqXMgZGUgSW50ZXJuZXQsIHBhcmEgbG8gcXVlIGxhIEJpYmxpb3RlY2EgcHJvY2VkZXLDoSBhIGFyY2hpdmFybG9zIGVuIGVsIFJlcG9zaXRvcmlvIEluc3RpdHVjaW9uYWwuIEFzaW1pc21vIGF1dG9yaXphIGEgbGEgVW5pdmVyc2lkYWQgZGUgQnVyZ29zIGEgcmVhbGl6YXIgbGFzIHRyYW5zZm9ybWFjaW9uZXMgbmVjZXNhcmlhcyBkZSBmb3JtYXRvLCBubyBkZSBjb250ZW5pZG8sIHBhcmEgZ2FyYW50aXphciBsYSBwcmVzZXJ2YWNpw7NuIHkgZWwgYWNjZXNvIGVuIGVsIGZ1dHVyby4NCg0KRWwgYXV0b3IgZGlzcG9uZSwgZW4gdG9kbyBjYXNvLCBkZWwgZGVyZWNobyBhIHJldm9jYXIgZXN0YSBhdXRvcml6YWNpw7NuLg0KDQpMYSBjZXNpw7NuIGRlIGRlcmVjaG9zIGRlIGVzdGEgb2JyYSBzZSBlbmN1ZW50cmEgc3VqZXRhIGEgbGEgbGVnaXNsYWNpw7NuIHZpZ2VudGUgc29icmUgcHJvcGllZGFkIGludGVsZWN0dWFsIHkgZGVyZWNob3MgZGUgYXV0b3Iu</binData>
</mdWrap>
</rightsMD>
</amdSec>
<amdSec ID="FO_10259_6206_1">
<techMD ID="TECH_O_10259_6206_1">
<mdWrap MDTYPE="PREMIS">
<xmlData xmlns:premis="http://www.loc.gov/standards/premis" xsi:schemaLocation="http://www.loc.gov/standards/premis http://www.loc.gov/standards/premis/PREMIS-v1-0.xsd">
<premis:premis>
<premis:object>
<premis:objectIdentifier>
<premis:objectIdentifierType>URL</premis:objectIdentifierType>
<premis:objectIdentifierValue>https://riubu.ubu.es/bitstream/10259/6206/1/Juez-neurocomputing_2021.pdf</premis:objectIdentifierValue>
</premis:objectIdentifier>
<premis:objectCategory>File</premis:objectCategory>
<premis:objectCharacteristics>
<premis:fixity>
<premis:messageDigestAlgorithm>MD5</premis:messageDigestAlgorithm>
<premis:messageDigest>3d2361e59c80bd769742340a4b593ead</premis:messageDigest>
</premis:fixity>
<premis:size>1068661</premis:size>
<premis:format>
<premis:formatDesignation>
<premis:formatName>application/pdf</premis:formatName>
</premis:formatDesignation>
</premis:format>
</premis:objectCharacteristics>
<premis:originalName>Juez-neurocomputing_2021.pdf</premis:originalName>
</premis:object>
</premis:premis>
</xmlData>
</mdWrap>
</techMD>
</amdSec>
<fileSec>
<fileGrp USE="ORIGINAL">
<file ID="BITSTREAM_ORIGINAL_10259_6206_1" MIMETYPE="application/pdf" SEQ="1" SIZE="1068661" CHECKSUM="3d2361e59c80bd769742340a4b593ead" CHECKSUMTYPE="MD5" ADMID="FO_10259_6206_1" GROUPID="GROUP_BITSTREAM_10259_6206_1">
<FLocat xlink:type="simple" LOCTYPE="URL" xlink:href="https://riubu.ubu.es/bitstream/10259/6206/1/Juez-neurocomputing_2021.pdf"/>
</file>
</fileGrp>
</fileSec>
<structMap TYPE="LOGICAL" LABEL="DSpace Object">
<div TYPE="DSpace Object Contents" ADMID="DMD_10259_6206">
<div TYPE="DSpace BITSTREAM">
<fptr FILEID="BITSTREAM_ORIGINAL_10259_6206_1"/>
</div>
</div>
</structMap>
</mets></metadata></record></GetRecord></OAI-PMH>