<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="static/style.xsl"?><OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate>2026-06-05T19:23:20Z</responseDate><request verb="GetRecord" identifier="oai:riubu.ubu.es:10259/4221" metadataPrefix="mets">https://riubu.ubu.es/oai/request</request><GetRecord><record><header><identifier>oai:riubu.ubu.es:10259/4221</identifier><datestamp>2021-11-10T09:38:16Z</datestamp><setSpec>com_10259_5377</setSpec><setSpec>com_10259_5086</setSpec><setSpec>com_10259_2604</setSpec><setSpec>com_10259_4219</setSpec><setSpec>col_10259_5378</setSpec><setSpec>col_10259_4220</setSpec></header><metadata><mets xmlns="http://www.loc.gov/METS/" xmlns:doc="http://www.lyncode.com/xoai" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink" xsi:schemaLocation="http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/mets.xsd" PROFILE="DSpace METS SIP Profile 1.0" TYPE="DSpace ITEM" ID="&#xa;&#x9;&#x9;&#x9;&#x9;DSpace_ITEM_10259-4221" OBJID="&#xa;&#x9;&#x9;&#x9;&#x9;hdl:10259/4221">
<metsHdr CREATEDATE="2026-06-05T21:23:20Z">
<agent TYPE="ORGANIZATION" ROLE="CUSTODIAN">
<name>Repositorio Institucional de la Universidad de Burgos</name>
</agent>
</metsHdr>
<dmdSec ID="DMD_10259_4221">
<mdWrap MDTYPE="MODS">
<xmlData xmlns:mods="http://www.loc.gov/mods/v3" xsi:schemaLocation="http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-1.xsd">
<mods:mods xsi:schemaLocation="http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-1.xsd">
<mods:name>
<mods:role>
<mods:roleTerm type="text">author</mods:roleTerm>
</mods:role>
<mods:namePart>Arnaiz González, Álvar</mods:namePart>
</mods:name>
<mods:name>
<mods:role>
<mods:roleTerm type="text">author</mods:roleTerm>
</mods:role>
<mods:namePart>Diez Pastor, José Francisco</mods:namePart>
</mods:name>
<mods:name>
<mods:role>
<mods:roleTerm type="text">author</mods:roleTerm>
</mods:role>
<mods:namePart>Rodríguez Diez, Juan José</mods:namePart>
</mods:name>
<mods:name>
<mods:role>
<mods:roleTerm type="text">author</mods:roleTerm>
</mods:role>
<mods:namePart>García Osorio, César</mods:namePart>
</mods:name>
<mods:extension>
<mods:dateAccessioned encoding="iso8601">2016-09-01T09:42:59Z</mods:dateAccessioned>
</mods:extension>
<mods:extension>
<mods:dateAvailable encoding="iso8601">2016-09-01T09:42:59Z</mods:dateAvailable>
</mods:extension>
<mods:originInfo>
<mods:dateIssued encoding="iso8601">2016-09</mods:dateIssued>
</mods:originInfo>
<mods:identifier type="issn">0950-7051</mods:identifier>
<mods:identifier type="uri">http://hdl.handle.net/10259/4221</mods:identifier>
<mods:identifier type="doi">10.1016/j.knosys.2016.05.056</mods:identifier>
<mods:abstract>Over recent decades, database sizes have grown considerably. Larger sizes present new challenges, because machine learning algorithms are not prepared to process such large volumes of information. Instance selection methods can alleviate this problem when the size of the data set is medium to large. However, even these methods face similar problems with very large-to-massive data sets.&#xd;
&#xd;
In this paper, two new algorithms with linear complexity for instance selection purposes are presented. Both algorithms use locality-sensitive hashing   to find similarities between instances. While the complexity of conventional methods (usually quadratic, O(n2), or log-linear, O(nlogn)) means that they are unable to process large-sized data sets, the new proposal shows competitive results in terms of accuracy. Even more remarkably, it shortens execution time, as the proposal manages to reduce complexity and make it linear with respect to the data set size. The new proposal has been compared with some of the best known instance selection methods for testing and has also been evaluated on large data sets (up to a million instances).</mods:abstract>
<mods:language>
<mods:languageTerm authority="rfc3066">eng</mods:languageTerm>
</mods:language>
<mods:accessCondition type="useAndReproduction">Attribution 4.0 International</mods:accessCondition>
<mods:subject>
<mods:topic>Nearest neighbor</mods:topic>
</mods:subject>
<mods:subject>
<mods:topic>Data reduction</mods:topic>
</mods:subject>
<mods:subject>
<mods:topic>Instance selection</mods:topic>
</mods:subject>
<mods:subject>
<mods:topic>Hashing</mods:topic>
</mods:subject>
<mods:subject>
<mods:topic>Big data</mods:topic>
</mods:subject>
<mods:titleInfo>
<mods:title>Instance selection of linear complexity for big data</mods:title>
</mods:titleInfo>
<mods:genre>info:eu-repo/semantics/article</mods:genre>
</mods:mods>
</xmlData>
</mdWrap>
</dmdSec>
<amdSec ID="TMD_10259_4221">
<rightsMD ID="RIG_10259_4221">
<mdWrap OTHERMDTYPE="DSpaceDepositLicense" MDTYPE="OTHER" MIMETYPE="text/plain">
<binData>RWwgYXV0b3IgY29tbyDDum5pY28gdGl0dWxhciBkZSBsb3MgZGVyZWNob3MgZGUgcHJvcGllZGFkIGludGVsZWN0dWFsIGRlIGxhIG9icmEsIG8gZGlzcG9uaWVuZG8gZGUgbG9zIGRlYmlkb3MgcGVybWlzb3MgZGUgbG9zIG90cm9zIHRpdHVsYXJlcywgc2kgbG9zIGh1YmllcmEsIHkgZW4gdmlydHVkIGRlIGxvcyBkZXJlY2hvcyBxdWUgbGUgY29uZmllcmUgbGEgbGVnaXNsYWNpw7NuIHZpZ2VudGUgc29icmUgcHJvcGllZGFkIGludGVsZWN0dWFsIHkgZGVyZWNob3MgZGUgYXV0b3IsIApBVVRPUklaQSBhIGxhIFVuaXZlcnNpZGFkIGRlIEJ1cmdvcyBhIGRpZnVuZGlyLCBkZSBtYW5lcmEgZ3JhdHVpdGEsIGVsIGNvbnRlbmlkbyBkZSBsb3MgYXJjaGl2b3MgZGlnaXRhbGVzIHF1ZSBjb3JyZXNwb25kZW4gYWwgZG9jdW1lbnRvIGRlc2NyaXRvIGFudGVyaW9ybWVudGUsIGNvbiBjYXLDoWN0ZXIgbm8gZXhjbHVzaXZvIHkgZGUgbWFuZXJhIHDDumJsaWNhIGVuIGFjY2VzbyBhYmllcnRvIGEgdHJhdsOpcyBkZSBJbnRlcm5ldCwgcGFyYSBsbyBxdWUgbGEgQmlibGlvdGVjYSBwcm9jZWRlcsOhIGEgYXJjaGl2YXJsb3MgZW4gZWwgUmVwb3NpdG9yaW8gSW5zdGl0dWNpb25hbC4gQXNpbWlzbW8gYXV0b3JpemEgYSBsYSBVbml2ZXJzaWRhZCBkZSBCdXJnb3MgYSByZWFsaXphciBsYXMgdHJhbnNmb3JtYWNpb25lcyBuZWNlc2FyaWFzIGRlIGZvcm1hdG8sIG5vIGRlIGNvbnRlbmlkbywgcGFyYSBnYXJhbnRpemFyIGxhIHByZXNlcnZhY2nDs24geSBlbCBhY2Nlc28gZW4gZWwgZnV0dXJvLgoKRWwgYXV0b3IgZGlzcG9uZSwgZW4gdG9kbyBjYXNvLCBkZWwgZGVyZWNobyBhIHJldm9jYXIgZXN0YSBhdXRvcml6YWNpw7NuLgoKTGEgY2VzacOzbiBkZSBkZXJlY2hvcyBkZSBlc3RhIG9icmEgc2UgZW5jdWVudHJhIHN1amV0YSBhIGxhIGxlZ2lzbGFjacOzbiB2aWdlbnRlIHNvYnJlIHByb3BpZWRhZCBpbnRlbGVjdHVhbCB5IGRlcmVjaG9zIGRlIGF1dG9yLiBTdSBkaWZ1c2nDs24gZW4gZWwgUmVwb3NpdG9yaW8gc2Vyw6EgYmFqbyBsYSBtb2RhbGlkYWQgZGUgbGljZW5jaWEgQ3JlYXRpdmUgQ29tbW9ucyBvIGVxdWl2YWxlbnRlOiByZWNvbm9jaW1pZW50byDigJMgdXNvIG5vIGNvbWVyY2lhbCDigJMgc2luIG9icmEgZGVyaXZhZGEsIHBvciBsYSBxdWUgc2UgcGVybWl0ZSBoYWNlciBjb3BpYSwgZGlzdHJpYnVpciB5IGNvbXVuaWNhciBww7pibGljYW1lbnRlIGxhIG9icmEgc2llbXByZSBxdWUgc2UgY2l0ZSBhbCBhdXRvciwgZWwgdXNvIHF1ZSBzZSBoYWdhIGRlIGVsbGEgc2VhIG5vIGNvbWVyY2lhbCB5IG5vIHNlIGNyZWVuIG9icmFzIGRlcml2YWRhcyBhIHBhcnRpciBkZSBsYSBvcmlnaW5hbC4K</binData>
</mdWrap>
</rightsMD>
</amdSec>
<amdSec ID="FO_10259_4221_1">
<techMD ID="TECH_O_10259_4221_1">
<mdWrap MDTYPE="PREMIS">
<xmlData xmlns:premis="http://www.loc.gov/standards/premis" xsi:schemaLocation="http://www.loc.gov/standards/premis http://www.loc.gov/standards/premis/PREMIS-v1-0.xsd">
<premis:premis>
<premis:object>
<premis:objectIdentifier>
<premis:objectIdentifierType>URL</premis:objectIdentifierType>
<premis:objectIdentifierValue>https://riubu.ubu.es/bitstream/10259/4221/1/Arnaiz-KBS_2016.pdf</premis:objectIdentifierValue>
</premis:objectIdentifier>
<premis:objectCategory>File</premis:objectCategory>
<premis:objectCharacteristics>
<premis:fixity>
<premis:messageDigestAlgorithm>MD5</premis:messageDigestAlgorithm>
<premis:messageDigest>d4e42af8a5936dad7b8f6e543e96d24f</premis:messageDigest>
</premis:fixity>
<premis:size>1184745</premis:size>
<premis:format>
<premis:formatDesignation>
<premis:formatName>application/pdf</premis:formatName>
</premis:formatDesignation>
</premis:format>
</premis:objectCharacteristics>
<premis:originalName>Arnaiz-KBS_2016.pdf</premis:originalName>
</premis:object>
</premis:premis>
</xmlData>
</mdWrap>
</techMD>
</amdSec>
<amdSec ID="FT_10259_4221_6">
<techMD ID="TECH_T_10259_4221_6">
<mdWrap MDTYPE="PREMIS">
<xmlData xmlns:premis="http://www.loc.gov/standards/premis" xsi:schemaLocation="http://www.loc.gov/standards/premis http://www.loc.gov/standards/premis/PREMIS-v1-0.xsd">
<premis:premis>
<premis:object>
<premis:objectIdentifier>
<premis:objectIdentifierType>URL</premis:objectIdentifierType>
<premis:objectIdentifierValue>https://riubu.ubu.es/bitstream/10259/4221/6/Arnaiz-KBS_2016.pdf.txt</premis:objectIdentifierValue>
</premis:objectIdentifier>
<premis:objectCategory>File</premis:objectCategory>
<premis:objectCharacteristics>
<premis:fixity>
<premis:messageDigestAlgorithm>MD5</premis:messageDigestAlgorithm>
<premis:messageDigest>6389f85d4eac645f34bbf67d56f6eb1a</premis:messageDigest>
</premis:fixity>
<premis:size>67394</premis:size>
<premis:format>
<premis:formatDesignation>
<premis:formatName>text/plain</premis:formatName>
</premis:formatDesignation>
</premis:format>
</premis:objectCharacteristics>
<premis:originalName>Arnaiz-KBS_2016.pdf.txt</premis:originalName>
</premis:object>
</premis:premis>
</xmlData>
</mdWrap>
</techMD>
</amdSec>
<fileSec>
<fileGrp USE="ORIGINAL">
<file ID="BITSTREAM_ORIGINAL_10259_4221_1" MIMETYPE="application/pdf" SEQ="1" SIZE="1184745" CHECKSUM="d4e42af8a5936dad7b8f6e543e96d24f" CHECKSUMTYPE="MD5" ADMID="FO_10259_4221_1" GROUPID="GROUP_BITSTREAM_10259_4221_1">
<FLocat xlink:type="simple" LOCTYPE="URL" xlink:href="https://riubu.ubu.es/bitstream/10259/4221/1/Arnaiz-KBS_2016.pdf"/>
</file>
</fileGrp>
<fileGrp USE="TEXT">
<file ID="BITSTREAM_TEXT_10259_4221_6" MIMETYPE="text/plain" SEQ="6" SIZE="67394" CHECKSUM="6389f85d4eac645f34bbf67d56f6eb1a" CHECKSUMTYPE="MD5" ADMID="FT_10259_4221_6" GROUPID="GROUP_BITSTREAM_10259_4221_6">
<FLocat xlink:type="simple" LOCTYPE="URL" xlink:href="https://riubu.ubu.es/bitstream/10259/4221/6/Arnaiz-KBS_2016.pdf.txt"/>
</file>
</fileGrp>
</fileSec>
<structMap TYPE="LOGICAL" LABEL="DSpace Object">
<div TYPE="DSpace Object Contents" ADMID="DMD_10259_4221">
<div TYPE="DSpace BITSTREAM">
<fptr FILEID="BITSTREAM_ORIGINAL_10259_4221_1"/>
</div>
</div>
</structMap>
</mets></metadata></record></GetRecord></OAI-PMH>