<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="static/style.xsl"?><OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate>2026-06-09T02:17:00Z</responseDate><request verb="GetRecord" identifier="oai:riubu.ubu.es:10259/7002" metadataPrefix="qdc">https://riubu.ubu.es/oai/request</request><GetRecord><record><header><identifier>oai:riubu.ubu.es:10259/7002</identifier><datestamp>2024-05-20T08:02:46Z</datestamp><setSpec>com_10259.4_104</setSpec><setSpec>com_10259_2604</setSpec><setSpec>col_10259_6848</setSpec></header><metadata><qdc:qualifieddc xmlns:qdc="http://dspace.org/qualifieddc/" xmlns:doc="http://www.lyncode.com/xoai" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xsi:schemaLocation="http://purl.org/dc/elements/1.1/ http://dublincore.org/schemas/xmls/qdc/2006/01/06/dc.xsd http://purl.org/dc/terms/ http://dublincore.org/schemas/xmls/qdc/2006/01/06/dcterms.xsd http://dspace.org/qualifieddc/ http://www.ukoln.ac.uk/metadata/dcmi/xmlschema/qualifieddc.xsd">
<dc:title>Reinforcement learning for Traffic Signal Control: Comparison with commercial systems</dc:title>
<dc:creator>Cabrejas Egea, Álvaro</dc:creator>
<dc:creator>Zhang, Raymond</dc:creator>
<dc:creator>Walton, Neil</dc:creator>
<dc:subject>Tráfico</dc:subject>
<dc:subject>Infraestructuras</dc:subject>
<dc:subject>Traffic</dc:subject>
<dc:subject>Infrastructures</dc:subject>
<dcterms:abstract>In recent years, Intelligent Transportation Systems are leveraging the power of increased&#xd;
sensory coverage and available computing power to deliver data-intensive solutions&#xd;
achieving higher levels of performance than traditional systems. Within Traffic Signal&#xd;
Control (TSC), this has allowed the emergence of Machine Learning (ML) based systems.&#xd;
Among this group, Reinforcement Learning (RL) approaches have performed particularly&#xd;
well. Given the lack of industry standards in ML for TSC, literature exploring RL often lacks&#xd;
comparison against commercially available systems and straightforward formulations of&#xd;
how the agents operate. Here we attempt to bridge that gap. We propose three different&#xd;
architectures for RL based agents and compare them against currently used commercial&#xd;
systems MOVA, SurTrac and Cyclic controllers and provide pseudo-code for them. The&#xd;
agents use variations of Deep Q-Learning (Double Q Learning, Duelling Architectures and&#xd;
Prioritised Experience Replay) and Actor Critic agents, using states and rewards based on&#xd;
queue length measurements. Their performance is compared in across different map&#xd;
scenarios with variable demand, assessing them in terms of the global delay generated by all&#xd;
vehicles. We find that the RL-based systems can significantly and consistently achieve lower&#xd;
delays when compared with traditional and existing commercial systems.</dcterms:abstract>
<dcterms:dateAccepted>2022-09-22T06:41:43Z</dcterms:dateAccepted>
<dcterms:available>2022-09-22T06:41:43Z</dcterms:available>
<dcterms:created>2022-09-22T06:41:43Z</dcterms:created>
<dcterms:issued>2021-07</dcterms:issued>
<dc:type>info:eu-repo/semantics/conferenceObject</dc:type>
<dc:identifier>978-84-18465-12-3</dc:identifier>
<dc:identifier>http://hdl.handle.net/10259/7002</dc:identifier>
<dc:identifier>10.36443/10259/7002</dc:identifier>
<dc:language>eng</dc:language>
<dc:relation>R-Evolucionando el transporte</dc:relation>
<dc:relation>http://hdl.handle.net/10259/6490</dc:relation>
<dc:relation>https://doi.org/10.36443/9788418465123</dc:relation>
<dc:relation>info:eu-repo/grantAgreement/EPSRC//EP%2FL015374</dc:relation>
<dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
<dc:publisher>Universidad de Burgos. Servicio de Publicaciones e Imagen Institucional</dc:publisher>
</qdc:qualifieddc></metadata></record></GetRecord></OAI-PMH>