Go to:
Logótipo
You are in:: Start > Publications > View > An Actor-Critic-based adapted Deep Reinforcement Learning model for multi-step traffic state prediction
Map of Premises
FC6 - Departamento de Ciência de Computadores FC5 - Edifício Central FC4 - Departamento de Biologia FC3 - Departamento de Física e Astronomia e Departamento GAOT FC2 - Departamento de Química e Bioquímica FC1 - Departamento de Matemática
Publication

An Actor-Critic-based adapted Deep Reinforcement Learning model for multi-step traffic state prediction

Title
An Actor-Critic-based adapted Deep Reinforcement Learning model for multi-step traffic state prediction
Type
Article in International Scientific Journal
Year
2025-12
Authors
Selim Reza
(Author)
Other
View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications Without AUTHENTICUS Without ORCID
Marta Campos Ferreira
(Author)
FEUP
View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications View Authenticus page View ORCID page
J.J.M. Machado
(Author)
FEUP
View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications View Authenticus page View ORCID page
João Manuel R. S. Tavares
(Author)
FEUP
View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications View Authenticus page View ORCID page
Journal
Vol. 184 No. 113783
Pages: 1-14
ISSN: 1568-4946
Publisher: Elsevier
Indexing
Publicação em ISI Web of Knowledge ISI Web of Knowledge - 0 Citations
Publicação em ISI Web of Science ISI Web of Science
Publicação em Scopus Scopus - 0 Citations
Clarivate Analytics
Scientific classification
CORDIS: Technological sciences
FOS: Engineering and technology
Other information
Authenticus ID: P-019-YYF
Abstract (EN): Traffic state prediction is critical to decision-making in various traffic management applications. Despite significant advancements in Deep Learning (DL) models, such as Long Short-Term Memory (LSTM), Graph Neural Networks (GNN), and attention-based transformer models, multi-step predictions remain challenging. The state-of-the-art models face a common limitation: the predictions' accuracy decreases as the prediction horizon increases, a phenomenon known as error accumulation. In addition, with the arrival of non-recurrent events and external noise, the models fail to maintain good prediction accuracy. Deep Reinforcement Learning (DRL) has been widely applied to diverse tasks, including optimising intersection traffic signal control. However, its potential to address multi-step traffic prediction challenges remains underexplored. This study introduces an Actor-Critic-based adapted DRL method to explore the solution to the challenges associated with multi-step prediction. The Actor network makes predictions by capturing the temporal correlations of the data sequence, and the Critic network optimises the Actor by evaluating the prediction quality using Q-values. This novel combination of Supervised Learning and Reinforcement Learning (RL) paradigms, along with non-autoregressive modelling, helps the model to mitigate the error accumulation problem and increase its robustness to the arrival of non-recurrent events. It also introduces a Denoising Autoencoder to deal with external noise effectively. The proposed model was trained and evaluated on three benchmark traffic flow and speed datasets. Baseline multi-step prediction models were implemented for comparison based on performance metrics such as Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). The results reveal that the proposed method outperforms the baselines by achieving average improvements of 0.26 to 21.29% in terms of MAE and RMSE for up to 24 time steps of prediction length on the three used datasets, at the expense of relatively higher computational costs. On top of that, this adapted DRL approach outperforms traditional DRL models, such as Deep Deterministic Policy Gradient (DDPG), in accuracy and computational efficiency.
Language: English
Type (Professor's evaluation): Scientific
No. of pages: 14
Documents
File name Description Size
paper 1st Page 1103.88 KB
1-s2.0-S1568494625010968 Paper 2283.50 KB
Related Publications

Of the same authors

Traffic State Prediction Using One-Dimensional Convolution Neural Networks and Long Short-Term Memory (2022)
Article in International Scientific Journal
Selim Reza; Marta Campos Ferreira; José J. M. Machado; João Manuel R. S. Tavares
Road Traffic Events Monitoring Using a Multi-Head Attention Mechanism-Based Transformer and Temporal Convolutional Networks (2025)
Article in International Scientific Journal
Selim Reza; Marta Campos Ferreira; J.J.M. Machado; João Manuel R. S. Tavares
A multi-head attention-based transformer model for traffic flow forecasting with a comparative analysis to recurrent neural networks (2022)
Article in International Scientific Journal
Selim Reza; Marta Campos Ferreira; José Joaquim M. Machado; João Manuel R. S. Tavares
A customized residual neural network and bi-directional gated recurrent unit-based automatic speech recognition model (2022)
Article in International Scientific Journal
Selim Reza; Marta Campos Ferreira; J.J.M. Machado; João Manuel R. S. Tavares

See all (6)

Of the same journal

Novelty detection for multi-label stream classification under extreme verification latency (2023)
Article in International Scientific Journal
Costa, JD; Júnior; Faria, ER; João Gama; Gama, J; Cerri, R
Improving a simulated soccer team's performance through a Memory-Based Collaborative Filtering approach (2014)
Article in International Scientific Journal
Pedro Henriques Abreu; Daniel Castro Silva; Fernando Almeida; João Mendes-Moreira
Heuristics for online three-dimensional packing problems and algorithm selection framework for semi-online with full look-ahead (2024)
Article in International Scientific Journal
Ali, S; Ramos, AG; Maria Antónia Carravilla; José Fernando Oliveira
Glass container production scheduling through hybrid multi-population based evolutionary algorithm (2013)
Article in International Scientific Journal
toledo, cfm; arantes, md; de oliveira, rrr; almada-lobo, b

See all (10)

Recommend this page Top
Copyright 1996-2026 © Faculdade de Ciências da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z
Page created on: 2026-02-28 at 17:04:58 | Privacy Policy | Personal Data Protection Policy | Whistleblowing | Electronic Yellow Book