Abstract (EN):
Traffic state prediction is critical to decision-making in various traffic management applications. Despite significant advancements in Deep Learning (DL) models, such as Long Short-Term Memory (LSTM), Graph Neural Networks (GNN), and attention-based transformer models, multi-step predictions remain challenging. The state-of-the-art models face a common limitation: the predictions' accuracy decreases as the prediction horizon increases, a phenomenon known as error accumulation. In addition, with the arrival of non-recurrent events and external noise, the models fail to maintain good prediction accuracy. Deep Reinforcement Learning (DRL) has been widely applied to diverse tasks, including optimising intersection traffic signal control. However, its potential to address multi-step traffic prediction challenges remains underexplored. This study introduces an Actor-Critic-based adapted DRL method to explore the solution to the challenges associated with multi-step prediction. The Actor network makes predictions by capturing the temporal correlations of the data sequence, and the Critic network optimises the Actor by evaluating the prediction quality using Q-values. This novel combination of Supervised Learning and Reinforcement Learning (RL) paradigms, along with non-autoregressive modelling, helps the model to mitigate the error accumulation problem and increase its robustness to the arrival of non-recurrent events. It also introduces a Denoising Autoencoder to deal with external noise effectively. The proposed model was trained and evaluated on three benchmark traffic flow and speed datasets. Baseline multi-step prediction models were implemented for comparison based on performance metrics such as Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). The results reveal that the proposed method outperforms the baselines by achieving average improvements of 0.26 to 21.29% in terms of MAE and RMSE for up to 24 time steps of prediction length on the three used datasets, at the expense of relatively higher computational costs. On top of that, this adapted DRL approach outperforms traditional DRL models, such as Deep Deterministic Policy Gradient (DDPG), in accuracy and computational efficiency.
Language:
English
Type (Professor's evaluation):
Scientific
No. of pages:
14