A Data-Driven Two-Stage Prediction Model for Train Primary-Delay Recovery Time
Abstract
Accurate prediction of train delay recovery is critical for railway incident management and providing passengers with accurate journey time. In this paper, a two-stage prediction model is proposed to predict the recovery time of train primary-delay based on the real records from High-Speed Railway (HSR). In Stage 1, two models are built to study the influence of feature space and model framework on the prediction accuracy of buffer time in each section or station. It is found that explicitly inputting the attribute features of stations and sections to the model, instead of implicit simulation, will improve the prediction accuracy effectively. For validation purpose, the proposed model has been compared with several alternative models, namely, Logistic Regression (LR), Artificial Neutral Network (ANN), Support Vector Machine (SVM) and Gradient Boosting Tree (GBT). The results show that its remarkable performance is better than other schemes. Specifically, when the error is extended to 3min, the proposed model can achieve up to the accuracy of 94.63%. It proves that our method has high value in practical engineering application. Considering the delay propagation of trains is a complex process, our future study will focus on building delay propagation knowledge base and dispatcher experience knowledge base.