Abstract: Public transportation systems are pivotal for sustainable urban mobility, yet frequent delays in buses, metros, and trams compromise service reliability, passenger satisfaction, and operational efficiency. This study proposes an AI-based hybrid CNN-LSTM model for public transport delay prediction, classifying trips as "Delayed" or "On Time" using a comprehensive dataset of 2,000 records encompassing operational features (transport mode, route details, scheduled and actual times), temporal attributes (peak hours, weekdays, seasons, holidays), meteorological variables (temperature, humidity, wind speed, precipitation), and exogenous factors (traffic congestion index, event attendance). Rigorous data preprocessing addresses missing values via imputation and employs Recursive Feature Elimination (RFE) with cross-validation to select optimal features, mitigating multicollinearity and enhancing model interpretability. A supervised learning pipeline, implemented in Scikit-learn and TensorFlow, leverages CNN for extracting spatial hierarchies from multivariate inputs, LSTM for modeling temporal dependencies in delay sequences, and Random Forest as an ensemble baseline, achieving superior performance (accuracy > 92%, F1-score > 0.91) over benchmarks via stratified k-fold validation, precision-recall curves, and confusion matrix analysis. Deployed as a Flask-based web application with secure authentication, Plotly interactive dashboards, and real-time inference APIs, the system facilitates proactive decision-making for transit authorities and scalable passenger information services.
Keywords: CNN-LSTM hybrid model, public transport delays, Recursive Feature Elimination, spatiotemporal prediction, Flask deployment, stratified validation
Downloads:
|
DOI:
10.17148/IARJSET.2026.13154
[1] Shreelakshmi D M, K R Sumana, "AI-Based Transit Delay Predictor," International Advanced Research Journal in Science, Engineering and Technology (IARJSET), DOI: 10.17148/IARJSET.2026.13154