A GTFS data acquisition and processing framework and its application to train delay prediction

Publication Name

International Journal of Transportation Science and Technology


With advanced artificial intelligence and deep learning techniques, a growing number of data sources are playing more and more critical roles in planning and operating transportation services. The General Transit Feed Specification (GTFS), with standard open-source data in both static and real-time formats, is being widely used in public transport planning and operation management. However, compared to other extensively studied data sources such as smart card data and GPS trajectory data, the GTFS data lacks proper investigation yet. Utilization of the GTFS data is challenging for both transport planners and researchers due to its difficulty and complexity of understanding, processing, and leveraging the raw data. In this paper, a GTFS data acquisition and processing framework is proposed to offer an efficient and effective benchmark tool for converting and fusing the GTFS data to a ready-to-use format. To validate and test the proposed framework, a multivariate multistep Long Short-Term Memory is developed to predict train delay with minor anomaly in Sydney as a case study. The contribution of this new framework will render great potential for broader applications and deeper research.

Open Access Status

This publication may be available as open access



Link to publisher version (DOI)