Nowadays a large amount of data is collected from sensor devices across the cyber-physical networks. Accurate and reliable primary delay predictions are essential for rail operations management and planning. However, very few existing 'big data' methods meet the specific needs in railways. We propose a comprehensive and general data-driven Primary Delay Prediction System (PDPS) framework, which combines General Transit Feed Specification (GTFS), Critical Point Search (CPS), and deep learning models to leverage the data fusion. Based on this framework, we have also developed an open source data collection and processing tool that reduces the barrier to the use of the different open data sources. Finally, we demonstrate an advanced deep learning model, the novel ConvLSTM Encoder-Decoder model with CPS for better primary delay predictions.