In public transport system, the equipped automated fare collection (AFC) system records travellers’ spatial and temporal information and generates a mass of data daily with more than ever attraction of interest and attention from both academics and practitioners. Advances in data availability and data mining techniques provide great opportunity to investigate various researches in an efficient and effective manner. A comprehensive literature review on the application of public transport smart card data before 2011 can be referred to . As some relevant studies in recent years,  proposed a data fusion method to infer passengers’ behavioral attributes of the trips based on the naive Bayes classifier model. The proposed method was applied to a single railway station in Osaka, with boarding/alighting information recorded by smart card and validation using trip survey data.  applied a unsupervised machine learning method, continuous hidden Markov model, to imputing the missing activities for each trip chain with integration of both clustering and transition models.  conducted a comparison on OD matrices between survey data and smart card data, and showed that both trip demands showed high correlation, which implied that the latter might provide a more efficient while less expensive way to construct the OD matrices. As is well known, traditional survey serves as the major method to gather useful trip information for a long time, but it often takes high expense of manpower, time and monetary resources. Moreover, the gap between real trips and survey results can never be ignored. This study aims to investigate various travel purposes of the public transit passengers and develop a data analysis framework to estimate the trip purposes, which can be considered as an alternative or a complementarity to the traditional survey method.