Threshold-based prediction of schedule overrun in software projects
Risk identification is the first critical task of risk management for planning measures to deal with risks. While, software projects have a high risk of schedule overruns, current practices in risk management mostly rely on high level guidance and the subjective judgements of experts. In this paper, we propose a novel approach to support risk identification using historical data associated with a software project. Specifically, our approach identifies patterns of abnormal behaviours that caused project delays and uses this knowledge to develop an interpretable risk predictive model to predict whether current software tasks (in the form of issues) will cause a schedule overrun. The abnormal behaviour identification is based on a set of configurable threshold-based risk factors. Our approach aims to provide not only predictive models, but also an interpretable outcome that can be inferred as the patterns of the combinations between risk factors. The evaluation results from two case studies (Moodle and Duraspace) demonstrate the e.ectiveness of our predictive models, achieving 78% precision, 56% recall, 65% F-measure, 84% Area Under the ROC Curve.