Degree Name

Doctor of Philosophy


School of Information Systems and Technology


Chronic diabetes complications are mostly non-reversible and it is a vital concern in health care systems to be able to predict them. There are only a few examples of risk advisory tools in the domain of diabetes secondary prevention. Moreover, most of the existing systems use a number of risk factors in order to predict a single complication. This appears to be due to data limitations and methodology problems.

This research focuses on developing a risk advisor model to predict the chance of diabetic complications by observing risk factors of a patient.

To overcome the first problem, a type of data meta-analysis has been done. We gathered knowledge from other studies that have investigated the relation of different risk factors and complications. Using secondary data enabled us to use and follow up information from more than 450,000 patient/years.

A very complicated relationship exists between the risk factors and diabetic complications. In order to overcome these complicated relationships, we needed to define a method to divide the problem into smaller and simpler parts. For that, the whole n (risk factors) - k (complications) relationship is broken down into k different (n-1) relationships. Then, these (n-1) dependencies are broken into n (1-1) models.

In the second step we created models to show the one-to-one correspondence between factors and complications (1-1 models). Hence, two competing prediction techniques: regression analysis (seven patterns) and artificial neural networks (ANN), are applied to develop the 1-1 models. The best fitted regression models outperform the predictive ability of an ANN model, as well as other six regression patterns. All 1-1 models related to an individual complication are integrated with each other in step three, using the naïve Bayes theorem. These models illustrate how n factors can determine the probability of one complication. Thereafter, a Bayesian belief network is developed to show the influence of all mentioned factors and complications on each other as the next step.

We assessed the 1-1 models by R2, F-test and adjusted R2 equations. R squared was between .374 and .926 and the F-ratio was between 14 and 125 for all fifteen 1-1 models which indicate statistically acceptable results. A random set of real patient data from AusDiab research has been used to assess the validity of the final model. The range of sensitivity and specificity was between 70 and 100 percent. It was also between 50 and 100 percent for positive predictive value, indicating a very high level of success in predicting all five diabetes complications.

Considering that this model uses ordinary patient data without any extra expense, this level of validity highly recommends its utilization for the secondary prevention of diabetes.

Some of the advantages of this system include the following: • General improvement to the quality of life of the patients, • Better management in the use of human resources, • Reduction in the cost of hospitalization, • Better understanding of diabetes complication management.

This thesis developed a predictive model applying a novel combination of different techniques. It has clear benefits for people with diabetes and the health workers who are involved in diabetes diagnosis and treatment.



Unless otherwise indicated, the views expressed in this thesis are those of the author and do not necessarily represent the views of the University of Wollongong.