Illawarra Health and Medical Research Institute

Data mining: potential applications in research on nutrition and health

Marijka Batterham, University of WollongongFollow
Elizabeth Neale, University of WollongongFollow
Allison Martin, University of WollongongFollow
Linda C. Tapsell, University of WollongongFollow

RIS ID

112176

Publication Details

Batterham, M., Neale, E., Martin, A. & Tapsell, L. (2017). Data mining: potential applications in research on nutrition and health. Nutrition and Dietetics, 74 (1), 3-10.

Abstract

Aim: Data mining enables further insights from nutrition-related research, but caution is required. The aim of this analysis was to demonstrate and compare the utility of data mining methods in classifying a categorical outcome derived from a nutrition-related intervention.

Methods: Baseline data (23 variables, 8 categorical) on participants (n = 295) in an intervention trial were used to classify participants in terms of meeting the criteria of achieving 10 000 steps per day. Results from classification and regression trees (CARTs), random forests, adaptive boosting, logistic regression, support vector machines and neural networks were compared using area under the curve (AUC) and error assessments.

Results: The CART produced the best model when considering the AUC (0.703), overall error (18%) and within class error (28%). Logistic regression also performed reasonably well compared to the other models (AUC 0.675, overall error 23%, within class error 36%). All the methods gave different rankings of variables' importance. CART found that body fat, quality of life using the SF-12 Physical Component Summary (PCS) and the cholesterol: HDL ratio were the most important predictors of meeting the 10 000 steps criteria, while logistic regression showed the SF-12PCS, glucose levels and level of education to be the most significant predictors (P≤0.01).

Conclusions: Differing outcomes suggest caution is required with a single data mining method, particularly in a dataset with nonlinear relationships and outliers and when exploring relationships that were not the primary outcomes of the research.

Please refer to publisher version or contact your library.

COinS

Link to publisher version (DOI)

http://dx.doi.org/10.1111/1747-0080.12337

Illawarra Health and Medical Research Institute

Data mining: potential applications in research on nutrition and health

RIS ID

Publication Details

Abstract

Link to publisher version (DOI)

Search

Browse

Links

Illawarra Health and Medical Research Institute

Data mining: potential applications in research on nutrition and health

Authors

RIS ID

Publication Details

Abstract

Share

Link to publisher version (DOI)

Search

Browse

Links