Identification of Potential Risk Factors of Diabetes for the Qatari Population
© 2020 IEEE. Large-scale cohorts are established in different regions of the world to identify the complex interaction of genetic, environmental, and lifestyle-related factors that may contribute to chronic diseases including diabetes. Qatar Biobank (QBB) is the largest repository for cohort study specific to the Qatari population. There are few studies based on the QBB cohort, which highlighted multiple risk factors responsible for diabetes in the Qatari population. However, no comprehensive research has been done using machine learning techniques to identify key factors that may contribute to diabetes specific to the Qatari population. We developed several machine-learning models using QBB data to classify diabetic patients from the non-diabetic participants forming the control group for this study. From the roster of several hundred measurements, we identified 25 potential risk factors that might be influential in distinguishing diabetic patients from nondiabetic participants. From the identified risk factors, we ranked HbAlc, Glucose, and LDL-Cholesterol as the most influential risk factors. Using these risk factors, we also developed several machine-learning models to classify diabetic subjects from healthy subjects. Overall, the classifiers achieved 0.85 F1-score in classifying diabetic subjects from non-diabetic subjects. Further investigation will pave the way for the inclusion of the identified risk factors into the standard diabetes screening process of the Qatari population.