Artificial Intelligence With Temporal Features Outperforms Machine Learning in Predicting Diabetes

Iqra Naveed, Muhammad Farhat Kaleem, Karim Keshavjee, Aziz Guergachi

Abstract

Diabetes mellitus type 2 is increasingly being called a modern preventable pandemic, as even with excellent available treatments, the rate of complications of diabetes is rapidly increasing. Predicting diabetes and identifying it in its early stages could make it easier to prevent, allowing enough time to implement therapies before it gets out of control. Leveraging longitudinal electronic medical record (EMR) data with deep learning has great potential for diabetes prediction. This paper examines the predictive competency of deep learning models in contrast to state-of-the-art machine learning models to incorporate the time dimension of risk. The proposed research investigates a variety of deep learning models and features for predicting diabetes.

Introduction

Diabetes mellitus type 2 (T2D) is a chronic disease that is growing in prevalence rapidly and is increasingly being called a preventable pandemic [1]. T2D is associated with long term chronic damage and dysfunction of organs particularly the heart, kidneys, eyes and blood vessels [2]. As reported by the International Diabetes Federation, 537 million individuals have diabetes globally, and this number is expected to increase to 783 million by the year 2045 [2]. T2D is the cause of 1.6 million deaths every year and is the seventh major cause of death.

Materials and method

The proposed framework is divided into data collection, data preparation, pre-processing, train-test split, prediction models, quantifying features, and performance evaluation.

Results

The predictive efficacy of deep learning models in contrast with baseline machine learning models for enhanced diabetes prediction

Discussion

Early prediction of diabetes onset is important for all health care systems, as diabetes is now considered a modern preventable pandemic. Leveraging longitudinal EMR data with deep learning can detect individuals at high risk of developing diabetes for early intervention that could delay or even prevent the onset of diabetes. State of the art machine learning algorithms which are reported on extensively in the literature for predictive analysis cannot capture long term sequences and temporal relations.

Conclusions

This study compares the predictive strength of deep learning models with machine learning models. The intent is to identify the most precise deep learning model that provides temporal features and the most significant features for diabetes prediction. Model performance was assessed for critical features, risk factors, training data density, and visit history of a patient. The results exhibit that deep learning models offer superior diabetes prediction with enhanced performance accuracy above 91%. The predictive competency analysis of features exhibits significant predictive potential for key features such as FBS, A1c and BMI. Risk factor analysis indicates that obese, middle aged and hypertensive individuals are more susceptible to diabetes, in keeping with known medical knowledge, but not used quantitatively in current clinical practice to predict future onset of diabetes.

Citation: Naveed I, Kaleem MF, Keshavjee K, Guergachi A (2023) Artificial intelligence with temporal features outperforms machine learning in predicting diabetes. PLOS Digit Health 2(10): e0000354. https://doi.org/10.1371/journal.pdig.0000354

Editor: Danilo Pani, University of Cagliari: Universita degli Studi Di Cagliari, ITALY

Received: April 4, 2023; Accepted: August 19, 2023; Published: October 25, 2023

Copyright: © 2023 Naveed et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: Data cannot be shared publicly because of they were obtained under a data sharing agreement. Data are available from the Canadian Primary Care Sentinel Surveillance Network (https://www.cpcssn.ca) for researchers who meet the criteria for access to confidential data.

Funding: This research was partially supported by a NSERC Discovery Grant 2019-24 held by author AG. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000354#sec021

Harvard Medical School - Leadership in Medicine Southeast Asia