Logo-hpp
2023: Two-year Impact Factor: 2.4
Scopus Journal Metrics
CiteScore (2023):7.1
 
Platinum
Open Access

Health Promot Perspect. 2025;15(1): 82-92.
doi: 10.34172/hpp.025.43105
  Abstract View: 44
  PDF Download: 58

Original Article

Artificial intelligence survival models for identifying relevant risk factors for incident diabetes in Azar cohort population

Neda Gilani 1 ORCID logo, Mohammadhossein Somi 2 ORCID logo, Farzaneh Hamidi 3 ORCID logo, Pasqualina Santaguida 4 ORCID logo, Elnaz Faramarzi 2 ORCID logo, Reza Arabi Belaghi 5* ORCID logo

1 Department of Statistics and Epidemiology, Faculty of Health, Tabriz University of Medical Sciences, Tabriz, Iran
2 Liver and Gastrointestinal Diseases Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
3 Student Research committee, Tabriz University of Medical Sciences, Tabriz, Iran
4 Department of Health Research Methods, Evidence, and Impact (HEI) Associate Member, Rehabilitation Science McMaster University, Hamilton, Canada
5 Unit of Applied Statistics and Mathematics, Department of Energy and Technology, Faculty of Natural Resources and Agricultural Sciences, Swedish University of Agriculture Sciences, Uppsala, Sweden
*Corresponding Author: Reza Arabi, Email: rezaarabi11@gmail.com

Abstract

Background: This study aimed to identify some risk factors associated with time to diabetes type II events using artificial intelligence (AI) survival models (SM) in a population cohort from East Azerbaijan, Iran.

Methods: Data from Azar-Cohort spanning from 2014 to 2020 was analyzed using the random forest (RF) variable selection method along with Cox regression to identify the most relevant risk factors associated with diabetes. We then developed prediction models using RF survival analysis. Lasso-variable selection and RF variable selection were used to select the most important variables. The concordance index (C-index) was used to evaluate the concordance of the prediction models.

Results: Our LASSO-Cox regression identified six factors to be significantly associated with diabetes: age, mean corpuscular hemoglobin concentration (MCHC), waist circumference (WC), body mass index (BMI), use of sleep medication, and hypertension stage 1 and stage 2. The model included all variables with a C-index of 76.3%. In contrast, the RF analysis identified 21 important variables predicting a higher probability of having diabetes. Of those, WC, MCHC, triglyceride, and age were the most important predictors of diabetes. The RF model converged after 500 trees with an out-of-bag (OOB) of 0.28 and a C-index of 79.5%.

Conclusion: RF machine learning algorithms and LASSO-Cox regression analyses consistently identified WC, hypertension, and MCHC as the main risk factors for developing diabetes. The RF approach demonstrated slightly better accuracy in predicting the likelihood of diabetes at different time points.


First Name
Last Name
Email Address
Comments
Security code


Abstract View: 45

Your browser does not support the canvas element.


PDF Download: 58

Your browser does not support the canvas element.

Submitted: 09 Apr 2024
Revision: 30 Nov 2024
Accepted: 01 Dec 2024
ePublished: 06 May 2025
EndNote EndNote

(Enw Format - Win & Mac)

BibTeX BibTeX

(Bib Format - Win & Mac)

Bookends Bookends

(Ris Format - Mac only)

EasyBib EasyBib

(Ris Format - Win & Mac)

Medlars Medlars

(Txt Format - Win & Mac)

Mendeley Web Mendeley Web
Mendeley Mendeley

(Ris Format - Win & Mac)

Papers Papers

(Ris Format - Win & Mac)

ProCite ProCite

(Ris Format - Win & Mac)

Reference Manager Reference Manager

(Ris Format - Win only)

Refworks Refworks

(Refworks Format - Win & Mac)

Zotero Zotero

(Ris Format - Firefox Plugin)