Comparison of Machine Learning Methods for the Prediction of Type 2 Diabetes in Primary Care Setting Using EHR Data
Loading...
Date
2023-10
Authors
Olwendo, Amos Otieno
Ochieng, George
Rucha, Kenneth
Journal Title
Journal ISSN
Volume Title
Publisher
JAGST
Abstract
ABSTRACT
Diabetes remains a major global public health challenge, thus the need for better methods for
managing diabetes. Machine learning could provide reliable solutions to the need for early
detection and management of diabetes. This study conducted experiments to compare a
number of selected machine learning approaches to determine their suitability for early
detection of diabetes in the primary care setting. A retrospective study was conducted using
EHR dataset of confirmed cases of diabetes collected during routine care at Nairobi Hospital.
Institutional ethical approvals were obtained, and data were retrieved from the database
through stratified sampling based on gender. Diagnoses were confirmed using the ICD-10 codes.
Records with 5% or so of missing values were excluded from this analysis. Data were processed
by correction of errors and replacement of missing values using measures of central tendency.
The data were transformed through normalization using the decimal-scaling method. Data
analysis was conducted using selected supervised and unsupervised learning algorithms. Model
performances were validated using metrics for the evaluation of classification and clustering
results, respectively. Random Forest had the highest accuracy (0.95) and error rate (0.05), while
Gradient Boosting and Multilayer Perceptron (MLP) with 3 hidden layers obtained accuracy
(0.94) and error rate (0.06), respectively. The process of selecting machine learning algorithms
needs to explore both supervised and unsupervised learning techniques. In addition, an
appropriate architectural desig
Description
article
Keywords
Comparison, machine learning, classification, clustering, type 2 diabetes
Citation
Olwendo, A. O., Ochieng, G., & Rucha, K. (2024). Comparison of machine learning methods for the prediction of type 2 diabetes in primary care setting using EHR data. Journal of Agriculture, Science and Technology, 23(1), 24-36.