📞 +91-7667918914 | ✉️ iarjset@gmail.com
International Advanced Research Journal in Science, Engineering and Technology
International Advanced Research Journal in Science, Engineering and Technology A Monthly Peer-Reviewed Multidisciplinary Journal
ISSN Online 2393-8021ISSN Print 2394-1588Since 2014
IARJSET aligns to the suggestive parameters by the latest University Grants Commission (UGC) for peer-reviewed journals, committed to promoting research excellence, ethical publishing practices, and a global scholarly impact.
← Back to VOLUME 13, ISSUE 1, JANUARY 2026

Hybrid Machine Learning Approaches for Early Diabetes Prediction Using Patient Health Data

Mohammed Nawaz Khan, K R Sumana

👁 1 view📥 0 downloads
Share: 𝕏 f in

Abstract: Diabetes mellitus, a pervasive chronic metabolic disorder, frequently evades early detection until irreversible complications-cardiovascular disease, nephropathy, neuropathy, and retinopathy-manifest. Conventional diagnostics reliant on laboratory assays and clinical expertise remain constrained by accessibility and cost. This investigation introduces a machine learning-driven diabetes risk prediction system leveraging the Pima Indians Diabetes Dataset, employing systematic data preprocessing, feature selection, and Logistic Regression modelling to deliver interpretable early-stage risk assessment from standard clinical parameters. Deployed through a Flask microservice architecture, the platform furnishes real-time probabilistic predictions with confidence intervals via an intuitive web interface, facilitating patient self-screening and healthcare provider decision support. Empirical validation confirms robust predictive performance suitable for population-scale early warning, while explicit positioning as an educational adjunct-rather than diagnostic substitute ensures clinical responsibility. The system advances accessible prediabetes surveillance, enabling timely lifestyle and pharmacotherapeutic interventions to mitigate long-term morbidity. CheckYourDiabetic introduces a hybrid machine learning framework for early Type 2 diabetes prediction, integrating Logistic Regression, K-Nearest Neighbors, Random Forest, and XGBoost via stacking ensemble on the Pima Indians Diabetes Dataset (n=768, 8 clinical features). Following robust preprocessing-KNN imputation, SMOTE oversampling, and RFE feature selection-the system achieves superior performance (AUC-ROC: 0.94, Sensitivity: 92%) compared to individual classifiers through complementary modeling of linear, local, and nonlinear biomarker interactions. Deployed as a Flask-based web application, it delivers real-time risk stratification with SHAP-based interpretability, enabling accessible pre-symptomatic screening and timely intervention to mitigate diabetes complications in resource-constrained settings.

How to Cite:

[1] Mohammed Nawaz Khan, K R Sumana, “Hybrid Machine Learning Approaches for Early Diabetes Prediction Using Patient Health Data,” International Advanced Research Journal in Science, Engineering and Technology (IARJSET), DOI: 10.17148/IARJSET.2026.13150

Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 International License.