Dataset-Specific Bootstrap-Stability Weighting for Calibrated and Clinically Useful Ensemble Prediction in Medical Diagnosis
DOI:
https://doi.org/10.6000/1929-6029.2025.14.75Keywords:
Bootstrap, ensemble learning, balanced accuracy, calibration, decision curve analysis, clinical utilityAbstract
Background: Ensemble machine-learning models often perform well within a single medical dataset yet lose discrimination, calibration, and decision usefulness under dataset shift.
Objective: To develop and evaluate Bootstrap-Guided Optimization System (BOOTMED), a bootstrap-guided framework that learns dataset-specific weights from resampling stability to fuse probabilistic predictions, targeting discrimination, calibration, and decision-analytic utility simultaneously.
Methods: Four heterogeneous UCI medical datasets were analyzed (Chronic Kidney Disease; CKD, diabetes, heart disease, breast cancer). Base learners were k-nearest neighbors, random forest (RF), Gaussian naïve Bayes, and complement naïve Bayes. BOOTMED estimated stability-derived weights over 500 bootstrap resamples and aggregated model probabilities. Performance was compared with equal-weight voting and stacking using balanced accuracy and ROC-AUC, calibration error (Brier/ECE), and decision curve analysis.
Results: BOOTMED outperformed equal-weight voting and the best single model across all datasets, improving balanced accuracy by approximately 0.7-2.3 percentage points (adjusted p<0.05). Calibration error decreased (lower Brier/ECE), and decision curve analysis showed consistent positive net benefit across clinically relevant thresholds (0.10-0.50). Transferring weights between datasets reduced performance, supporting dataset-specific optimization.
Conclusion: Bootstrap-guided, dataset-specific weighting can improve discrimination, calibration, and clinical net benefit across heterogeneous medical datasets, offering a simple and reproducible ensembling strategy for diagnostic prediction.
References
Sang H, Lee H, Lee M, Park J, Kim S, Woo HG, Rahmati M, Koyanagi A, Smith L, Lee S, Hwang YC, Park TS, Lim H, Yon DK, Rhee SY. Prediction model for cardiovascular disease in patients with diabetes using machine learning derived and validated in two independent Korean cohorts. Sci Rep 2024; 14(1): 14966. DOI: https://doi.org/10.1038/s41598-024-63798-y
Chhabra D, Juneja M, Chutani G. An efficient ensemble based machine learning approach for predicting Chronic Kidney Disease. Curr Med Imaging 2023. DOI: https://doi.org/10.2174/1573405620666230508104538
Ganie SM, Pramanik PKD, Zhao Z. Ensemble learning with explainable AI for improved heart disease prediction based on multiple datasets. Sci Rep 2025; 15(1): 13912. DOI: https://doi.org/10.1038/s41598-025-97547-6
Gurcan F. Enhancing breast cancer prediction through stacking ensemble and deep learning integration. PeerJ Computer Science 2025; 11: e2461. DOI: https://doi.org/10.7717/peerj-cs.2461
Ali MS, Islam MK, Das AA, Duranta S, Haque MF, Rahman MH. A Novel Approach for Best Parameters Selection and Feature Engineering to Analyze and Detect Diabetes: Machine Learning Insights. BioMed Research International 2023(1): 8583210. DOI: https://doi.org/10.1155/2023/8583210
Saif D, Sarhan AM, Elshennawy NM. Deep-kidney: an effective deep learning framework for chronic kidney disease prediction. Health Inf Sci Syst 2023; 12(1): 3. DOI: https://doi.org/10.1007/s13755-023-00261-8
Preethi I, Dharmarajan K, Sharma B, Chowdhury S, Dhaou IB. A novel method to predict chronic kidney disease using optimized deep learning algorithm. In: 2024 21st Learning and Technology (L&T) 2024: 313-318. DOI: https://doi.org/10.1109/LT60077.2024.10468760
Reddy MP, Kumar KP, Suresh Y, Lakshmi TV. Prediction of chronic kidney disease using svm and cnn. International Journal on Recent and Innovation Trends in Computing and Communication 2023; 11(5s): 80-89. DOI: https://doi.org/10.17762/ijritcc.v11i5s.6632
Vanathi D, Ramesh SM, Sudha K, Tamizharasu K, Sengottaiyan N, et al. A machine learning perspective for predicting chronic kidney disease. In: 2024 Sustainable Computing and Smart Systems 2024: 989-993. DOI: https://doi.org/10.1109/ICSCSS60660.2024.10625341
Azizah MF, Paramitha AT. Predictive modelling of chronic kidney disease using Gaussian Naïve Bayes algorithm. International Journal of Artificial Intelligence in Medical Issues 2023; 2(2): 45-53. DOI: https://doi.org/10.56705/ijaimi.v2i2.160
Adarkar D, Lokapur A, Porwal J, Mali P. Chronic kidney disease prediction. International Journal for Research in Applied Science and Engineering Technology 2023; 11(4): 4239-4243. DOI: https://doi.org/10.22214/ijraset.2023.51239
Ganie SM, Dutta Pramanik PK, Mallik S, Zhao Z. Chronic kidney disease prediction using boosting techniques based on clinical parameters. PLOS ONE 2023; 18(12): e0295234. DOI: https://doi.org/10.1371/journal.pone.0295234
Islam R, Sultana A, Islam MR. A comprehensive review for chronic disease prediction using machine learning algorithms. Journal of Electrical Systems and Information Technology 2024; 11(1): 27. DOI: https://doi.org/10.1186/s43067-024-00150-4
Jeyalakshmi G, Lloyd FV, Subbulakshmi K, Vinudevi G. A biomedical dataset analysis on predictive modeling of chronic kidney disease using machine learning. In: Machine Learning in Multimedia 2024: 175-196. DOI: https://doi.org/10.4018/979-8-3693-8659-0.ch010
Khalil N, Elkholy M, Eassa M. A comparative analysis of machine learning models for prediction of chronic kidney disease. Sustainable Machine Intelligence Journal 2023; 5. DOI: https://doi.org/10.61185/SMIJ.2023.55103
Lu Y, Ning Y, Li B, Zhu J, Zhang J, et al. Risk factor mining and prediction of urine protein progression in chronic kidney disease: A machine learning based study. BMC Medical Informatics and Decision Making 2023; 23(1): 173. DOI: https://doi.org/10.1186/s12911-023-02269-2
Nowrozy R. Machine learning model for chronic disease prediction. Journal of Biomedical Research & Environmental Sciences 2023; 4(12): 1738-1744. DOI: https://doi.org/10.37871/jbres1859
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Policy for Journals/Articles with Open Access
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are permitted and encouraged to post links to their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
Policy for Journals / Manuscript with Paid Access
Authors who publish with this journal agree to the following terms:
- Publisher retain copyright .
- Authors are permitted and encouraged to post links to their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work .