Heart Disease Prediction using an Ensemble Learning Method: A Study at King Abdullah Hospital in Bisha, Saudi Arabia
DOI:
https://doi.org/10.6000/1929-6029.2025.14.52Keywords:
Machine learning, Ensemble learning, Classification, Disease prediction, Heart diseaseAbstract
The detection of diseases is essential to improving healthcare outcomes and saving lives. Thanks to technological advancements in medicine, machine learning has become a valuable tool for predicting future patient health outcomes. Despite the abundance of available patient data, accurately predicting cardiac disease has become increasingly challenging. In response, we developed an innovative ensemble learning approach (ELA) that combines three powerful machine learning (ML) techniques. Our ELA provides reliable predictions of cardiac disease that surpass those of the individual classification algorithms, resulting in higher accuracy. Our research yields a new combination of classification algorithms that significantly increases the prediction accuracy. We tested our model on a regional dataset collected from King Abdullah Hospital in Bisha, Saudi Arabia. We obtained the best results false negatives (FN ) of 8, true positives (TP) of 70, true negatives (TN) of 72, false positives (FP) of 6, accuracy of 0.9113, sensitivity of 0.8839, specificity of 0.95, PPV of 0.9389, NPV of 0.8878, AUC of 0.9569, F1 of 0.9133 Kappa of 0.8220, MCC of 0.8277 with an ELA comprising logistic regression (LR), extra trees (ET) and support vector machine (SVM) with radial basis function (RBF) kernel. With our ELA, medical professionals can detect cardiac disease and provide timely interventions to prevent potentially life-threatening health issues.
References
Ministry of Health, Saudi Arabia. Heart disease is the cause of 42% of deaths from non-communicable diseases in the Kingdom [Press release] 2013. https://www.moh.gov.sa
Rath A, Mishra D, Panda G, Satapathy SC. Heart disease detection using deep learning methods from imbalanced ECG samples. Biomedical Signal Processing and Control 2021; 68: 102820. DOI: https://doi.org/10.1016/j.bspc.2021.102820
Devi AD, Xavier S. Enhanced prediction of heart disease by genetic algorithm and RBF network. International Journal of Advanced Information Engineering and Technology 2015; 2(2): 29-37.
Djam XY, Wajiga GM, Kimbi YH, Blamah NV. A Fuzzy Expert System for the Management of Malaria. International Journal of Pure & Applied Sciences & Technology 2011; 5(2): 84-102.
Pawlovsky AP. An ensemble based on distances for a kNN method for heart disease diagnosis. In 2018 International Conference on Electronics, Information, and Communication (ICEIC) IEEE 2018; pp. 1-4. DOI: https://doi.org/10.23919/ELINFOCOM.2018.8330570
Janosi A, Steinbrunn W, Pfisterer M, Detrano R. UCI machine learning repository-heart disease data set [Data set]. University of California, Irvine 1988.
Alizadehsani Z, Alizadehsani R, Roshanzamir M. Z-Alizadeh Sani data set [Data set]. UCI Machine Learning Repository 2017. https://archive.ics.uci.edu
Latha CBC, Jeeva SC. Improving the accuracy of prediction of heart disease risk based on ensemble classification techniques. Informatics in Medicine Unlocked 2019; 16: 100203. DOI: https://doi.org/10.1016/j.imu.2019.100203
Atallah R, Al-Mousa A. Heart disease detection using machine learning majority voting ensemble method. In 2019 2nd International Conference on New Trends in Computing Sciences (ICTCS) IEEE 2019; pp. 1-6. DOI: https://doi.org/10.1109/ICTCS.2019.8923053
Lapp D. Heart Disease Dataset [Data set]. Kaggle 2019. https://www.kaggle.com/datasets/johnsmith88/heart-disease-dataset
Li R, Shen S, Chen G, Xie T, Ji S, Zhou B, Wang Z. Multilevel risk prediction of cardiovascular disease based on Adaboost+RF ensemble learning. IOP Conference Series: Materials Science and Engineering 2019; 533(1): 012050. DOI: https://doi.org/10.1088/1757-899X/533/1/012050
Chaurasia V, Chaurasia A. Novel method of characterization of heart disease prediction using sequential feature selection-based ensemble technique. Biomedical Materials & Devices 2023; 1(2): 932-941. DOI: https://doi.org/10.1007/s44174-022-00060-x
Asif D, Bibi M, Arif MS, Mukheimer A. Enhancing heart dis-ease prediction through ensemble learning techniques with hyperparameter optimization. Algorithms 2023; 16(6): 308. DOI: https://doi.org/10.3390/a16060308
Ahmed R. Heart Disease [Data set]. Kaggle 2020. https://www.kaggle.com/datasets/data855/heart-disease
Cherngs. Heart disease Cleveland UCI [Data set]. Kaggle 2020. https://www.kaggle.com/datasets/cherngs/heart-disease-cleveland-uci
Ganie SM, Pramanik PKD, Malik MB, Nayyar A, Kwak KS. An Improved Ensemble Learning Approach for Heart Disease Prediction Using Boosting Algorithms. Computer Systems Science and Engineering 2023; 46(3): 3993-4006. DOI: https://doi.org/10.32604/csse.2023.035244
Aziz S, Afreen N, Akram F, Ahmed M. A Framework for Cardiac Arrest Prediction via Application of Ensemble Learning Using Boosting Algorithms. Procedia Computer Science 2024; 235: 3293-3304. DOI: https://doi.org/10.1016/j.procs.2024.04.311
Narayanana J. Implementation of Efficient Machine Learning Techniques for Prediction of Cardiac Disease using SMOTE. Procedia Computer Science 2024; 233: 558-569. DOI: https://doi.org/10.1016/j.procs.2024.03.245
Musa IR, Omar SM, Sharif ME, Ahmed ABA, Adam I. The calculated versus the measured glycosylated haemoglobin (HbA1c) levels in patients with type 2 diabetes mellitus. Journal of Clinical Laboratory Analysis 2021; 35(8): e23873. DOI: https://doi.org/10.1002/jcla.23873
Beaulieu-Jones B, Greene CS, Consortium P. A new analy-tical framework for missing data imputation and classification with uncertainty. PLOS One 2022; 17(3): e0264238.
Waljee AK, Mukherjee A, Singal AG, Zhang Y, Warren J, Balis U, Marrero J, Zhu J, Higgins PDR. Comparison of imputation methods for missing laboratory data in medicine. BMJ Open 2013; 3(8): e002847. DOI: https://doi.org/10.1136/bmjopen-2013-002847
Zhang Y, Zhang J, Gong C, et al. Improving the prediction of heart failure patients' survival using SMOTE and effective data mining techniques. IEEE Access 2020; 8: 182459-182472.
García S, Luengo J, Herrera F. Data preprocessing in data mining. Springer 2015. DOI: https://doi.org/10.1007/978-3-319-10247-4
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay É. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 2011; 12: 2825-2830.
Tang J, Alelyani S, Liu H. Feature selection for classification: A review. In Data classification: Algorithms and applications. CRC Press 2014; pp. 37-64.
Hasan N, Bao Y. Comparing different feature selection algorithms for cardiovascular disease prediction. Health and Technology 2021; 11(1): 49-62. DOI: https://doi.org/10.1007/s12553-020-00499-2
Guyon I, Elisseeff A. An introduction to variable and feature selection. Journal of Machine Learning Research 2003; 3: 1157-1182.
Dey A, Ashour AS, Bhattacharya S. Machine learning techniques for heart disease prediction: A comparative study. In 2017 International Conference on Electronics, Communication and Aerospace Technology (ICECA) IEEE 2017; pp. 547-550.
Das R, Turkoglu I, Sengur A. Effective diagnosis of heart disease through neural networks ensembles. Expert Systems with Applications 2009; 36(4): 7675-7680. DOI: https://doi.org/10.1016/j.eswa.2008.09.013
Pattekari SA, Parveen S. Prediction system for heart disease using Naive Bayes. International Journal of Advanced Computer and Mathematical Sciences 2012; 3(3): 290-294.
Rajkumar A, Reena GS. Diagnosis of heart disease using machine learning algorithms. International Journal of Research in Engineering and Technology 2010; 2(6): 741-744.
Ali L, Zhu C, Zhou M, Javeed A. Reliable Parkinson's disease detection by using an intelligent system based on L2-regularized logistic regression and extra trees classifier. Future Generation Computer Systems 2019; 97: 238-252.
Schapire RE. Explaining adaboost. In Empirical Inference. Springer 2013; pp. 37-52. DOI: https://doi.org/10.1007/978-3-642-41136-6_5
Soofi AA, Awan A. Classification techniques in machine learning: Applications and issues. Journal of Basic and Applied Sciences 2017; 13: 459-465. DOI: https://doi.org/10.6000/1927-5129.2017.13.76
Jakkula V. Tutorial on support vector machine (SVM) (Technical Report). School of EECS, Washington State University 2006.
Mitchell TM. Machine learning. McGraw-Hill 1997.
Speiser JL, Miller ME, Tooze J, Ip E. A comparison of random forest variable selection methods for classification prediction modeling. Expert Systems with Applications 2019; 134: 93-101. DOI: https://doi.org/10.1016/j.eswa.2019.05.028
Dong X, Yu Z, Cao W, Shi Y, Ma Q. A survey on ensemble learning. Frontiers of Computer Science 2020; 14(2): 241-258. DOI: https://doi.org/10.1007/s11704-019-8208-z
Patil S, Bhosale S. Hyperparameter tuning based performance analysis of machine learning approaches for prediction of cardiac complications. In International Conference on Soft Computing and Pattern Recognition. Springer 2020; pp. 605-617. DOI: https://doi.org/10.1007/978-3-030-73689-7_58
Alshehri GA, Alharbi HM. Prediction of heart disease using an ensemble learning approach. International Journal of Advanced Computer Science and Applications 2023; 14(8). DOI: https://doi.org/10.14569/IJACSA.2023.01408118
Ullah T, Ullah SI, Ullah K, Ishaq M, Khan A, Ghadi YY, Algarni A. Machine learning-based cardiovascular disease detection using optimal feature selection. IEEE Access 2024. DOI: https://doi.org/10.1109/ACCESS.2024.3359910
Sumbria S. Statlog (Heart) Data Set [Data set]. Kaggle 2019. https://www.kaggle.com/datasets/shubamsumbria/statlog-heart-data-set
Ulianova S. Cardiovascular Disease dataset [Data set]. Kaggle 2019. https://www.kaggle.com/datasets/sulianova/ cardiovascular-disease-dataset
Abdar M, Acharya UR, Sarrafzadegan N, Makarenkov V. NE-nu-SVC: A new nested ensemble clinical decision support system for effective diagnosis of coronary artery disease. IEEE Access 2019; 7: 167605-167620. DOI: https://doi.org/10.1109/ACCESS.2019.2953920
Yewale D, Vijayaragavan SP, Bairagi VK. An Effective Heart Disease Prediction Framework based on Ensemble Techniques in Machine Learning. International Journal of Advanced Computer Science and Applications 2023; 14(2). DOI: https://doi.org/10.14569/IJACSA.2023.0140223
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Policy for Journals/Articles with Open Access
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are permitted and encouraged to post links to their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
Policy for Journals / Manuscript with Paid Access
Authors who publish with this journal agree to the following terms:
- Publisher retain copyright .
- Authors are permitted and encouraged to post links to their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work .