Evaluation of Data Mining in Heart Failure Disease Classfication
DOI:
https://doi.org/10.31154/cogito.v10i2.726.460-473Keywords:
Heart failure, Data mining, SMOTE, XGBoost, Prediction accuracyAbstract
This study evaluates the effectiveness of data mining algorithms in heart failure disease classification. Various algorithms, including Random Forest, Decision Tree C4.5, Gradient Boosted Machine (GBM), and XGBoost, were applied to a heart failure dataset. The dataset was collected from multiple sources and preprocessed to address imbalances using the SMOTE (Synthetic Minority Over-sampling Technique) technique. The results indicate that employing SMOTE and parameter optimization through grid search significantly enhances the performance of these algorithms. XGBoost and GBM demonstrated superior accuracy, precision, and recall in both balanced and imbalanced data scenarios. In balanced data scenarios, XGBoost achieved an accuracy of 98.75% with an error rate of 1.25%, while GBM achieved an accuracy of 98.60% with an error rate of 1.40%. The study confirms that appropriate data preprocessing and parameter optimization are crucial for improving the accuracy of medical data analysis. These findings suggest that XGBoost and GBM are highly effective for heart disease prediction, supporting early diagnosis and timely medical intervention. Future research should explore alternative preprocessing techniques and additional algorithms to further improve prediction outcomes.References
World Health Organization, “Cardiovascular diseases,” 2021, [Online]. Available: https://www.who.int/health-topics/cardiovascular-diseases#tab=tab_1
S. Suangli, F. Fahmi, and E. M. Zamzami, “Performance Analysis of Support Vector Machine and Xgboost Classifier Algorithms in Predicting Data Heart Disease,” in 2023 29th International Conference on Telecommunications (ICT), IEEE, Nov. 2023, pp. 1–6. doi: 10.1109/ICT60153.2023.10374048.
V. Jain and M. Agrawal, “Heart Failure Prediction Using XGB Classifier, Logistic Regression and Support Vector Classifier,” in 2023 International Conference on Advancement in Computation & Computer Technologies (InCACCT), IEEE, May 2023, pp. 1–5. doi: 10.1109/InCACCT57535.2023.10141752.
S. Parthasarathy, V. Jayaraman, and J. P. Princy R, “Predicting Heart Failure using SMOTE-ENN-XGBoost,” in 2023 International Conference on Intelligent Data Communication Technologies and Internet of Things (IDCIoT), IEEE, Jan. 2023, pp. 661–666. doi: 10.1109/IDCIoT56793.2023.10053458.
S. Doki, S. Devella, S. Tallam, S. S. Reddy Gangannagari, P. Sampathkrishna Reddy, and G. P. Reddy, “Heart Disease Prediction Using XGBoost,” in 2022 Third International Conference on Intelligent Computing Instrumentation and Control Technologies (ICICICT), IEEE, Aug. 2022, pp. 1317–1320. doi: 10.1109/ICICICT54557.2022.9917678.
K. Karthick, S. K. Aruna, R. Samikannu, R. Kuppusamy, Y. Teekaraman, and A. R. Thelkar, “Implementation of a Heart Disease Risk Prediction Model Using Machine Learning,” Comput Math Methods Med, vol. 2022, pp. 1–14, May 2022, doi: 10.1155/2022/6517716.
A. Tiwari, A. Chugh, and A. Sharma, “Ensemble framework for cardiovascular disease prediction,” Comput Biol Med, vol. 146, p. 105624, Jul. 2022, doi: 10.1016/j.compbiomed.2022.105624.
M. O. Butt, A. Ur Rehman, S. Javaid, T. M. Ali, and A. Nawaz, “An Application of Artificial Intelligence for an Early and Effective Prediction of Heart Failure,” in 2022 Third International Conference on Latest trends in Electrical Engineering and Computing Technologies (INTELLECT), IEEE, Nov. 2022, pp. 1–6. doi: 10.1109/INTELLECT55495.2022.9969182.
M. M. Ali, B. K. Paul, K. Ahmed, F. M. Bui, J. M. W. Quinn, and M. A. Moni, “Heart disease prediction using supervised machine learning algorithms: Performance analysis and comparison,” Comput Biol Med, vol. 136, p. 104672, Sep. 2021, doi: 10.1016/j.compbiomed.2021.104672.
S. S. Panigrahi and N. Kaur, “Hybrid Classification Method for the Heart Disease Prediction,” in 2022 4th International Conference on Advances in Computing, Communication Control and Networking (ICAC3N), IEEE, Dec. 2022, pp. 494–499. doi: 10.1109/ICAC3N56670.2022.10074324.
U. Nagavelli, D. Samanta, and P. Chakraborty, “Machine Learning Technology-Based Heart Disease Detection Models,” J Healthc Eng, vol. 2022, pp. 1–9, Feb. 2022, doi: 10.1155/2022/7351061.
F. Tasnim and S. U. Habiba, “A Comparative Study on Heart Disease Prediction Using Data Mining Techniques and Feature Selection,” in 2021 2nd International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST), IEEE, Jan. 2021, pp. 338–341. doi: 10.1109/ICREST51555.2021.9331158.
H. H. Alalawi and M. S. Alsuwat, "Detection of cardiovascular disease using machine learning classification models," Int. J. Eng. Res. Technol., vol. 10, no. 7, pp. 151-157, 2021.
B. Martins, D. Ferreira, C. Neto, A. Abelha, and J. Machado, “Data Mining for Cardiovascular Disease Prediction,” J Med Syst, vol. 45, no. 1, p. 6, Jan. 2021, doi: 10.1007/s10916-020-01682-8.
A. Alqahtani, S. Alsubai, M. Sha, L. Vilcekova, and T. Javed, “Cardiovascular Disease Detection using Ensemble Learning,” Comput Intell Neurosci, vol. 2022, pp. 1–9, Aug. 2022, doi: 10.1155/2022/5267498.
K. M. Mohi Uddin, R. Ripa, N. Yeasmin, N. Biswas, and S. K. Dey, “Machine learning-based approach to the diagnosis of cardiovascular vascular disease using a combined dataset,” Intell Based Med, vol. 7, p. 100100, 2023, doi: 10.1016/j.ibmed.2023.100100.
J. Yang and J. Guan, "A heart disease prediction model based on feature optimization and SMOTE-XGBoost algorithm," Information, vol. 13, no. 10, p. 475, 2022. [Online]. Available: https://doi.org/10.3390/info13100475.
P. A. Moreno-Sanchez, “Development of an Explainable Prediction Model of Heart Failure Survival by Using Ensemble Trees,” Proceedings - 2020 IEEE International Conference on Big Data, Big Data 2020, pp. 4902–4910, 2020, doi: 10.1109/BigData50022.2020.9378460.
K. Shiwangi, J. K. Sandhu, and R. Sahu, “Effective Heart-Disease Prediction by Using Hybrid Machine Learning Technique,” in 2023 International Conference on Circuit Power and Computing Technologies (ICCPCT), IEEE, Aug. 2023, pp. 1670–1675. doi: 10.1109/ICCPCT58313.2023.10245785.
D. Asif, M. Bibi, M. S. Arif, and A. Mukheimer, “Enhancing Heart Disease Prediction through Ensemble Learning Techniques with Hyperparameter Optimization,” Algorithms, vol. 16, no. 6, p. 308, Jun. 2023, doi: 10.3390/a16060308.
K. Budholiya, S. K. Shrivastava, and V. Sharma, “An optimized XGBoost based diagnostic system for effective prediction of heart disease,” Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 7, pp. 4514–4523, Jul. 2022, doi: 10.1016/j.jksuci.2020.10.013.
Narayanan and Jayashree, “Implementation of Efficient Machine Learning Techniques for Prediction of Cardiac Disease using SMOTE,” Procedia Comput Sci, vol. 233, pp. 558–569, 2024, doi: 10.1016/j.procs.2024.03.245.
R. Valarmathi and T. Sheela, "Heart disease prediction using hyperparameter optimization (HPO) tuning," Biomed. Signal Process. Control, vol. 70, p. 103033, 2021. [Online]. Available: https://doi.org/10.1016/j.bspc.2021.103033.
N. Afiatuddin, M. T. Wicaksono, V. R. Akbar, R. Rahmaddeni, and D. Wulandari, “Komparasi Algoritma Machine Learning dalam Klasifikasi Kanker Payudara,” JURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 8, no. 2, p. 889, Apr. 2024, doi: 10.30865/mib.v8i2.7457.
A. Nugroho, “Analisa Splitting Criteria Pada Decision Tree dan Random Forest untuk Klasifikasi Evaluasi Kendaraan,” JSITIK: Jurnal Sistem Informasi dan Teknologi Informasi Komputer, vol. 1, no. 1, pp. 41–49, Dec. 2022, doi: 10.53624/jsitik.v1i1.154.
Fedesoriano, “Heart Failure Prediction Dataset,” 2021, [Online]. Available: https://www.kaggle.com/datasets/fedesoriano/heart-failure-prediction
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 CogITo Smart Journal

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).