Evaluation of Data Mining in Heart Failure Disease Classfication

Authors

  • Nurfadlan Afiatuddin Universitas Sains dan Teknologi Indonesia
  • Rahmaddeni Rahmaddeni Universitas Sains dan Teknologi Indonesia
  • Fitri Pratiwi Universitas Dumai
  • Rapindra Septia Universitas Sains dan Teknologi Indonesia
  • Heri Hendrawan Universitas Sains dan Teknologi Indonesia

DOI:

https://doi.org/10.31154/cogito.v10i2.726.460-473

Keywords:

Heart failure, Data mining, SMOTE, XGBoost, Prediction accuracy

Abstract

This study evaluates the effectiveness of data mining algorithms in heart failure disease classification. Various algorithms, including Random Forest, Decision Tree C4.5, Gradient Boosted Machine (GBM), and XGBoost, were applied to a heart failure dataset. The dataset was collected from multiple sources and preprocessed to address imbalances using the SMOTE (Synthetic Minority Over-sampling Technique) technique. The results indicate that employing SMOTE and parameter optimization through grid search significantly enhances the performance of these algorithms. XGBoost and GBM demonstrated superior accuracy, precision, and recall in both balanced and imbalanced data scenarios. In balanced data scenarios, XGBoost achieved an accuracy of 98.75% with an error rate of 1.25%, while GBM achieved an accuracy of 98.60% with an error rate of 1.40%. The study confirms that appropriate data preprocessing and parameter optimization are crucial for improving the accuracy of medical data analysis. These findings suggest that XGBoost and GBM are highly effective for heart disease prediction, supporting early diagnosis and timely medical intervention. Future research should explore alternative preprocessing techniques and additional algorithms to further improve prediction outcomes.

References

World Health Organization, “Cardiovascular diseases,” 2021, [Online]. Available: https://www.who.int/health-topics/cardiovascular-diseases#tab=tab_1

S. Suangli, F. Fahmi, and E. M. Zamzami, “Performance Analysis of Support Vector Machine and Xgboost Classifier Algorithms in Predicting Data Heart Disease,” in 2023 29th International Conference on Telecommunications (ICT), IEEE, Nov. 2023, pp. 1–6. doi: 10.1109/ICT60153.2023.10374048.

V. Jain and M. Agrawal, “Heart Failure Prediction Using XGB Classifier, Logistic Regression and Support Vector Classifier,” in 2023 International Conference on Advancement in Computation & Computer Technologies (InCACCT), IEEE, May 2023, pp. 1–5. doi: 10.1109/InCACCT57535.2023.10141752.

S. Parthasarathy, V. Jayaraman, and J. P. Princy R, “Predicting Heart Failure using SMOTE-ENN-XGBoost,” in 2023 International Conference on Intelligent Data Communication Technologies and Internet of Things (IDCIoT), IEEE, Jan. 2023, pp. 661–666. doi: 10.1109/IDCIoT56793.2023.10053458.

S. Doki, S. Devella, S. Tallam, S. S. Reddy Gangannagari, P. Sampathkrishna Reddy, and G. P. Reddy, “Heart Disease Prediction Using XGBoost,” in 2022 Third International Conference on Intelligent Computing Instrumentation and Control Technologies (ICICICT), IEEE, Aug. 2022, pp. 1317–1320. doi: 10.1109/ICICICT54557.2022.9917678.

K. Karthick, S. K. Aruna, R. Samikannu, R. Kuppusamy, Y. Teekaraman, and A. R. Thelkar, “Implementation of a Heart Disease Risk Prediction Model Using Machine Learning,” Comput Math Methods Med, vol. 2022, pp. 1–14, May 2022, doi: 10.1155/2022/6517716.

A. Tiwari, A. Chugh, and A. Sharma, “Ensemble framework for cardiovascular disease prediction,” Comput Biol Med, vol. 146, p. 105624, Jul. 2022, doi: 10.1016/j.compbiomed.2022.105624.

M. O. Butt, A. Ur Rehman, S. Javaid, T. M. Ali, and A. Nawaz, “An Application of Artificial Intelligence for an Early and Effective Prediction of Heart Failure,” in 2022 Third International Conference on Latest trends in Electrical Engineering and Computing Technologies (INTELLECT), IEEE, Nov. 2022, pp. 1–6. doi: 10.1109/INTELLECT55495.2022.9969182.

M. M. Ali, B. K. Paul, K. Ahmed, F. M. Bui, J. M. W. Quinn, and M. A. Moni, “Heart disease prediction using supervised machine learning algorithms: Performance analysis and comparison,” Comput Biol Med, vol. 136, p. 104672, Sep. 2021, doi: 10.1016/j.compbiomed.2021.104672.

S. S. Panigrahi and N. Kaur, “Hybrid Classification Method for the Heart Disease Prediction,” in 2022 4th International Conference on Advances in Computing, Communication Control and Networking (ICAC3N), IEEE, Dec. 2022, pp. 494–499. doi: 10.1109/ICAC3N56670.2022.10074324.

U. Nagavelli, D. Samanta, and P. Chakraborty, “Machine Learning Technology-Based Heart Disease Detection Models,” J Healthc Eng, vol. 2022, pp. 1–9, Feb. 2022, doi: 10.1155/2022/7351061.

F. Tasnim and S. U. Habiba, “A Comparative Study on Heart Disease Prediction Using Data Mining Techniques and Feature Selection,” in 2021 2nd International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST), IEEE, Jan. 2021, pp. 338–341. doi: 10.1109/ICREST51555.2021.9331158.

H. H. Alalawi and M. S. Alsuwat, "Detection of cardiovascular disease using machine learning classification models," Int. J. Eng. Res. Technol., vol. 10, no. 7, pp. 151-157, 2021.

B. Martins, D. Ferreira, C. Neto, A. Abelha, and J. Machado, “Data Mining for Cardiovascular Disease Prediction,” J Med Syst, vol. 45, no. 1, p. 6, Jan. 2021, doi: 10.1007/s10916-020-01682-8.

A. Alqahtani, S. Alsubai, M. Sha, L. Vilcekova, and T. Javed, “Cardiovascular Disease Detection using Ensemble Learning,” Comput Intell Neurosci, vol. 2022, pp. 1–9, Aug. 2022, doi: 10.1155/2022/5267498.

K. M. Mohi Uddin, R. Ripa, N. Yeasmin, N. Biswas, and S. K. Dey, “Machine learning-based approach to the diagnosis of cardiovascular vascular disease using a combined dataset,” Intell Based Med, vol. 7, p. 100100, 2023, doi: 10.1016/j.ibmed.2023.100100.

J. Yang and J. Guan, "A heart disease prediction model based on feature optimization and SMOTE-XGBoost algorithm," Information, vol. 13, no. 10, p. 475, 2022. [Online]. Available: https://doi.org/10.3390/info13100475.

P. A. Moreno-Sanchez, “Development of an Explainable Prediction Model of Heart Failure Survival by Using Ensemble Trees,” Proceedings - 2020 IEEE International Conference on Big Data, Big Data 2020, pp. 4902–4910, 2020, doi: 10.1109/BigData50022.2020.9378460.

K. Shiwangi, J. K. Sandhu, and R. Sahu, “Effective Heart-Disease Prediction by Using Hybrid Machine Learning Technique,” in 2023 International Conference on Circuit Power and Computing Technologies (ICCPCT), IEEE, Aug. 2023, pp. 1670–1675. doi: 10.1109/ICCPCT58313.2023.10245785.

D. Asif, M. Bibi, M. S. Arif, and A. Mukheimer, “Enhancing Heart Disease Prediction through Ensemble Learning Techniques with Hyperparameter Optimization,” Algorithms, vol. 16, no. 6, p. 308, Jun. 2023, doi: 10.3390/a16060308.

K. Budholiya, S. K. Shrivastava, and V. Sharma, “An optimized XGBoost based diagnostic system for effective prediction of heart disease,” Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 7, pp. 4514–4523, Jul. 2022, doi: 10.1016/j.jksuci.2020.10.013.

Narayanan and Jayashree, “Implementation of Efficient Machine Learning Techniques for Prediction of Cardiac Disease using SMOTE,” Procedia Comput Sci, vol. 233, pp. 558–569, 2024, doi: 10.1016/j.procs.2024.03.245.

R. Valarmathi and T. Sheela, "Heart disease prediction using hyperparameter optimization (HPO) tuning," Biomed. Signal Process. Control, vol. 70, p. 103033, 2021. [Online]. Available: https://doi.org/10.1016/j.bspc.2021.103033.

N. Afiatuddin, M. T. Wicaksono, V. R. Akbar, R. Rahmaddeni, and D. Wulandari, “Komparasi Algoritma Machine Learning dalam Klasifikasi Kanker Payudara,” JURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 8, no. 2, p. 889, Apr. 2024, doi: 10.30865/mib.v8i2.7457.

A. Nugroho, “Analisa Splitting Criteria Pada Decision Tree dan Random Forest untuk Klasifikasi Evaluasi Kendaraan,” JSITIK: Jurnal Sistem Informasi dan Teknologi Informasi Komputer, vol. 1, no. 1, pp. 41–49, Dec. 2022, doi: 10.53624/jsitik.v1i1.154.

Fedesoriano, “Heart Failure Prediction Dataset,” 2021, [Online]. Available: https://www.kaggle.com/datasets/fedesoriano/heart-failure-prediction

Downloads

Published

2024-12-31

How to Cite

Afiatuddin, N., Rahmaddeni, R., Pratiwi, F. ., Septia, R., & Hendrawan, H. (2024). Evaluation of Data Mining in Heart Failure Disease Classfication: . CogITo Smart Journal, 10(2), 460–473. https://doi.org/10.31154/cogito.v10i2.726.460-473