Skip Navigation Links.
Collapse <span class="m110 colortj mt20 fontw700">Volume 12 (2024)</span>Volume 12 (2024)
Collapse <span class="m110 colortj mt20 fontw700">Volume 11 (2023)</span>Volume 11 (2023)
Collapse <span class="m110 colortj mt20 fontw700">Volume 10 (2022)</span>Volume 10 (2022)
Collapse <span class="m110 colortj mt20 fontw700">Volume 9 (2021)</span>Volume 9 (2021)
Collapse <span class="m110 colortj mt20 fontw700">Volume 8 (2020)</span>Volume 8 (2020)
Collapse <span class="m110 colortj mt20 fontw700">Volume 7 (2019)</span>Volume 7 (2019)
Collapse <span class="m110 colortj mt20 fontw700">Volume 6 (2018)</span>Volume 6 (2018)
Collapse <span class="m110 colortj mt20 fontw700">Volume 5 (2017)</span>Volume 5 (2017)
Collapse <span class="m110 colortj mt20 fontw700">Volume 4 (2016)</span>Volume 4 (2016)
Collapse <span class="m110 colortj mt20 fontw700">Volume 3 (2015)</span>Volume 3 (2015)
Collapse <span class="m110 colortj mt20 fontw700">Volume 2 (2014)</span>Volume 2 (2014)
Collapse <span class="m110 colortj mt20 fontw700">Volume 1 (2013)</span>Volume 1 (2013)
American Journal of Applied Mathematics and Statistics. 2023, 11(2), 35-49
DOI: 10.12691/AJAMS-11-2-1
Original Research

Performance Evaluation and Comparison of Heart Disease Prediction Using Machine Learning Methods with Elastic Net Feature Selection

Sanjib Ghosh1, 2, and Muhammad Alamgir Islam1

1Assistant Professor, Department of Statistics, University of Chittagong, Chittagong 4331, Bangladesh

2Ph.D. Fellow at Zhejiang Gongshang University, China

Pub. Date: April 05, 2023

Cite this paper

Sanjib Ghosh and Muhammad Alamgir Islam. Performance Evaluation and Comparison of Heart Disease Prediction Using Machine Learning Methods with Elastic Net Feature Selection. American Journal of Applied Mathematics and Statistics. 2023; 11(2):35-49. doi: 10.12691/AJAMS-11-2-1

Abstract

Abstract Heart disease is a fatal human disease that rapidly increases globally in developed and underdeveloped countries and causes death. This disease's timely and accurate diagnosis is essential for avoiding patient harm and preserving their lives. This study compared the classifier’s performance in three stages: complete attributes, class balance, and after-feature selection. For class balancing using SMOTE (Synthetic Minority Oversampling Technique) and Elastic Net feature selection algorithm has been used to select suitable features from the available dataset. In this study, justification of performance, the authors have used Logistic Regression (LR), K-nearest neighbor (KNN), Support vector machine (SVM), Random Forest (RF), Adaboost (AB), Artificial neural network (ANN), and Multilayer perceptron (MLP). It has been found that the performance increased ANN and LR after class balance and was unchanged in SVM and MLP. The classification accuracies of the top two classification algorithms, i.e., RF and Adaboost, on full features were 99% and 94%, respectively. After applying feature selection algorithms, the classification accuracy of RF slightly decreases from 99% to 92%. The accuracy of Adaboost decreases from 94% to 83%. However, the performance of classifiers increased after class balance and feature selection, such as KNN, SVM, and MLP. After class balancing and feature selection, we observed that the SVM classifier performs best.

Keywords

heart disease, SMOTE, elastic net, LR, KNN, SVM, DF, Adaboost, ANN, MLP

Copyright

Creative CommonsThis work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

References

[1]  https://www.who.int/health-topics/cardiovascular-diseases#tab=tab_1. [Accessed 02 June 2021].
 
[2]  S. I. Ayon, M. M. Islam, and M. R. Hossain, “Coronary artery heart disease prediction: a comparative study of computational intelligence techniques,” IETE Journal of Research, 2020.
 
[3]  Patil, S. B. & Kumaraswamy, Y. Intelligent and effective heart attack prediction system using data mining and artificial neural network. Eur. J. Sci. Res. 31, 642-656 (2009).
 
[4]  Vanisree, K. & Singaraju, J. Decision support system for congenital heart disease diagnosis based on signs and symptoms using neural networks. Int. J. Comput. Appl. 19, 6-12 (2015).
 
[5]  Edmonds. In Proceedings of AISB Symposium on Socially Inspired Computing 1-12 (Hatfield, 2005).
 
[6]  Methaila, A., Kansal, P., Arya, H. & Kumar, P. Early heart disease prediction using data mining techniques. Computer. Sci.Inf. Technol. J. (2014).
 
[7]  Ponikowski P, Anker SD, AlHabib KF, Cowie MR, Force TL, Hu S, Jaarsma T, Krum H, Rastogi V, Rohde LE, Samal UC, Shimokawa H, Siswanto BB, Sliwa K, Filippatos G (2014) Heart failure: preventing disease and death worldwide. ESC Heart Fail 1(1): 4-25.
 
[8]  Bashir S, Qamar U, Khan FH, javed MY (2014) MV5: a clinical decision support framework for heart disease prediction using majority vote based classifier ensemble. Arab J Sci Eng 39(11): 7771-7783.
 
[9]  Olaniyi EO, Oyedotun OK, Adnan K (2015) Heart diseases diagnosis using neural networks arbitration. Int J Intell Syst Appl 7(12): 75-82.
 
[10]  Thomas J, Princy RT. Human heart disease prediction system using data mining techniques. 2016 International Conference on Circuit, Power, and Computing Technologies (ICCPCT). 2016.
 
[11]  Verma L, Srivastava S, Negi PC (2016) A hybrid data mining model to predict coronary artery disease cases using non-invasive clinical data. J Med Syst 40(7):17.
 
[12]  Verma L, Srivastava S (2016) A data mining model for coronary artery disease detection using noninvasive clinical parameters. Indian J Sci Technol 9(48): 1.
 
[13]  Pahwa K, Kumar R. “Prediction of heart disease using hybrid technique for selecting features,” 2017 4th IEEE Uttar Pradesh Section International Conference on Electrical, Computer and Electronics (UPCON), Mathura. 2017. p. 500-504.
 
[14]  Xu S, Zhang Z, Wang D, Hu J, Duan X, Zhu T. “Cardiovascular risk prediction method based on CFS subset evaluation and random forest classifcation framework,” 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA), Beijing. 2017. p. 228-232.
 
[15]  Jabbar MA, Deekshatulu BL, Chandra P (2016) Prediction of heart disease using random forest and feature subset selection. In: Innovations in bio-inspired computing and applications. Springer, Cham, pp 187-196.
 
[16]  Latha CBC, Jeeva SC (2019) Improving the accuracy of prediction of heart disease risk based on ensemble classification techniques. Inf Med Unlocked 16:100203.
 
[17]  Terrada O, Hamida S, Cherradi B, Raihani A, Bouattane O (2020) Supervised machine learning based medical diagnosis support system for prediction of patients with heart disease. Adv Sci Technol Eng Syst J 5(5): 269-277.
 
[18]  Tama BA, Im S, Lee S (2020) Improving an intelligent detection system for coronary heart disease using a two-tier classifier ensemble. Biomed Res Int 2020: 1-10.
 
[19]  Saez JA, Krawczyk B, Woźniak M (2016) Analyzing the oversampling of different classes and types of examples in multi-class imbalanced datasets. Pattern Recogn 57: 164-178.
 
[20]  https://www.kaggle.com/johnsmith88/heart-disease-dataset [Accessed 02 June 2021].
 
[21]  Azur MJ, Stuart EA, Frangakis C, Leaf PJ. (2011). Multiple imputation by chained equations: what is it and how does it work. Int J Methods Psychiatr Res 20(1): 40-49.
 
[22]  M.R. Rahman, T. Islam, T. Zaman, M. Shahjaman, M.R. Karim, F. Huq, J.M. Quinn, R.D. Holsinger, E. Gov, M.A. Moni, Identification of molecular signatures and pathways to identify novel therapeutic targets in alzheimer’s disease: insights from a systems biomedicine perspective, Genomics 112 (2) (2019) 1290-1299.
 
[23]  Four Techniques for Outlier Detection, https://www.kdnuggets.com/2018/12/four-techniques-outlier-detection.html.
 
[24]  Md Satu, Syeda Atik, Mohammad Moni, A Novel Hybrid Machine Learning Model to Predict Diabetes Mellitus, 2019.
 
[25]  Zou H, Hastie T. Regularization and variable selection via the elastic net. J R Statist Soc: Ser B (Statist Methodol). 2005; 67: 301-320.
 
[26]  H. Zou, T. Hastie, Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67 (2) (2005) 301-320.
 
[27]  Rani P, Kumar R, Jain A, Lamba R (2020) Taxonomy of machine learning algorithms and its applications. J Comput Theror Nanosci 17(6): 2509-2514.
 
[28]  Priya S. Comparative Study of Data Mining Classifcation Algorithms in Heart Disease Prediction.
 
[29]  Dangare C, Apte S. A Data Mining Approach for Prediction of Heart Disease Using Neural Networks. 2012.