Sains Malaysiana 50(8)(2021): 2479-2497
http://doi.org/10.17576/jsm-2021-5008-28
Prediction of COVID-19 Patient
using Supervised Machine Learning Algorithm
(Ramalan Pesakit COVID-19 menggunakan Algoritma Pembelajaran Diselia Mesin)
BUVANA,
M.1* & MUTHUMAYIL, K.2
1Department of Computer Science and
Engineering, PSNA College of Engineering and Technology, Tamil Nadu, India
2Department of Information Technology, PSNA
College of Engineering and Technology, Tamil Nadu, India
Received: 7 April
2021/Accepted: 27 June 2021
ABSTRACT
One
of the most symptomatic diseases is COVID-19. Early and precise physiological
measurement-based prediction of breathing will minimize the risk of COVID-19 by
a reasonable distance from anyone; wearing a mask, cleanliness, medication,
balanced diet, and if not well stay safe at home. To evaluate the collected
datasets of COVID-19 prediction, five machine learning classifiers were used:
Nave Bayes, Support Vector Machine (SVM), Logistic Regression, K-Nearest Neighbour (KNN), and Decision Tree. COVID-19 datasets from
the Repository were combined and re-examined to remove incomplete entries, and
a total of 2500 cases were utilized in this study. Features of fever, body
pain, runny nose, difficulty in breathing, shore throat, and nasal congestion,
are considered to be the most important differences between patients who have
COVID-19s and those who do not. We exhibit the
prediction functionality of five machine learning classifiers. A
publicly available data set was used to train and assess the model. With an overall
accuracy of 99.88 percent, the ensemble model is performed commendably. When
compared to the existing methods and studies, the proposed model is performed
better. As a result, the model presented is trustworthy and can be used to
screen COVID-19 patients timely, efficiently.
Keywords:
Classifier; COVID-19; machine learning; prediction;
supervised learning
ABSTRAK
Salah satu penyakit yang paling simptomatik ialah COVID-19. Ramalan pernafasan berdasarkan pengukuran fisiologi awal dan tepat akan meminimumkan risikoCOVID-19 dengan jarak yang munasabah daripada sesiapa sahaja; memakai topeng, kebersihan, ubat-ubatan, diet seimbang dan jika tidak sihat, tinggal di rumah. Untuk menilai kumpulan data ramalanCOVID-19
yang dikumpulkan, lima pengkelasan pembelajaran mesin digunakan: Nave Bayes, Mesin Vektor Sokongan (SVM), Regresi Logistik, Jiran K-Terdekat (KNN) dan Pohon Keputusan. Set data COVID-19 daripada Repositori digabungkan dan disemak semula untuk menghapus entri yang tidak lengkap dan sejumlah 2500 kes digunakan dalam kajian ini. Ciri demam, sakit badan, hidung berair, kesukaran bernafas, sakit tekak dan hidung tersumbat, dianggap sebagai perbezaan yang paling penting antara pesakit yang menghidapCOVID-19 dan mereka yang tidak. Kami menunjukkan fungsi ramalan lima pengelasan pembelajaran mesin. Satu set data
yang tersedia untuk umum digunakan untuk melatih dan menilai model. Dengan ketepatan keseluruhan 99.88 peratus, model ensembel dilakukan dengan terpuji. Jika dibandingkan dengan kaedah dan kajian yang ada, model yang dicadangkan dilakukan dengan lebih baik. Hasilnya, model yang dipersembahkan boleh dipercayai dan dapat digunakan untuk menyaring pesakit COVID-19 tepat pada waktunya.
Kata kunci: COVID-19; pembelajaran mesin; pembelajaran yang diselia; pengelas; ramalan
REFERENCES
Ayyoubzadeh, S.M., Ayyoubzadeh,
S.M., Zahedi, H., Ahmadi, M. & R Niakan Kalhori, S. 2020. Predicting COVID-19 incidence through
analysis of google trends data in Iran: Data mining and deep learning pilot
study. JMIR Public Health and Surveillance 6(2): e18828. doi.org/10.2196/18828.
COVID-19. Dataset. https://github.com/Simranpandey16/Covid-19-prediction.
COVID-19 Public Health Emergency of
International Concern (PHEIC). Global research and innovation forum.
https://www.who.int/publications/m/item/covid-19-public-health-emergency-of-international-concern-(pheic)-global-research-and-innovation-forum.
Dharshana Deepthi, L., Shanthi, D. & Buvana, M.
2020. An intelligent Alzheimer’s Disease prediction using convolutional
neural network (CNN). International Journal of Advanced Research in
Engineering and Technology (IJARET) 11(4): 12-22.
ExtraTreeClassifier. https://www.geeksforgeeks.org/ml-extra-tree-classifier-for-feature-selection/.
Furqan
Rustam, Aijaz Ahmad Reshi, Arif Mehmood, Saleem Ullah, Byung-Won On, Waqar Aslam & Gyu Sang Choi. 2020. COVID-19 future forecasting using supervised machine learning
models. IEEE Access 8: 101489-101499.
Jackins, V., Vimal, S., Kaliappan, M. & Lee, M.Y. 2021. AI-based
smart prediction of clinical disease using random forest classifier and Naive
Bayes. The Journal of Supercomputing 77: 5198-5219. https://doi.org/10.1007/s11227-020-03481.
Li,
W.T., Ma, J-Y., Neil, S., Grant, C., Jaideep, C., Tsai, J., Apostol, L., Honda,
C., Xu, J-Y., Wong, L., Zhang, T-Y., Lee, A., Gnanasekar,
A., Honda, T., Kuo, S., Yu, M., Chang, E., Rajasekaran, M.R. & Ongkeko,
W. 2020. Using machine learning of clinical data to diagnose COVID-19: A
systematic review and meta-analysis. BMC Medical Informatics and Decision
Making 20: 247. DOI. 10.1186/s12911-020-01266-z.
Muhammad,
L.J., Algehyne, E.A., Usman, S.S., Ahmad, A.,
Chakraborty, C. & Mohammed, I.A. 2021. Supervised machine learning models
for prediction of COVID-19 infection using epidemiology dataset. SN Comput. Sci. 2(1): 11. https://doi.org/10.1007/s42979-020-00394-7.
Naw Safrin Sattar,
Shaikh Arifuzzaman, Minhaz F. Zibran & Md Mohiuddin Sakib.
2019. Detecting web spam in webgraphs with predictive
model analysis. 2019 IEEE International Conference on Big
Data (Big Data). pp. 4299-4308. doi: 10.1109/BigData47090.2019.9006282.
Remuzzi, A. & Remuzzi, G. 2020. COVID‐19, and Italy: What
next? The Lancet 395(10231): 1225-1228.
Roosa, K., Lee, Y., Luo, R., Kirpich, A., Rothenberg, R., Hyman, J.M., Yan, P.
& Chowell, G. 2020. Real‐time
forecasts of the COVID‐19 epidemic in China from February 5th to February
24th. Infectious Disease Modelling 5: 256-263.
Sarwar, A., Ali, M., Manhas, J. & Sharma, V. 2020. Diagnosis
of diabetes type-II using hybrid machine learning based ensemble model. Int.
J. Inf. Tecnol. 12: 419-428.
Sharma,
A., Tiwari, S., Deb, M.K. & Marty, J.L. 2020. Severe acute respiratory
syndrome coronavirus-2 (SARS-CoV-2): A global pandemic and treatment
strategies. International Journal of Antimicrobial Agents 56(2):
106054. https://doi.org/10.1016/j.ijantimicag.2020.106054.
Ud Din Khanday, A.M., Rabani, S.T.,
Khan, Q.R., Rouf, N. & Ud Din, M.M. 2020. Machine learning based approaches for detecting COVID-19
using clinical text data. Int. J. Inf. Tecnol. 12: 731-739 https://doi.org/10.1007/s41870-020-00495-9.
Wang, S., Kang,
B., Ma, J., Zeng, X., Xiao, M., Guo, J., Cai, M.J., Yang, J.Y., Li, Y.D., Meng, X.F.
& Bo, Xu. 2020. A deep learning algorithm using CT
images to screen for Corona Virus Disease (COVID-19). Eur. Radiol. 31(8):
6096-6104.
Yan, L., Zhang, H-T., Goncalves,
J., Xiao, Y., Wang, M-L., Guo, Y-Q., Sun, C., Tang, X-C., Jin, L., Zhang, M-Y., Huang, X., Xiao, Y.,
Cao, H., Chen, Y-Y., Ren, T-X., Wang, F., Xiao, Y., Huang, S., Tan, X., Huang,
N-N., Jiao, B., Zhang, Y., Luo, A-L., Mombaerts, L., Jin, J-Y., Cao, Z-G., Li, S.S., Xu, H. & Yuan, Y. 2020.
A machine learning-based model for survival prediction in patients with severe
COVID-19 infection. medRxiv https://
doi.org/10.1101/2020.02.27.20028 027.
*Corresponding
author; email: buvana@psnacet.edu.in
|