典型文献
Applying data mining techniques to classify patients with suspected hepatitis C virus infection
文献摘要:
Background:Hepatitis C virus (HCV) has a high prevalence worldwide, and the progression of the disease can cause irreversible damage to severe liver damage or even death. Therefore, developing prediction models using machine learning techniques is beneficial. This study was conducted to classify suspected patients with HCV infection using different classification models.Methods:The study was conducted using a dataset derived from the University of California, Irvine (UCI) Machine Learning Repository. Since the HCV dataset was imbalanced, the synthetic minority oversampling technique (SMOTE) was applied to balance the dataset. After cleaning the dataset, it was divided into training and test data for developing six classification models. These six algorithms included the support vector machine (SVM), Gaussian Na?ve Bayes (NB), decision tree (DT), random forest (RF), logistic regression (LR), and K-nearest neighbors (KNN) algorithm. The Python programming language was used to develop the classifiers. Receiver operating characteristic curve analysis and other metrics were used to evaluate the performance of the proposed models.Results:After the evaluation of the models using different metrics, the RF classifier had the best performance among the six methods. The accuracy of the RF classifier was 97.29%. Accordingly, the area under the curve (AUC) for LR, KNN, DT, SVM, Gaussian NB, and RF models were 0.921, 0.963, 0.953, 0.972, 0.896, and 0.998, respectively, RF showing the best predictive performance.Conclusion:Various machine learning techniques for classifying healthy and unhealthy patients were used in this study. Additionally, the developed models might identify the stage of HCV based on trained data.
文献关键词:
Data mining methods;Classification;Hepatitis;Hepatitis C virus
中图分类号:
作者姓名:
Safdari Reza;Deghatipour Amir;Gholamzadeh Marsa;Maghooli Keivan
作者机构:
Health Information Management Department, School of Allied Medical Sciences, Tehran University of Medical Sciences, Tehran, Iran;Health Information Management and Medical Informatics Department, School of Allied Medical Sciences, Tehran University of Medical Sciences, Tehran, Iran;Department of Biomedical Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran
文献出处:
引用格式:
[1]Safdari Reza;Deghatipour Amir;Gholamzadeh Marsa;Maghooli Keivan-.Applying data mining techniques to classify patients with suspected hepatitis C virus infection)[J].智慧医学(英文),2022(04):193-198
A类:
B类:
Applying,mining,techniques,patients,suspected,hepatitis,virus,infection,Background,Hepatitis,HCV,has,high,prevalence,worldwide,progression,disease,can,cause,irreversible,damage,severe,liver,even,death,Therefore,developing,prediction,models,using,machine,learning,beneficial,This,study,was,conducted,different,classification,Methods,dataset,derived,from,University,California,Irvine,UCI,Machine,Learning,Repository,Since,imbalanced,synthetic,minority,oversampling,SMOTE,applied,After,cleaning,divided,into,training,test,six,These,algorithms,included,support,vector,Gaussian,Na,Bayes,NB,decision,tree,DT,random,forest,RF,logistic,regression,LR,nearest,neighbors,KNN,Python,programming,language,used,classifiers,Receiver,operating,characteristic,curve,analysis,other,metrics,were,evaluate,performance,proposed,Results,evaluation,had,best,among,methods,accuracy,Accordingly,area,under,respectively,showing,predictive,Conclusion,Various,classifying,unhealthy,this,Additionally,developed,might,identify,stage,trained,Data,Classification
AB值:
0.554231
相似文献
机标中图分类号,由域田数据科技根据网络公开资料自动分析生成,仅供学习研究参考。