Variable Selection for Naive Bayes Semisupervised Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Variable Selection for Naive Bayes Semisupervised Learning

Authors:	Byoung-Jeong Choi Kwang-Rae Kim Kyu-Dong Cho Changyi Park

Affiliation:	1. SAS Korea, Seoul, Korea;2. School of Mathematical Sciences, University of Nottingham, Nottingham, UK;3. Department of Statistics, Korea University, Seoul, Korea;4. Department of Statistics, University of Seoul, Seoul, Korea

Abstract:	This article deals with a semisupervised learning based on naive Bayes assumption. A univariate Gaussian mixture density is used for continuous input variables whereas a histogram type density is adopted for discrete input variables. The EM algorithm is used for the computation of maximum likelihood estimators of parameters in the model when we fix the number of mixing components for each continuous input variable. We carry out a model selection for choosing a parsimonious model among various fitted models based on an information criterion. A common density method is proposed for the selection of significant input variables. Simulated and real datasets are used to illustrate the performance of the proposed method.

Keywords:	BIC Common density Density estimation EM algorithm Model selection Naive Bayes Semisupervised learning Variable selection.