期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bayesian methods for dealing with missing data problems

Zhihua Ma Guanghui Chen 《Journal of the Korean Statistical Society》2018,47(3):297-313

Missing data, a common but challenging issue in most studies, may lead to biased and inefficient inferences if handled inappropriately. As a natural and powerful way for dealing with missing data, Bayesian approach has received much attention in the literature. This paper reviews the recent developments and applications of Bayesian methods for dealing with ignorable and non-ignorable missing data. We firstly introduce missing data mechanisms and Bayesian framework for dealing with missing data, and then introduce missing data models under ignorable and non-ignorable missing data circumstances based on the literature. After that, important issues of Bayesian inference, including prior construction, posterior computation, model comparison and sensitivity analysis, are discussed. Finally, several future issues that deserve further research are summarized and concluded. 相似文献

2.

A nonparametric inverse probability weighted estimation for functional data with missing response data at random

Longbing Wang Ruiyuan Cao Jiang Du Zhongzhan Zhang 《Journal of the Korean Statistical Society》2019,48(4):537-546

This paper considers the nonparametric inverse probability weighted estimation for functional data with missing response data at random. Under mild conditions, the asymptotic properties of the proposed estimation method are established. Based on the resampling method, the estimation of the asymptotic variance of the proposed estimator is obtained. Finally, the finite sample properties of the proposed estimation method are investigated via Monte Carlo simulation studies. A real data analysis is given to illustrate the use of the proposed method. 相似文献

3.

Stable feature screening for ultrahigh dimensional data

Peng Lai Fengli Song Yufei Gao 《Journal of the Korean Statistical Society》2019,48(2):221-232

This paper is concerned with the stable feature screening for the ultrahigh dimensional data. To deal with the ultrahigh dimensional data problem and screen the important features, a set-averaging measurement is proposed. The model averaging technique and the conditional quantile method are used to construct the weighted set-averaging feature screening procedure to identify the relationships between the possible predictors and the response variable. The proposed screening method is model free, stable and possesses the sure screening property under some regular conditions. Some Monte Carlo simulations and a real data application are conducted to evaluate the performance of the proposed procedure. 相似文献

4.

Review: Reversed low-rank ANOVA model for transforming high dimensional genetic data into low dimension

Yoonsuh Jung Jianhua Hu 《Journal of the Korean Statistical Society》2019,48(2):169-178

A general modeling procedure for analyzing genetic data is reviewed. We review ANOVA type model that can handle both the continuous and discrete genetic variables in one modeling framework. Unlike the regression type models which typically set the phenotype variable as a response, this ANOVA model treats the phenotype variable as an explanatory variable. By reversely treating the phenotype variable, usual high dimensional problem is turned into low dimension. Instead, the ANOVA model always includes interaction term between the genetic locations and phenotype variable to find potential association between them. The interaction term is designed to be low rank with the multiplication of bilinear terms so that the required number of parameters is kept in a manageable degree. We compare the performance of the reviewed ANOVA model to the other popular methods via microarray and SNP data sets. 相似文献

5.

Pseudo MLE for semiparametric transformation model with doubly truncated data

Pao-sheng Shen Yi Liu 《Journal of the Korean Statistical Society》2019,48(3):384-395

In this article, we consider the efficient estimation of the semiparametric transformation model with doubly truncated data. We propose a two-step approach for obtaining the pseudo maximum likelihood estimators (PMLE) of regression parameters. In the first step, the truncation time distribution is estimated by the nonparametric maximum likelihood estimator (Shen, 2010a) when the distribution function

K

of the truncation time is unspecified or by the conditional maximum likelihood estimator (Bilker and Wang, 1996) when

K

is parameterized. In the second step, using the pseudo complete-data likelihood function with the estimated distribution of truncation time, we propose expectation–maximization algorithms for obtaining the PMLE. We establish the consistency of the PMLE. The simulation study indicates that the PMLE performs well in finite samples. The proposed method is illustrated using an AIDS data set. 相似文献

6.

Principal weighted logistic regression for sufficient dimension reduction in binary classification

Boyoung Kim Seung Jun Shin 《Journal of the Korean Statistical Society》2019,48(2):194-206

Sufficient dimension reduction (SDR) is a popular supervised machine learning technique that reduces the predictor dimension and facilitates subsequent data analysis in practice. In this article, we propose principal weighted logistic regression (PWLR), an efficient SDR method in binary classification where inverse-regression-based SDR methods often suffer. We first develop linear PWLR for linear SDR and study its asymptotic properties. We then extend it to nonlinear SDR and propose the kernel PWLR. Evaluations with both simulated and real data show the promising performance of the PWLR for SDR in binary classification. 相似文献

7.

Hausman-type tests for individual and time effects in the panel regression model with incomplete data

Jing Chen Rongxian Yue Jianhong Wu 《Journal of the Korean Statistical Society》2018,47(3):347-363

By comparing estimators of the variance of idiosyncratic error at different robust levels, two Hausman-type test statistics are respectively constructed for the existence of individual and time effects in the panel regression model with incomplete data. The resultant test statistics have several desired properties. Firstly, they are robust to the presence of one effect when the other is tested. Secondly, they are immune to the non-normal distribution of the disturbances since the distributional conditions are not needed in the construction of the statistics. Thirdly, they have more robust performances than the main competitors in the literature when the covariates are correlated with the effects. Additionally, they are very simple and have no heavy computational burden. Joint tests for both of the two effects are also discussed. Monte Carlo evidence shows that the proposed tests have desired finite sample properties, and a real data analysis gives further support. 相似文献

8.

Recent developments in high dimensional covariance estimation and its related issues,a review

Younghee Hong Choongrak Kim 《Journal of the Korean Statistical Society》2018,47(3):239-247

In this paper we review some of recent developments in high dimensional data analysis, especially in the estimation of covariance and precision matrix, asymptotic results on the eigenstructure in the principal components analysis, and some relevant issues such as test on the equality of two covariance matrices, determination of the number of principal components, and detection of hubs in a complex network. 相似文献

9.

Bayesian curve fitting and clustering with Dirichlet process mixture models for microarray data

Ju-Hyun Park Minjung Kyung 《Journal of the Korean Statistical Society》2019,48(2):207-220

In the field of molecular biology, it is often of interest to analyze microarray data for clustering genes based on similar profiles of gene expression to identify genes that are differentially expressed under multiple biological conditions. One of the notable characteristics of a gene expression profile is that it shows a cyclic curve over a course of time. To group sequences of similar molecular functions, we propose a Bayesian Dirichlet process mixture of linear regression models with a Fourier series for the regression coefficients, for each of which a spike and slab prior is assumed. A full Gibbs-sampling algorithm is developed for an efficient Markov chain Monte Carlo (MCMC) posterior computation. Due to the so-called “label-switching” problem and different numbers of clusters during the MCMC computation, a post-process approach of Fritsch and Ickstadt (2009) is additionally applied to MCMC samples for an optimal single clustering estimate by maximizing the posterior expected adjusted Rand index with the posterior probabilities of two observations being clustered together. The proposed method is illustrated with two simulated data and one real data of the physiological response of fibroblasts to serum of Iyer et al. (1999). 相似文献

10.

A class of observation-driven random coefficient INAR(1) processes based on negative binomial thinning

Meiju Yu Dehui Wang Kai Yang 《Journal of the Korean Statistical Society》2019,48(2):248-264

Integer-valued time series models and their applications have attracted a lot of attention over the last years. In this paper, we introduce a class of observation-driven random coefficient integer-valued autoregressive processes based on negative binomial thinning, where the autoregressive parameter depends on the observed values of the previous moment. Basic probability and statistics properties of the process are established. The unknown parameters are estimated by the conditional least squares and empirical likelihood methods. Specially, we consider three aspects of the empirical likelihood method: maximum empirical likelihood estimate, confidence region and EL test. The performance of the two estimation methods is compared through simulation studies. Finally, an application to a real data example is provided. 相似文献

11.

Asymptotic normality and mean consistency for the weighted estimator in nonparametric regression models

Yi Wu Xuejun Wang Soo Hak Sung 《Journal of the Korean Statistical Society》2019,48(3):463-479

In this paper, we mainly study the asymptotic properties of weighted estimator for the nonparametric regression model based on linearly negative quadrant dependent (LNQD, for short) errors. We obtain the rate of uniformly asymptotic normality of the weighted estimator which is nearly

O (n^{? 1 ∕ 4})

when the moment condition is appropriate. The results generalize the corresponding ones of Yang (2003) from NA samples to LNQD samples and improve or extend the corresponding one of Li et al. (2012) for LNQD samples. Moreover, we obtain some results on mean consistency, uniformly mean consistency, and the rate of mean consistency for the weighted estimator. Finally we carry out some simulations to verify the validity of our results. 相似文献

12.

Complete convergence for arrays of rowwise END random variables and its statistical applications under sub-linear expectations

Mengmei Xi Yi Wu Xuejun Wang 《Journal of the Korean Statistical Society》2019,48(3):412-425

相似文献

13.

Performance of standard imputation methods for missing quality of life data as covariate in survival analysis based on simulations from the International Breast Cancer Study Group Trials VI and VII*

Marion Procter Chris Robertson 《统计学通讯:模拟与计算》2013,42(10):3063-3077

Abstract

Imputation methods for missing data on a time-dependent variable within time-dependent Cox models are investigated in a simulation study. Quality of life (QoL) assessments were removed from the complete simulated datasets, which have a positive relationship between QoL and disease-free survival (DFS) and delayed chemotherapy and DFS, by missing at random and missing not at random (MNAR) mechanisms. Standard imputation methods were applied before analysis. Method performance was influenced by missing data mechanism, with one exception for simple imputation. The greatest bias occurred under MNAR and large effect sizes. It is important to carefully investigate the missing data mechanism. 相似文献

14.

Efficient and flexible model-based clustering of jumps in diffusion processes

Bokgyeong Kang Taeyoung Park 《Journal of the Korean Statistical Society》2019,48(3):439-453

Jump–diffusion processes involving diffusion processes with discontinuous movements, called jumps, are widely used to model time-series data that commonly exhibit discontinuity in their sample paths. The existing jump–diffusion models have been recently extended to multivariate time-series data. The models are, however, still limited by a single parametric jump-size distribution that is common across different subjects. Such strong parametric assumptions for the shape and structure of a jump-size distribution may be too restrictive and unrealistic for multiple subjects with different characteristics. This paper thus proposes an efficient Bayesian nonparametric method to flexibly model a jump-size distribution while borrowing information across subjects in a clustering procedure using a nested Dirichlet process. For efficient posterior computation, a partially collapsed Gibbs sampler is devised to fit the proposed model. The proposed methodology is illustrated through a simulation study and an application to daily stock price data for companies in the S&P 100 index from June 2007 to June 2017. 相似文献