期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bayesian variable selection in a finite mixture of linear mixed-effects models

Kuo-Jung Lee Ray-Bing Chen 《Journal of Statistical Computation and Simulation》2019,89(13):2434-2453

Mixture of linear mixed-effects models has received considerable attention in longitudinal studies, including medical research, social science and economics. The inferential question of interest is often the identification of critical factors that affect the responses. We consider a Bayesian approach to select the important fixed and random effects in the finite mixture of linear mixed-effects models. To accomplish our goal, latent variables are introduced to facilitate the identification of influential fixed and random components and to classify the membership of observations in the longitudinal data. A spike-and-slab prior for the regression coefficients is adopted to sidestep the potential complications of highly collinear covariates and to handle large p and small n issues in the variable selection problems. Here we employ Markov chain Monte Carlo (MCMC) sampling techniques for posterior inferences and explore the performance of the proposed method in simulation studies, followed by an actual psychiatric data analysis concerning depressive disorder. 相似文献

2.

Bayesian variable selection for multioutcome models through shared shrinkage

Debamita Kundu Riten Mitra Jeremy T. Gaskins 《Scandinavian Journal of Statistics》2021,48(1):295-320

Variable selection over a potentially large set of covariates in a linear model is quite popular. In the Bayesian context, common prior choices can lead to a posterior expectation of the regression coefficients that is a sparse (or nearly sparse) vector with a few nonzero components, those covariates that are most important. This article extends the “global‐local” shrinkage idea to a scenario where one wishes to model multiple response variables simultaneously. Here, we have developed a variable selection method for a K‐outcome model (multivariate regression) that identifies the most important covariates across all outcomes. The prior for all regression coefficients is a mean zero normal with coefficient‐specific variance term that consists of a predictor‐specific factor (shared local shrinkage parameter) and a model‐specific factor (global shrinkage term) that differs in each model. The performance of our modeling approach is evaluated through simulation studies and a data example. 相似文献

3.

Efficient estimation of regression parameters from multistage studies with validation of outcome and covariates

《Journal of statistical planning and inference》1997,65(2):349-374

Often the variables in a regression model are difficult or expensive to obtain so auxiliary variables are collected in a preliminary step of a study and the model variables are measured at later stages on only a subsample of the study participants called the validation sample. We consider a study in which at the first stage some variables, throughout called auxiliaries, are collected; at the second stage the true outcome is measured on a subsample of the first-stage sample, and at the third stage the true covariates are collected on a subset of the second-stage sample. In order to increase efficiency, the probabilities of selection into the second and third-stage samples are allowed to depend on the data observed at the previous stages. In this paper we describe a class of inverse-probability-of-selection-weighted semiparametric estimators for the parameters of the model for the conditional mean of the outcomes given the covariates. We assume that a subject's probability of being sampled at subsequent stages is bounded away from zero and depends only on the subject's data collected at the previous sampling stages. We show that the asymptotic variance of the optimal estimator in our class is equal to the semiparametric variance bound for the model. Since the optimal estimator depends on unknown population parameters it is not available for data analysis. We therefore propose an adaptive estimation procedure for locally efficient inferences. A simulation study is carried out to study the finite sample properties of the proposed estimators. 相似文献

4.

Sparsity identification in ultra-high dimensional quantile regression models with longitudinal data

Xianli Gao 《统计学通讯:理论与方法》2020,49(19):4712-4736

Abstract

In this paper, we propose a variable selection method for quantile regression model in ultra-high dimensional longitudinal data called as the weighted adaptive robust lasso (WAR-Lasso) which is double-robustness. We derive the consistency and the model selection oracle property of WAR-Lasso. Simulation studies show the double-robustness of WAR-Lasso in both cases of heavy-tailed distribution of the errors and the heavy contaminations of the covariates. WAR-Lasso outperform other methods such as SCAD and etc. A real data analysis is carried out. It shows that WAR-Lasso tends to select fewer variables and the estimated coefficients are in line with economic significance. 相似文献

5.

Joint GEEs for multivariate correlated data with incomplete binary outcomes

G. Inan R. Yucel 《Journal of applied statistics》2017,44(11):1920-1937

This study considers a fully-parametric but uncongenial multiple imputation (MI) inference to jointly analyze incomplete binary response variables observed in a correlated data settings. Multiple imputation model is specified as a fully-parametric model based on a multivariate extension of mixed-effects models. Dichotomized imputed datasets are then analyzed using joint GEE models where covariates are associated with the marginal mean of responses with response-specific regression coefficients and a Kronecker product is accommodated for cluster-specific correlation structure for a given response variable and correlation structure between multiple response variables. The validity of the proposed MI-based JGEE (MI-JGEE) approach is assessed through a Monte Carlo simulation study under different scenarios. The simulation results, which are evaluated in terms of bias, mean-squared error, and coverage rate, show that MI-JGEE has promising inferential properties even when the underlying multiple imputation is misspecified. Finally, Adolescent Alcohol Prevention Trial data are used for illustration. 相似文献

6.

Multivariate elliptical models with general parameterization

Artur J. Lemonte Alexandre G. Patriota 《Statistical Methodology》2011,8(4):389-400

In this paper we introduce a general elliptical multivariate regression model in which the mean vector and the scale matrix have parameters (or/and covariates) in common. This approach unifies several important elliptical models, such as nonlinear regressions, mixed-effects model with nonlinear fixed effects, errors-in-variables models, and so forth. We discuss maximum likelihood estimation of the model parameters and obtain the information matrix, both observed and expected. Additionally, we derived the generalized leverage as well as the normal curvatures of local influence under some perturbation schemes. An empirical application is presented for illustrative purposes. 相似文献

7.

Penalized inverse probability weighted estimators for weighted rank regression with missing covariates

Hu Yang Jing Lv 《统计学通讯:理论与方法》2013,42(5):1388-1402

Abstract

In this article, we study the variable selection and estimation for linear regression models with missing covariates. The proposed estimation method is almost as efficient as the popular least-squares-based estimation method for normal random errors and empirically shown to be much more efficient and robust with respect to heavy tailed errors or outliers in the responses and covariates. To achieve sparsity, a variable selection procedure based on SCAD is proposed to conduct estimation and variable selection simultaneously. The procedure is shown to possess the oracle property. To deal with the covariates missing, we consider the inverse probability weighted estimators for the linear model when the selection probability is known or unknown. It is shown that the estimator by using estimated selection probability has a smaller asymptotic variance than that with true selection probability, thus is more efficient. Therefore, the important Horvitz-Thompson property is verified for penalized rank estimator with the covariates missing in the linear model. Some numerical examples are provided to demonstrate the performance of the estimators. 相似文献

8.

VARIANCE ESTIMATION IN NONPARAMETRIC MULTIPLE REGRESSION

《统计学通讯:理论与方法》2013,42(8):1373-1383

ABSTRACT

We consider the variance estimation in a general nonparametric regression model with multiple covariates. We extend difference methods to the multivariate setting by introducing an algorithm that orders the design points in higher dimensions. We also consider an adaptive difference estimator which requires much less strict assumptions on the covariate design and can significantly reduce mean squared error for small sample sizes. 相似文献

9.

Variable selection for spatial Poisson point processes via a regularization method

《Statistical Methodology》2014

It is often of interest to use regression analysis to study the relationship between occurrence of events in space and spatially-indexed covariates. One model for such regression analysis is the Poisson point process. Here, we develop a method to perform the selection of covariates and the estimation of model parameters simultaneously for this model via a regularization method. We assess the finite-sample properties of our method with a simulation study. In addition, we propose a variant of our method that allows the selection of covariates at multiple pixel resolutions. For illustration, we consider the locations of a tree species, Beilschmiedia pendula, in a study plot at Barro Colorado Island in central Panama. We find that Beilschmiedia pendula occurs in greater abundance at locations with higher elevation and steeper slope. Also, we identify three species to which Beilschmiedia pendula tends to be attracted, two species by which it appears to be repelled, and a species with no apparent relationship. 相似文献

10.

Nonlinear mixed-effects scalar-on-function models and variable selection

Cheng Yafeng Shi Jian Qing Eyre Janet 《Statistics and Computing》2020,30(1):129-140

This paper is motivated by our collaborative research and the aim is to model clinical assessments of upper limb function after stroke using 3D-position and 4D-orientation movement data. We present a new nonlinear mixed-effects scalar-on-function regression model with a Gaussian process prior focusing on the variable selection from a large number of candidates including both scalar and function variables. A novel variable selection algorithm has been developed, namely functional least angle regression. As it is essential for this algorithm, we studied the representation of functional variables with different methods and the correlation between a scalar and a group of mixed scalar and functional variables. We also propose a new stopping rule for practical use. This algorithm is efficient and accurate for both variable selection and parameter estimation even when the number of functional variables is very large and the variables are correlated. And thus the prediction provided by the algorithm is accurate. Our comprehensive simulation study showed that the method is superior to other existing variable selection methods. When the algorithm was applied to the analysis of the movement data, the use of the nonlinear random-effect model and the function variables significantly improved the prediction accuracy for the clinical assessment.

相似文献

11.

TREE-BASED REGRESSION FOR A CIRCULAR RESPONSE

《统计学通讯:理论与方法》2013,42(9):1549-1560

ABSTRACT

The application of conventional statistical methods to directional data generally produces erroneous results. Various regression models for a circular response have been presented in the literature, however these are unsatisfactory either in the limited relationships that can be modeled, or the limitations on the number or type of covariates admissible. One difficulty with circular regression is devising a meaningful regression function. This problem is exacerbated when trying to incorporate both linear and circular variables as covariates. Due to these complexities, circular regression is ripe for exploration via tree-based methods, in which a formal regression function is not needed, but where insight into the general structure and relationship between predictors and the response may be obtained. A basic framework for regression trees, predicting a circular response from a combination of circular and linear predictors, will be presented. 相似文献

12.

Response-based multiple imputation method for minimizing the impact of covariate detection limit in logistic regression

Shahadut Hossain Zahirul Hoque Jacek Wesolowski 《统计学通讯:理论与方法》2021,50(2):371-386

Abstract

Presence of detection limit (DL) in covariates causes inflated bias and inaccurate mean squared error to the estimators of the regression parameters. This paper suggests a response-driven multiple imputation method to correct the deleterious impact introduced by the covariate DL in the estimators of the parameters of simple logistic regression model. The performance of the method has been thoroughly investigated, and found to outperform the existing competing methods. The proposed method is computationally simple and easily implementable by using three existing R libraries. The method is robust to the violation of distributional assumption for the covariate of interest. 相似文献

13.

Bayesian composite quantile regression for linear mixed-effects models

Yuzhu Tian Heng Lian Maozai Tian 《统计学通讯:理论与方法》2017,46(15):7717-7731

Longitudinal data are commonly modeled with the normal mixed-effects models. Most modeling methods are based on traditional mean regression, which results in non robust estimation when suffering extreme values or outliers. Median regression is also not a best choice to estimation especially for non normal errors. Compared to conventional modeling methods, composite quantile regression can provide robust estimation results even for non normal errors. In this paper, based on a so-called pseudo composite asymmetric Laplace distribution (PCALD), we develop a Bayesian treatment to composite quantile regression for mixed-effects models. Furthermore, with the location-scale mixture representation of the PCALD, we establish a Bayesian hierarchical model and achieve the posterior inference of all unknown parameters and latent variables using Markov Chain Monte Carlo (MCMC) method. Finally, this newly developed procedure is illustrated by some Monte Carlo simulations and a case analysis of HIV/AIDS clinical data set. 相似文献

14.

Improved inference for a linear mixed-effects model when the subpopulation effects are clustered

Guofen Yan J. Sedransk 《Journal of statistical planning and inference》2011,141(11):3489-3497

We provide Bayesian methodology to relax the assumption that all subpopulation effects in a linear mixed-effects model have, after adjustment for covariates, a common mean. We expand the model specification by assuming that the m subpopulation effects are allowed to cluster into d groups where the value of d, 1?d?m, and the composition of the d groups are unknown, a priori. Specifically, for each partition of the m effects into d groups we only assume that the subpopulation effects in each group are exchangeable and are independent across the groups. We show that failure to take account of this clustering, as with the customary method, will lead to serious errors in inference about the variances and subpopulation effects, but the proposed, expanded, model leads to appropriate inferences. The efficacy of the proposed method is evaluated by contrasting it with both the customary method and use of a Dirichlet process prior. We use data from small area estimation to illustrate our method. 相似文献

15.

Empirical likelihood-based inference in nonlinear regression models with missing responses at random

Nian-Sheng Tang Pu-Ying Zhao 《Statistics》2013,47(6):1141-1159

This paper investigates the estimations of regression parameters and response mean in nonlinear regression models in the presence of missing response variables that are missing with missingness probabilities depending on covariates. We propose four empirical likelihood (EL)-based estimators for the regression parameters and the response mean. The resulting estimators are shown to be consistent and asymptotically normal under some general assumptions. To construct the confidence regions for the regression parameters as well as the response mean, we develop four EL ratio statistics, which are proven to have the χ² distribution asymptotically. Simulation studies and an artificial data set are used to illustrate the proposed methodologies. Empirical results show that the EL method behaves better than the normal approximation method and that the coverage probabilities and average lengths depend on the selection probability function. 相似文献

16.

A novel approach to estimate the Cox model with temporal covariates and application to medical cost data

Yanqiao Zheng Xiaobing Zhao 《统计学通讯:理论与方法》2020,49(18):4520-4535

Abstract

We propose a novel approach to estimate the Cox model with temporal covariates. Our new approach treats the temporal covariates as arising from a longitudinal process which is modeled jointly with the event time. Different from the literature, the longitudinal process in our model is specified as a bounded variational process and determined by a family of Initial Value Problems associated with an Ordinary Differential Equation. Our specification has the advantage that only the observation of the temporal covariates at the event-time and the event-time itself are needed to fit the model, while it is fine but not necessary to have more longitudinal observations. This fact makes our approach very useful for many medical outcome datasets, such as the SPARCS and NIS, where it is important to find the hazard rate of being discharged given the accumulative cost but only the total cost at the discharge time is available due to the protection of private information. Our estimation procedure is based on maximizing the full information likelihood function. The resulting estimators are shown to be consistent and asymptotically normally distributed. Simulations and a real example illustrate the utility of the proposed model. Finally, a couple of extensions are discussed. 相似文献

17.

Linear Transformations of Linear Mixed-Effects Models

Christopher H. Morrell Jay D. Pearson Larry J. Brant 《The American statistician》2013,67(4):338-343

A number of articles have discussed the way lower order polynomial and interaction terms should be handled in linear regression models. Only if all lower order terms are included in the model will the regression model be invariant with respect to coding transformations of the variables. If lower order terms are omitted, the regression model will not be well formulated. In this paper, we extend this work to examine the implications of the ordering of variables in the linear mixed-effects model. We demonstrate how linear transformations of the variables affect the model and tests of significance of fixed effects in the model. We show how the transformations modify the random effects in the model, as well as their covariance matrix and the value of the restricted log-likelihood. We suggest a variable selection strategy for the linear mixed-effects model. 相似文献

18.

Genetic Algorithm in the Wavelet Domain for Large p Small n Regression

Eylem Deniz Howe Orietta Nicolis 《统计学通讯:模拟与计算》2015,44(5):1144-1157

Many areas of statistical modeling are plagued by the “curse of dimensionality,” in which there are more variables than observations. This is especially true when developing functional regression models where the independent dataset is some type of spectral decomposition, such as data from near-infrared spectroscopy. While we could develop a very complex model by simply taking enough samples (such that n > p), this could prove impossible or prohibitively expensive. In addition, a regression model developed like this could turn out to be highly inefficient, as spectral data usually exhibit high multicollinearity. In this article, we propose a two-part algorithm for selecting an effective and efficient functional regression model. Our algorithm begins by evaluating a subset of discrete wavelet transformations, allowing for variation in both wavelet and filter number. Next, we perform an intermediate processing step to remove variables with low correlation to the response data. Finally, we use the genetic algorithm to perform a stochastic search through the subset regression model space, driven by an information-theoretic objective function. We allow our algorithm to develop the regression model for each response variable independently, so as to optimally model each variable. We demonstrate our method on the familiar biscuit dough dataset, which has been used in a similar context by several researchers. Our results demonstrate both the flexibility and the power of our algorithm. For each response variable, a different subset model is selected, and different wavelet transformations are used. The models developed by our algorithm show an improvement, as measured by lower mean error, over results in the published literature. 相似文献

19.

A Note on Multiple Regression for Single Index Model

《统计学通讯:理论与方法》2013,42(10):2409-2422

Abstract

A simple method based on sliced inverse regression (SIR) is proposed to explore an effective dimension reduction (EDR) vector for the single index model. We avoid the principle component analysis step of the original SIR by using two sample mean vectors in two slices of the response variable and their difference vector. The theories become simpler, the method is equivalent to the multiple linear regression with dichotomized response, and the estimator can be expressed by a closed form, although the objective function might be an unknown nonlinear. It can be applied for the case when the number of covariates is large, and it requires no matrix operation or iterative calculation. 相似文献

20.

Finite sample properties of an HPT estimator when each individual regression coefficient is estimated in a misspecified linear regression model

Haifeng Xu 《统计学通讯:理论与方法》2013,42(2):506-519

ABSTRACT

In this paper, assuming that there exist omitted variables in the specified model, we analytically derive the exact formula for the mean squared error (MSE) of a heterogeneous pre-test (HPT) estimator whose components are the ordinary least squares (OLS) and feasible ridge regression (FRR) estimators. Since we cannot examine the MSE performance analytically, we execute numerical evaluations to investigate small sample properties of the HPT estimator, and compare the MSE performance of the HPT estimator with those of the FRR estimator and the usual OLS estimator. Our numerical results show that (1) the HPT estimator is more efficient when the model misspecification is severe; (2) the HPT estimator with the optimal critical value obtained under the correctly specified model can be safely used even when there exist omitted variables in the specified model. 相似文献