首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Significance tests on coefficients of lower-order terms in polynomial regression models are affected by linear transformations. For this reason, a polynomial regression model that excludes hierarchically inferior predictors (i.e., lower-order terms) is considered to be not well formulated. Existing variable-selection algorithms do not take into account the hierarchy of predictors and often select as “best” a model that is not hierarchically well formulated. This article proposes a theory of the hierarchical ordering of the predictors of an arbitrary polynomial regression model in m variables, where m is any arbitrary positive integer. Ways of modifying existing algorithms to restrict their search to well-formulated models are suggested. An algorithm that generates all possible well-formulated models is presented.  相似文献   

2.
This article presents a novel Bayesian analysis for linear mixed-effects models. The analysis is based on the method of partial collapsing that allows some components to be partially collapsed out of a model. The resulting partially collapsed Gibbs (PCG) sampler constructed to fit linear mixed-effects models is expected to exhibit much better convergence properties than the corresponding Gibbs sampler. In order to construct the PCG sampler without complicating component updates, we consider the reparameterization of model components by expressing a between-group variance in terms of a within-group variance in a linear mixed-effects model. The proposed method of partial collapsing with reparameterization is applied to the Merton’s jump diffusion model as well as general linear mixed-effects models with proper prior distributions and illustrated using simulated data and longitudinal data on sleep deprivation.  相似文献   

3.
由于常用的线性混合效应模型对具有非线性关系的纵向数据建模具有一定的局限性,因此对线性混合效应模型进行扩展,根据变量间的非线性关系建立不同的非线性混合效应模型,并根据因变量的分布特征建立混合分布模型。基于一组实际的保险损失数据,建立多项式混合效应模型、截断多项式混合效应模型和B样条混合效应模型。研究结果表明,非线性混合效应模型能够显著改进对保险损失数据的建模效果,对非寿险费率厘定具有重要参考价值。  相似文献   

4.
Mixture of linear mixed-effects models has received considerable attention in longitudinal studies, including medical research, social science and economics. The inferential question of interest is often the identification of critical factors that affect the responses. We consider a Bayesian approach to select the important fixed and random effects in the finite mixture of linear mixed-effects models. To accomplish our goal, latent variables are introduced to facilitate the identification of influential fixed and random components and to classify the membership of observations in the longitudinal data. A spike-and-slab prior for the regression coefficients is adopted to sidestep the potential complications of highly collinear covariates and to handle large p and small n issues in the variable selection problems. Here we employ Markov chain Monte Carlo (MCMC) sampling techniques for posterior inferences and explore the performance of the proposed method in simulation studies, followed by an actual psychiatric data analysis concerning depressive disorder.  相似文献   

5.
In this article, we present a compressive sensing based framework for generalized linear model regression that employs a two-component noise model and convex optimization techniques to simultaneously detect outliers and determine optimally sparse representations of noisy data from arbitrary sets of basis functions. We then extend our model to include model order reduction capabilities that can uncover inherent sparsity in regression coefficients and achieve simple, superior fits. Second, we use the mixed ?2/?1 norm to develop another model that can efficiently uncover block-sparsity in regression coefficients. By performing model order reduction over all independent variables and basis functions, our algorithms successfully deemphasize the effect of independent variables that become uncorrelated with dependent variables. This desirable property has various applications in real-time anomaly detection, such as faulty sensor detection and sensor jamming in wireless sensor networks. After developing our framework and inheriting a stable recovery theorem from compressive sensing theory, we present two simulation studies on sparse or block-sparse problems that demonstrate the superior performance of our algorithms with respect to (1) classic outlier-invariant regression techniques like least absolute value and iteratively reweighted least-squares and (2) classic sparse-regularized regression techniques like LASSO.  相似文献   

6.
Functional data can be clustered by plugging estimated regression coefficients from individual curves into the k-means algorithm. Clustering results can differ depending on how the curves are fit to the data. Estimating curves using different sets of basis functions corresponds to different linear transformations of the data. k-means clustering is not invariant to linear transformations of the data. The optimal linear transformation for clustering will stretch the distribution so that the primary direction of variability aligns with actual differences in the clusters. It is shown that clustering the raw data will often give results similar to clustering regression coefficients obtained using an orthogonal design matrix. Clustering functional data using an L(2) metric on function space can be achieved by clustering a suitable linear transformation of the regression coefficients. An example where depressed individuals are treated with an antidepressant is used for illustration.  相似文献   

7.
The subject of this paper is Bayesian inference about the fixed and random effects of a mixed-effects linear statistical model with two variance components. It is assumed that a priori the fixed effects have a noninformative distribution and that the reciprocals of the variance components are distributed independently (of each other and of the fixed effects) as gamma random variables. It is shown that techniques similar to those employed in a ridge analysis of a response surface can be used to construct a one-dimensional curve that contains all of the stationary points of the posterior density of the random effects. The “ridge analysis” (of the posterior density) can be useful (from a computational standpoint) in finding the number and the locations of the stationary points and can be very informative about various features of the posterior density. Depending on what is revealed by the ridge analysis, a multivariate normal or multivariate-t distribution that is centered at a posterior mode may provide a satisfactory approximation to the posterior distribution of the random effects (which is of the poly-t form).  相似文献   

8.
Binary data are commonly used as responses to assess the effects of independent variables in longitudinal factorial studies. Such effects can be assessed in terms of the rate difference (RD), the odds ratio (OR), or the rate ratio (RR). Traditionally, the logistic regression seems always a recommended method with statistical comparisons made in terms of the OR. Statistical inference in terms of the RD and RR can then be derived using the delta method. However, this approach is hard to realize when repeated measures occur. To obtain statistical inference in longitudinal factorial studies, the current article shows that the mixed-effects model for repeated measures, the logistic regression for repeated measures, the log-transformed regression for repeated measures, and the rank-based methods are all valid methods that lead to inference in terms of the RD, OR, and RR, respectively. Asymptotic linear relationships between the estimators of the regression coefficients of these models are derived when the weight (working covariance) matrix is an identity matrix. Conditions for the Wald-type tests to be asymptotically equivalent in these models are provided and powers were compared using simulation studies. A phase III clinical trial is used to illustrate the investigated methods with corresponding SAS® code supplied.  相似文献   

9.
针对自变量和因变量皆模糊的数据系统中的回归分析问题,为避免自变量退化成数值变量时可能引致的估计误差增大而带来的问题,提出系统中引入模糊调整项的回归模型的一般结构,并运用基于模糊数间完备距离的最小二乘法研究模型解析表达式;利用水平截集概念将模糊多元回归模型转化成两个传统回归模型,根据模糊数间距离采用最小二乘法得到参数估计,给出员工工作绩效评估的算例说明方法的有效性,并结合Bootstrap方法的应用,研究回归参数所具有的随机不确定性动态变化。  相似文献   

10.
In this article, the multivariate linear regression model is studied under the assumptions that the error term of this model is described by the elliptically contoured distribution and the observations on the response variables are of a monotone missing pattern. It is primarily concerned with estimation of the model parameters, as well as with the development of the likelihood ratio test in order to examine the existence of linear constraints on the regression coefficients. An illustrative example is presented for the explanation of the results.  相似文献   

11.
Linear regression with compositional explanatory variables   总被引:1,自引:0,他引:1  
Compositional explanatory variables should not be directly used in a linear regression model because any inference statistic can become misleading. While various approaches for this problem were proposed, here an approach based on the isometric logratio (ilr) transformation is used. It turns out that the resulting model is easy to handle, and that parameter estimation can be done in like in usual linear regression. Moreover, it is possible to use the ilr variables for inference statistics in order to obtain an appropriate interpretation of the model.  相似文献   

12.
13.
删除截距项和遗漏解释变量是线性回归模型估计中的两个常见错误,删除截距项错误发生的原因是检验过程中发现其不显著而将其剔除,这会造成模型参数估计和假设检验的失真;遗漏解释变量的错误发生原因是人们错误认为只要变量存在相关性且存在因果联系就可以进行回归分析,以至于不考虑其它重要的解释变量,此时建立的模型不能用于经济结构分析和政策评价,最多只能用于预测目的。  相似文献   

14.
In this article, a general approach to latent variable models based on an underlying generalized linear model (GLM) with factor analysis observation process is introduced. We call these models Generalized Linear Factor Models (GLFM). The observations are produced from a general model framework that involves observed and latent variables that are assumed to be distributed in the exponential family. More specifically, we concentrate on situations where the observed variables are both discretely measured (e.g., binomial, Poisson) and continuously distributed (e.g., gamma). The common latent factors are assumed to be independent with a standard multivariate normal distribution. Practical details of training such models with a new local expectation-maximization (EM) algorithm, which can be considered as a generalized EM-type algorithm, are also discussed. In conjunction with an approximated version of the Fisher score algorithm (FSA), we show how to calculate maximum likelihood estimates of the model parameters, and to yield inferences about the unobservable path of the common factors. The methodology is illustrated by an extensive Monte Carlo simulation study and the results show promising performance.  相似文献   

15.
Double hierarchical generalized linear models (with discussion)   总被引:2,自引:0,他引:2  
Summary.  We propose a class of double hierarchical generalized linear models in which random effects can be specified for both the mean and dispersion. Heteroscedasticity between clusters can be modelled by introducing random effects in the dispersion model, as is heterogeneity between clusters in the mean model. This class will, among other things, enable models with heavy-tailed distributions to be explored, providing robust estimation against outliers. The h -likelihood provides a unified framework for this new class of models and gives a single algorithm for fitting all members of the class. This algorithm does not require quadrature or prior probabilities.  相似文献   

16.
This paper introduces an alternating conditional expectation (ACE) algorithm: a non-parametric approach for estimating the transformations that lead to the maximal multiple correlation of a response and a set of independent variables in regression and correlation analysis. These transformations can give the data analyst insight into the relationships between these variables so that this can be best described and non-linear relationships uncovered. Using the Bayesian information criterion (BIC), we show how to find the best closed-form approximations for the optimal ACE transformations. By means of ACE and BIC, the model fit can be considerably improved compared with the conventional linear model as demonstrated in the two simulated and two real datasets in this paper.  相似文献   

17.
ABSTRACT

We propose a new semiparametric Weibull cure rate model for fitting nonlinear effects of explanatory variables on the mean, scale and cure rate parameters. The regression model is based on the generalized additive models for location, scale and shape, for which any or all distribution parameters can be modeled as parametric linear and/or nonparametric smooth functions of explanatory variables. We present methods to select additive terms, model estimation and validation, where all computational codes are presented in a simple way such that any R user can fit the new model. Biases of the parameter estimates caused by models specified erroneously are investigated through Monte Carlo simulations. We illustrate the usefulness of the new model by means of two applications to real data. We provide computational codes to fit the new regression model in the R software.  相似文献   

18.
Existing research on mixtures of regression models are limited to directly observed predictors. The estimation of mixtures of regression for measurement error data imposes challenges for statisticians. For linear regression models with measurement error data, the naive ordinary least squares method, which directly substitutes the observed surrogates for the unobserved error-prone variables, yields an inconsistent estimate for the regression coefficients. The same inconsistency also happens to the naive mixtures of regression estimate, which is based on the traditional maximum likelihood estimator and simply ignores the measurement error. To solve this inconsistency, we propose to use the deconvolution method to estimate the mixture likelihood of the observed surrogates. Then our proposed estimate is found by maximizing the estimated mixture likelihood. In addition, a generalized EM algorithm is also developed to find the estimate. The simulation results demonstrate that the proposed estimation procedures work well and perform much better than the naive estimates.  相似文献   

19.
In real‐data analysis, deciding the best subset of variables in regression models is an important problem. Akaike's information criterion (AIC) is often used in order to select variables in many fields. When the sample size is not so large, the AIC has a non‐negligible bias that will detrimentally affect variable selection. The present paper considers a bias correction of AIC for selecting variables in the generalized linear model (GLM). The GLM can express a number of statistical models by changing the distribution and the link function, such as the normal linear regression model, the logistic regression model, and the probit model, which are currently commonly used in a number of applied fields. In the present study, we obtain a simple expression for a bias‐corrected AIC (corrected AIC, or CAIC) in GLMs. Furthermore, we provide an ‘R’ code based on our formula. A numerical study reveals that the CAIC has better performance than the AIC for variable selection.  相似文献   

20.
In this paper, we consider partially linear additive models with an unknown link function, which include single‐index models and additive models as special cases. We use polynomial spline method for estimating the unknown link function as well as the component functions in the additive part. We establish that convergence rates for all nonparametric functions are the same as in one‐dimensional nonparametric regression. For a faster rate of the parametric part, we need to define appropriate ‘projection’ that is more complicated than that defined previously for partially linear additive models. Compared to previous approaches, a distinct advantage of our estimation approach in implementation is that estimation directly reduces estimation in the single‐index model and can thus deal with much larger dimensional problems than previous approaches for additive models with unknown link functions. Simulations and a real dataset are used to illustrate the proposed model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号