首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Binary data are commonly used as responses to assess the effects of independent variables in longitudinal factorial studies. Such effects can be assessed in terms of the rate difference (RD), the odds ratio (OR), or the rate ratio (RR). Traditionally, the logistic regression seems always a recommended method with statistical comparisons made in terms of the OR. Statistical inference in terms of the RD and RR can then be derived using the delta method. However, this approach is hard to realize when repeated measures occur. To obtain statistical inference in longitudinal factorial studies, the current article shows that the mixed-effects model for repeated measures, the logistic regression for repeated measures, the log-transformed regression for repeated measures, and the rank-based methods are all valid methods that lead to inference in terms of the RD, OR, and RR, respectively. Asymptotic linear relationships between the estimators of the regression coefficients of these models are derived when the weight (working covariance) matrix is an identity matrix. Conditions for the Wald-type tests to be asymptotically equivalent in these models are provided and powers were compared using simulation studies. A phase III clinical trial is used to illustrate the investigated methods with corresponding SAS® code supplied.  相似文献   

2.
The autoregressive model for cointegrated variables is analyzed with respect to the role of the constant and linear terms. Various models for 1(1) variables defined by restrictions on the deterministic terms are discussed, and it is shown that statistical inference can be performed by reduced rank regression. The asymptotic distributions of the test statistics and estimators are found. A similar analysis is given for models for 1(2) variables with a constant term.  相似文献   

3.
The autoregressive model for cointegrated variables is analyzed with respect to the role of the constant and linear terms. Various models for 1(1) variables defined by restrictions on the deterministic terms are discussed, and it is shown that statistical inference can be performed by reduced rank regression. The asymptotic distributions of the test statistics and estimators are found. A similar analysis is given for models for 1(2) variables with a constant term.  相似文献   

4.
Tsou (2003a) proposed a parametric procedure for making robust inference for mean regression parameters in the context of generalized linear models. This robust procedure is extended to model variance heterogeneity. The normal working model is adjusted to become asymptotically robust for inference about regression parameters of the variance function for practically all continuous response variables. The connection between the novel robust variance regression model and the estimating equations approach is also provided.  相似文献   

5.
Summary. When a number of distinct models contend for use in prediction, the choice of a single model can offer rather unstable predictions. In regression, stochastic search variable selection with Bayesian model averaging offers a cure for this robustness issue but at the expense of requiring very many predictors. Here we look at Bayes model averaging incorporating variable selection for prediction. This offers similar mean-square errors of prediction but with a vastly reduced predictor space. This can greatly aid the interpretation of the model. It also reduces the cost if measured variables have costs. The development here uses decision theory in the context of the multivariate general linear model. In passing, this reduced predictor space Bayes model averaging is contrasted with single-model approximations. A fast algorithm for updating regressions in the Markov chain Monte Carlo searches for posterior inference is developed, allowing many more variables than observations to be contemplated. We discuss the merits of absolute rather than proportionate shrinkage in regression, especially when there are more variables than observations. The methodology is illustrated on a set of spectroscopic data used for measuring the amounts of different sugars in an aqueous solution.  相似文献   

6.
To assess the quality of the fit in a multiple linear regression, the coefficient of determination or R2 is a very simple tool, yet the most used by practitioners. Indeed, it is reported in most statistical analyzes, and although it is not recommended as a final model selection tool, it provides an indication of the suitability of the chosen explanatory variables in predicting the response. In the classical setting, it is well known that the least-squares fit and coefficient of determination can be arbitrary and/or misleading in the presence of a single outlier. In many applied settings, the assumption of normality of the errors and the absence of outliers are difficult to establish. In these cases, robust procedures for estimation and inference in linear regression are available and provide a suitable alternative.  相似文献   

7.
The linear regression model is commonly used by practitioners to model the relationship between the variable of interest and a set of explanatory variables. The assumption that all error variances are the same (homoskedasticity) is oftentimes violated. Consistent regression standard errors can be computed using the heteroskedasticity-consistent covariance matrix estimator proposed by White (1980). Such standard errors, however, typically display nonnegligible systematic errors in finite samples, especially under leveraged data. Cribari-Neto et al. (2000) improved upon the White estimator by defining a sequence of bias-adjusted estimators with increasing accuracy. In this paper, we improve upon their main result by defining an alternative sequence of adjusted estimators whose biases vanish at a much faster rate. Hypothesis testing inference is also addressed. An empirical illustration is presented.  相似文献   

8.
Biplots represent a widely used statistical tool for visualizing the resulting loadings and scores of a dimension reduction technique applied to multivariate data. If the underlying data carry only relative information (i.e. compositional data expressed in proportions, mg/kg, etc.) they have to be pre-processed with a logratio transformation before the dimension reduction is carried out. In the context of principal component analysis, the resulting biplot is called compositional biplot. We introduce an alternative, the ilr biplot, which is based on a special choice of orthonormal coordinates resulting from an isometric logratio (ilr) transformation. This allows to incorporate also external non-compositional variables, and to study the relations to the compositional variables. The methodology is demonstrated on real data sets.  相似文献   

9.
Bayesian networks are not well-formulated for continuous variables. The majority of recent works dealing with Bayesian inference are restricted only to special types of continuous variables such as the conditional linear Gaussian model for Gaussian variables. In this context, an exact Bayesian inference algorithm for clusters of continuous variables which may be approximated by independent component analysis models is proposed. The complexity in memory space is linear and the overfitting problem is attenuated, while the inference time is still exponential. Experiments for multibiometric score fusion with quality estimates are conducted, and it is observed that the performances are satisfactory compared to some known fusion techniques.  相似文献   

10.
Multicollinearity and model misspecification are frequently encountered problems in practice that produce undesirable effects on classical ordinary least squares (OLS) regression estimator. The ridge regression estimator is an important tool to reduce the effects of multicollinearity, but it is still sensitive to a model misspecification of error distribution. Although rank-based statistical inference has desirable robustness properties compared to the OLS procedures, it can be unstable in the presence of multicollinearity. This paper introduces a rank regression estimator for regression parameters and develops tests for general linear hypotheses in a multiple linear regression model. The proposed estimator and the tests have desirable robustness features against the multicollinearity and model misspecification of error distribution. Asymptotic behaviours of the proposed estimator and the test statistics are investigated. Real and simulated data sets are used to demonstrate the feasibility and the performance of the estimator and the tests.  相似文献   

11.
This paper is concerned with selection of explanatory variables in generalized linear models (GLM). The class of GLM's is quite large and contains e.g. the ordinary linear regression, the binary logistic regression, the probit model and Poisson regression with linear or log-linear parameter structure. We show that, through an approximation of the log likelihood and a certain data transformation, the variable selection problem in a GLM can be converted into variable selection in an ordinary (unweighted) linear regression model. As a consequence no specific computer software for variable selection in GLM's is needed. Instead, some suitable variable selection program for linear regression can be used. We also present a simulation study which shows that the log likelihood approximation is very good in many practical situations. Finally, we mention briefly possible extensions to regression models outside the class of GLM's.  相似文献   

12.
Calibration techniques in survey sampling, such as generalized regression estimation (GREG), were formalized in the 1990s to produce efficient estimators of linear combinations of study variables, such as totals or means. They implicitly lie on the assumption of a linear regression model between the variable of interest and some auxiliary variables in order to yield estimates with lower variance if the model is true and remaining approximately design-unbiased even if the model does not hold. We propose a new class of model-assisted estimators obtained by releasing a few calibration constraints and replacing them with a penalty term. This penalization is added to the distance criterion to minimize. By introducing the concept of penalized calibration, combining usual calibration and this ‘relaxed’ calibration, we are able to adjust the weight given to the available auxiliary information. We obtain a more flexible estimation procedure giving better estimates particularly when the auxiliary information is overly abundant or not fully appropriate to be completely used. Such an approach can also be seen as a design-based alternative to the estimation procedures based on the more general class of mixed models, presenting new prospects in some scopes of application such as inference on small domains.  相似文献   

13.
Approximate conditional inference is developed for the slope parameter of the linear functional model with two variables. It is shown that the model can be transformed so that the slope parameter becomes an angle and nuisance parameters are radial distances. If the nuisance parameters are known an exact confidence interval based on a location-type conditional distribution is available for the angle. More gen¬erally, confidence distributions are used to average the conditional distribution over the nuisance parameters yielding an approximate conditional confidence interval that reflects the precision indicated by the data. An example is analyzed.  相似文献   

14.
Summary.  Structured additive regression models are perhaps the most commonly used class of models in statistical applications. It includes, among others, (generalized) linear models, (generalized) additive models, smoothing spline models, state space models, semiparametric regression, spatial and spatiotemporal models, log-Gaussian Cox processes and geostatistical and geoadditive models. We consider approximate Bayesian inference in a popular subset of structured additive regression models, latent Gaussian models , where the latent field is Gaussian, controlled by a few hyperparameters and with non-Gaussian response variables. The posterior marginals are not available in closed form owing to the non-Gaussian response variables. For such models, Markov chain Monte Carlo methods can be implemented, but they are not without problems, in terms of both convergence and computational time. In some practical applications, the extent of these problems is such that Markov chain Monte Carlo sampling is simply not an appropriate tool for routine analysis. We show that, by using an integrated nested Laplace approximation and its simplified version, we can directly compute very accurate approximations to the posterior marginals. The main benefit of these approximations is computational: where Markov chain Monte Carlo algorithms need hours or days to run, our approximations provide more precise estimates in seconds or minutes. Another advantage with our approach is its generality, which makes it possible to perform Bayesian analysis in an automatic, streamlined way, and to compute model comparison criteria and various predictive measures so that models can be compared and the model under study can be challenged.  相似文献   

15.
Summary.  We introduce a flexible marginal modelling approach for statistical inference for clustered and longitudinal data under minimal assumptions. This estimated estimating equations approach is semiparametric and the proposed models are fitted by quasi-likelihood regression, where the unknown marginal means are a function of the fixed effects linear predictor with unknown smooth link, and variance–covariance is an unknown smooth function of the marginal means. We propose to estimate the nonparametric link and variance–covariance functions via smoothing methods, whereas the regression parameters are obtained via the estimated estimating equations. These are score equations that contain nonparametric function estimates. The proposed estimated estimating equations approach is motivated by its flexibility and easy implementation. Moreover, if data follow a generalized linear mixed model, with either a specified or an unspecified distribution of random effects and link function, the model proposed emerges as the corresponding marginal (population-average) version and can be used to obtain inference for the fixed effects in the underlying generalized linear mixed model, without the need to specify any other components of this generalized linear mixed model. Among marginal models, the estimated estimating equations approach provides a flexible alternative to modelling with generalized estimating equations. Applications of estimated estimating equations include diagnostics and link selection. The asymptotic distribution of the proposed estimators for the model parameters is derived, enabling statistical inference. Practical illustrations include Poisson modelling of repeated epileptic seizure counts and simulations for clustered binomial responses.  相似文献   

16.
While most regression models focus on explaining distributional aspects of one single response variable alone, interest in modern statistical applications has recently shifted towards simultaneously studying multiple response variables as well as their dependence structure. A particularly useful tool for pursuing such an analysis are copula-based regression models since they enable the separation of the marginal response distributions and the dependence structure summarised in a specific copula model. However, so far copula-based regression models have mostly been relying on two-step approaches where the marginal distributions are determined first whereas the copula structure is studied in a second step after plugging in the estimated marginal distributions. Moreover, the parameters of the copula are mostly treated as a constant not related to covariates and most regression specifications for the marginals are restricted to purely linear predictors. We therefore propose simultaneous Bayesian inference for both the marginal distributions and the copula using computationally efficient Markov chain Monte Carlo simulation techniques. In addition, we replace the commonly used linear predictor by a generic structured additive predictor comprising for example nonlinear effects of continuous covariates, spatial effects or random effects and furthermore allow to make the copula parameters covariate-dependent. To facilitate Bayesian inference, we construct proposal densities for a Metropolis–Hastings algorithm relying on quadratic approximations to the full conditionals of regression coefficients avoiding manual tuning. The performance of the resulting Bayesian estimates is evaluated in simulations comparing our approach with penalised likelihood inference, studying the choice of a specific copula model based on the deviance information criterion, and comparing a simultaneous approach with a two-step procedure. Furthermore, the flexibility of Bayesian conditional copula regression models is illustrated in two applications on childhood undernutrition and macroecology.  相似文献   

17.
Summary.  Generalized linear latent variable models (GLLVMs), as defined by Bartholomew and Knott, enable modelling of relationships between manifest and latent variables. They extend structural equation modelling techniques, which are powerful tools in the social sciences. However, because of the complexity of the log-likelihood function of a GLLVM, an approximation such as numerical integration must be used for inference. This can limit drastically the number of variables in the model and can lead to biased estimators. We propose a new estimator for the parameters of a GLLVM, based on a Laplace approximation to the likelihood function and which can be computed even for models with a large number of variables. The new estimator can be viewed as an M -estimator, leading to readily available asymptotic properties and correct inference. A simulation study shows its excellent finite sample properties, in particular when compared with a well-established approach such as LISREL. A real data example on the measurement of wealth for the computation of multidimensional inequality is analysed to highlight the importance of the methodology.  相似文献   

18.
The use of heteroscedasticity-consistent covariance matrix (HCCM) estimators is very common in practice to draw correct inference for the coefficients of a linear regression model with heteroscedastic errors. However, in addition to the problem of heteroscedasticity, linear regression models may also be plagued with some considerable degree of collinearity among the regressors when two or more regressors are considered. This situation causes many adverse effects on the least squares measures and alternatively, the ordinary ridge regression method is used as a common practice. But in the available literature, the problems of multicollinearity and heteroscedasticity have not been discussed as a combined issue especially, for the inference of the regression coefficients. The present article addresses the inference about the regression coefficients taking both the issues of multicollinearity and heteroscedasticity into account and suggests the use of HCCM estimators for the ridge regression. This article proposes t- and F-tests, based on these HCCM estimators, that perform adequately well in the numerical evaluation of the Monte Carlo simulations.  相似文献   

19.
A special source of difficulty in the statistical analysis is the possibility that some subjects may not have a complete observation of the response variable. Such incomplete observation of the response variable is called censoring. Censorship can occur for a variety of reasons, including limitations of measurement equipment, design of the experiment, and non-occurrence of the event of interest until the end of the study. In the presence of censoring, the dependence of the response variable on the explanatory variables can be explored through regression analysis. In this paper, we propose to examine the censorship problem in context of the class of asymmetric, i.e., we have proposed a linear regression model with censored responses based on skew scale mixtures of normal distributions. We develop a Monte Carlo EM (MCEM) algorithm to perform maximum likelihood inference of the parameters in the proposed linear censored regression models with skew scale mixtures of normal distributions. The MCEM algorithm has been discussed with an emphasis on the skew-normal, skew Student-t-normal, skew-slash and skew-contaminated normal distributions. To examine the performance of the proposed method, we present some simulation studies and analyze a real dataset.  相似文献   

20.
Modern statistical applications involving large data sets have focused attention on statistical methodologies which are both efficient computationally and able to deal with the screening of large numbers of different candidate models. Here we consider computationally efficient variational Bayes approaches to inference in high-dimensional heteroscedastic linear regression, where both the mean and variance are described in terms of linear functions of the predictors and where the number of predictors can be larger than the sample size. We derive a closed form variational lower bound on the log marginal likelihood useful for model selection, and propose a novel fast greedy search algorithm on the model space which makes use of one-step optimization updates to the variational lower bound in the current model for screening large numbers of candidate predictor variables for inclusion/exclusion in a computationally thrifty way. We show that the model search strategy we suggest is related to widely used orthogonal matching pursuit algorithms for model search but yields a framework for potentially extending these algorithms to more complex models. The methodology is applied in simulations and in two real examples involving prediction for food constituents using NIR technology and prediction of disease progression in diabetes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号