首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
Tsou (2003a) proposed a parametric procedure for making robust inference for mean regression parameters in the context of generalized linear models. This robust procedure is extended to model variance heterogeneity. The normal working model is adjusted to become asymptotically robust for inference about regression parameters of the variance function for practically all continuous response variables. The connection between the novel robust variance regression model and the estimating equations approach is also provided.  相似文献   

2.
In many areas of application mixed linear models serve as a popular tool for analyzing highly complex data sets. For inference about fixed effects and variance components, likelihood-based methods such as (restricted) maximum likelihood estimators, (RE)ML, are commonly pursued. However, it is well-known that these fully efficient estimators are extremely sensitive to small deviations from hypothesized normality of random components as well as to other violations of distributional assumptions. In this article, we propose a new class of robust-efficient estimators for inference in mixed linear models. The new three-step estimation procedure provides truncated generalized least squares and variance components' estimators with hard-rejection weights adaptively computed from the data. More specifically, our data re-weighting mechanism first detects and removes within-subject outliers, then identifies and discards between-subject outliers, and finally it employs maximum likelihood procedures on the “clean” data. Theoretical efficiency and robustness properties of this approach are established.  相似文献   

3.
A threshold autoregressive (TAR) model is an important class of nonlinear time series models that possess many desirable features such as asymmetric limit cycles and amplitude-dependent frequencies. Statistical inference for the TAR model encounters a major difficulty in the estimation of thresholds, however. This article develops an efficient procedure to estimate the thresholds. The procedure first transforms multiple-threshold detection to a regression variable selection problem, and then employs a group orthogonal greedy algorithm to obtain the threshold estimates. Desirable theoretical results are derived to lend support to the proposed methodology. Simulation experiments are conducted to illustrate the empirical performances of the method. Applications to U.S. GNP data are investigated.  相似文献   

4.
Dummy (0, 1) variables are frequently used in statistical modeling to represent the effect of certain extraneous factors. This paper presents a special purpose linear programming algorithm for obtaining least-absolute-value estimators in a linear model with dummy variables. The algorithm employs a compact basis inverse procedure and incorporates the advanced basis exchange techniques available in specialized algorithms for the general linear least-absolute-value problem. Computational results with a computer code version of the algorithm are given.  相似文献   

5.
The use of surrogate variables has been proposed as a means to capture, for a given observed set of data, sources driving the dependency structure among high-dimensional sets of features and remove the effects of those sources and their potential negative impact on simultaneous inference. In this article we illustrate the potential effects of latent variables on testing dependence and the resulting impact on multiple inference, we briefly review the method of surrogate variable analysis proposed by Leek and Storey (PNAS 2008; 105:18718-18723), and assess that method via simulations intended to mimic the complexity of feature dependence observed in real-world microarray data. The method is also assessed via application to a recent Merck microarray data set. Both simulation and case study results indicate that surrogate variable analysis can offer a viable strategy for tackling the multiple testing dependence problem when the features follow a potentially complex correlation structure, yielding improvements in the variability of false positive rates and increases in power.  相似文献   

6.
For manifest variables with additive noise and for a given number of latent variables with an assumed distribution, we propose to nonparametrically estimate the association between latent and manifest variables. Our estimation is a two step procedure: first it employs standard factor analysis to estimate the latent variables as theoretical quantiles of the assumed distribution; second, it employs the additive models’ backfitting procedure to estimate the monotone nonlinear associations between latent and manifest variables. The estimated fit may suggest a different latent distribution or point to nonlinear associations. We show on simulated data how, based on mean squared errors, the nonparametric estimation improves on factor analysis. We then employ the new estimator on real data to illustrate its use for exploratory data analysis.  相似文献   

7.
The Reed-Frost epidemic model is a simple stochastic process with parameter q that describes the spread of an infectious disease among a closed population. Given data on the final outcome of an epidemic, it is possible to perform Bayesian inference for q using a simple Gibbs sampler algorithm. In this paper it is illustrated that by choosing latent variables appropriately, certain monotonicity properties hold which facilitate the use of a perfect simulation algorithm. The methods are applied to real data.  相似文献   

8.
In this paper, we develop a new class of double generalized linear models, introducing a random-effect component in the link function describing the linear predictor related to the precision parameter. This is a useful procedure to take into account extra variability and also to make the model more robust. The Bayesian paradigm is adopted to make inference in this class of models. Samples of the joint posterior distribution are drawn using standard Monte Carlo Markov Chain procedures. Finally, we illustrate this algorithm by considering simulated and real data sets.  相似文献   

9.
A procedure for stepwise regression analysis for the non-experimental case is suggested. Regarding the problem as a multiple inference one, the procedure picks out the relevant regressors and, based on a slightly new approach, estimates the structure of dependencies among the variables involved.  相似文献   

10.
This paper considers the analysis of linear models where the response variable is a linear function of observable component variables. For example, scores on two or more psychometric measures (the component variables) might be weighted and summed to construct a single response variable in a psychological study. A linear model is then fit to the response variable. The question addressed in this paper is how to optimally transform the component variables so that the response is approximately normally distributed. The transformed component variables, themselves, need not be jointly normal. Two cases are considered; in both cases, the Box-Cox power family of transformations is employed. In Case I, the coefficients of the linear transformation are known constants. In Case II, the linear function is the first principal component based on the matrix of correlations among the transformed component variables. For each case, an algorithm is described for finding the transformation powers that minimize a generalized Anderson-Darling statistic. The proposed transformation procedure is compared to likelihood-based methods by means of simulation. The proposed method rarely performed worse than likelihood-based methods and for many data sets performed substantially better. As an illustration, the algorithm is applied to a problem from rural sociology and social psychology; namely scaling family residences along an urban-rural dimension.  相似文献   

11.
We consider a linear regression model where there are group structures in covariates. The group LASSO has been proposed for group variable selections. Many nonconvex penalties such as smoothly clipped absolute deviation and minimax concave penalty were extended to group variable selection problems. The group coordinate descent (GCD) algorithm is used popularly for fitting these models. However, the GCD algorithms are hard to be applied to nonconvex group penalties due to computational complexity unless the design matrix is orthogonal. In this paper, we propose an efficient optimization algorithm for nonconvex group penalties by combining the concave convex procedure and the group LASSO algorithm. We also extend the proposed algorithm for generalized linear models. We evaluate numerical efficiency of the proposed algorithm compared to existing GCD algorithms through simulated data and real data sets.  相似文献   

12.
The performance of commonly used asymptotic inference procedures for the random-effects model used in meta analysis relies on the number of studies. When the number of studies is moderate or small, the exact inference procedure is more reliable than the asymptotic counterparts. However, the related numerical computation may be demanding and an obstacle of routine use of the exact method. In this paper, we proposed a novel numerical algorithm for constructing the exact 95% confidence interval of the location parameter in the random-effects model. The algorithm is much faster than the naive method and may greatly facilitate the use of the more appropriate exact inference procedure in meta analysis. Numerical studies and real data examples are used to illustrate the advantage of the proposed method.  相似文献   

13.
ABSTRACT

The literature on spurious regressions has found that the t-statistic for testing the null of no relationship between two independent variables diverges asymptotically under a wide variety of non stationary data-generating processes for the dependent and explanatory variables. This paper introduces a simple method which guarantees convergence of this t-statistic to a pivotal limit distribution, thus allowing asymptotic inference. This method can be used to distinguish a genuine relationship from a spurious one among integrated processes. We apply the proposed procedure to several pairs of apparently independent integrated variables, and find that our procedure does not find (spurious) significant relationships.  相似文献   

14.
Inverse Weibull (IW) distribution is one of the widely used probability distributions for nonnegative data modelling, specifically, for describing degradation phenomena of mechanical components. In this paper, by compounding IW and power series distributions we introduce a new lifetime distribution. The compounding procedure follows the same set-up carried out by Adamidis and Loukas [A lifetime distribution with decreasing failure rate. Stat Probab Lett. 1998;39:35–42]. We provide mathematical properties of this new distribution such as moments, estimation by maximum likelihood with censored data, inference for a large sample and the EM algorithm to determine the maximum likelihood estimates of the parameters. Furthermore, we characterize the proposed distributions using a simple relationship between two truncated moments and maximum entropy principle under suitable constraints. Finally, to show the flexibility of this type of distributions, we demonstrate applications of two real data sets.  相似文献   

15.
In recent years, dynamical modelling has been provided with a range of breakthrough methods to perform exact Bayesian inference. However, it is often computationally unfeasible to apply exact statistical methodologies in the context of large data sets and complex models. This paper considers a nonlinear stochastic differential equation model observed with correlated measurement errors and an application to protein folding modelling. An approximate Bayesian computation (ABC)-MCMC algorithm is suggested to allow inference for model parameters within reasonable time constraints. The ABC algorithm uses simulations of ‘subsamples’ from the assumed data-generating model as well as a so-called ‘early-rejection’ strategy to speed up computations in the ABC-MCMC sampler. Using a considerate amount of subsamples does not seem to degrade the quality of the inferential results for the considered applications. A simulation study is conducted to compare our strategy with exact Bayesian inference, the latter resulting two orders of magnitude slower than ABC-MCMC for the considered set-up. Finally, the ABC algorithm is applied to a large size protein data. The suggested methodology is fairly general and not limited to the exemplified model and data.  相似文献   

16.
ABSTRACT

In this paper, we consider an effective Bayesian inference for censored Student-t linear regression model, which is a robust alternative to the usual censored Normal linear regression model. Based on the mixture representation of the Student-t distribution, we propose a non-iterative Bayesian sampling procedure to obtain independently and identically distributed samples approximately from the observed posterior distributions, which is different from the iterative Markov Chain Monte Carlo algorithm. We conduct model selection and influential analysis using the posterior samples to choose the best fitted model and to detect latent outliers. We illustrate the performance of the procedure through simulation studies, and finally, we apply the procedure to two real data sets, one is the insulation life data with right censoring and the other is the wage rates data with left censoring, and we get some interesting results.  相似文献   

17.
Multivariate data arise frequently in biomedical and health studies where multiple response variables are collected across subjects. Unlike a univariate procedure fitting each response separately, a multivariate regression model provides a unique opportunity in studying the joint evolution of various response variables. In this paper, we propose two estimation procedures that improve estimation efficiency for the regression parameter by accommodating correlations among the response variables. The proposed procedures do not require knowledge of the true correlation structure nor does it estimate the parameters associated with the correlation. Theoretical and simulation results confirm that the proposed estimators are more efficient than the one obtained from the univariate approach. We further propose simple and powerful inference procedures for a goodness-of-fit test that possess the chi-squared asymptotic properties. Extensive simulation studies suggest that the proposed tests are more powerful than the Wald test based on the univariate procedure. The proposed methods are also illustrated through the mother’s stress and children’s morbidity study.  相似文献   

18.
This paper develops Bayesian inference of extreme value models with a flexible time-dependent latent structure. The generalized extreme value distribution is utilized to incorporate state variables that follow an autoregressive moving average (ARMA) process with Gumbel-distributed innovations. The time-dependent extreme value distribution is combined with heavy-tailed error terms. An efficient Markov chain Monte Carlo algorithm is proposed using a state-space representation with a finite mixture of normal distributions to approximate the Gumbel distribution. The methodology is illustrated by simulated data and two different sets of real data. Monthly minima of daily returns of stock price index, and monthly maxima of hourly electricity demand are fit to the proposed model and used for model comparison. Estimation results show the usefulness of the proposed model and methodology, and provide evidence that the latent autoregressive process and heavy-tailed errors play an important role to describe the monthly series of minimum stock returns and maximum electricity demand.  相似文献   

19.
We introduce a technique for extending the classical method of linear discriminant analysis (LDA) to data sets where the predictor variables are curves or functions. This procedure, which we call functional linear discriminant analysis ( FLDA ), is particularly useful when only fragments of the curves are observed. All the techniques associated with LDA can be extended for use with FLDA. In particular FLDA can be used to produce classifications on new (test) curves, give an estimate of the discriminant function between classes and provide a one- or two-dimensional pictorial representation of a set of curves. We also extend this procedure to provide generalizations of quadratic and regularized discriminant analysis.  相似文献   

20.
Cluster analysis is one of the most widely used method in statistical analyses, in which homogeneous subgroups are identified in a heterogeneous population. Due to the existence of the continuous and discrete mixed data in many applications, so far, some ordinary clustering methods such as, hierarchical methods, k-means and model-based methods have been extended for analysis of mixed data. However, in the available model-based clustering methods, by increasing the number of continuous variables, the number of parameters increases and identifying as well as fitting an appropriate model may be difficult. In this paper, to reduce the number of the parameters, for the model-based clustering mixed data of continuous (normal) and nominal data, a set of parsimonious models is introduced. Models in this set are extended, using the general location model approach, for modeling distribution of mixed variables and applying factor analyzer structure for covariance matrices. The ECM algorithm is used for estimating the parameters of these models. In order to show the performance of the proposed models for clustering, results from some simulation studies and analyzing two real data sets are presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号