首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Abstract. The modelling process in Bayesian Statistics constitutes the fundamental stage of the analysis, since depending on the chosen probability laws the inferences may vary considerably. This is particularly true when conflicts arise between two or more sources of information. For instance, inference in the presence of an outlier (which conflicts with the information provided by the other observations) can be highly dependent on the assumed sampling distribution. When heavy‐tailed (e.g. t) distributions are used, outliers may be rejected whereas this kind of robust inference is not available when we use light‐tailed (e.g. normal) distributions. A long literature has established sufficient conditions on location‐parameter models to resolve conflict in various ways. In this work, we consider a location–scale parameter structure, which is more complex than the single parameter cases because conflicts can arise between three sources of information, namely the likelihood, the prior distribution for the location parameter and the prior for the scale parameter. We establish sufficient conditions on the distributions in a location–scale model to resolve conflicts in different ways as a single observation tends to infinity. In addition, for each case, we explicitly give the limiting posterior distributions as the conflict becomes more extreme.  相似文献   

2.
Representative points (RPs) are a set of points that optimally represents a distribution in terms of mean square error. When the prior data is location biased, the direct methods such as the k-means algorithm may be inefficient to obtain the RPs. In this article, a new indirect algorithm is proposed to search the RPs based on location-biased datasets. Such an algorithm does not constrain the parameter model of the true distribution. The empirical study shows that such algorithm can obtain better RPs than the k-means algorithm.  相似文献   

3.
We investigate the exact coverage and expected length properties of the model averaged tail area (MATA) confidence interval proposed by Turek and Fletcher, CSDA, 2012, in the context of two nested, normal linear regression models. The simpler model is obtained by applying a single linear constraint on the regression parameter vector of the full model. For given length of response vector and nominal coverage of the MATA confidence interval, we consider all possible models of this type and all possible true parameter values, together with a wide class of design matrices and parameters of interest. Our results show that, while not ideal, MATA confidence intervals perform surprisingly well in our regression scenario, provided that we use the minimum weight within the class of weights that we consider on the simpler model.  相似文献   

4.
The generalized Gaussian distribution with location parameter μ, scale parameter σ, and shape parameter p contains the Laplace, normal, and uniform distributions as particular cases for p = 1, 2, +∞, respectively. Derivations of the true maximum-likelihood estimators of μ and σ for these special cases are popular exercises in many university courses. Here, we show how the true maximum-likelihood estimators of μ and σ can be derived for p = 3, 4, 5. The derivations involve solving of quadratic, cubic, and quartic equations.  相似文献   

5.
The small-sample behavior of the bootstrap is investigated as a method for estimating p values and power in the stationary first-order autoregressive model. Monte Carlo methods are used to examine the bootstrap and Student-t approximations to the true distribution of the test statistic frequently used for testing hypotheses on the underlying slope parameter. In contrast to Student's t, the results suggest that the bootstrap can accurately estimate p values and power in this model in sample sizes as small as 5–10.  相似文献   

6.
The Jeffreys-rule prior and the marginal independence Jeffreys prior are recently proposed in Fonseca et al. [Objective Bayesian analysis for the Student-t regression model, Biometrika 95 (2008), pp. 325–333] as objective priors for the Student-t regression model. The authors showed that the priors provide proper posterior distributions and perform favourably in parameter estimation. Motivated by a practical financial risk management application, we compare the performance of the two Jeffreys priors with other priors proposed in the literature in a problem of estimating high quantiles for the Student-t model with unknown degrees of freedom. Through an asymptotic analysis and a simulation study, we show that both Jeffreys priors perform better in using a specific quantile of the Bayesian predictive distribution to approximate the true quantile.  相似文献   

7.
Regularized variable selection is a powerful tool for identifying the true regression model from a large number of candidates by applying penalties to the objective functions. The penalty functions typically involve a tuning parameter that controls the complexity of the selected model. The ability of the regularized variable selection methods to identify the true model critically depends on the correct choice of the tuning parameter. In this study, we develop a consistent tuning parameter selection method for regularized Cox's proportional hazards model with a diverging number of parameters. The tuning parameter is selected by minimizing the generalized information criterion. We prove that, for any penalty that possesses the oracle property, the proposed tuning parameter selection method identifies the true model with probability approaching one as sample size increases. Its finite sample performance is evaluated by simulations. Its practical use is demonstrated in The Cancer Genome Atlas breast cancer data.  相似文献   

8.
Measurement error models constitute a wide class of models that include linear and nonlinear regression models. They are very useful to model many real-life phenomena, particularly in the medical and biological areas. The great advantage of these models is that, in some sense, they can be represented as mixed effects models, allowing us to implement well-known techniques, like the EM-algorithm for the parameter estimation. In this paper, we consider a class of multivariate measurement error models where the observed response and/or covariate are not fully observed, i.e., the observations are subject to certain threshold values below or above which the measurements are not quantifiable. Consequently, these observations are considered censored. We assume a Student-t distribution for the unobserved true values of the mismeasured covariate and the error term of the model, providing a robust alternative for parameter estimation. Our approach relies on a likelihood-based inference using an EM-type algorithm. The proposed method is illustrated through some simulation studies and the analysis of an AIDS clinical trial dataset.  相似文献   

9.
We propose a test for state dependence in binary panel data with individual covariates. For this aim, we rely on a quadratic exponential model in which the association between the response variables is accounted for in a different way with respect to more standard formulations. The level of association is measured by a single parameter that may be estimated by a Conditional Maximum Likelihood (CML) approach. Under the dynamic logit model, the conditional estimator of this parameter converges to zero when the hypothesis of absence of state dependence is true. Therefore, it is possible to implement a t-test for this hypothesis which may be very simply performed and attains the nominal significance level under several structures of the individual covariates. Through an extensive simulation study, we find that our test has good finite sample properties and it is more robust to the presence of (autocorrelated) covariates in the model specification in comparison with other existing testing procedures for state dependence. The proposed approach is illustrated by two empirical applications: the first is based on data coming from the Panel Study of Income Dynamics and concerns employment and fertility; the second is based on the Health and Retirement Study and concerns the self reported health status.  相似文献   

10.
This article considers fixed effects (FE) estimation for linear panel data models under possible model misspecification when both the number of individuals, n, and the number of time periods, T, are large. We first clarify the probability limit of the FE estimator and argue that this probability limit can be regarded as a pseudo-true parameter. We then establish the asymptotic distributional properties of the FE estimator around the pseudo-true parameter when n and T jointly go to infinity. Notably, we show that the FE estimator suffers from the incidental parameters bias of which the top order is O(T? 1), and even after the incidental parameters bias is completely removed, the rate of convergence of the FE estimator depends on the degree of model misspecification and is either (nT)? 1/2 or n? 1/2. Second, we establish asymptotically valid inference on the (pseudo-true) parameter. Specifically, we derive the asymptotic properties of the clustered covariance matrix (CCM) estimator and the cross-section bootstrap, and show that they are robust to model misspecification. This establishes a rigorous theoretical ground for the use of the CCM estimator and the cross-section bootstrap when model misspecification and the incidental parameters bias (in the coefficient estimate) are present. We conduct Monte Carlo simulations to evaluate the finite sample performance of the estimators and inference methods, together with a simple application to the unemployment dynamics in the U.S.  相似文献   

11.
In the paper we consider minimisation of U-statistics with the weighted Lasso penalty and investigate their asymptotic properties in model selection and estimation. We prove that the use of appropriate weights in the penalty leads to the procedure that behaves like the oracle that knows the true model in advance, i.e. it is model selection consistent and estimates nonzero parameters with the standard rate. For the unweighted Lasso penalty, we obtain sufficient and necessary conditions for model selection consistency of estimators. The obtained results strongly based on the convexity of the loss function that is the main assumption of the paper. Our theorems can be applied to the ranking problem as well as generalised regression models. Thus, using U-statistics we can study more complex models (better describing real problems) than usually investigated linear or generalised linear models.  相似文献   

12.
Results are developed concerning the asymptotic behaviour of the Bayes classification rule as the number of unclassified observations grows without bound. It is shown that unclassified observations serve only to estimate the individual population parameters in an unlabeled sense and do not provide information about the labels that are attached to the populations. Prior construction is approached through investigation of prior odds over regions of the joint parameter space (across all populations) deemed likely to contain the true joint parameter vector. It is shown that consideration of these prior odds can lead to more robust a posteriori classification of individual observations.  相似文献   

13.
Estimating the proportion of true null hypotheses, π0, has attracted much attention in the recent statistical literature. Besides its apparent relevance for a set of specific scientific hypotheses, an accurate estimate of this parameter is key for many multiple testing procedures. Most existing methods for estimating π0 in the literature are motivated from the independence assumption of test statistics, which is often not true in reality. Simulations indicate that most existing estimators in the presence of the dependence among test statistics can be poor, mainly due to the increase of variation in these estimators. In this paper, we propose several data-driven methods for estimating π0 by incorporating the distribution pattern of the observed p-values as a practical approach to address potential dependence among test statistics. Specifically, we use a linear fit to give a data-driven estimate for the proportion of true-null p-values in (λ, 1] over the whole range [0, 1] instead of using the expected proportion at 1?λ. We find that the proposed estimators may substantially decrease the variance of the estimated true null proportion and thus improve the overall performance.  相似文献   

14.
ABSTRACT

Both philosophically and in practice, statistics is dominated by frequentist and Bayesian thinking. Under those paradigms, our courses and textbooks talk about the accuracy with which true model parameters are estimated or the posterior probability that they lie in a given set. In nonparametric problems, they talk about convergence to the true function (density, regression, etc.) or the probability that the true function lies in a given set. But the usual paradigms' focus on learning the true model and parameters can distract the analyst from another important task: discovering whether there are many sets of models and parameters that describe the data reasonably well. When we discover many good models we can see in what ways they agree. Points of agreement give us more confidence in our inferences, but points of disagreement give us less. Further, the usual paradigms’ focus seduces us into judging and adopting procedures according to how well they learn the true values. An alternative is to judge models and parameter values, not procedures, and judge them by how well they describe data, not how close they come to the truth. The latter is especially appealing in problems without a true model.  相似文献   

15.
The T‐optimality criterion is used in optimal design to derive designs for model selection. To set up the method, it is required that one of the models is considered to be true. We term this local T‐optimality. In this work, we propose a generalisation of T‐optimality (termed robust T‐optimality) that relaxes the requirement that one of the candidate models is set as true. We then show an application to a nonlinear mixed effects model with two candidate non‐nested models and combine robust T‐optimality with robust D‐optimality. Optimal design under local T‐optimality was found to provide adequate power when the a priori assumed true model was the true model but poor power if the a priori assumed true model was not the true model. The robust T‐optimality method provided adequate power irrespective of which model was true. The robust T‐optimality method appears to have useful properties for nonlinear models, where both the parameter values and model structure are required to be known a priori, and the most likely model that would be applied to any new experiment is not known with certainty. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

16.
We advocate the use of an Indirect Inference method to estimate the parameter of a COGARCH(1,1) process for equally spaced observations. This requires that the true model can be simulated and a reasonable estimation method for an approximate auxiliary model. We follow previous approaches and use linear projections leading to an auxiliary autoregressive model for the squared COGARCH returns. The asymptotic theory of the Indirect Inference estimator relies on a uniform strong law of large numbers and asymptotic normality of the parameter estimates of the auxiliary model, which require continuity and differentiability of the COGARCH process with respect to its parameter and which we prove via Kolmogorov's continuity criterion. This leads to consistent and asymptotically normal Indirect Inference estimates under moment conditions on the driving Lévy process. A simulation study shows that the method yields a substantial finite sample bias reduction compared with previous estimators.  相似文献   

17.
When employing model selection methods with oracle properties such as the smoothly clipped absolute deviation (SCAD) and the Adaptive Lasso, it is typical to estimate the smoothing parameter by m-fold cross-validation, for example, m = 10. In problems where the true regression function is sparse and the signals large, such cross-validation typically works well. However, in regression modeling of genomic studies involving Single Nucleotide Polymorphisms (SNP), the true regression functions, while thought to be sparse, do not have large signals. We demonstrate empirically that in such problems, the number of selected variables using SCAD and the Adaptive Lasso, with 10-fold cross-validation, is a random variable that has considerable and surprising variation. Similar remarks apply to non-oracle methods such as the Lasso. Our study strongly questions the suitability of performing only a single run of m-fold cross-validation with any oracle method, and not just the SCAD and Adaptive Lasso.  相似文献   

18.
ABSTRACT

When the editors of Basic and Applied Social Psychology effectively banned the use of null hypothesis significance testing (NHST) from articles published in their journal, it set off a fire-storm of discussions both supporting the decision and defending the utility of NHST in scientific research. At the heart of NHST is the p-value which is the probability of obtaining an effect equal to or more extreme than the one observed in the sample data, given the null hypothesis and other model assumptions. Although this is conceptually different from the probability of the null hypothesis being true, given the sample, p-values nonetheless can provide evidential information, toward making an inference about a parameter. Applying a 10,000-case simulation described in this article, the authors found that p-values’ inferential signals to either reject or not reject a null hypothesis about the mean (α?=?0.05) were consistent for almost 70% of the cases with the parameter’s true location for the sampled-from population. Success increases if a hybrid decision criterion, minimum effect size plus p-value (MESP), is used. Here, rejecting the null also requires the difference of the observed statistic from the exact null to be meaningfully large or practically significant, in the researcher’s judgment and experience. The simulation compares performances of several methods: from p-value and/or effect size-based, to confidence-interval based, under various conditions of true location of the mean, test power, and comparative sizes of the meaningful distance and population variability. For any inference procedure that outputs a binary indicator, like flagging whether a p-value is significant, the output of one single experiment is not sufficient evidence for a definitive conclusion. Yet, if a tool like MESP generates a relatively reliable signal and is used knowledgeably as part of a research process, it can provide useful information.  相似文献   

19.
Abstract. The cross‐validation (CV) criterion is known to be asecond‐order unbiased estimator of the risk function measuring the discrepancy between the candidate model and the true model, as well as the generalized information criterion (GIC) and the extended information criterion (EIC). In the present article, we show that the 2kth‐order unbiased estimator can be obtained using a linear combination from the leave‐one‐out CV criterion to the leave‐k‐out CV criterion. The proposed scheme is unique in that a bias smaller than that of a jackknife method can be obtained without any analytic calculation, that is, it is not necessary to obtain the explicit form of several terms in an asymptotic expansion of the bias. Furthermore, the proposed criterion can be regarded as a finite correction of a bias‐corrected CV criterion by using scalar coefficients in a bias‐corrected EIC obtained by the bootstrap iteration.  相似文献   

20.
ABSTRACT

This article considers nonparametric regression problems and develops a model-averaging procedure for smoothing spline regression problems. Unlike most smoothing parameter selection studies determining an optimum smoothing parameter, our focus here is on the prediction accuracy for the true conditional mean of Y given a predictor X. Our method consists of two steps. The first step is to construct a class of smoothing spline regression models based on nonparametric bootstrap samples, each with an appropriate smoothing parameter. The second step is to average bootstrap smoothing spline estimates of different smoothness to form a final improved estimate. To minimize the prediction error, we estimate the model weights using a delete-one-out cross-validation procedure. A simulation study has been performed by using a program written in R. The simulation study provides a comparison of the most well known cross-validation (CV), generalized cross-validation (GCV), and the proposed method. This new method is straightforward to implement, and gives reliable performances in simulations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号