首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A Box-Cox transformed linear model usually has the form y(λ) = μ + β1x1 +… + βpxp + oe, where y(λ) is the power transform of y. Although widely used in practice, the Fisher information matrix for the unknown parameters and, in particular, its inverse have not been studied seriously in the literature. We obtain those two important matrices to put the Box-Cox transformed linear model on a firmer ground. The question of how to make inference on β = (β1,…,βp)T when λ; is estimated from the data is then discussed for large but finite sample size by studying some parameter-based asymptotics. Both unconditional and conditional inference are studied from the frequentist point of view.  相似文献   

2.
In this paper, we seek to establish asymptotic results for selective inference procedures removing the assumption of Gaussianity. The class of selection procedures we consider are determined by affine inequalities, which we refer to as affine selection procedures. Examples of affine selection procedures include selective inference along the solution path of the least absolute shrinkage and selection operator (LASSO), as well as selective inference after fitting the least absolute shrinkage and selection operator at a fixed value of the regularization parameter. We also consider some tests in penalized generalized linear models. Our result proves asymptotic convergence in the high‐dimensional setting where n<p, and n can be of a logarithmic factor of the dimension p for some procedures.  相似文献   

3.
ABSTRACT

Such is the grip of formal methods of statistical inference—that is, frequentist methods for generalizing from sample to population in enumerative studies—in the drawing of scientific inferences that the two are routinely deemed equivalent in the social, management, and biomedical sciences. This, despite the fact that legitimate employment of said methods is difficult to implement on practical grounds alone. But supposing the adoption of these procedures were simple does not get us far; crucially, methods of formal statistical inference are ill-suited to the analysis of much scientific data. Even findings from the claimed gold standard for examination by the latter, randomized controlled trials, can be problematic.

Scientific inference is a far broader concept than statistical inference. Its authority derives from the accumulation, over an extensive period of time, of both theoretical and empirical knowledge that has won the (provisional) acceptance of the scholarly community. A major focus of scientific inference can be viewed as the pursuit of significant sameness, meaning replicable and empirically generalizable results among phenomena. Regrettably, the obsession with users of statistical inference to report significant differences in data sets actively thwarts cumulative knowledge development.

The manifold problems surrounding the implementation and usefulness of formal methods of statistical inference in advancing science do not speak well of much teaching in methods/statistics classes. Serious reflection on statistics' role in producing viable knowledge is needed. Commendably, the American Statistical Association is committed to addressing this challenge, as further witnessed in this special online, open access issue of The American Statistician.  相似文献   

4.
We describe inferactive data analysis, so-named to denote an interactive approach to data analysis with an emphasis on inference after data analysis. Our approach is a compromise between Tukey's exploratory and confirmatory data analysis allowing also for Bayesian data analysis. We see this as a useful step in concrete providing tools (with statistical guarantees) for current data scientists. The basis of inference we use is (a conditional approach to) selective inference, in particular its randomized form. The relevant reference distributions are constructed from what we call a DAG-DAG—a Data Analysis Generative DAG, and a selective change of variables formula is crucial to any practical implementation of inferactive data analysis via sampling these distributions. We discuss a canonical example of an incomplete cross-validation test statistic to discriminate between black box models, and a real HIV dataset example to illustrate inference after making multiple queries on data.  相似文献   

5.
Abstract

We propose a simple procedure based on an existing “debiased” l1-regularized method for inference of the average partial effects (APEs) in approximately sparse probit and fractional probit models with panel data, where the number of time periods is fixed and small relative to the number of cross-sectional observations. Our method is computationally simple and does not suffer from the incidental parameters problems that come from attempting to estimate as a parameter the unobserved heterogeneity for each cross-sectional unit. Furthermore, it is robust to arbitrary serial dependence in underlying idiosyncratic errors. Our theoretical results illustrate that inference concerning APEs is more challenging than inference about fixed and low-dimensional parameters, as the former concerns deriving the asymptotic normality for sample averages of linear functions of a potentially large set of components in our estimator when a series approximation for the conditional mean of the unobserved heterogeneity is considered. Insights on the applicability and implications of other existing Lasso-based inference procedures for our problem are provided. We apply the debiasing method to estimate the effects of spending on test pass rates. Our results show that spending has a positive and statistically significant average partial effect; moreover, the effect is comparable to found using standard parametric methods.  相似文献   

6.
The problem of inference in Bayesian Normal mixture models is known to be difficult. In particular, direct Bayesian inference (via quadrature) suffers from a combinatorial explosion in having to consider every possible partition of n observations into k mixture components, resulting in a computation time which is O(k n). This paper explores the use of discretised parameters and shows that for equal-variance mixture models, direct computation time can be reduced to O(D k n k), where relevant continuous parameters are each divided into D regions. As a consequence, direct inference is now possible on genuine data sets for small k, where the quality of approximation is determined by the level of discretisation. For large problems, where the computational complexity is still too great in O(D k n k) time, discretisation can provide a convergence diagnostic for a Markov chain Monte Carlo analysis.  相似文献   

7.
In a high-dimensional multiple testing framework, we present new confidence bounds on the false positives contained in subsets S of selected null hypotheses. These bounds are post hoc in the sense that the coverage probability holds simultaneously over all S, possibly chosen depending on the data. This article focuses on the common case of structured null hypotheses, for example, along a tree, a hierarchy, or geometrically (spatially or temporally). Following recent advances in post hoc inference, we build confidence bounds for some prespecified forest-structured subsets and deduce a bound for any subset S by interpolation. The proposed bounds are shown to improve substantially previous ones when the signal is locally structured. Our findings are supported both by theoretical results and numerical experiments. Moreover, our bounds can be obtained by an algorithm (with complexity bilinear in the sizes of the reference hierarchy and of the selected subset) that is implemented in the open-source R package sansSouci available from https://github.com/pneuvial/sanssouci , making our approach operational.  相似文献   

8.
This study takes up inference in linear models with generalized error and generalized t distributions. For the generalized error distribution, two computational algorithms are proposed. The first is based on indirect Bayesian inference using an approximating finite scale mixture of normal distributions. The second is based on Gibbs sampling. The Gibbs sampler involves only drawing random numbers from standard distributions. This is important because previously the impression has been that an exact analysis of the generalized error regression model using Gibbs sampling is not possible. Next, we describe computational Bayesian inference for linear models with generalized t disturbances based on Gibbs sampling, and exploiting the fact that the model is a mixture of generalized error distributions with inverse generalized gamma distributions for the scale parameter. The linear model with this specification has also been thought not to be amenable to exact Bayesian analysis. All computational methods are applied to actual data involving the exchange rates of the British pound, the French franc, and the German mark relative to the U.S. dollar.  相似文献   

9.
In this paper, we propose a smoothed Q‐learning algorithm for estimating optimal dynamic treatment regimes. In contrast to the Q‐learning algorithm in which nonregular inference is involved, we show that, under assumptions adopted in this paper, the proposed smoothed Q‐learning estimator is asymptotically normally distributed even when the Q‐learning estimator is not and its asymptotic variance can be consistently estimated. As a result, inference based on the smoothed Q‐learning estimator is standard. We derive the optimal smoothing parameter and propose a data‐driven method for estimating it. The finite sample properties of the smoothed Q‐learning estimator are studied and compared with several existing estimators including the Q‐learning estimator via an extensive simulation study. We illustrate the new method by analyzing data from the Clinical Antipsychotic Trials of Intervention Effectiveness–Alzheimer's Disease (CATIE‐AD) study.  相似文献   

10.
Epstein [Truncated life tests in the exponential case, Ann. Math. Statist. 25 (1954), pp. 555–564] introduced a hybrid censoring scheme (called Type-I hybrid censoring) and Chen and Bhattacharyya [Exact confidence bounds for an exponential parameter under hybrid censoring, Comm. Statist. Theory Methods 17 (1988), pp. 1857–1870] derived the exact distribution of the maximum-likelihood estimator (MLE) of the mean of a scaled exponential distribution based on a Type-I hybrid censored sample. Childs et al. [Exact likelihood inference based on Type-I and Type-II hybrid censored samples from the exponential distribution, Ann. Inst. Statist. Math. 55 (2003), pp. 319–330] provided an alternate simpler expression for this distribution, and also developed analogous results for another hybrid censoring scheme (called Type-II hybrid censoring). The purpose of this paper is to derive the exact bivariate distribution of the MLE of the parameter vector of a two-parameter exponential model based on hybrid censored samples. The marginal distributions are derived and exact confidence bounds for the parameters are obtained. The results are also used to derive the exact distribution of the MLE of the pth quantile, as well as the corresponding confidence bounds. These exact confidence intervals are then compared with parametric bootstrap confidence intervals in terms of coverage probabilities. Finally, we present some numerical examples to illustrate the methods of inference developed here.  相似文献   

11.
Summary A standard improper prior for the parameters of a MANOVA model is shown to yield an inference that is incoherent in the sense of Heath and Sudderth. The proof of incoherence is based on the fact that the formal Bayes estimate, sayδ 0 , of the covariance matrix based on the improper prior and a certain bounded loss function is uniformly inadmissible in that there is another estimatorδ l and an ɛ>0 such that the risk functions satisfyR(δ l ,Σ)⩽R δ 0 ,Σ)−ε for all values of the covariance matrix Σ. The estimatorδ I is formal Bayes for an alternative improper prior which leads to a coherent inference. Research supported by National Science Foundation grants DMS-89-22607 (for Eaton) and DMS-9123358 (for Sudderth).  相似文献   

12.
Just as frequentist hypothesis tests have been developed to check model assumptions, prior predictive p-values and other Bayesian p-values check prior distributions as well as other model assumptions. These model checks not only suffer from the usual threshold dependence of p-values, but also from the suppression of model uncertainty in subsequent inference. One solution is to transform Bayesian and frequentist p-values for model assessment into a fiducial distribution across the models. Averaging the Bayesian or frequentist posterior distributions with respect to the fiducial distribution can reproduce results from Bayesian model averaging or classical fiducial inference.  相似文献   

13.

Recently, exact confidence bounds and exact likelihood inference have been developed based on hybrid censored samples by Chen and Bhattacharyya [Chen, S. and Bhattacharyya, G.K. (1998). Exact confidence bounds for an exponential parameter under hybrid censoring. Communications in StatisticsTheory and Methods, 17, 1857–1870.], Childs et al. [Childs, A., Chandrasekar, B., Balakrishnan, N. and Kundu, D. (2003). Exact likelihood inference based on Type-I and Type-II hybrid censored samples from the exponential distribution. Annals of the Institute of Statistical Mathematics, 55, 319–330.], and Chandrasekar et al. [Chandrasekar, B., Childs, A. and Balakrishnan, N. (2004). Exact likelihood inference for the exponential distribution under generalized Type-I and Type-II hybrid censoring. Naval Research Logistics, 51, 994–1004.] for the case of the exponential distribution. In this article, we propose an unified hybrid censoring scheme (HCS) which includes many cases considered earlier as special cases. We then derive the exact distribution of the maximum likelihood estimator as well as exact confidence intervals for the mean of the exponential distribution under this general unified HCS. Finally, we present some examples to illustrate all the methods of inference developed here.  相似文献   

14.
Introductory statistical inference texts and courses treat the point estimation, hypothesis testing, and interval estimation problems separately, with primary emphasis on large-sample approximations. Here, I present an alternative approach to teaching this course, built around p-values, emphasizing provably valid inference for all sample sizes. Details about computation and marginalization are also provided, with several illustrative examples, along with a course outline. Supplementary materials for this article are available online.  相似文献   

15.
The last decade saw enormous progress in the development of causal inference tools to account for noncompliance in randomized clinical trials. With survival outcomes, structural accelerated failure time (SAFT) models enable causal estimation of effects of observed treatments without making direct assumptions on the compliance selection mechanism. The traditional proportional hazards model has however rarely been used for causal inference. The estimator proposed by Loeys and Goetghebeur (2003, Biometrics vol. 59 pp. 100–105) is limited to the setting of all or nothing exposure. In this paper, we propose an estimation procedure for more general causal proportional hazards models linking the distribution of potential treatment-free survival times to the distribution of observed survival times via observed (time-constant) exposures. Specifically, we first build models for observed exposure-specific survival times. Next, using the proposed causal proportional hazards model, the exposure-specific survival distributions are backtransformed to their treatment-free counterparts, to obtain – after proper mixing – the unconditional treatment-free survival distribution. Estimation of the parameter(s) in the causal model is then based on minimizing a test statistic for equality in backtransformed survival distributions between randomized arms.  相似文献   

16.
Item response theory (IRT) comprises a set of statistical models which are useful in many fields, especially when there is an interest in studying latent variables (or latent traits). Usually such latent traits are assumed to be random variables and a convenient distribution is assigned to them. A very common choice for such a distribution has been the standard normal. Recently, Azevedo et al. [Bayesian inference for a skew-normal IRT model under the centred parameterization, Comput. Stat. Data Anal. 55 (2011), pp. 353–365] proposed a skew-normal distribution under the centred parameterization (SNCP) as had been studied in [R.B. Arellano-Valle and A. Azzalini, The centred parametrization for the multivariate skew-normal distribution, J. Multivariate Anal. 99(7) (2008), pp. 1362–1382], to model the latent trait distribution. This approach allows one to represent any asymmetric behaviour concerning the latent trait distribution. Also, they developed a Metropolis–Hastings within the Gibbs sampling (MHWGS) algorithm based on the density of the SNCP. They showed that the algorithm recovers all parameters properly. Their results indicated that, in the presence of asymmetry, the proposed model and the estimation algorithm perform better than the usual model and estimation methods. Our main goal in this paper is to propose another type of MHWGS algorithm based on a stochastic representation (hierarchical structure) of the SNCP studied in [N. Henze, A probabilistic representation of the skew-normal distribution, Scand. J. Statist. 13 (1986), pp. 271–275]. Our algorithm has only one Metropolis–Hastings step, in opposition to the algorithm developed by Azevedo et al., which has two such steps. This not only makes the implementation easier but also reduces the number of proposal densities to be used, which can be a problem in the implementation of MHWGS algorithms, as can be seen in [R.J. Patz and B.W. Junker, A straightforward approach to Markov Chain Monte Carlo methods for item response models, J. Educ. Behav. Stat. 24(2) (1999), pp. 146–178; R.J. Patz and B.W. Junker, The applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses, J. Educ. Behav. Stat. 24(4) (1999), pp. 342–366; A. Gelman, G.O. Roberts, and W.R. Gilks, Efficient Metropolis jumping rules, Bayesian Stat. 5 (1996), pp. 599–607]. Moreover, we consider a modified beta prior (which generalizes the one considered in [3 Azevedo, C. L.N., Bolfarine, H. and Andrade, D. F. 2011. Bayesian inference for a skew-normal IRT model under the centred parameterization. Comput. Stat. Data Anal., 55: 353365. [Crossref], [Web of Science ®] [Google Scholar]]) and a Jeffreys prior for the asymmetry parameter. Furthermore, we study the sensitivity of such priors as well as the use of different kernel densities for this parameter. Finally, we assess the impact of the number of examinees, number of items and the asymmetry level on the parameter recovery. Results of the simulation study indicated that our approach performed equally as well as that in [3 Azevedo, C. L.N., Bolfarine, H. and Andrade, D. F. 2011. Bayesian inference for a skew-normal IRT model under the centred parameterization. Comput. Stat. Data Anal., 55: 353365. [Crossref], [Web of Science ®] [Google Scholar]], in terms of parameter recovery, mainly using the Jeffreys prior. Also, they indicated that the asymmetry level has the highest impact on parameter recovery, even though it is relatively small. A real data analysis is considered jointly with the development of model fitting assessment tools. The results are compared with the ones obtained by Azevedo et al. The results indicate that using the hierarchical approach allows us to implement MCMC algorithms more easily, it facilitates diagnosis of the convergence and also it can be very useful to fit more complex skew IRT models.  相似文献   

17.
This article presents non-parametric predictive inference for future order statistics. Given the data consisting of n real-valued observations, m future observations are considered and predictive probabilities are presented for the rth-ordered future observation. In addition, joint and conditional probabilities for events involving multiple future order statistics are presented. The article further presents the use of such predictive probabilities for order statistics in statistical inference, in particular considering pairwise and multiple comparisons based on two or more independent groups of data.  相似文献   

18.
This article is concerned with making predictive inference on the basis of a doubly censored sample from a two-parameter Rayleigh life model. We derive the predictive distributions for a single future response, the ith future response, and several future responses. We use the Bayesian approach in conjunction with an improper flat prior for the location parameter and an independent proper conjugate prior for the scale parameter to derive the predictive distributions. We conclude with a numerical example in which the effect of the hyperparameters on the mean and standard deviation of the predictive density is assessed.  相似文献   

19.
A statistical model is said to be an order‐restricted statistical model when its parameter takes its values in a closed convex cone C of the Euclidean space. In recent years, order‐restricted likelihood ratio tests and maximum likelihood estimators have been criticized on the grounds that they may violate a cone order monotonicity (COM) property, and hence reverse the cone order induced by C. The authors argue here that these reversals occur only in the case that C is an obtuse cone, and that in this case COM is an inappropriate requirement for likelihood‐based estimates and tests. They conclude that these procedures thus remain perfectly reasonable procedures for order‐restricted inference.  相似文献   

20.
Maximum-likelihood estimation is interpreted as a procedure for generating approximate pivotal quantities, that is, functions u(X;θ) of the data X and parameter θ that have distributions not involving θ. Further, these pivotals should be efficient in the sense of reproducing approximately the likelihood function of θ based on X, and they should be approximately linear in θ. To this end the effect of replacing θ by a parameter ϕ = ϕ(θ) is examined. The relationship of maximum-likelihood estimation interpreted in this way to conditional inference is discussed. Examples illustrating this use of maximum-likelihood estimation on small samples are given.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号