首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Summary.  The difference, if any, between men's and women's voting patterns is of particular interest to historians of gender and politics. For elections that were held before the introduction of opinion surveying in the 1940s, little data are available with which to estimate such differences. We apply six methods for ecological inference to estimate men's and women's voting rates in New Zealand (NZ), 1893–1919. NZ is an interesting case-study, since it was the first self-governing country where women could vote. Furthermore, NZ officials recorded the voting rates of men and women at elections, making it possible to compare estimates produced by methods for ecological inference with known true values, thus testing the efficacy of different methods for ecological inference for this data set. We find that the most popular methods for ecological inference, namely Goodman's ecological regression and King's parametric method, give poor estimates, as does the much debated neighbourhood method. However, King's non-parametric method, Chambers and Steel's semiparametric method and the Steel, Beh and Chambers homogeneous approach all gave good estimates that were close to the known values, with the homogeneous approach performing best overall. The success of these methods in this example suggests that ecological inference may be a viable option when investigating gender and voting. Moreover, researchers using ecological inference in other fields may do well to consider a range of statistical methods. This work is a significant NZ contribution to historical politics and the first quantitative contribution, in the area of NZ gender and politics.  相似文献   

2.
Approximate Bayesian computation (ABC) is an approach to sampling from an approximate posterior distribution in the presence of a computationally intractable likelihood function. A common implementation is based on simulating model, parameter and dataset triples from the prior, and then accepting as samples from the approximate posterior, those model and parameter pairs for which the corresponding dataset, or a summary of that dataset, is ‘close’ to the observed data. Closeness is typically determined though a distance measure and a kernel scale parameter. Appropriate choice of that parameter is important in producing a good quality approximation. This paper proposes diagnostic tools for the choice of the kernel scale parameter based on assessing the coverage property, which asserts that credible intervals have the correct coverage levels in appropriately designed simulation settings. We provide theoretical results on coverage for both model and parameter inference, and adapt these into diagnostics for the ABC context. We re‐analyse a study on human demographic history to determine whether the adopted posterior approximation was appropriate. Code implementing the proposed methodology is freely available in the R package abctools .  相似文献   

3.
C. R. Rao (1978) discusses estimation for the common linear model in the case that the variance matrix σ2 Q has known singular form Q . In the more general context of inference, this model exhibits certain special features and illustrates how information concerning unknowns can separate into a categorical component and a statistical component. The categorical component establishes that certain parameters are known in value and thus are not part of the statistical inference.  相似文献   

4.
Probabilistic graphical models offer a powerful framework to account for the dependence structure between variables, which is represented as a graph. However, the dependence between variables may render inference tasks intractable. In this paper, we review techniques exploiting the graph structure for exact inference, borrowed from optimisation and computer science. They are built on the principle of variable elimination whose complexity is dictated in an intricate way by the order in which variables are eliminated. The so‐called treewidth of the graph characterises this algorithmic complexity: low‐treewidth graphs can be processed efficiently. The first point that we illustrate is therefore the idea that for inference in graphical models, the number of variables is not the limiting factor, and it is worth checking the width of several tree decompositions of the graph before resorting to the approximate method. We show how algorithms providing an upper bound of the treewidth can be exploited to derive a ‘good' elimination order enabling to realise exact inference. The second point is that when the treewidth is too large, algorithms for approximate inference linked to the principle of variable elimination, such as loopy belief propagation and variational approaches, can lead to accurate results while being much less time consuming than Monte‐Carlo approaches. We illustrate the techniques reviewed in this article on benchmarks of inference problems in genetic linkage analysis and computer vision, as well as on hidden variables restoration in coupled Hidden Markov Models.  相似文献   

5.
Summary.  We deal with contingency table data that are used to examine the relationships between a set of categorical variables or factors. We assume that such relationships can be adequately described by the cond`itional independence structure that is imposed by an undirected graphical model. If the contingency table is large, a desirable simplified interpretation can be achieved by combining some categories, or levels, of the factors. We introduce conditions under which such an operation does not alter the Markov properties of the graph. Implementation of these conditions leads to Bayesian model uncertainty procedures based on reversible jump Markov chain Monte Carlo methods. The methodology is illustrated on a 2×3×4 and up to a 4×5×5×2×2 contingency table.  相似文献   

6.
Monte Carlo methods for the exact inference have received much attention recently in complete or incomplete contingency table analysis. However, conventional Markov chain Monte Carlo, such as the Metropolis–Hastings algorithm, and importance sampling methods sometimes generate the poor performance by failing to produce valid tables. In this paper, we apply an adaptive Monte Carlo algorithm, the stochastic approximation Monte Carlo algorithm (SAMC; Liang, Liu, & Carroll, 2007), to the exact test of the goodness-of-fit of the model in complete or incomplete contingency tables containing some structural zero cells. The numerical results are in favor of our method in terms of quality of estimates.  相似文献   

7.
对复杂样本进行推断通常有两种体系,一种是传统的基于随机化理论的统计推断,另一种是基于模型的统计推断。传统的抽样理论以随机化理论为基础,将总体取值视为固定,随机性仅体现在样本的选取上,对总体的推断依赖于抽样设计。该方法在大样本情况下具有稳健估计量,但在小样本、数据缺失等情况下失效。基于模型的抽样推断认为总体是超总体模型中抽取的一个随机样本,对总体的推断取决于模型的建立,但在不可忽略抽样设计下估计量是有偏估计。在对这两类推断方法分析的基础上,提出抽样设计辅助的模型推断,并指出该方法在复杂抽样中具有重要的应用价值。  相似文献   

8.
One of the common problems encountered in applied statistics is that of comparing two proportions from stratified samples. One approach to this problem is via inference on the corresponding odds ratio. In this paper, the various point and interval estimators of and hypothesis testing procedures for a common odds ratio from multiple 2 ×2 tables are reviewed. Based On research to date, the conditional maximum likelihood and Mantel-Haenszel estimators are recommended as the point estimators of choice. Neither confidence intervals nor hypothesis testing metthods have been studied as well as the point estimators, but there is a confidence interval method associated with the Mantel-Haenszel estimator that is a good choice.  相似文献   

9.
It is suggested that inference under the proportional hazard model can be carried out by programs for exact inference under the logistic regression model. Advantages of such inference is that software is available and that multivariate models can be addressed. The method has been evaluated by means of coverage and power calculations in certain situations. In all situations coverage was above the nominal level, but on the other hand rather conservative. A different type of exact inference is developed under Type II censoring. Inference was then less conservative, however there are limitations with respect to censoring mechanism, multivariate generalizations and software is not available. This method also requires extensive computational power. Performance of large sample Wald, score and likelihood inference was also considered. Large sample methods works remarkably well with small data sets, but inference by score statistics seems to be the best choice. There seems to be some problems with likelihood ratio inference that may originate from how this method works with infinite estimates of the regression parameter. Inference by Wald statistics can be quite conservative with very small data sets.  相似文献   

10.
Exact inference for odds-ratio tests and confidence intervals relies on sophisticated algorithms, typically found only in the specialize software. This tends to discourage the use of exact methods in the analyses of 2 × 2 tables. We show that by first devolving each corresponding hypergeometric random variable into independent Bernoulli variates, simple and efficient algorithms emerge that are easily programmed in the commonly available software.  相似文献   

11.
In this article statistical inference for the failure time distribution of a product from “field return data”, that records the time between the product being shipped and returned for repair or replacement, is described. The problem that is addressed is that the data are not failure times because they also include the time that it took to ship and install the product and then to return it to the manufacturer for repair or replacement. The inference attempts to infer the distribution of time to failure (that is, from installation to failure) from the data when in addition there are separate data on the times from shipping to installation, and from failure to return. The method is illustrated with data from units installed in a telecommunications network. Our collaborator on writing this paper, Ed Lisay of Alcatel-Lucent, passed away suddenly in October 2008. As a tribute, we can state that Ed had an energetic and vigorous charisma in the application of his skills. He brought a sense of fun to his many interests, such as his achievement of becoming a master electrician. Ed is sadly missed by his family, friends and colleagues.  相似文献   

12.
Many inference problems lead naturally to a marginal or conditional measure of departure that depends on a nuisance parameter. As a device for first-order elimination of the nuisance parameter, we suggest averaging with respect to an exact or approximate confidence distribution function. It is shown that for many standard problems where an exact answer is available by other methods, the averaging method reproduces the exact answer. Moreover, for the gamma-mean problem, where the exact answer is not explicitly available, the averaging method gives results that agree closely with those obtained from higher-order asymptotic methods. Examples are discussed; detailed asymptotic calculations will be examined elsewhere.  相似文献   

13.
Summary.  There are models for which the evaluation of the likelihood is infeasible in practice. For these models the Metropolis–Hastings acceptance probability cannot be easily computed. This is the case, for instance, when only departure times from a G / G /1 queue are observed and inference on the arrival and service distributions are required. Indirect inference is a method to estimate a parameter θ in models whose likelihood function does not have an analytical closed form, but from which random samples can be drawn for fixed values of θ . First an auxiliary model is chosen whose parameter β can be directly estimated. Next, the parameters in the auxiliary model are estimated for the original data, leading to an estimate     . The parameter β is also estimated by using several sampled data sets, simulated from the original model for different values of the original parameter θ . Finally, the parameter θ which leads to the best match to     is chosen as the indirect inference estimate. We analyse which properties an auxiliary model should have to give satisfactory indirect inference. We look at the situation where the data are summarized in a vector statistic T , and the auxiliary model is chosen so that inference on β is drawn from T only. Under appropriate assumptions the asymptotic covariance matrix of the indirect estimators is proportional to the asymptotic covariance matrix of T and componentwise inversely proportional to the square of the derivative, with respect to θ , of the expected value of T . We discuss how these results can be used in selecting good estimating functions. We apply our findings to the queuing problem.  相似文献   

14.
We describe inferactive data analysis, so-named to denote an interactive approach to data analysis with an emphasis on inference after data analysis. Our approach is a compromise between Tukey's exploratory and confirmatory data analysis allowing also for Bayesian data analysis. We see this as a useful step in concrete providing tools (with statistical guarantees) for current data scientists. The basis of inference we use is (a conditional approach to) selective inference, in particular its randomized form. The relevant reference distributions are constructed from what we call a DAG-DAG—a Data Analysis Generative DAG, and a selective change of variables formula is crucial to any practical implementation of inferactive data analysis via sampling these distributions. We discuss a canonical example of an incomplete cross-validation test statistic to discriminate between black box models, and a real HIV dataset example to illustrate inference after making multiple queries on data.  相似文献   

15.
Abstract

We propose a simple procedure based on an existing “debiased” l1-regularized method for inference of the average partial effects (APEs) in approximately sparse probit and fractional probit models with panel data, where the number of time periods is fixed and small relative to the number of cross-sectional observations. Our method is computationally simple and does not suffer from the incidental parameters problems that come from attempting to estimate as a parameter the unobserved heterogeneity for each cross-sectional unit. Furthermore, it is robust to arbitrary serial dependence in underlying idiosyncratic errors. Our theoretical results illustrate that inference concerning APEs is more challenging than inference about fixed and low-dimensional parameters, as the former concerns deriving the asymptotic normality for sample averages of linear functions of a potentially large set of components in our estimator when a series approximation for the conditional mean of the unobserved heterogeneity is considered. Insights on the applicability and implications of other existing Lasso-based inference procedures for our problem are provided. We apply the debiasing method to estimate the effects of spending on test pass rates. Our results show that spending has a positive and statistically significant average partial effect; moreover, the effect is comparable to found using standard parametric methods.  相似文献   

16.
In the context of frequentist inference there are strong arguments in favour of data reduction by both (a) conditioning on the most appropriate ancillary statistic and (b) restricting attention to a minimal sufficient statistic. However, significantly for the study of the foundations of frequentist inference, there are some examples in which the order of application of these data reductions has an important bearing on the statistical inference of interest. This paper presents a new simple example of this kind.  相似文献   

17.
This paper discusses recovery of information regarding logistic regression parameters in cases when maximum likelihood estimates of some parameters are infinite. An algorithm for detecting such cases and characterizing the divergence of the parameter estimates is presented. A method for fitting the remaining parameters is also presented . All of these methods rely only on sufficient statistics rather than less aggregated quantities, as required for inference according to the method of Kolassa & Tanner (1994). These results are applied to approximate conditional inference via saddlepoint methods. Specifically, the double saddlepoint method of Skovgaard (1987) is adapted to the case when the solution to the saddlepoint equations exists as a point at infinity  相似文献   

18.
The methodic use of Shannon's entropy as a basic concept, complementing probability, leads to a new class of statistics which provides, inter alia, a measure of mutual dissimilarity y between several frequency distributions. Application to contin-gency tables with any number of dimensions yields a dimension-less, standardised contingency coefficient which depends on the direction of inference and will combine multiplicatively with the number of observed events. This class of statistics further in-cludes a continuous modification W of the number of degrees of freedom in a table, and a measure Q of its overall information content. Numerical illustrations and comparisons with former re-sults are worked out. Direct applications include the optimal partition of a quasicontinuum into cells by maximizing Q, the ordering of unordered tables by minimising local values of y, and a tentative absolute weighting of inductive inference based on the minimal necessary shift, required by an hypothesis, between the actually observed data and a set of assumed future events.  相似文献   

19.
In many parametric problems the use of order restrictions among the parameters can lead to improved precision. Our interest is in the study of several multinomial populations under the stochastic order restriction (SOR) for univariate situations. We use Bayesian methods to show that the SOR can lead to larger gains in precision than the method without the SOR when the SOR is reasonable. Unlike frequentist order restricted inference, our methodology permits analysis even when there is uncertainty about the SOR. Our method is sampling based, and we use simple and efficient rejection sampling. The Bayes factor in favor of the SOR is computed in a simple manner, and samples from the requisite posterior distributions are easily obtained. We use real data to illustrate the procedure, and we show that there is likely to be larger gains in precision under the SOR.  相似文献   

20.
针对纵向数据半参数模型E(y|x,t)=XTβ+f(t),采用惩罚二次推断函数方法同时估计模型中的回归参数β和未知光滑函数f(t)。首先利用截断幂函数基对未知光滑函数进行基函数展开近似,然后利用惩罚样条的思想构造关于回归参数和基函数系数的惩罚二次推断函数,最小化惩罚二次推断函数便可得到回归参数和基函数系数的惩罚二次推断函数估计。理论结果显示,估计结果具有相合性和渐近正态性,通过数值方法也得到了较好的模拟结果。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号