首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The accuracy of population projections   总被引:3,自引:0,他引:3  
A review of past population projection errors is presented as a means for constructing confidence intervals for future projections. The author first defines a statistic to measure projection errors independent of the size of population and the length of the projection period. A sample of U.S. and U.N. projections is used to show that the distributions of components of the error statistic are relatively stable. This information is then used to construct confidence intervals for the U.S. population up to the year 2000.  相似文献   

2.
Loosely speaking a robust projection index is one that prefers projections involving true clusters over projections consisting of a cluster and an outlier. We introduce a mathematical definition of one-dimensional index robustness and describe a numerical experiment to measure it. We design five new indices based on measuring divergence from Student's t -distribution which are intended to be especially robust: the experiment shows that they are more robust than several established indices. The experiment also reveals more generally that the robustness of moment indices depends on the number of approximation terms, providing additional practical guidance for existing projection pursuit implementations. We investigate the theoretical properties of one new Student t -index and Hall's index and show that the new index automatically adapts its robustness to the degree of outlier contamination. We conclude by outlining the possibilities for extending our experiments to both higher dimensions and other new indices.  相似文献   

3.
《Statistics》2012,46(6):1357-1385
ABSTRACT

The early stages of many real-life experiments involve a large number of factors among which only a few factors are active. Unfortunately, the optimal full-dimensional designs of those early stages may have bad low-dimensional projections and the experimenters do not know which factors turn out to be important before conducting the experiment. Therefore, designs with good projections are desirable for factor screening. In this regard, significant questions are arising such as whether the optimal full-dimensional designs have good projections onto low dimensions? How experimenters can measure the goodness of a full-dimensional design by focusing on all of its projections?, and are there linkages between the optimality of a full-dimensional design and the optimality of its projections? Through theoretical justifications, this paper tries to provide answers to these interesting questions by investigating the construction of optimal (average) projection designs for screening either nominal or quantitative factors. The main results show that: based on the aberration and orthogonality criteria the full-dimensional design is optimal if and only if it is optimal projection design; the full-dimensional design is optimal via the aberration and orthogonality if and only if it is uniform projection design; there is no guarantee that a uniform full-dimensional design is optimal projection design via any criterion; the projection design is optimal via the aberration, orthogonality and uniformity criteria if it is optimal via any criterion of them; and the saturated orthogonal designs have the same average projection performance.  相似文献   

4.
Beginning statistics at the undergraduate level can be taught by using a few geometric principles of linear vector space theory. Even formulas for simple sample means and variances can be derived with these principles by assuming a univariate linear statistical model. The least squares estimator of the sample mean is found by a perpendicular projection. The analogy of the bivariate model to the univariate model is indicated, and an analogous perpendicular projection solution for it is shown. Vector geometric diagrams illustrate the basic concepts.

Once the basic technique is understood, the appropriate application or perpendicular projections can be used to illustrate the problems of multicollinearity and tests of hypotheses in regression models. The translation of the geometric concepts into concrete algebraic equations is shown. The emphasis is on geometric thinking as a means of visualizing and thereby improving an understanding of methods of data analysis.  相似文献   

5.
Two separate structure discovery properties of Fisher's LDF are derived in a mixture multivariate normal setting. One of the properties is related to Fisher information and is proved by using Stein's identity. The other property is on lack of unimodality. The properties are used to give three selection rules for choice of informative projections of high-dimensional data, not necessarily multivariate normal. Their usefulness in the two group-classification problem is studied theoretically and by means of examples. Extensions and various issues about practical implementation are discussed.  相似文献   

6.
In this paper, a notion of generalized inner product spaces is introduced to study optimal estimating functions. The basic technique involves an idea of orthogonal projection first introduced by Small and McLeish (1988, 1989, 1991, 1992, 1994). A characterization of orthogonal projections in generalized inner product spaces is given. It is shown that the orthogonal projection of the score function into a linear subspace of estimating functions is optimal in that subspace, and a characterization of optimal estimating functions is given. As special cases of the main results of this paper, we derive the results of Godambe (1985) on the foundation of estimation in stochastic processes, the result of Godambe and Thompson (1989) on the extension of quasi-likelihood, and the generalized estimating equations for multivariate data due to Liang and Zeger (1986). Also we have derived optimal estimating functions in the Bayesian framework.  相似文献   

7.
The idea of searching for orthogonal projections, from a multidimensional space into a linear subspace, as an aid to detecting non-linear structure has been named exploratory projection pursuit.Most approaches are tied to the idea of searching for interesting projections. Typically, an interesting projection is one where the distribution of the projected data differs from the normal distribution. In this paper we define two projection indices which are aimed specifically at finding projections that best show grouped structure in the plane, if this exists in the multi-dimensional space. These involve a numerical optimization problem which is tackled in two stages, the projection and the pursuit; the first is based on a procedure to generate pseudo-random rotation matrices in the sense of the grand tour by D. Asimov (1985), and the second is a local numerical optimization procedure. One artificial and one real example illustrate the performance of the suggested indices.  相似文献   

8.
Tests of forecast accuracy and bias for county population projections   总被引:1,自引:0,他引:1  
"This article deals with the forecast accuracy and bias of population projections for 2,971 counties in the United States. It uses three different projection techniques and data from 1950, 1960, 1970, and 1980 to make two sets of 10-year projections and one set of 20-year projections. These projections are compared with census counts to determine forecast errors. The size, direction, and distribution of forecast errors are analyzed by size of place, rate of growth, and length of projection horizon. A number of consistent patterns are noted, and an extension of the empirical results to the production of confidence intervals for population projections is considered." A comment by Paul M. Beaumont and Andrew M. Isserman is included (pp. 1,004-9) together with a rejoinder by the author (pp. 1,009-12). This is a revised version of a paper presented at the 1986 Annual Meeting of the Population Association of America (see Population Index, Vol. 52, No. 3, Fall 1986, p. 456).  相似文献   

9.
This paper introduces a new class of time-varying, measure-valued stochastic processes for Bayesian nonparametric inference. The class of priors is constructed by normalising a stochastic process derived from non-Gaussian Ornstein-Uhlenbeck processes and generalises the class of normalised random measures with independent increments from static problems. Some properties of the normalised measure are investigated. A particle filter and MCMC schemes are described for inference. The methods are applied to an example in the modelling of financial data.  相似文献   

10.
The table look-up rule problem can be described by the question: what is a good way for the table to represent the decision regions in the N-dimensional measurement space. This paper describes a quickly implementable table look-up rule based on Ashby’s representation of sets in his constraint analysis. A decision region for category c in the N-dimensional measurement space is considered to be the intersection of the inverse projections of the decision regions determined for category c by Bayes rules in smaller dimensional projection spaces. Error bounds for this composite decision rule are derived: any entry in the confusion matrix for the composite decision rule is bounded above by the minimum of that entry taken over all the confusion matrices of the Bayes decision rules in the smaller dimensional projection spaces.

On simulated Gaussian Data, probability of error with the table look-up rule is comparable to the optimum Bayes rule.  相似文献   

11.
Abstract. Testing for parametric structure is an important issue in non‐parametric regression analysis. A standard approach is to measure the distance between a parametric and a non‐parametric fit with a squared deviation measure. These tests inherit the curse of dimensionality from the non‐parametric estimator. This results in a loss of power in finite samples and against local alternatives. This article proposes to circumvent the curse of dimensionality by projecting the residuals under the null hypothesis onto the space of additive functions. To estimate this projection, the smooth backfitting estimator is used. The asymptotic behaviour of the test statistic is derived and the consistency of a wild bootstrap procedure is established. The finite sample properties are investigated in a simulation study.  相似文献   

12.
Existing projection designs (e.g. maximum projection designs) attempt to achieve good space-filling properties in all projections. However, when using a Gaussian process (GP), model-based design criteria such as the entropy criterion is more appropriate. We employ the entropy criterion averaged over a set of projections, called expected entropy criterion (EEC), to generate projection designs. We show that maximum EEC designs are invariant to monotonic transformations of the response, i.e. they are optimal for a wide class of stochastic process models. We also demonstrate that transformation of each column of a Latin hypercube design (LHD) based on a monotonic function can substantially improve the EEC. Two types of input transformations are considered: a quantile function of a symmetric Beta distribution chosen to optimize the EEC, and a nonparametric transformation corresponding to the quantile function of a symmetric density chosen to optimize the EEC. Numerical studies show that the proposed transformations of the LHD are efficient and effective for building robust maximum EEC designs. These designs give projections with markedly higher entropies and lower maximum prediction variances (MPV''s) at the cost of small increases in average prediction variances (APV''s) compared to state-of-the-art space-filling designs over wide ranges of covariance parameter values.  相似文献   

13.
This paper focuses on unsupervised curve classification in the context of nuclear industry. At the Commissariat à l'Energie Atomique (CEA), Cadarache (France), the thermal-hydraulic computer code CATHARE is used to study the reliability of reactor vessels. The code inputs are physical parameters and the outputs are time evolution curves of a few other physical quantities. As the CATHARE code is quite complex and CPU time-consuming, it has to be approximated by a regression model. This regression process involves a clustering step. In the present paper, the CATHARE output curves are clustered using a k-means scheme, with a projection onto a lower dimensional space. We study the properties of the empirically optimal cluster centres found by the clustering method based on projections, compared with the ‘true’ ones. The choice of the projection basis is discussed, and an algorithm is implemented to select the best projection basis among a library of orthonormal bases. The approach is illustrated on a simulated example and then applied to the industrial problem.  相似文献   

14.
The effects of future population trends, such as demographic aging, declining fertility, and changes in migration, on the labor market in the Federal Republic of Germany are analyzed up to the year 2000. The study is based on projections prepared by the Institute for Research on the Labor Market and Occupations. Topics discussed include demographic trends as a cause of current unemployment, labor market phases and demographic trends since 1950, the projection model used, age-specific projections of the potential labor force, and labor market projections.  相似文献   

15.
We model the effect of a road safety measure on a set of target sites with a control area for each site, and we suppose that the accident data recorded at each site are classified in different mutually exclusive types. We adopt the before–after technique and we assume that at any one target site the total number of accidents recorded is multinomially distibuted between the periods and types of accidents. In this article, we propose a minorization–majorization (MM) algorithm for obtaining the constrained maximum likelihood estimates of the parameter vector. We compare it with a gradient projection–expectation maximization (GP-EM) algorithm, based on gradient projections. The performance of the algorithms is examined through a simulation study of road safety data.  相似文献   

16.
Lu Lin   《Statistical Methodology》2006,3(4):444-455
If the form of the distribution of data is unknown, the Bayesian method fails in the parametric inference because there is no posterior distribution of the parameter. In this paper, a theoretical framework of Bayesian likelihood is introduced via the Hilbert space method, which is free of the distributions of data and the parameter. The posterior distribution and posterior score function based on given inner products are defined and, consequently, the quasi posterior distribution and quasi posterior score function are derived, respectively, as the projections of the posterior distribution and posterior score function onto the space spanned by given estimating functions. In the space spanned by data, particularly, an explicit representation for the quasi posterior score function is obtained, which can be derived as a projection of the true posterior score function onto this space. The methods of constructing conservative quasi posterior score and quasi posterior log-likelihood are proposed. Some examples are given to illustrate the theoretical results. As an application, the quasi posterior distribution functions are used to select variables for generalized linear models. It is proved that, for linear models, the variable selections via quasi posterior distribution functions are equivalent to the variable selections via the penalized residual sum of squares or regression sum of squares.  相似文献   

17.
Projection techniques for nonlinear principal component analysis   总被引:4,自引:0,他引:4  
Principal Components Analysis (PCA) is traditionally a linear technique for projecting multidimensional data onto lower dimensional subspaces with minimal loss of variance. However, there are several applications where the data lie in a lower dimensional subspace that is not linear; in these cases linear PCA is not the optimal method to recover this subspace and thus account for the largest proportion of variance in the data.Nonlinear PCA addresses the nonlinearity problem by relaxing the linear restrictions on standard PCA. We investigate both linear and nonlinear approaches to PCA both exclusively and in combination. In particular we introduce a combination of projection pursuit and nonlinear regression for nonlinear PCA. We compare the success of PCA techniques in variance recovery by applying linear, nonlinear and hybrid methods to some simulated and real data sets.We show that the best linear projection that captures the structure in the data (in the sense that the original data can be reconstructed from the projection) is not necessarily a (linear) principal component. We also show that the ability of certain nonlinear projections to capture data structure is affected by the choice of constraint in the eigendecomposition of a nonlinear transform of the data. Similar success in recovering data structure was observed for both linear and nonlinear projections.  相似文献   

18.
φ-divergence .statistics are obtained by either replacing both distributions involved in the argument of the φ -divergence measure by their sample estimates or replacing one distribution and considering the other as given. The sampling properties of estimated divergence-type measures are investigated. Approximate means and variances are derived and asymptotic distributions are obtained. Tests of goodness of fit of observed frequencies to expected ones and tests of equality of divergences based on two or more multinomial samples are constructed.  相似文献   

19.
In high-dimensional data, one often seeks a few interesting low-dimensional projections which reveal important aspects of the data. Projection pursuit for classification finds projections that reveal differences between classes. Even though projection pursuit is used to bypass the curse of dimensionality, most indexes will not work well when there are a small number of observations relative to the number of variables, known as a large p (dimension) small n (sample size) problem. This paper discusses the relationship between the sample size and dimensionality on classification and proposes a new projection pursuit index that overcomes the problem of small sample size for exploratory classification.  相似文献   

20.
The objective of this paper is to study the issue of the projection discrepancy along the line of Liu (2002) and Fang and Qin (2005) based on discrete discrepancy measure proposed in Qin and Fang (2004), which has wide application to the field of fractional factorials. Here we also study the projection properties for q-level factorials and provide connection between minimum projection uniformity and other optimality criteria. A lower bound to projection discrepancy for q-level factorials is presented here.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号