首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In the estimation of a population mean or total from a random sample, certain methods based on linear models are known to be automatically design consistent, regardless of how well the underlying model describes the population. A sufficient condition is identified for this type of robustness to model failure; the condition, which we call 'internal bias calibration', relates to the combination of a model and the method used to fit it. Included among the internally bias-calibrated models, in addition to the aforementioned linear models, are certain canonical link generalized linear models and nonparametric regressions constructed from them by a particular style of local likelihood fitting. Other models can often be made robust by using a suboptimal fitting method. Thus the class of model-based, but design consistent, analyses is enlarged to include more realistic models for certain types of survey variable such as binary indicators and counts. Particular applications discussed are the estimation of the size of a population subdomain, as arises in tax auditing for example, and the estimation of a bootstrap tail probability.  相似文献   

2.
Multistate capture-recapture models are a natural generalization of the usual one-site recapture models. Similarly, individuals are sampled on discrete occasions, at which they may be captured or not. However, contrary to the one-site case, the individuals can move within a finite set of states between occasions. The growing interest in spatial aspects of population dynamics presently contributes to making multistate models a very promising tool for population biology. We review first the interest and the potential of multistate models, in particular when they are used with individual states as well as geographical sites. Multistate models indeed constitute canonical capture-recapture models for individual categorical covariates changing over time, and can be linked to longitudinal studies with missing data and models such as hidden Markov chains. Multistate models also provide a promising tool for handling heterogeneity of capture, provided states related to capturability can be defined and used. Such an approach could be relevant for population size estimation in closed populations. Multistate models also constitute a natural framework for mixtures of information in individual history data. Presently, most models can be fit using program MARK. As an example, we present a canonical model for multisite accession to reproduction, which fully generalizes a classical one-site model. In the generalization proposed, one can estimate simultaneously age-dependent rates of accession to reproduction, natal and breeding dispersal. Finally, we discuss further generalizations - such as a multistate generalization of growth rate models and models for data where the state in which an individual is detected is known with uncertainty - and prospects for software development.  相似文献   

3.
Multistate recapture models: modelling incomplete individual histories   总被引:1,自引:0,他引:1  
Multistate capture-recapture models are a natural generalization of the usual one-site recapture models. Similarly, individuals are sampled on discrete occasions, at which they may be captured or not. However, contrary to the one-site case, the individuals can move within a finite set of states between occasions. The growing interest in spatial aspects of population dynamics presently contributes to making multistate models a very promising tool for population biology. We review first the interest and the potential of multistate models, in particular when they are used with individual states as well as geographical sites. Multistate models indeed constitute canonical capture-recapture models for individual categorical covariates changing over time, and can be linked to longitudinal studies with missing data and models such as hidden Markov chains. Multistate models also provide a promising tool for handling heterogeneity of capture, provided states related to capturability can be defined and used. Such an approach could be relevant for population size estimation in closed populations. Multistate models also constitute a natural framework for mixtures of information in individual history data. Presently, most models can be fit using program MARK. As an example, we present a canonical model for multisite accession to reproduction, which fully generalizes a classical one-site model. In the generalization proposed, one can estimate simultaneously age-dependent rates of accession to reproduction, natal and breeding dispersal. Finally, we discuss further generalizations - such as a multistate generalization of growth rate models and models for data where the state in which an individual is detected is known with uncertainty - and prospects for software development.  相似文献   

4.
When considering the relationships between two sets of variates, the number of nonzero population canonical correlations may be called the dimensionality. In the literature, several tests for dimensionality in the canonical correlation analysis are known. A comparison of seven sequential test procedures is presented, using results from some simulation study. The tests are compared with regard to the relative frequencies of underestimation, correct estimation, and overestimation of the true dimensionality. Some conclusions from the simulation results are drawn.  相似文献   

5.
Four basic strands in the disequilibrium literature are identified. Some examples are discussed and the canonical econometric disequilibrium model and its estimation are dealt with in detail. Specific criticisms of the canonical model,dealing with price and wage rigidity, with the nature of the min condition and the price-adjustment equation, are considered and a variety of modifications is entertained. Tests of the “equilibrium vs. disequilibrium” hypothesis are discussed, as well as several classes of models that may switch between equilibrium and disequilibrium modes. Finally, consideration is given to multimarket disequilibrium models with particular emphasis on the problems of coherence and estimation.  相似文献   

6.
Canonical correlation assesses the relationship between two groups of variables. Although it has been a useful tool in a wide variety of research areas, it is not well known that weaker canonical correlations require larger sample sizes to be correctly inferred. In this article, we investigate small sample bias in canonical correlation analysis and apply the jackknife bias correction to the estimation of canonical correlations. We use bootstrap samples to obtain a better confidence interval for the jackknife canonical correlation estimator.  相似文献   

7.
In this paper, we study the effects of nonnormality on the distributions of sample canonical correlations when the population canonical correlations are simple. In order to achieve the purpose, we derive asymptotic expansion formulas for the distributions of a function of the canonical correlations as well as the individual canonical correlations under nonnormal populations. We particularly discuss the distribution of sample canonical correlations under the class of elliptical population. These expansions are given by using a perturbation method. Simulation results are also given.  相似文献   

8.
We investigate bounded-memory estimators of statistical functionals. It is shown that, for nondegenerate functionals and stochastic processes, it is impossible to achieve consistent estimation with bounded memory. In the positive direction, we show that O(log(1/??)) states suffice to achieve ??-consistent estimation for a natural class of functionals. A?canonical optimal construction is conjectured for arbitrary statistical functionals.  相似文献   

9.
In this article we study two methodologies which identify and specify canonical form VARMA models. The two methodologies are: (1) an extension of the scalar component methodology which specifies canonical VARMA models by identifying scalar components through canonical correlations analysis; and (2) the Echelon form methodology, which specifies canonical VARMA models through the estimation of Kronecker indices. We compare the actual forms and the methodologies on three levels. Firstly, we present a theoretical comparison. Secondly, we present a Monte Carlo simulation study that compares the performances of the two methodologies in identifying some pre-specified data generating processes. Lastly, we compare the out-of-sample forecast performance of the two forms when models are fitted to real macroeconomic data.  相似文献   

10.
We consider estimation in a high-dimensional linear model with strongly correlated variables. We propose to cluster the variables first and do subsequent sparse estimation such as the Lasso for cluster-representatives or the group Lasso based on the structure from the clusters. Regarding the first step, we present a novel and bottom-up agglomerative clustering algorithm based on canonical correlations, and we show that it finds an optimal solution and is statistically consistent. We also present some theoretical arguments that canonical correlation based clustering leads to a better-posed compatibility constant for the design matrix which ensures identifiability and an oracle inequality for the group Lasso. Furthermore, we discuss circumstances where cluster-representatives and using the Lasso as subsequent estimator leads to improved results for prediction and detection of variables. We complement the theoretical analysis with various empirical results.  相似文献   

11.
Four basic strands in the disequilibrium literature are identified. Some examples are discussed and the canonical econometric disequilibrium model and its estimation are dealt with in detail. Specific criticisms of the canonical model,dealing with price and wage rigidity, with the nature of the min condition and the price-adjustment equation, are considered and a variety of modifications is entertained. Tests of the “equilibrium vs. disequilibrium” hypothesis are discussed, as well as several classes of models that may switch between equilibrium and disequilibrium modes. Finally, consideration is given to multimarket disequilibrium models with particular emphasis on the problems of coherence and estimation.  相似文献   

12.
ABSTRACT

Canonical correlations are maximized correlation coefficients indicating the relationships between pairs of canonical variates that are linear combinations of the two sets of original variables. The number of non-zero canonical correlations in a population is called its dimensionality. Parallel analysis (PA) is an empirical method for determining the number of principal components or factors that should be retained in factor analysis. An example is given to illustrate for adapting proposed procedures based on PA and bootstrap modified PA to the context of canonical correlation analysis (CCA). The performances of the proposed procedures are evaluated in a simulation study by their comparison with traditional sequential test procedures with respect to the under-, correct- and over-determination of dimensionality in CCA.  相似文献   

13.
We consider a stochastic process describing the evolution of a certain population. A sample is taken from this population in some given generation and is to be used for estimation purposes. It is shown that if one wishes to estimate, from the sample, the actual value of a certain population quantity F in the current generation, estimation using a statistic f is preferred to estimation using a different statistic k, whereas, if one wishes to estimate the mean value of F relative to the stochastic process describing the population evolution, estimation using k is preferred to estimation using f.  相似文献   

14.
The coefficient of the main term of the generalization error in Bayesian estimation is called a Bayesian learning coefficient. In this article, we first introduce Vandermonde matrix type singularities and show certain orthogonality conditions of them. Recently, it has been recognized that Vandermonde matrix type singularities are related to Bayesian learning coefficients for several hierarchical learning models. By applying the orthogonality conditions of them, we show that their log canonical threshold also corresponds to the Bayesian learning coefficient for normal mixture models, and we obtain the explicit computational results in dimension one.  相似文献   

15.
16.
An asymptotic expansion of the cross-validation criterion (CVC) using the Kullback-Leibler distance is derived when the leave-k-out method is used and when parameters are estimated by the weighted score method. By this expansion, the asymptotic bias of the Takeuchi information criterion (TIC) is derived as well as that of the CVC. Under canonical parametrization in the exponential family of distributions when maximum likelihood estimation is used, the magnitudes of the asymptotic biases of the Akaike information criterion (AIC) and CVC are shown to be smaller than that of the TIC. Examples in typical statistical distributions are shown.  相似文献   

17.
The author identifies static optimal designs for polynomial regression models with or without intercept. His optimality criterion is an average between the D‐optimality criterion for the estimation of low‐degree terms and the D8‐optimality criterion for testing the significance of higher degree terms. His work relies on classical results concerning canonical moments and the theory of continued fractions.  相似文献   

18.
Functional data analysis involves the extension of familiar statistical procedures such as principal components analysis, linear modelling, and canonical correlation analysis to data where the raw observation xi is a function. An essential preliminary to a functional data analysis is often the registration or alignment of salient curve features by suitable monotone transformations hi of the argument t , so that the actual analyses are carried out on the values xi { hi ( t )}. This is referred to as dynamic time warping in the engineering literature. In effect, this conceptualizes variation among functions as being composed of two aspects: horizontal and vertical, or domain and range. A nonparametric function estimation technique is described for identifying the smooth monotone transformations hi , and is illustrated by data analyses. A second-order linear stochastic differential equation is proposed to model these components of variation.  相似文献   

19.
In practical survey sampling, missing data are unavoidable due to nonresponse, rejected observations by editing, disclosure control, or outlier suppression. We propose a calibrated imputation approach so that valid point and variance estimates of the population (or domain) totals can be computed by the secondary users using simple complete‐sample formulae. This is especially helpful for variance estimation, which generally require additional information and tools that are unavailable to the secondary users. Our approach is natural for continuous variables, where the estimation may be either based on reweighting or imputation, including possibly their outlier‐robust extensions. We also propose a multivariate procedure to accommodate the estimation of the covariance matrix between estimated population totals, which facilitates variance estimation of the ratios or differences among the estimated totals. We illustrate the proposed approach using simulation data in supplementary materials that are available online.  相似文献   

20.
We consider the adjustment, based upon a sample of size n, of collections of vectors drawn from either an infinite or finite population. The vectors may be judged to be either normally distributed or, more generally, second-order exchangeable. We develop the work of Goldstein and Wooff (1998) to show how the familiar univariate finite population corrections (FPCs) naturally generalise to individual quantities in the multivariate population. The types of information we gain by sampling are identified with the orthogonal canonical variable directions derived from a generalised eigenvalue problem. These canonical directions share the same co-ordinate representation for all sample sizes and, for equally defined individuals, all population sizes enabling simple comparisons between both the effects of different sample sizes and of different population sizes. We conclude by considering how the FPC is modified for multivariate cluster sampling with exchangeable clusters. In univariate two-stage cluster sampling, we may decompose the variance of the population mean into the sum of the variance of cluster means and the variance of the cluster members within clusters. The first term has a FPC relating to the sampling fraction of clusters, the second term has a FPC relating to the sampling fraction of cluster size. We illustrate how this generalises in the multivariate case. We decompose the variance into two terms: the first relating to multivariate finite population sampling of clusters and the second to multivariate finite population sampling within clusters. We solve two generalised eigenvalue problems to show how to generalise the univariate to the multivariate: each of the two FPCs attaches to one, and only one, of the two eigenbases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号