首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 250 毫秒
1.
The author is concerned with log‐linear estimators of the size N of a population in a capture‐recapture experiment featuring heterogeneity in the individual capture probabilities and a time effect. He also considers models where the first capture influences the probability of subsequent captures. He derives several results from a new inequality associated with a dispersive ordering for discrete random variables. He shows that in a log‐linear model with inter‐individual heterogeneity, the estimator N is an increasing function of the heterogeneity parameter. He also shows that the inclusion of a time effect in the capture probabilities decreases N in models without heterogeneity. He further argues that a model featuring heterogeneity can accommodate a time effect through a small change in the heterogeneity parameter. He demonstrates these results using an inequality for the estimators of the heterogeneity parameters and illustrates them in a Monte Carlo experiment  相似文献   

2.
Nonparametric density estimation in the presence of measurement error is considered. The usual kernel deconvolution estimator seeks to account for the contamination in the data by employing a modified kernel. In this paper a new approach based on a weighted kernel density estimator is proposed. Theoretical motivation is provided by the existence of a weight vector that perfectly counteracts the bias in density estimation without generating an excessive increase in variance. In practice a data driven method of weight selection is required. Our strategy is to minimize the discrepancy between a standard kernel estimate from the contaminated data on the one hand, and the convolution of the weighted deconvolution estimate with the measurement error density on the other hand. We consider a direct implementation of this approach, in which the weights are optimized subject to sum and non-negativity constraints, and a regularized version in which the objective function includes a ridge-type penalty. Numerical tests suggest that the weighted kernel estimation can lead to tangible improvements in performance over the usual kernel deconvolution estimator. Furthermore, weighted kernel estimates are free from the problem of negative estimation in the tails that can occur when using modified kernels. The weighted kernel approach generalizes to the case of multivariate deconvolution density estimation in a very straightforward manner.  相似文献   

3.
This article considers testing the significance of a regressor with a near unit root in a predictive regression model. The procedures discussed in this article are nonparametric, so one can test the significance of a regressor without specifying a functional form. The results are used to test the null hypothesis that the entire function takes the value of zero. We show that the standardized test has a normal distribution regardless of whether there is a near unit root in the regressor. This is in contrast to tests based on linear regression for this model where tests have a nonstandard limiting distribution that depends on nuisance parameters. Our results have practical implications in testing the significance of a regressor since there is no need to conduct pretests for a unit root in the regressor and the same procedure can be used if the regressor has a unit root or not. A Monte Carlo experiment explores the performance of the test for various levels of persistence of the regressors and for various linear and nonlinear alternatives. The test has superior performance against certain nonlinear alternatives. An application of the test applied to stock returns shows how the test can improve inference about predictability.  相似文献   

4.
It has been known that when there is a break in the variance (unconditional heteroskedasticity) of the error term in linear regression models, a routine application of the Lagrange multiplier (LM) test for autocorrelation can cause potentially significant size distortions. We propose a new test for autocorrelation that is robust in the presence of a break in variance. The proposed test is a modified LM test based on a generalized least squares regression. Monte Carlo simulations show that the new test performs well in finite samples and it is especially comparable to other existing heteroskedasticity-robust tests in terms of size, and much better in terms of power.  相似文献   

5.
In this paper, two tests, based on weighted CUSUM of the least squares residuals, are studied to detect in real time a change-point in a nonlinear model. A first test statistic is proposed by extension of a method already used in the literature but for the linear models. It is tested under the null hypothesis, at each sequential observation, that there is no change in the model against a change presence. The asymptotic distribution of the test statistic under the null hypothesis is given and its convergence in probability to infinity is proved when a change occurs. These results will allow to build an asymptotic critical region. Next, in order to decrease the type I error probability, a bootstrapped critical value is proposed and a modified test is studied in a similar way. A generalization of the Hájek–Rényi inequality is established.  相似文献   

6.
Structure learning for Bayesian networks has been made in a heuristic mode in search of an optimal model to avoid an explosive computational burden. In the learning process, a structural error which occurred at a point of learning may deteriorate its subsequent learning. We proposed a remedial approach to this error-for-error process by using marginal model structures. The remedy is made by fixing local errors in structure in reference to the marginal structures. In this sense, we call the remedy a marginally corrective procedure. We devised a new score function for the procedure which consists of two components, the likelihood function of a model and a discrepancy measure in marginal structures. The proposed method compares favourably with a couple of the most popular algorithms as shown in experiments with benchmark data sets.  相似文献   

7.
This paper considers the problem of estimating the size and mean value of a stigmatized quantitative character of a hidden gang in a finite population. The proposed method may be applied to solve domestic problems in a particular country or across countries: for example, a government may be interested in estimating the average income of victims or perpetrators of domestic violence. The proposed method is based on the technique introduced by Warner (1965) to estimate the proportion of a sensitive attribute in a finite population without threatening the privacy of the respondents. Expressions for the bias and variance of the proposed estimators are given, to a first order of approximation. Circumstances in which the method can be applied are studied and illustrated using a numerical example.  相似文献   

8.
The use of lower probabilities is considered for inferences in basic jury scenarios to study aspects of the size of juries and their composition if society consists of subpopulations. The use of lower probability seems natural in law, as it leads to robust inference in the sense of providing a defendant with the benefit of the doubt. The method presented in this paper focusses on how representative a jury is for the whole population, using a novel concept of a second ’imaginary’ jury together with exchangeability assumptions. It has the advantage that there is an explicit absence of any assumption with regard to guilt of a defendant. Although the concept of a jury in law is central in the presentation, the novel approach and the conclusions of this paper hold for representative decision making processes in many fields, and it also provides a new perspective to stratified sampling.  相似文献   

9.
By assuming that the underlying distribution belongs to the domain of attraction of an extreme value distribution, one can extrapolate the data to a far tail region so that a rare event can be predicted. However, when the distribution is in the domain of attraction of a Gumbel distribution, the extrapolation is quite limited generally in comparison with a heavy tailed distribution. In view of this drawback, a Weibull tailed distribution has been studied recently. Some methods for choosing the sample fraction in estimating the Weibull tail coefficient and some bias reduction estimators have been proposed in the literature. In this paper, we show that the theoretical optimal sample fraction does not exist and a bias reduction estimator does not always produce a smaller mean squared error than a biased estimator. These are different from using a heavy tailed distribution. Further we propose a refined class of Weibull tailed distributions which are more useful in estimating high quantiles and extreme tail probabilities.  相似文献   

10.
In high-dimensional linear regression, the dimension of variables is always greater than the sample size. In this situation, the traditional variance estimation technique based on ordinary least squares constantly exhibits a high bias even under sparsity assumption. One of the major reasons is the high spurious correlation between unobserved realized noise and several predictors. To alleviate this problem, a refitted cross-validation (RCV) method has been proposed in the literature. However, for a complicated model, the RCV exhibits a lower probability that the selected model includes the true model in case of finite samples. This phenomenon may easily result in a large bias of variance estimation. Thus, a model selection method based on the ranks of the frequency of occurrences in six votes from a blocked 3×2 cross-validation is proposed in this study. The proposed method has a considerably larger probability of including the true model in practice than the RCV method. The variance estimation obtained using the model selected by the proposed method also shows a lower bias and a smaller variance. Furthermore, theoretical analysis proves the asymptotic normality property of the proposed variance estimation.  相似文献   

11.
Group testing procedures, in which groups containing several units are tested without testing each unit, are widely used as cost-effective procedures in estimating the proportion of defective units in a population. A problem arises when we apply these procedures to the detection of genetically modified organisms (GMOs), because the analytical instrument for detecting GMOs has a threshold of detection. If the group size (i.e., the number of units within a group) is large, the GMOs in a group are not detected due to the dilution even if the group contains one unit of GMOs. Thus, most people conventionally use a small group size (which we call conventional group size) so that they can surely detect the existence of defective units if at least one unit of GMOs is included in the group. However, we show that we can estimate the proportion of defective units for any group size even if a threshold of detection exists; the estimate of the proportion of defective units is easily obtained by using functions implemented in a spreadsheet. Then, we show that the conventional group size is not always optimal in controlling a consumer's risk, because such a group size requires a larger number of groups for testing.  相似文献   

12.
We propose a stochastic model to analyse risk factors for emesis in multi-cycle chemotherapies, which allows to describe the effect of a potential risk factor by a single parameter. This model is a hybrid between a random intercept model and a transition model and it is motivated by some medical background knowledge with respect to frequency and course of emesis in cancer patients. We consider maximum likelihood estimation of the parameters of the model and additionally efficient estimation of the marginal risk in the first cycle. Finite sample properties are investigated in a simulation study. The proposed model suffers from a slight overparametrization, such that ML estimates show some poor statistical properties, but estimates of the marginal risk behave quite well. An investigation of alternative, simpler regression models reveals, that in this setting these models allow to define a time-constant regression coefficient only in a somewhat arbitrary manner. Hence we conclude, that the proposed model is valuable in spite of the difficulties with respect to parameter estimation.  相似文献   

13.
A sensitivity analysis displays the increase in uncertainty that attends an inference when a key assumption is relaxed. In matched observational studies of treatment effects, a key assumption in some analyses is that subjects matched for observed covariates are comparable, and this assumption is relaxed by positing a relevant covariate that was not observed and not controlled by matching. What properties would such an unobserved covariate need to have to materially alter the inference about treatment effects? For ease of calculation and reporting, it is convenient that the sensitivity analysis be of low dimension, perhaps indexed by a scalar sensitivity parameter, but for interpretation in specific contexts, a higher dimensional analysis may be of greater relevance. An amplification of a sensitivity analysis is defined as a map from each point in a low dimensional sensitivity analysis to a set of points, perhaps a 'curve,' in a higher dimensional sensitivity analysis such that the possible inferences are the same for all points in the set. Possessing an amplification, an investigator may calculate and report the low dimensional analysis, yet have available the interpretations of the higher dimensional analysis.  相似文献   

14.
In the multistage processes, quality of a process or a product at each stage is related to the previous stage(s). This property is referred to as a cascade property. Sometimes, quality of a process is characterized by a profile. In this paper, we consider a two-stage process with a normal quality characteristic in the first stage and a simple linear regression profile in the second stage. Then we propose two methods to monitor quality characteristics in both stages. The performance of the proposed two methods is evaluated through a numerical example in terms of average run length criterion.  相似文献   

15.
ABSTRACT

Consider the problem of estimating the positions of a set of targets in a multidimensional Euclidean space from distances reported by a number of observers when the observers do not know their own positions in the space. Each observer reports the distance from the observer to each target plus a random error. This statistical problem is the basic model for the various forms of what is called multidimensional unfolding in the psychometric literature. Multidimensional unfolding methodology as developed in the field of cognitive psychology is basically a statistical estimation problem where the data structure is a set of measures that are monotonic functions of Euclidean distances between a number of observers and targets in a multidimensional space. The new method presented in this article deals with estimating the target locations and the observer positions when the observations are functions of the squared distances between observers and targets observed with an additive random error in a two-dimensional space. The method provides robust estimates of the target locations in a multidimensional space for the parametric structure of the data generating model presented in the article. The method also yields estimates of the orientation of the coordinate system and the mean and variances of the observer locations. The mean and the variances are not estimated by standard unfolding methods which yield targets maps that are invariant to a rotation of the coordinate system. The data is transformed so that the nonlinearity due to the squared observer locations is removed. The sampling properties of the estimates are derived from the asymptotic variances of the additive errors of a maximum likelihood factor analysis of the sample covariance matrix of the transformed data augmented with bootstrapping. The robustness of the new method is tested using artificial data. The method is applied to a 2001 survey data set from Turkey to provide a real data example.  相似文献   

16.
A control procedure is presented in this article that is based on jointly using two separate control statistics in the detection and interpretation of signals in a multivariate normal process. The procedure detects the following three situations: (i) a mean vector shift without a shift in the covariance matrix; (ii) a shift in process variation (covariance matrix) without a mean vector shift; and (iii) both a simultaneous shift in the mean vector and covariance matrix as the result of a change in the parameters of some key process variables. It is shown that, following the occurrence of a signal on either of the separate control charts, the values from both of the corresponding signaling statistics can be decomposed into interpretable elements. Viewing the two decompositions together helps one to specifically identify the individual components and associated variables that are being affected. These components may include individual means or variances of the process variables as well as the correlations between or among variables. An industrial data set is used to illustrate the procedure.  相似文献   

17.
Summary.  The 2001 census in the UK asked for a return of people 'usually living at this address'. But this phrase is fuzzy and may have led to undercount. In addition, analysis of the sex ratios in the 2001 census of England and Wales points to a sex bias in the adjustments for net undercount—too few males in relation to females. The Office for National Statistics's abandonment of the method of demographic analysis for the population of working ages has allowed these biases to creep in. The paper presents a demographic account to check on the plausibility of census results. The need to revise preliminary estimates of the national population over a period of years following census day—as experienced in North America and now in the UK—calls into question the feasibility of a one-number census. Looking to the future, the environment for taking a reliable census by conventional methods is deteriorating. The UK Government's proposals for a population register open up the possibility of a Nordic-style administrative record census in the longer term.  相似文献   

18.
In parametric regression models the sign of a coefficient often plays an important role in its interpretation. One possible approach to model selection in these situations is to consider a loss function that formulates prediction of the sign of a coefficient as a decision problem. Taking a Bayesian approach, we extend this idea of a sign based loss for selection to more complex situations. In generalized additive models we consider prediction of the sign of the derivative of an additive term at a set of predictors. Being able to predict the sign of the derivative at some point (that is, whether a term is increasing or decreasing) is one approach to selection of terms in additive modelling when interpretation is the main goal. For models with interactions, prediction of the sign of a higher order derivative can be used similarly. There are many advantages to our sign-based strategy for selection: one can work in a full or encompassing model without the need to specify priors on a model space and without needing to specify priors on parameters in submodels. Also, avoiding a search over a large model space can simplify computation. We consider shrinkage prior specifications on smoothing parameters that allow for good predictive performance in models with large numbers of terms without the need for selection, and a frequentist calibration of the parameter in our sign-based loss function when it is desired to control a false selection rate for interpretation.  相似文献   

19.
We propose a simple, but effective, tool to detect possible anomalies in the services prescribed by a health care provider (HP) compared to his/her colleagues in the same field and environment. Our method is based on the concentration function that is an extension of the Lorenz curve widely used in describing uneven distribution of wealth in a population. The proposed tool provides a graphical illustration of a possible anomalous behavior of the HPs and it can be used as a prescreening device for further investigations of potential medical fraud.  相似文献   

20.
In multivariate regression, a graphical diagnostic method of detecting observations that are influential in estimating regression coefficients is introduced. It is based on the principal components and their variances obtained from the covariance matrix of the probability distribution for the change in the estimator of the matrix of unknown regression coefficients due to a single-case deletion. As a result, each deletion statistic obtained in a form of matrix is transformed into a two-dimensional quantity. Its univariate version is also introduced in a little different way. No distributional form is assumed. For illustration, we provide a numerical example in which the graphical method introduced here is seen to be effective in getting information about influential observations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号