首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Projections of AIDS incidence are critical for assessing future healthcare needs. This paper focuses on the method of back-calculation for obtaining forecasts. The first problem faced was the need to account for delays and underreporting in reporting of cases and to adjust the incidence data. The method used to estimate the reporting delay distribution is based on Poisson regression and involves cross-classifying each reported case by calendar time of diagnosis and reporting delay. The adjusted AIDS incidence data are then used to obtain short-term projections and lower bounds on the size of the AIDS epidemic. The estimation procedure 'back-calculates' from AIDS incidence data using the incubation period distribution to obtain estimates of the numbers previously infected. These numbers are then projected forward. The problem can be shown to reduce to estimating the size of a multinomial population. The expectation-maximization (EM) algorithm is used to obtain maximum-likelihood estimates when the density of infection times is parametrized as a step function. The methodology is applied to AIDS incidence data in Portugal for four different transmission categories: injecting drug users, sexual transmission (homosexual/bisexual and heterosexual contact) and other, mainly haemophilia and blood transfusion related, to obtain short-term projections and an estimate of the minimum size of the epidemic.  相似文献   

2.
In this paper, we conducted a simulation study to evaluate the performance of four algorithms: multinomial logistic regression (MLR), bagging (BAG), random forest (RF), and gradient boosting (GB), for estimating generalized propensity score (GPS). Similar to the propensity score (PS), the ultimate goal of using GPS is to estimate unbiased average treatment effects (ATEs) in observational studies. We used the GPS estimates computed from these four algorithms with the generalized doubly robust (GDR) estimator to estimate ATEs in observational studies. We evaluated these ATE estimates in terms of bias and mean squared error (MSE). Simulation results show that overall, the GB algorithm produced the best ATE estimates based on these evaluation criteria. Thus, we recommend using the GB algorithm for estimating GPS in practice.  相似文献   

3.
Consider the problem of estimating the mean of a p (≥3)-variate multi-normal distribution with identity variance-covariance matrix and with unweighted sum of squared error loss. A class of minimax, noncomparable (i.e. no estimate in the class dominates any other estimate in the class) estimates is proposed; the class contains rules dominating the simple James-Stein estimates. The estimates are essentially smoothed versions of the scaled, truncated James-Stein estimates studied by Efron and Morris. Explicit and analytically tractable expressions for their risks are obtained and are used to give guidelines for selecting estimates within the class.  相似文献   

4.
Summary.  A new methodology is developed for estimating unemployment or employment characteristics in small areas, based on the assumption that the sample totals of unemployed and employed individuals follow a multinomial logit model with random area effects. The method is illustrated with UK labour force data aggregated by sex–age groups. For these data, the accuracy of direct estimates is poor in comparison with estimates that are derived from the multinomial logit model. Furthermore, two different estimators of the mean-squared errors are given: an analytical approximation obtained by Taylor linearization and an estimator based on bootstrapping. A simulation study for comparison of the two estimators shows the good performance of the bootstrap estimator.  相似文献   

5.
In this paper, the method of Hocking and Oxspring (1971) to estimate multinomial probabilities when full and partial data are available for some cells is extended to estimate the cell probabilities of a contingency table with structural zeros. The estimates are maximum likelihood, and the process is sequential. The gain in precision is due to the use of partial data and the bias of the estimates is also investigated.  相似文献   

6.
Abstract

The gambler's ruin problem is one of the most important problems in the emergence of probability. The problem has been long considered “solved” from a probabilistic viewpoint. However, we do not find the solution satisfactory. In this paper, the problem is recast as a statistical problem. Bounds of the estimate are derived over wide classes of priors. Interestingly, the probabilistic estimates ω(1/2) are identified as the most conservative solutions while the plug-in estimates are found to be out of range of the bounds. It implies that, although conservative, the probabilistic estimates ω(1/2) are justified by our analysis while the plug-in estimates are too extreme for estimating the ruin probability of gambler.  相似文献   

7.
A Bayesian method is proposed for estimating the cell probabilities of several multinomial distributions. Parameters of different distributions are taken to be a priori exchangeable. The prior specification is based upon mixtures of a hierarchical distribution, referred to as the multivariate “Dirichlet-Dirichlet” distribution. The analysis is facilitated by a multinomial approximation relating to the multinomial-Dirichlet distribution. The posterior estimates depend upon measures of entropy for the various distributions and shrink the individual observed proportions towards values obtained by pooling the data across the distributions. As well as incorporating prior information they are particularly useful when some of the cell frequencies are zero. We use them to investigate a numerical classification of males of various vocations, according to cause of death.  相似文献   

8.
Population-level proportions of individuals that fall at different points in the spectrum [of disease severity], from asymptomatic infection to severe disease, are often difficult to observe, but estimating these quantities can provide information about the nature and severity of the disease in a particular population. Logistic and multinomial regression techniques are often applied to infectious disease modeling of large populations and are suited to identifying variables associated with a particular disease or disease state. However, they are less appropriate for estimating infection state prevalence over time because they do not naturally accommodate known disease dynamics like duration of time an individual is infectious, heterogeneity in the risk of acquiring infection, and patterns of seasonality. We propose a Bayesian compartmental model to estimate latent infection state prevalence over time that easily incorporates known disease dynamics. We demonstrate how and why a stochastic compartmental model is a better approach for determining infection state proportions than multinomial regression is by using a novel method for estimating Bayes factors for models with high-dimensional parameter spaces. We provide an example using visceral leishmaniasis in Brazil and present an empirically-adjusted reproductive number for the infection.  相似文献   

9.
We address the problem of estimating the proportions of two statistical populations in a given mixture on the basis of an unlabeled sample of n–dimensional observations on the mixture. Assuming that the expected values of observations on the two populations are known, we show that almost any linear map from Rn to R1 yields an unbiased consistent estimate of the proportion of one population in a very easy way. We then find that linear map for which the resulting proportion estimate has minimum variance among all estimates so obtained. After deriving a simple expression for the minimum-variance estimate, we discuss practical aspects of obtaining this and related estimates.  相似文献   

10.
In nonlinear random coefficients models, the means or variances of response variables may not exist. In such cases, commonly used estimation procedures, e.g., (extended) least-squares (LS) and quasi-likelihood methods, are not applicable. This article solves this problem by proposing an estimate based on percentile estimating equations (PEE). This method does not require full distribution assumptions and leads to efficient estimates within the class of unbiased estimating equations. By minimizing the asymptotic variance of the PEE estimates, the optimum percentile estimating equations (OPEE) are derived. Several examples including Weibull regression show the flexibility of the PEE estimates. Under certain regularity conditions, the PEE estimates are shown to be strongly consistent and asymptotic normal, and the OPEE estimates have the minimal asymptotic variance. Compared with the parametric maximum likelihood estimates (MLE), the asymptotic efficiency of the OPEE estimates is more than 98%, while the LS-type of procedures can have infinite variances. When the observations have outliers or do not follow the distributions considered in model assumptions, the article shows that OPEE is more robust than the MLE, and the asymptotic efficiency in the model misspecification cases can be above 150%.  相似文献   

11.
The problem of estimating the total number of trials n in a binomial distribution is reconsidered in this article for both cases of known and unknown probability of success p from the Bayesian viewpoint. Bayes and empirical Bayes point estimates for n are proposed under the assumption of a left-truncated prior distribution for n and a beta prior distribution for p. Simulation studies are provided in this article in order to compare the proposed estimate with the most familiar n estimates.  相似文献   

12.
天津城市化水平综合测度研究   总被引:4,自引:0,他引:4  
把人口城市化作为城市化水平的惟一测度指标,似乎不能全面具体地反映城市化水平。而真正认识和测度城市化水平,对于我国的城市建设和国民经济发展具有重要意义。针对这一问题,文章分析了人口城市化的片面性,提出了新的城市化概念。在此基础上,依据牛顿第二定律,建立了基于城市发展评价城市化水平的城市化水平综合测度模型,对天津市基于城市发展的城市化水平进行了测度,并且对2010年我国部分省、市基于城市发展的城市化水平进行了测度,测度结果与以人口城市化为惟一测度指标的城市化水平进行了对比研究,表明利用城市化水平综合测度模型测度城市化水平的合理性。  相似文献   

13.
The problem of estimating the mean average total cost of each output for multiproduct firms in an industry is addressed. The identity that defines total cost for each firm as the product of output levels multiplied by their respective average total costs is viewed as a random-coefficients model. A random coefficients regression estimator is used to estimate mean average total output costs. Solutions to problems arising with this method in empirical studies are discussed. An application of the approach to data from cash grain farms in Illinois shows that the method gives reliable estimates for per-unit output production costs with considerably fewer data requirements than current methods of cost estimation.  相似文献   

14.
A simple least squares method for estimating a change in mean of a sequence of independent random variables is studied. The method first tests for a change in mean based on the regression principle of constrained and unconstrained sums of squares. Conditionally on a decision by this test that a change has occurred, least squares estimates are used to estimate the change point, the initial mean level (prior to the change point) and the change itself. The estimates of the initial level and change are functions of the change point estimate. All estimates are shown to be consistent, and those for the initial level and change are shown to be asymptotically jointly normal. The method performs well for moderately large shifts (one standard deviation or more), but the estimates of the initial level and change are biased in a predictable way for small shifts. The large sample theory is helpful in understanding this problem. The asymptotic distribution of the change point estimator is obtained for local shifts in mean, but the case of non-local shifts appears analytically intractable.  相似文献   

15.
We consider estimation of the number of cells in a multinomial distribution. This is one version of the species problem: there are many applications, such as the estimation of the number of unobserved species of animals; estimation of vocabulary size, etc. We describe the results of a simulation comparison of three principal frequent-ist' procedures for estimating the number of cells (or species). The first procedure postulates a functional form for the cell probabilities; the second procedure approxi mates the distribution of the probabilities by a parametric probability density function; and the third procedure is based on an estimate of the sample coverage, i.e. the sum of the probabilities of the observed cells. Among the procedures studied, we find that the third (non-parametric) method is globally preferable; the second (functional parametric) method cannot be recommended; and that, when based on the inverse Gaussian density, the first method is competitive in some cases with the third method. We also discuss Sichel's recent generalized inverse Gaussian-based procedure which, with some refine ment, promises to perform at least as well as the non-parametric method in all cases.  相似文献   

16.
New robust estimates for variance components are introduced. Two simple models are considered: the balanced one-way classification model with a random factor and the balanced mixed model with one random factor and one fixed factor. However, the method of estimation proposed can be extended to more complex models. The new method of estimation we propose is based on the relationship between the variance components and the coefficients of the least-mean-squared-error predictor between two observations of the same group. This relationship enables us to transform the problem of estimating the variance components into the problem of estimating the coefficients of a simple linear regression model. The variance-component estimators derived from the least-squares regression estimates are shown to coincide with the maximum-likelihood estimates. Robust estimates of the variance components can be obtained by replacing the least-squares estimates by robust regression estimates. In particular, a Monte Carlo study shows that for outlier-contaminated normal samples, the estimates of variance components derived from GM regression estimates and the derived test outperform other robust procedures.  相似文献   

17.
Previously, Bayesian anomaly was reported for estimating reliability when subsystem failure data and system failure data were obtained from the same time period. As a result, a practical method for mitigating Bayesian anomaly was developed. In the first part of this paper, however, we show that the Bayesian anomaly can be avoided as long as the same failure information is incorporated in the model. In the second part of this paper, we consider a problem of estimating the Bayesian reliability when the failure count data on subsystems and systems are obtained from the same time period. We show that Bayesian anomaly does not exist when using the multinomial distribution with the Dirichlet prior distribution. A numerical example is given to compare the proposed method with the previous methods.  相似文献   

18.
We revisit the problem of estimating the proportion π of true null hypotheses where a large scale of parallel hypothesis tests are performed independently. While the proportion is a quantity of interest in its own right in applications, the problem has arisen in assessing or controlling an overall false discovery rate. On the basis of a Bayes interpretation of the problem, the marginal distribution of the p-value is modeled in a mixture of the uniform distribution (null) and a non-uniform distribution (alternative), so that the parameter π of interest is characterized as the mixing proportion of the uniform component on the mixture. In this article, a nonparametric exponential mixture model is proposed to fit the p-values. As an alternative approach to the convex decreasing mixture model, the exponential mixture model has the advantages of identifiability, flexibility, and regularity. A computation algorithm is developed. The new approach is applied to a leukemia gene expression data set where multiple significance tests over 3,051 genes are performed. The new estimate for π with the leukemia gene expression data appears to be about 10% lower than the other three estimates that are known to be conservative. Simulation results also show that the new estimate is usually lower and has smaller bias than the other three estimates.  相似文献   

19.
20.
A method for estimating the asymptotic standard error of the sample median based on generalized least squares is outlined. The practical problems of implementing this new estimate along with those associated with two existing estimates are discussed. Finally a simulation study is presented to compare the three estimates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号