首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
The problems of estimating the reliability function and Pr{X1+...+Xk ≤ Y} are considered. The random variables X’s and Y are assumed to follow binomial and Poisson distributions. Classical estimators available in the literature are discussed and Bayes estimators are derived. In order to obtain the estimators of these parametric functions, the basic role is played by the estimators of factorial moments of the two distributions.  相似文献   

We introduce a combined two-stage least-squares (2SLS)–expectation maximization (EM) algorithm for estimating vector-valued autoregressive conditional heteroskedasticity models with standardized errors generated by Gaussian mixtures. The procedure incorporates the identification of the parametric settings as well as the estimation of the model parameters. Our approach does not require a priori knowledge of the Gaussian densities. The parametric settings of the 2SLS_EM algorithm are determined by the genetic hybrid algorithm (GHA). We test the GHA-driven 2SLS_EM algorithm on some simulated cases and on international asset pricing data. The statistical properties of the estimated models and the derived mixture densities indicate good performance of the algorithm. We conduct tests on a massively parallel processor supercomputer to cope with situations involving numerous mixtures. We show that the algorithm is scalable.  相似文献   

SUMMARY Using San Francisco city clinic cohort data, we estimate the HIV seroconversion distribution by both non-parametric and parametric methods, and illustrate the effects of age on this distribution. The non-parametric methods include the Turnbull method, the Bacchetti method, the expectation, maximization and smoothing (EMS) method and the penalized spline method. The seroconversion density curves estimated by these nonparametric methods are of bimodal nature with obvious effects of age. As a result of the bimodal nature of the seroconversion curves, the parametric models considered are mixtures of two distributions taken from the generalized log-logistic distribution with three parameters, the Weibull distribution and the log-normal distribution. In terms of the logarithm of the likelihood values, it appears that the non-parametric methods with smoothing as well as without smoothing (i.e. the Turnbull method) provided much better fits than did the parametric models. Among the non-parametric methods, the EMS and the spline estimates are more appealing, because the unsmoothed Turnbull estimates are very unstable and because the Bacchetti estimates have a longer tail. Among the parametric models, the mixture of a generalized log-logistic distribution with three parameters and a Weibull distribution or a log-normal distribution provided better fits than did other mixtures of parametric models.  相似文献   

If the capture probabilities in a capture‐recapture experiment depend on covariates, parametric models may be fitted and the population size may then be estimated. Here a semiparametric model for the capture probabilities that allows both continuous and categorical covariates is developed. Kernel smoothing and profile estimating equations are used to estimate the nonparametric and parametric components. Analytic forms of the standard errors are derived, which allows an empirical bias bandwidth selection procedure to be used to estimate the bandwidth. The method is evaluated in simulations and is applied to a real data set concerning captures of Prinia flaviventris, which is a common bird species in Southeast Asia.  相似文献   

Model-based clustering of Gaussian copulas for mixed data   总被引:1,自引:0,他引:1  
Clustering of mixed data is important yet challenging due to a shortage of conventional distributions for such data. In this article, we propose a mixture model of Gaussian copulas for clustering mixed data. Indeed copulas, and Gaussian copulas in particular, are powerful tools for easily modeling the distribution of multivariate variables. This model clusters data sets with continuous, integer, and ordinal variables (all having a cumulative distribution function) by considering the intra-component dependencies in a similar way to the Gaussian mixture. Indeed, each component of the Gaussian copula mixture produces a correlation coefficient for each pair of variables and its univariate margins follow standard distributions (Gaussian, Poisson, and ordered multinomial) depending on the nature of the variable (continuous, integer, or ordinal). As an interesting by-product, this model generalizes many well-known approaches and provides tools for visualization based on its parameters. The Bayesian inference is achieved with a Metropolis-within-Gibbs sampler. The numerical experiments, on simulated and real data, illustrate the benefits of the proposed model: flexible and meaningful parameterization combined with visualization features.  相似文献   


This paper is concerned with properties (bias, standard deviation, mean square error and efficiency) of twenty six estimators of the intraclass correlation in the analysis of binary data. Our main interest is to study these properties when data are generated from different distributions. For data generation we considered three over-dispersed binomial distributions, namely, the beta-binomial distribution, the probit normal binomial distribution and a mixture of two binomial distributions. The findings regarding bias, standard deviation and mean squared error of all these estimators, are that (a) in general, the distributions of biases of most of the estimators are negatively skewed. The biases are smallest when data are generated from the beta-binomial distribution and largest when data are generated from the mixture distribution; (b) the standard deviations are smallest when data are generated from the beta-binomial distribution; and (c) the mean squared errors are smallest when data are generated from the beta-binomial distribution and largest when data are generated from the mixture distribution. Of the 26, nine estimators including the maximum likelihood estimator, an estimator based on the optimal quadratic estimating equations of Crowder (1987), and an analysis of variance type estimator is found to have least amount of bias, standard deviation and mean squared error. Also, the distributions of the bias, standard deviation and mean squared error for each of these estimators are, in general, more symmetric than those of the other estimators. Our findings regarding efficiency are that the estimator based on the optimal quadratic estimating equations has consistently high efficiency and least variability in the efficiency results. In the important range in which the intraclass correlation is small (≤0 5), on the average, this estimator shows best efficiency performance. The analysis of variance type estimator seems to do well for larger values of the intraclass correlation. In general, the estimator based on the optimal quadratic estimating equations seems to show best efficiency performance for data from the beta-binomial distribution and the probit normal binomial distribution, and the analysis of variance type estimator seems to do well for data from the mixture distribution.  相似文献   

The parametric and nonparametric methods for estimating the error rates in linear discriminant analysis are examined both in normal and in nonnormal situations. A Monte Carlo experiment was carried out under the assumption that two population distributions were characterized by a mixture of two multivariate normal distributions. The bootstrap bias-corrected apparent error rate compares favourably to other available estimators for nonnormal populations with small Mahalanobis distance. The methods for error estimation are also applied to a practical problem in medical diagnosis  相似文献   

We define a distribution on the unit sphere \(\mathbb {S}^{d-1}\) called the elliptically symmetric angular Gaussian distribution. This distribution, which to our knowledge has not been studied before, is a subfamily of the angular Gaussian distribution closely analogous to the Kent subfamily of the general Fisher–Bingham distribution. Like the Kent distribution, it has ellipse-like contours, enabling modelling of rotational asymmetry about the mean direction, but it has the additional advantages of being simple and fast to simulate from, and having a density and hence likelihood that is easy and very quick to compute exactly. These advantages are especially beneficial for computationally intensive statistical methods, one example of which is a parametric bootstrap procedure for inference for the directional mean that we describe.  相似文献   

We consider the problem of estimating the error variance in a general linear model when the error distribution is assumed to be spherically symmetric, but not necessary Gaussian. In particular we study the case of a scale mixture of Gaussians including the particularly important case of the multivariate-t distribution. Under Stein's loss, we construct a class of estimators that improve on the usual best unbiased (and best equivariant) estimator. Our class has the interesting double robustness property of being simultaneously generalized Bayes (for the same generalized prior) and minimax over the entire class of scale mixture of Gaussian distributions.  相似文献   

Experience ratemaking plays a crucial role in general insurance in determining future premiums of individuals in a portfolio by assessing observed claims from the whole portfolio. This paper investigates this problem in which claims can be modeled by certain parametric family of distributions. The Dirichlet process mixtures are employed to model the distributions of the parameters so as to make two advantages: to produce exact Bayesian experience premiums for a class of premium principles generated from generic error functions and, at the same time, provide robust and flexible ways to avoid possible bias caused by traditionally used priors such as non informative priors or conjugate priors. In this paper, the Bayesian experience ratemaking under Dirichlet process mixture models are investigated and due to the lack of analytical forms of the conditional expectations of the quantities concerned, the Gibbs sampling schemes are designed for the purpose of approximations.  相似文献   

Bimodal truncated count distributions are frequently observed in aggregate survey data and in user ratings when respondents are mixed in their opinion. They also arise in censored count data, where the highest category might create an additional mode. Modeling bimodal behavior in discrete data is useful for various purposes, from comparing shapes of different samples (or survey questions) to predicting future ratings by new raters. The Poisson distribution is the most common distribution for fitting count data and can be modified to achieve mixtures of truncated Poisson distributions. However, it is suitable only for modeling equidispersed distributions and is limited in its ability to capture bimodality. The Conway–Maxwell–Poisson (CMP) distribution is a two-parameter generalization of the Poisson distribution that allows for over- and underdispersion. In this work, we propose a mixture of CMPs for capturing a wide range of truncated discrete data, which can exhibit unimodal and bimodal behavior. We present methods for estimating the parameters of a mixture of two CMP distributions using an EM approach. Our approach introduces a special two-step optimization within the M step to estimate multiple parameters. We examine computational and theoretical issues. The methods are illustrated for modeling ordered rating data as well as truncated count data, using simulated and real examples.  相似文献   

The property of identifiability is an important consideration on estimating the parameters in a mixture of distributions. Also classification of a random variable based on a mixture can be meaning fully discussed only if the class of all finite mixtures is identifiable. The problem of identifiability of finite mixture of Gompertz distributions is studied. A procedure is presented for finding maximum likelihood estimates of the parameters of a mixture of two Gompertz distributions, using classified and unclassified observations. Based on small sample size, estimation of a nonlinear discriminant function is considered. Throughout simulation experiments, the performance of the corresponding estimated nonlinear discriminant function is investigated.  相似文献   

Grouped data are commonly encountered in applications. All data from a continuous population are grouped due to rounding of the individual observations. The Bernstein polynomial model is proposed as an approximate model in this paper for estimating a univariate density function based on grouped data. The coefficients of the Bernstein polynomial, as the mixture proportions of beta distributions, can be estimated using an EM algorithm. The optimal degree of the Bernstein polynomial can be determined using a change-point estimation method. The rate of convergence of the proposed density estimate to the true density is proved to be almost parametric by an acceptance–rejection argument used for generating random numbers. The proposed method is compared with some existing methods in a simulation study and is applied to the Chicken Embryo Data.  相似文献   

The orthogonalization of undesigned experiments is introduced to increase statistical precision of the estimated regression coefficients. The goals are to minimize the covariance and the bias of the least squares estimator for estimating the path of the steepest ascent (SA) that leads the users toward the neighbour of the optimum response. An orthogonal design is established to decrease the inverse determinant of XX and the angle between the true and the estimated SA paths. For orthogonalization of an undesigned matrix, our proposed solution is constructed on the modified Gram–Schmidt strategy relevant to the process of Gaussian elimination. The proposed solution offers an orthogonal basis, in full working accuracy, for the space spanned by the columns of the original matrix.  相似文献   

When estimating loss distributions in insurance, large and small losses are usually split because it is difficult to find a simple parametric model that fits all claim sizes. This approach involves determining the threshold level between large and small losses. In this article, a unified approach to the estimation of loss distributions is presented. We propose an estimator obtained by transforming the data set with a modification of the Champernowne cdf and then estimating the density of the transformed data by use of the classical kernel density estimator. We investigate the asymptotic bias and variance of the proposed estimator. In a simulation study, the proposed method shows a good performance. We also present two applications dealing with claims costs in insurance.  相似文献   

The authors propose two tests, one parametric and the other semiparametric, for testing bias of estimating equations in weighted regression with partially missing covariates when the primary regression model is correctly specified. More generally, the proposed tests may be thought of as a diagnostic tool for the combined package of the primary regression model and the missingness assumptions. The asymptotic null distributions of the two test statistics are derived under the assumption of missingness at random for the partially missing covariates. A small scale simulation study completes the work.  相似文献   

Summary.  As biological knowledge accumulates rapidly, gene networks encoding genomewide gene–gene interactions have been constructed. As an improvement over the standard mixture model that tests all the genes identically and independently distributed a priori , Wei and co-workers have proposed modelling a gene network as a discrete or Gaussian Markov random field (MRF) in a mixture model to analyse genomic data. However, how these methods compare in practical applications is not well understood and this is the aim here. We also propose two novel constraints in prior specifications for the Gaussian MRF model and a fully Bayesian approach to the discrete MRF model. We assess the accuracy of estimating the false discovery rate by posterior probabilities in the context of MRF models. Applications to a chromatin immuno-precipitation–chip data set and simulated data show that the modified Gaussian MRF models have superior performance compared with other models, and both MRF-based mixture models, with reasonable robustness to misspecified gene networks, outperform the standard mixture model.  相似文献   

Given a prior distribution for a model , the prior information specified on a nested submodel by means of a conditioning procedure crucially depends on the parameterisation used to describe the model. Regression coefficients represent the most common parameterisation of Gaussian DAG models. Nevertheless, in the specification of prior distributions, invariance considerations lead to the use of different parameterisations of the model, depending on the required invariance class. In this paper we consider the problem of prior specification by conditioning on zero regression coefficients and show that also such a procedure satisfies the property of invariance with respect to a class of parameterisations and characterise such a class.  相似文献   

The two parametric distribution functions appearing in the extreme-value theory – the generalized extreme-value distribution and the generalized Pareto distribution – have log-concave densities if the extreme-value index γ∈[?1, 0]. Replacing the order statistics in tail-index estimators by their corresponding quantiles from the distribution function that is based on the estimated log-concave density ? f n leads to novel smooth quantile and tail-index estimators. These new estimators aim at estimating the tail index especially in small samples. Acting as a smoother of the empirical distribution function, the log-concave distribution function estimator reduces estimation variability to a much greater extent than it introduces bias. As a consequence, Monte Carlo simulations demonstrate that the smoothed version of the estimators are well superior to their non-smoothed counterparts, in terms of mean-squared error.  相似文献   

Let \(\mathbf {X} = (X_1,\ldots ,X_p)\) be a stochastic vector having joint density function \(f_{\mathbf {X}}(\mathbf {x})\) with partitions \(\mathbf {X}_1 = (X_1,\ldots ,X_k)\) and \(\mathbf {X}_2 = (X_{k+1},\ldots ,X_p)\). A new method for estimating the conditional density function of \(\mathbf {X}_1\) given \(\mathbf {X}_2\) is presented. It is based on locally Gaussian approximations, but simplified in order to tackle the curse of dimensionality in multivariate applications, where both response and explanatory variables can be vectors. We compare our method to some available competitors, and the error of approximation is shown to be small in a series of examples using real and simulated data, and the estimator is shown to be particularly robust against noise caused by independent variables. We also present examples of practical applications of our conditional density estimator in the analysis of time series. Typical values for k in our examples are 1 and 2, and we include simulation experiments with values of p up to 6. Large sample theory is established under a strong mixing condition.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号