首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
Generating samples from a two-stage distribution is an important part of the study of mixture models. These samples are used to examine estimation procedures, and other properties of the mixture model. In this paper we present an exemplary sampling method for generating data from the mixed distribution. This method uses the order statistic spacings of the mixing distribution and random sampling from the distribution conditional on the mixing variable to produce samples from the mixed distribution. We show that this exemplary procedure often produces data with an empirical distribution function closer to the mixed distribution than the Method of Composition. We illustrate the method with an example.  相似文献   

2.
A method for inducing a desired rank correlation matrix on a multivariate input random variable for use in a simulation study is introduced in this paper. This method is simple to use, is distribution free, preserves the exact form of the marginal distributions on the input variables, and may be used with any type of sampling scheme for which correlation of input variables is a meaningful concept. A Monte Carlo study provides an estimate of the bias and variability associated with the method. Input variables used in a model for study of geologic disposal of radioactive waste provide an example of the usefulness of this procedure. A textbook example shows how the output may be affected by the method presented in this paper.  相似文献   

3.
This paper presents a procedure for developing life-test sampling plans for exponential distributions based upon accelerated life testing(ALT). Type II censoring is assumed at each overstress level. The derived test statistic is shown to be a quotient of two independent random variables, each of which is a rational power of a Chi-square random variable. The distribution of the test statistic is characterized by the H-function, which can be numerically evaluated to obtain desired sampling plans.  相似文献   

4.
For testing normality we investigate the power of several tests, first of all, the well-known test of Jarque & Bera (1980) and furthermore the tests of Kuiper (1960) and Shapiro & Wilk (1965) as well as tests of Kolmogorov–Smirnov and Cramér-von Mises type. The tests on normality are based, first, on independent random variables (model I) and, second, on the residuals in the classical linear regression (model II). We investigate the exact critical values of the Jarque–Bera test and the Kolmogorov–Smirnov and Cramér-von Mises tests, in the latter case for the original and standardized observations where the unknown parameters μ and σ have to be estimated. The power comparison is carried out via Monte Carlo simulation assuming the model of contaminated normal distributions with varying parameters μ and σ and different proportions of contamination. It turns out that for the Jarque–Bera test the approximation of critical values by the chi-square distribution does not work very well. The test is superior in power to its competitors for symmetric distributions with medium up to long tails and for slightly skewed distributions with long tails. The power of the Jarque–Bera test is poor for distributions with short tails, especially if the shape is bimodal – sometimes the test is even biased. In this case a modification of the Cramér-von Mises test or the Shapiro–Wilk test may be recommended.  相似文献   

5.
One of the most basic topics in many introductory statistical methods texts is inference for a population mean, μ. The primary tool for confidence intervals and tests is the Student t sampling distribution. Although the derivation requires independent identically distributed normal random variables with constant variance, σ2, most authors reassure the readers about some robustness to the normality and constant variance assumptions. Some point out that if one is concerned about assumptions, one may statistically test these prior to reliance on the Student t. Most software packages provide optional test results for both (a) the Gaussian assumption and (b) homogeneity of variance. Many textbooks advise only informal graphical assessments, such as certain scatterplots for independence, others for constant variance, and normal quantile–quantile plots for the adequacy of the Gaussian model. We concur with this recommendation. As convincing evidence against formal tests of (a), such as the Shapiro–Wilk, we offer a simulation study of the tails of the resulting conditional sampling distributions of the Studentized mean. We analyze the results of systematically screening all samples from normal, uniform, exponential, and Cauchy populations. This pretest does not correct the erroneous significance levels and makes matters worse for the exponential. In practice, we conclude that graphical diagnostics are better than a formal pretest. Furthermore, rank or permutation methods are recommended for exact validity in the symmetric case.  相似文献   

6.
The innovation random variable for a non-negative self-decomposable random variable can have a compound Poisson distribution. In this case, we provide the density function for the compounded variable. When it does not have a compound Poisson representation, there is a straightforward and easily available compound Poisson approximation for which the density function of the compounded variable is also available. These results can be used in the simulation of Ornstein–Uhlenbeck type processes with given marginal distributions. Previously, simulation of such processes used the inverse of the corresponding tail Lévy measure. We show this approach corresponds to the use of an inverse cdf method of a certain distribution. With knowledge of this distribution and hence density function, the sampling procedure is open to direct sampling methods.  相似文献   

7.
This article proposes a stochastic version of the matching pursuit algorithm for Bayesian variable selection in linear regression. In the Bayesian formulation, the prior distribution of each regression coefficient is assumed to be a mixture of a point mass at 0 and a normal distribution with zero mean and a large variance. The proposed stochastic matching pursuit algorithm is designed for sampling from the posterior distribution of the coefficients for the purpose of variable selection. The proposed algorithm can be considered a modification of the componentwise Gibbs sampler. In the componentwise Gibbs sampler, the variables are visited by a random or a systematic scan. In the stochastic matching pursuit algorithm, the variables that better align with the current residual vector are given higher probabilities of being visited. The proposed algorithm combines the efficiency of the matching pursuit algorithm and the Bayesian formulation with well defined prior distributions on coefficients. Several simulated examples of small n and large p are used to illustrate the algorithm. These examples show that the algorithm is efficient for screening and selecting variables.  相似文献   

8.
In this paper we consider properties of the logarithmic and Tukey's lambda-type transformations of random variables that follow beta or unit-gamma distributions. Beta distributions often arise as models for random proportions, and unit-gamma distributions, although not well- known, may serve the same purpose. The latter possess many properties similar to those of beta distributions. Some transformations of random variables that follow a beta distribution are considered by Johnson (1949) and Johnson and Kotz (1970,1973). These are used to obtain a -new"random variable that potentially approximately follows a normal distribution, so that practical analyses become possible. We study normality -related properties of the above transformations. This is done for the first time for unit-gamma distributions. Under the logarithmic transformation the beta and unit-gamma distributions become, respectively, the logarithmic F and generalized logistic distributions. The distributions of the transformed beta and unit-gamma distributions after application of Tukey's lambda-type transformations cannot be derived easily; however, we obtain the first four moments and expressions for the skewness and kudos is of the transformed variables. Values of skewness and kurtosis for a variety of different parameter values are calculated, and in consequence, the near (or not near) normality of the transformed variables is evaluated. Comments on the use of the various transformations are provided..  相似文献   

9.
This paper proposes an adaptive estimator that is more precise than the ordinary least squares estimator if the distribution of random errors is skewed or has long tails. The adaptive estimates are computed using a weighted least squares approach with weights based on the lengths of the tails of the distribution of residuals. Smaller weights are assigned to those observations that have residuals in the tails of long-tailed distributions and larger weights are assigned to observations having residuals in the tails of short-tailed distributions. Monte Carlo methods are used to compare the performance of the proposed estimator and the performance of the ordinary least squares estimator. The estimates that were studied in this simulation include the difference between the means of two populations, the mean of a symmetric distribution, and the slope of a regression line. The adaptive estimators are shown to have lower mean squared errors than those for the ordinary least squares estimators for short-tailed, long-tailed, and skewed distributions, provided the sample size is at least 20. The ordinary least squares estimator has slightly lower mean squared error for normally distributed errors. The adaptive estimator is recommended for general use for studies having sample sizes of at least 20 observations unless the random errors are known to be normally distributed.  相似文献   

10.
The Metropolis–Hastings algorithm is one of the most basic and well-studied Markov chain Monte Carlo methods. It generates a Markov chain which has as limit distribution the target distribution by simulating observations from a different proposal distribution. A proposed value is accepted with some particular probability otherwise the previous value is repeated. As a consequence, the accepted values are repeated a positive number of times and thus any resulting ergodic mean is, in fact, a weighted average. It turns out that this weighted average is an importance sampling-type estimator with random weights. By the standard theory of importance sampling, replacement of these random weights by their (conditional) expectations leads to more efficient estimators. In this paper we study the estimator arising by replacing the random weights with certain estimators of their conditional expectations. We illustrate by simulations that it is often more efficient than the original estimator while in the case of the independence Metropolis–Hastings and for distributions with finite support we formally prove that it is even better than the “optimal” importance sampling estimator.  相似文献   

11.
The class of limit distribution functions of bivariate extreme, intermediate and central dual generalized order statistics from independent and identically distributed random variables with random sample size is fully characterized. Two cases are considered. The first case is when the random sample size is assumed to be independent of all basic random variables. The second case is when the interrelation of the random size and the basic random variables is not restricted.  相似文献   

12.
In many environmental sampling situations, the variable of interest is either not easily observable or is too expensive to observe. Under such circumstances, the need arises to observe another variable, related to the variable of interest, so as to estimate the population parameters of interest. We study the performance of two different sampling procedures, i.e. ranked set sampling and stratified simple random sampling, when both stratification and ranking are accomplished on the basis of such a concomitant variable. The relative precision of the two methods is obtained and expressed as a function of population variance, between-stratum and between-rank variation, and the correlation coefficient between the variable of interest and the concomitant variable. The relative precision is computed for several important families of distributions that occur frequently in environmental and ecological work. Under equal allocation of sampling units, stratified simple random sampling is found to perform better than ranked set sampling, when the costs incurred to obtain sample measurements are ignored. When optimum allocation is considered for both methods, ranked set sampling performs better than stratified simple random sampling, when the concomitant variable is not highly correlated with the variable of interest. Furthermore, when the costs of sampling and the costs of measurement are incorporated into the assessment of the relative precision, the ranked set sampling is seen to be more efficient than stratified simple random sampling, particularly when the cost of stratification is high compared with that of ranking. This is generally the case in practice.  相似文献   

13.
In this article, the effects of mixtures of two normal distributions on the fraction non-conforming are studied in the context of capability analysis. When the output from several processes is mixed, the quality characteristic variables of the resulting mix may result in a normal mixture distribution. This can happen in cases such as monitoring an output from several suppliers, several machines, or several workers. This study considered the independence case and autocorrelated processes for a mixture of two normal distributions, using an autoregressive model of order one, AR(1). It is shown that the true attained process fraction non-conforming (corresponding to specific values for some capability index) can be very different from what is expected when the data are independent normal random variables.  相似文献   

14.
In many environmental sampling situations, the variable of interest is either not easily observable or is too expensive to observe. Under such circumstances, the need arises to observe another variable, related to the variable of interest, so as to estimate the population parameters of interest. We study the performance of two different sampling procedures, i.e. ranked set sampling and stratified simple random sampling, when both stratification and ranking are accomplished on the basis of such a concomitant variable. The relative precision of the two methods is obtained and expressed as a function of population variance, between-stratum and between-rank variation, and the correlation coefficient between the variable of interest and the concomitant variable. The relative precision is computed for several important families of distributions that occur frequently in environmental and ecological work. Under equal allocation of sampling units, stratified simple random sampling is found to perform better than ranked set sampling, when the costs incurred to obtain sample measurements are ignored. When optimum allocation is considered for both methods, ranked set sampling performs better than stratified simple random sampling, when the concomitant variable is not highly correlated with the variable of interest. Furthermore, when the costs of sampling and the costs of measurement are incorporated into the assessment of the relative precision, the ranked set sampling is seen to be more efficient than stratified simple random sampling, particularly when the cost of stratification is high compared with that of ranking. This is generally the case in practice.  相似文献   

15.
This paper investigates the roles of partial correlation and conditional correlation as measures of the conditional independence of two random variables. It first establishes a sufficient condition for the coincidence of the partial correlation with the conditional correlation. The condition is satisfied not only for multivariate normal but also for elliptical, multivariate hypergeometric, multivariate negative hypergeometric, multinomial and Dirichlet distributions. Such families of distributions are characterized by a semigroup property as a parametric family of distributions. A necessary and sufficient condition for the coincidence of the partial covariance with the conditional covariance is also derived. However, a known family of multivariate distributions which satisfies this condition cannot be found, except for the multivariate normal. The paper also shows that conditional independence has no close ties with zero partial correlation except in the case of the multivariate normal distribution; it has rather close ties to the zero conditional correlation. It shows that the equivalence between zero conditional covariance and conditional independence for normal variables is retained by any monotone transformation of each variable. The results suggest that care must be taken when using such correlations as measures of conditional independence unless the joint distribution is known to be normal. Otherwise a new concept of conditional independence may need to be introduced in place of conditional independence through zero conditional correlation or other statistics.  相似文献   

16.
In epidemiology, an infection lasting n weeks may be monitored by taking weekly serum samples. If tests on samples are independent Bernoulli trials with probability q of correctly testing positive, the apparent duration of infection ( from the first positive test to the last positive test inclusive) may be less than n weeks. This distribution of apparent length also arises when plants in a row of n each have a probability q of germinating, for example. This distribution is shown to be related to that of the number of tails obtained when tossing a coin until two heads are obtained, in a maximum of n tosses. The properties of the 'apparent length' distribution are described, and some compounded (mixed) distributions that can be derived from it are also discussed. The distribution was used to estimate the underlying distribution of the duration of infection, in a longitudinal study of infections of children. The methodology was also used to estimate the proportion of infectious episodes that were not detected. It can be similarly used to correct episode durations and rates in longitudinal studies in which episodes of any kind are detected by regular sampling.  相似文献   

17.
Logistic models with a random intercept are prevalent in medical and social research where clustered and longitudinal data are often collected. Traditionally, the random intercept in these models is assumed to follow some parametric distribution such as the normal distribution. However, such an assumption inevitably raises concerns about model misspecification and misleading inference conclusions, especially when there is dependence between the random intercept and model covariates. To protect against such issues, we use a semiparametric approach to develop a computationally simple and consistent estimator where the random intercept is distribution‐free. The estimator is revealed to be optimal and achieve the efficiency bound without the need to postulate or estimate any latent variable distributions. We further characterize other general mixed models where such an optimal estimator exists.  相似文献   

18.
Four related methods are discussed for obtaining robust confidence bounds for extreme upper quantiles of the unknown distribution of a positive random variable. These methods are designed to work when the upper tail of the distribution is neither too heavy nor too light in comparison to the exponential distribution. An extensive simulated study is described, which compares the performance of nominal 90% upper confidence bounds corresponding to the four methods over a wide variety of distributions having light to heavy upper tails, ranging from a half-normal distribution to a heavy-tailed lognormal distribution.  相似文献   

19.
A general framework is proposed for modelling clustered mixed outcomes. A mixture of generalized linear models is used to describe the joint distribution of a set of underlying variables, and an arbitrary function relates the underlying variables to be observed outcomes. The model accommodates multilevel data structures, general covariate effects and distinct link functions and error distributions for each underlying variable. Within the framework proposed, novel models are developed for clustered multiple binary, unordered categorical and joint discrete and continuous outcomes. A Markov chain Monte Carlo sampling algorithm is described for estimating the posterior distributions of the parameters and latent variables. Because of the flexibility of the modelling framework and estimation procedure, extensions to ordered categorical outcomes and more complex data structures are straightforward. The methods are illustrated by using data from a reproductive toxicity study.  相似文献   

20.
In this paper a finite series approximation involving Laguerre polynomials is derived for central and noncentral multivariate gamma distributions. It is shown that if one approximates the density of any k nonnegative continuous random variables by a finite series of Laguerre polynomials up to the (n1, …, nk)th degree, then all the mixed moments up to the order (n1, …, nk) of the approximated distribution equal to the mixed moments up to the same order of the random variables. Some numerical results are given for the bivariate central and noncentral multivariate gamma distributions to indicate the usefulness of the approximations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号