期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Optimal asymmetric classification procedures for interval-screened normal data

Hea-Jung Kim 《Journal of applied statistics》2013,40(2):449-462

Statistical methods for an asymmetric normal classification do not adapt well to the situations where the population distributions are perturbed by an interval-screening scheme. This paper explores methods for providing an optimal classification of future samples in this situation. The properties of the screened population distributions are considered and two optimal regions for classifying the future samples are obtained. These developments yield yet other rules for the interval-screened asymmetric normal classification. The rules are studied from several aspects such as the probability of misclassification, robustness, and estimation of the rules. The investigation of the performance of the rules as well as the illustration of the screened classification idea, using two numerical examples, is also considered. 相似文献

2.

Sampling strategy for optimal classification into one of two correlated normal populations

Subhadip Bandyopadhyay Shibdas Bandyopadhyay 《Statistics》2013,47(5):1116-1127

A unit ω is to be classified into one of two correlated homoskedastic normal populations by linear discriminant function known as W classification statistic [T.W. Anderson, An asymptotic expansion of the distribution of studentized classification statistic, Ann. Statist. 1 (1973), pp. 964–972; T.W. Anderson, An Introduction to Multivariate Statistical Analysis, 2nd edn, Wiley, New York, 1984; G.J. Mclachlan, Discriminant Analysis and Statistical Pattern Recognition, John Wiley and Sons, New York, 1992]. The two populations studied here are two different states of the same population, like two different states of a disease where the population is the population of diseased patient. When a sample unit is observed in both the states (populations), the observations made on it (which form a pair) become correlated. A training sample is unbalanced when not all sample units are observed in both the states. Paired and also unbalanced samples are natural in studies related to correlated populations. S. Bandyopadhyay and S. Bandyopadhyay [Choosing better training sample for classifying an individual into one of two correlated normal populations, Calcutta Statist. Assoc. Bull. 54(215–216) (2003), pp. 167–180] studied the effect of unbalanced training sample structure on the performance of W statistics in the univariate correlated normal set-up for finding optimal sampling strategy for a better classification rate. In this study, the results are extended to the multivariate case with discussion on application in real scenario. 相似文献

3.

A classification rule for ordered exponential populations

《Journal of statistical planning and inference》2005,135(2):339-356

In this paper, we consider classification procedures for exponential populations when an order on the populations parameters is known. We define and study the behavior of a classification rule which takes into account the additional information and outperforms the likelihood-ratio-based rule when two populations are considered. Moreover, we study the behavior of this rule in each of the two populations and compare the misclassification probabilities with the classical ones. Type II censorship, which is usual in practice, is considered and results obtained. The performance for more than two populations is evaluated by simulation. 相似文献

4.

Testing homogeneity in a scale mixture of normal distributions

Xiaoqing Niu Pengfei Li Peng Zhang 《Statistical Papers》2016,57(2):499-516

相似文献

5.

Bayesian modeling of autoregressive partial linear models with scale mixture of normal errors

Guillermo Ferreira Luis M. Castro Ronaldo Dias 《Journal of applied statistics》2013,40(8):1796-1816

Normality and independence of error terms are typical assumptions for partial linear models. However, these assumptions may be unrealistic in many fields, such as economics, finance and biostatistics. In this paper, a Bayesian analysis for partial linear model with first-order autoregressive errors belonging to the class of the scale mixtures of normal distributions is studied in detail. The proposed model provides a useful generalization of the symmetrical linear regression model with independent errors, since the distribution of the error term covers both correlated and thick-tailed distributions, and has a convenient hierarchical representation allowing easy implementation of a Markov chain Monte Carlo scheme. In order to examine the robustness of the model against outlying and influential observations, a Bayesian case deletion influence diagnostics based on the Kullback–Leibler (K–L) divergence is presented. The proposed method is applied to monthly and daily returns of two Chilean companies. 相似文献

6.

Bayesian two-stage optimal design for mixture models

《Journal of Statistical Computation and Simulation》2012,82(3):209-231

In this paper, a Bayesian two-stage D–D optimal design for mixture experimental models under model uncertainty is developed. A Bayesian D-optimality criterion is used in the first stage to minimize the determinant of the posterior variances of the parameters. The second stage design is then generated according to an optimalityprocedure that collaborates with the improved model from the first stage data. The results show that a Bayesian two-stage D–D-optimal design for mixture experiments under model uncertainty is more efficient than both the Bayesian one-stage D-optimal design and the non-Bayesian one-stage D-optimal design in most situations. Furthermore, simulations are used to obtain a reasonable ratio of the sample sizes between the two stages. 相似文献

7.

Robustness of locally most powerful invariant test for normal mixture model in control and treatment populations

《Journal of statistical planning and inference》1996,52(1):33-41

Two independent samples from control with N(μ₁, σ²) and treatment with pN(μ₁, σ²) + (1 − p)N(μ₂, σ²) are considered. A locally most powerful invariant test for testing H₀: μ₁ = μ₂ against H₁ : μ₂ > μ₁, where σ² > 0, 0 < p < 1 are unknown, is obtained. Also, the robustness of the test statistic on the lines of Kariya and Sinha (Robustness of Statistical Tests (1989). Academic Press, New York) is studied. 相似文献

8.

An online classification EM algorithm based on the mixture model

Allou Samé Christophe Ambroise Gérard Govaert 《Statistics and Computing》2007,17(3):209-218

Mixture model-based clustering is widely used in many applications. In certain real-time applications the rapid increase of data size with time makes classical clustering algorithms too slow. An online clustering algorithm based on mixture models is presented in the context of a real-time flaw-diagnosis application for pressurized containers which uses data from acoustic emission signals. The proposed algorithm is a stochastic gradient algorithm derived from the classification version of the EM algorithm (CEM). It provides a model-based generalization of the well-known online k-means algorithm, able to handle non-spherical clusters. Using synthetic and real data sets, the proposed algorithm is compared with the batch CEM algorithm and the online EM algorithm. The three approaches generate comparable solutions in terms of the resulting partition when clusters are relatively well separated, but online algorithms become faster as the size of the available observations increases. 相似文献

9.

Random search algorithm for optimal mixture experimental design

Guanghui Li 《统计学通讯:理论与方法》2018,47(6):1413-1422

It is well known that it is difficult to obtain an accurate optimal design for a mixture experimental design with complex constraints. In this article, we construct a random search algorithm which can be used to find the optimal design for mixture model with complex constraints. First, we generate an initial set by the Monte-Carlo method, and then run the random search algorithm to get the optimal set of points. After that, we explain the effectiveness of this method by using two examples. 相似文献

10.

On uniqueness of an optimal search rule

Hans W. Gottinger 《Statistical Papers》1976,17(4):290-294

相似文献

11.

Discretisation for inference on normal mixture models

Mark J. Brewer 《Statistics and Computing》2003,13(3):209-219

The problem of inference in Bayesian Normal mixture models is known to be difficult. In particular, direct Bayesian inference (via quadrature) suffers from a combinatorial explosion in having to consider every possible partition of n observations into k mixture components, resulting in a computation time which is O(k ⁿ). This paper explores the use of discretised parameters and shows that for equal-variance mixture models, direct computation time can be reduced to O(D ^k n ^k), where relevant continuous parameters are each divided into D regions. As a consequence, direct inference is now possible on genuine data sets for small k, where the quality of approximation is determined by the level of discretisation. For large problems, where the computational complexity is still too great in O(D ^k n ^k) time, discretisation can provide a convergence diagnostic for a Markov chain Monte Carlo analysis. 相似文献

12.

An estimator of the common mean of two normal populations

K.Aiyappan Nair 《Journal of statistical planning and inference》1982,6(2):119-122

In this paper we consider the estimation of the common mean of two normal populations when the variances are unknown. If it is known that one specified variance is smaller than the other, then it is possible to modify the Graybill-Deal estimator in order to obtain a more efficient estimator. One such estimator is proposed by Mehta and Gurland (1969). We prove that this estimator is more efficient than the Graybill-Deal estimator under the condition that one variance is known to be less than the other. 相似文献

13.

An asymptotically optimal test for the mean and the variance of a normal distribution

S.K. Perng D.S. Gill 《统计学通讯:理论与方法》2013,42(16):1817-1829

相似文献

14.

On some optimal selection procedures for weibull populations

Tong An Hsu 《统计学通讯:理论与方法》2013,42(23):2657-2668

Consider k (k >(>)2) Weibull populations. We shall derive a method of constructing optimal selection procedures to select a subset of the k populations containing the best population which control the size of the selected subset and which maximises the minimum probability of making a correct selection. Procedures and results are derived for the case when sample sizes are unequal. Some tables and figures are given at the end of this paper. 相似文献

15.

Estimation of scale parameters in mixture distributions

Dipak K. Dey 《Revue canadienne de statistique》1990,18(2):171-178

Simultaneous estimation of scale parameters is considered in mixture distributions under squared-error loss. A general class of estimators is obtained which dominates the componentwise best multiple estimators and the moment estimators. As special cases, improved estimators are obtained for the multivariate t-distribution and the p-variate Lomax distribution. 相似文献

16.

A likelihood ratio test of a homoscedastic normal mixture against a heteroscedastic normal mixture

Yungtai Lo 《Statistics and Computing》2008,18(3):233-240

It is generally assumed that the likelihood ratio statistic for testing the null hypothesis that data arise from a homoscedastic normal mixture distribution versus the alternative hypothesis that data arise from a heteroscedastic normal mixture distribution has an asymptotic χ ² reference distribution with degrees of freedom equal to the difference in the number of parameters being estimated under the alternative and null models under some regularity conditions. Simulations show that the χ ² reference distribution will give a reasonable approximation for the likelihood ratio test only when the sample size is 2000 or more and the mixture components are well separated when the restrictions suggested by Hathaway (Ann. Stat. 13:795–800, 1985) are imposed on the component variances to ensure that the likelihood is bounded under the alternative distribution. For small and medium sample sizes, parametric bootstrap tests appear to work well for determining whether data arise from a normal mixture with equal variances or a normal mixture with unequal variances. 相似文献

17.

Evaluating subject-level incremental values of new markers for risk classification rule

T. Cai L. Tian D. Lloyd-Jones L. J. Wei 《Lifetime data analysis》2013,19(4):547-567

Suppose that we need to classify a population of subjects into several well-defined ordered risk categories for disease prevention or management with their “baseline” risk factors/markers. In this article, we present a systematic approach to identify subjects using their conventional risk factors/markers who would benefit from a new set of risk markers for more accurate classification. Specifically for each subgroup of individuals with the same conventional risk estimate, we present inference procedures for the reclassification and the corresponding correct re-categorization rates with the new markers. We then apply these new tools to analyze the data from the Cardiovascular Health Study sponsored by the US National Heart, Lung, and Blood Institute. We used Framingham risk factors plus the information of baseline anti-hypertensive drug usage to identify adult American women who may benefit from the measurement of a new blood biomarker, CRP, for better risk classification in order to intensify prevention of coronary heart disease for the subsequent 10 years. 相似文献

18.

Penalized optimal scoring for the classification of multi-dimensional functional data

Tomohiro Ando 《Statistical Methodology》2009,6(6):565-576

Many fields of research need to classify individual systems based on one or more data series, which are obtained by sampling an unknown continuous curve with noise. In other words, the underlying process is an unknown function which the observed variables represent only imperfectly. Although functional logistic regression has many attractive features for this classification problem, this method is applicable only when the number of individuals to be classified (or available to estimate the model) is large compared to the number of curves sampled per individual.To overcome this limitation, we use penalized optimal scoring to construct a new method for the classification of multi-dimensional functional data. The proposed method consists of two stages. First, the series of observed discrete values available for each individual are expressed as a set of continuous curves. Next, the penalized optimal scoring model is estimated on the basis of these curves. A similar penalized optimal scoring method was described in my previous work, but this model is not suitable for the analysis of continuous functions. In this paper we adopt a Gaussian kernel approach to extend the previous model. The high accuracy of the new method is demonstrated on Monte Carlo simulations, and used to predict defaulting firms on the Japanese Stock Exchange. 相似文献

19.

Linear discrimination for three known normal populations

Mark J. Schervish 《Journal of statistical planning and inference》1984,10(2):167-175

A random vector is assumed to have one of three known multivariate normal distributions with equal covariance matrices. It is desired to separate the three distributions by means of a single linear discriminant function. Such a function can lead to a classification rule. The function whose classification rule minimizes the average of the three probabilities of misclassification is found. Also the function is found whose rule minimizes the maximum of the three probabilities of misclassification. 相似文献

20.

A nonparametric plug-in rule for selecting optimal block lengths for block bootstrap methods

《Statistical Methodology》2007,4(3):292-321

In this paper, we consider the problem of empirical choice of optimal block sizes for block bootstrap estimation of population parameters. We suggest a nonparametric plug-in principle that can be used for estimating ‘mean squared error’-optimal smoothing parameters in general curve estimation problems, and establish its validity for estimating optimal block sizes in various block bootstrap estimation problems. A key feature of the proposed plug-in rule is that it can be applied without explicit analytical expressions for the constants that appear in the leading terms of the optimal block lengths. Furthermore, we also discuss the computational efficacy of the method and explore its finite sample properties through a simulation study. 相似文献