首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 359 毫秒
1.
This paper deals with the construction of optimum partitions of for a clustering criterion which is based on a convex function of the class centroids as a generalization of the classical SSQ clustering criterion for n data points. We formulate a dual optimality problem involving two sets of variables and derive a maximum-support-plane (MSP) algorithm for constructing a (sub-)optimum partition as a generalized k-means algorithm. We present various modifications of the basic criterion and describe the corresponding MSP algorithm. It is shown that the method can also be used for solving optimality problems in classical statistics (maximizing Csiszárs -divergence) and for simultaneous classification of the rows and columns of a contingency table.  相似文献   

2.
R. Göb 《Statistical Papers》1992,33(1):273-277
In elementary probability theory, as a result of a limiting process the probabilities of aBi(n, p) binomial distribution are approximated by the probabilities of aPo(np) Poisson distribution. Accordingly, in statistical quality control the binomial operating characteristic function \(\mathcal{L}_{n,c} (p)\) is approximated by the Poisson operating characteristic function \(\mathcal{F}_{n,c} (p)\) . The inequality \(\mathcal{L}_{n + 1,c + 1} (p) > \mathcal{L}_{n,c} (p)\) forp∈(0;1) is evident from the interpretation of \(\mathcal{L}_{n + 1,c + 1} (p)\) , \(\mathcal{L}_{n,c} (p)\) as probabilities of accepting a lot. It is shown that the Poisson approximation \(\mathcal{F}_{n,c} (p)\) preserves this essential feature of the binomial operating characteristic function, i.e. that an analogous inequality holds for the Poisson operating characteristic function, too.  相似文献   

3.
Essential graphs and largest chain graphs are well-established graphical representations of equivalence classes of directed acyclic graphs and chain graphs respectively, especially useful in the context of model selection. Recently, the notion of a labelled block ordering of vertices was introduced as a flexible tool for specifying subfamilies of chain graphs. In particular, both the family of directed acyclic graphs and the family of “unconstrained” chain graphs can be specified in this way, for the appropriate choice of . The family of chain graphs identified by a labelled block ordering of vertices is partitioned into equivalence classes each represented by means of a -essential graph. In this paper, we introduce a topological ordering of meta-arrows and use this concept to devise an efficient procedure for the construction of -essential graphs. In this way we also provide an efficient procedure for the construction of both largest chain graphs and essential graphs. The key feature of the proposed procedure is that every meta-arrow needs to be processed only once.  相似文献   

4.
Improvement of the Liu estimator in linear regression model   总被引:2,自引:0,他引:2  
In the presence of stochastic prior information, in addition to the sample, Theil and Goldberger (1961) introduced a Mixed Estimator for the parameter vector β in the standard multiple linear regression model (T,2 I). Recently, the Liu estimator which is an alternative biased estimator for β has been proposed by Liu (1993). In this paper we introduce another new Liu type biased estimator called Stochastic restricted Liu estimator for β, and discuss its efficiency. The necessary and sufficient conditions for mean squared error matrix of the Stochastic restricted Liu estimator to exceed the mean squared error matrix of the mixed estimator will be derived for the two cases in which the parametric restrictions are correct and are not correct. In particular we show that this new biased estimator is superior in the mean squared error matrix sense to both the Mixed estimator and to the biased estimator introduced by Liu (1993).  相似文献   

5.
Letx i(1)≤x i(2)≤…≤x i(ri) be the right-censored samples of sizesn i from theith exponential distributions $\sigma _i^{ - 1} exp\{ - (x - \mu _i )\sigma _i^{ - 1} \} ,i = 1,2$ where μi and σi are the unknown location and scale parameters respectively. This paper deals with the posteriori distribution of the difference between the two location parameters, namely μ21, which may be represented in the form $\mu _2 - \mu _1 \mathop = \limits^\mathcal{D} x_{2(1)} - x_{1(1)} + F_1 \sin \theta - F_2 \cos \theta $ where $\mathop = \limits^\mathcal{D} $ stands for equal in distribution,F i stands for the central F-variable with [2,2(r i?1)] degrees of freedom and $\tan \theta = \frac{{n_2 s_{x1} }}{{n_1 s_{x2} }}, s_{x1} = (r_1 - 1)^{ - 1} \left\{ {\sum\limits_{j = 1}^{r_i - 1} {(n_i - j)(x_{i(j + 1)} - x_{i(j)} )} } \right\}$ The paper also derives the distribution of the statisticV=F 1 sin σ?F 2 cos σ and tables of critical values of theV-statistic are provided for the 5% level of significance and selected degrees of freedom.  相似文献   

6.
We consider equalities between the ordinary least squares estimator ( $\mathrm {OLSE} $ ), the best linear unbiased estimator ( $\mathrm {BLUE} $ ) and the best linear unbiased predictor ( $\mathrm {BLUP} $ ) in the general linear model $\{ \mathbf y , \mathbf X \varvec{\beta }, \mathbf V \}$ extended with the new unobservable future value $ \mathbf y _{*}$ of the response whose expectation is $ \mathbf X _{*}\varvec{\beta }$ . Our aim is to provide some new insight and new proofs for the equalities under consideration. We also collect together various expressions, without rank assumptions, for the $\mathrm {BLUP} $ and provide new results giving upper bounds for the Euclidean norm of the difference between the $\mathrm {BLUP} ( \mathbf y _{*})$ and $\mathrm {BLUE} ( \mathbf X _{*}\varvec{\beta })$ and between the $\mathrm {BLUP} ( \mathbf y _{*})$ and $\mathrm {OLSE} ( \mathbf X _{*}\varvec{\beta })$ . A remark is made on the application to small area estimation.  相似文献   

7.
The additive model is considered when some observations on x are missing at random but corresponding observations on y are available. Especially for this model, missing at random is an interesting case because the complete case analysis is expected to be no more suitable. A simulation experiment is reported and the different methods are compared based on their superiority with respect to the sample mean squared error. Some focus is also given on the sample variance and the estimated bias. In detail, the complete case analysis, a kind of stochastic mean imputation, a single imputation and the nearest neighbor imputation are discussed.  相似文献   

8.
Summary Let , whereX i are i.i.d. random variables with a finite variance σ2 and is the usual estimate of the mean ofX i. We consider the problem of finding optimal α with respect to the minimization of the expected value of |S 2(σ)−σ2|k for variousk and with respect to Pitman's nearness criterion. For the Gaussian case analytical results are obtained and for some non-Gaussian cases we present Monte Carlo results regarding Pitman's criteron. This research was supported by Science Fund of Serbia, grant number 04M03, through Mathematical Institute, Belgrade.  相似文献   

9.
Crude oil and natural gas depletion may be modelled by a diffusion process based upon a constrained life-cycle. Here we consider the Generalized Bass Model. The choice is motivated by the realistic assumption that there is a self-evident link between oil and gas extraction and the spreading of the modern technologies in wide areas such as transport, heating, cooling, chemistry and hydrocarbon fuels consumption. Such a model may include deterministic or semi-deterministic regulatory interventions. Statistical analysis is based upon nonlinear methodologies and more flexible autoregressive structure of residuals. The technical aim of this paper is to outline the meaningful hierarchy existing among the components of such diffusion models. Statistical effort in residual component analysis may be read as a significant confirmation of a well-founded diffusion process under rare but strong deterministic shocks. Applications of such ideas are proposed with reference to world oil and gas production data and to particular regions such as mainland U.S.A., U.K., Norway and Alaska. The main results give new evidence in time-peaks location and in residual times to depletion.  相似文献   

10.
In this paper we consider the inferential aspect of the nonparametric estimation of a conditional function , where X t,m represents the vector containing the m conditioning lagged values of the series. Here is an arbitrary measurable function. The local polynomial estimator of order p is used for the estimation of the function g, and of its partial derivatives up to a total order p. We consider α-mixing processes, and we propose the use of a particular resampling method, the local polynomial bootstrap, for the approximation of the sampling distribution of the estimator. After analyzing the consistency of the proposed method, we present a simulation study which gives evidence of its finite sample behaviour.  相似文献   

11.
12.
In this paper we consider an acceptance-rejection (AR) sampler based on deterministic driver sequences. We prove that the discrepancy of an N element sample set generated in this way is bounded by \(\mathcal {O} (N^{-2/3}\log N)\), provided that the target density is twice continuously differentiable with non-vanishing curvature and the AR sampler uses the driver sequence \(\mathcal {K}_M= \{( j \alpha , j \beta ) ~~ mod~~1 \mid j = 1,\ldots ,M\},\) where \(\alpha ,\beta \) are real algebraic numbers such that \(1,\alpha ,\beta \) is a basis of a number field over \(\mathbb {Q}\) of degree 3. For the driver sequence \(\mathcal {F}_k= \{ ({j}/{F_k}, \{{jF_{k-1}}/{F_k}\} ) \mid j=1,\ldots , F_k\},\) where \(F_k\) is the k-th Fibonacci number and \(\{x\}=x-\lfloor x \rfloor \) is the fractional part of a non-negative real number x, we can remove the \(\log \) factor to improve the convergence rate to \(\mathcal {O}(N^{-2/3})\), where again N is the number of samples we accepted. We also introduce a criterion for measuring the goodness of driver sequences. The proposed approach is numerically tested by calculating the star-discrepancy of samples generated for some target densities using \(\mathcal {K}_M\) and \(\mathcal {F}_k\) as driver sequences. These results confirm that achieving a convergence rate beyond \(N^{-1/2}\) is possible in practice using \(\mathcal {K}_M\) and \(\mathcal {F}_k\) as driver sequences in the acceptance-rejection sampler.  相似文献   

13.
Summary In this paper likelihood is characterized as an index which measures how much a model fits a sample. Some properties required to an index of fit are introduced and discussed, while stressing how they describe aspects inner to idea of fit. Finally we prove that, if an index of fit is maximal when the model reaches the distribution of the sample, then such an index is an increasing continuous transform of , where thep i's are the theoretical relative frequencies provided by the model and theq i's are the actual relative frequencies of the sample.  相似文献   

14.
Suppose one has a sample of high-frequency intraday discrete observations of a continuous time random process, such as foreign exchange rates and stock prices, and wants to test for the presence of jumps in the process. We show that the power of any test of this hypothesis depends on the frequency of observation. In particular, if the process is observed at intervals of length $1/n$ 1 / n and the instantaneous volatility of the process is given by $ \sigma _{t}$ σ t , we show that at best one can detect jumps of height no smaller than $\sigma _{t}\sqrt{2\log (n)/n}$ σ t 2 log ( n ) / n . We present a new test which achieves this rate for diffusion-type processes, and examine its finite-sample properties using simulations.  相似文献   

15.
In this paper we address the problem of protecting confidentiality in statistical tables containing sensitive information that cannot be disseminated. This is an issue of primary importance in practice. Cell Suppression is a widely-used technique for avoiding disclosure of sensitive information, which consists in suppressing all sensitive table entries along with a certain number of other entries, called complementary suppressions. Determining a pattern of complementary suppressions that minimizes the overall loss of information results into a difficult (i.e., -hard) optimization problem known as the Cell Suppression Problem. We propose here a different protection methodology consisting of replacing some table entries by appropriate intervals containing the actual value of the unpublished cells. We call this methodology Partial Cell Suppression, as opposed to the classical complete cell suppression. Partial cell suppression has the important advantage of reducing the overall information loss needed to protect the sensitive information. Also, the new method provides automatically auditing ranges for each unpublished cell, thus saving an often time-consuming task to the statistical office while increasing the information explicitly provided with the table. Moreover, we propose an efficient (i.e., polynomial-time) algorithm to find an optimal partial suppression solution. A preliminary computational comparison between partial and complete suppression methologies is reported, showing the advantages of the new approach. Finally, we address possible extensions leading to a unified complete/partial cell suppression framework.  相似文献   

16.
Krämer (Sankhy $\bar{\mathrm{a }}$ 42:130–131, 1980) posed the following problem: “Which are the $\mathbf{y}$ , given $\mathbf{X}$ and $\mathbf{V}$ , such that OLS and Gauss–Markov are equal?”. In other words, the problem aimed at identifying those vectors $\mathbf{y}$ for which the ordinary least squares (OLS) and Gauss–Markov estimates of the parameter vector $\varvec{\beta }$ coincide under the general Gauss–Markov model $\mathbf{y} = \mathbf{X} \varvec{\beta } + \mathbf{u}$ . The problem was later called a “twist” to Kruskal’s Theorem, which provides conditions necessary and sufficient for the OLS and Gauss–Markov estimates of $\varvec{\beta }$ to be equal. The present paper focuses on a similar problem to the one posed by Krämer in the aforementioned paper. However, instead of the estimation of $\varvec{\beta }$ , we consider the estimation of the systematic part $\mathbf{X} \varvec{\beta }$ , which is a natural consequence of relaxing the assumption that $\mathbf{X}$ and $\mathbf{V}$ are of full (column) rank made by Krämer. Further results, dealing with the Euclidean distance between the best linear unbiased estimator (BLUE) and the ordinary least squares estimator (OLSE) of $\mathbf{X} \varvec{\beta }$ , as well as with an equality between BLUE and OLSE are also provided. The calculations are mostly based on a joint partitioned representation of a pair of orthogonal projectors.  相似文献   

17.
The Double Chain Markov Model (DCMM) is used to model an observable process $Y = \{Y_{t}\}_{t=1}^{T}$ as a Markov chain with transition matrix, $P_{x_{t}}$ , dependent on the value of an unobservable (hidden) Markov chain $\{X_{t}\}_{t=1}^{T}$ . We present and justify an efficient algorithm for sampling from the posterior distribution associated with the DCMM, when the observable process Y consists of independent vectors of (possibly) different lengths. Convergence of the Gibbs sampler, used to simulate the posterior density, is improved by adding a random permutation step. Simulation studies are included to illustrate the method. The problem that motivated our model is presented at the end. It is an application to real data, consisting of the credit rating dynamics of a portfolio of financial companies where the (unobserved) hidden process is the state of the broader economy.  相似文献   

18.
In this paper, we consider the problem of hypotheses testing about the drift parameter \(\theta \) in the process \(\text {d}Y^{\delta }_{t} = \theta \dot{f}(t)Y^{\delta }_{t}\text {d}t + b(t)\text {d}L^{\delta }_{t}\) driven by symmetric \(\delta \)-stable Lévy process \(L^{\delta }_{t}\) with \(\dot{f}(t)\) being the derivative of a known increasing function f(t) and b(t) being known as well. We consider the hypotheses testing \(H_{0}: \theta \le 0\) and \(K_{0}: \theta =0\) against the alternatives \(H_{1}: \theta >0\) and \(K_{1}: \theta \ne 0\), respectively. For these hypotheses, we propose inverse methods, which are motivated by sequential approach, based on the first hitting time of the observed process (or its absolute value) to a pre-specified boundary or two boundaries until some given time. The applicability of these methods is illustrated. For the case \(Y^{\delta }_{0}=0\), we are able to calculate the values of boundaries and finite observed times more directly. We are able to show the consistencies of proposed tests for \(Y^{\delta }_{0}\ge 0\) with \(\delta \in (1,2]\) and for \(Y^{\delta }_{0}=0\) with \(\delta \in (0,2]\) under quite mild conditions.  相似文献   

19.
In this paper, we propose an adaptive algorithm that iteratively updates both the weights and component parameters of a mixture importance sampling density so as to optimise the performance of importance sampling, as measured by an entropy criterion. The method, called M-PMC, is shown to be applicable to a wide class of importance sampling densities, which includes in particular mixtures of multivariate Student t distributions. The performance of the proposed scheme is studied on both artificial and real examples, highlighting in particular the benefit of a novel Rao-Blackwellisation device which can be easily incorporated in the updating scheme. This work has been supported by the Agence Nationale de la Recherche (ANR) through the 2006–2008 project ’ . Both last authors are grateful to the participants to the BIRS meeting on “Bioinformatics, Genetics and Stochastic Computation: Bridging the Gap”, Banff, for their comments on an earlier version of this paper. The last author also acknowledges an helpful discussion with Geoff McLachlan. The authors wish to thank both referees for their encouraging comments.  相似文献   

20.
Estimation of a normal mean relative to balanced loss functions   总被引:3,自引:0,他引:3  
LetX 1,…,X nbe a random sample from a normal distribution with mean θ and variance σ2. The problem is to estimate θ with Zellner's (1994) balanced loss function, % MathType!End!2!1!, where 0<ω<1. It is shown that the sample mean % MathType!End!2!1!, is admissible. More generally, we investigate the admissibility of estimators of the form % MathType!End!2!1! under % MathType!End!2!1!. We also consider the weighted balanced loss function, % MathType!End!2!1!, whereq(θ) is any positive function of θ, and the class of admissible linear estimators is obtained under such loss withq(θ) =e θ .  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号