期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Iterative numerical methods for sampling from high dimensional Gaussian distributions

Erlend Aune Jo Eidsvik Yvo Pokern 《Statistics and Computing》2013,23(4):501-521

Many applications require efficient sampling from Gaussian distributions. The method of choice depends on the dimension of the problem as well as the structure of the covariance- (Σ) or precision matrix (Q). The most common black-box routine for computing a sample is based on Cholesky factorization. In high dimensions, computing the Cholesky factor of Σ or Q may be prohibitive due to accumulation of more non-zero entries in the factor than is possible to store in memory. We compare different methods for computing the samples iteratively adapting ideas from numerical linear algebra. These methods assume that matrix vector products, Qv, are fast to compute. We show that some of the methods are competitive and faster than Cholesky sampling and that a parallel version of one method on a Graphical Processing Unit (GPU) using CUDA can introduce a speed-up of up to 30x. Moreover, one method is used to sample from the posterior distribution of petroleum reservoir parameters in a North Sea field, given seismic reflection data on a large 3D grid. 相似文献

2.

L models and multiple regressions designs

Elsa Moreira João Tiago Mexia Miguel Fonseca Roman Zmyślony 《Statistical Papers》2009,50(4):869-885

Given an orthogonal model

${{\bf \lambda}}=\sum_{i=1}^m{{{\bf X}}_i}{\boldsymbol{\alpha}}_i$

an L model

${{\bf y}}={\bf L}\left(\sum_{i=1}^m{{{\bf X}}_i}{\boldsymbol{\alpha}}_i\right)+{\bf e}$

is obtained, and the only restriction is the linear independency of the column vectors of matrix L. Special cases of the L models correspond to blockwise diagonal matrices L = D(L ₁, . . . , L _c). In multiple regression designs this matrix will be of the form

${\bf L}={\bf D}(\check{{\bf X}}_1,\ldots,\check{{\bf X}}_{c})$

with ${\check{{\bf X}}_j, j=1,\ldots,c}$ the model matrices of the individual regressions, while the original model will have fixed effects. In this way, we overcome the usual restriction of requiring all regressions to have the same model matrix.

相似文献

3.

Maximum modulus confidence bands

Christopher S. Withers Saralees Nadarajah 《Statistical Papers》2012,53(4):811-819

A family of confidence bands (simultaneous confidence regions) is given for EY = x′β that are piecewise-linear in x. Normality is assumed. These confidence bands are advocated over the usual hyperbolic band when the region of prime interest is distant from ${\overline{\bf x}}$ . In particular, this is the case when x?=?x(t) for time t and future time is of primary interest, that is for the prediction problem. For the case x′?=?(1, t), the family of bands includes that of Graybill and Bowden (J Am Stat Assoc 62:403–408, 1967). 相似文献

4.

Robust model-based sampling designs

A. H. Welsh Douglas P. Wiens 《Statistics and Computing》2013,23(6):689-701

We investigate methods for the design of sample surveys, and address the traditional resistance of survey samplers to the use of model-based methods by incorporating model robustness at the design stage. The designs are intended to be sufficiently flexible and robust that resulting estimates, based on the designer’s best guess at an appropriate model, remain reasonably accurate in a neighbourhood of this central model. Thus, consider a finite population of N units in which a survey variable Y is related to a q dimensional auxiliary variable x. We assume that the values of x are known for all N population units, and that we will select a sample of n≤N population units and then observe the n corresponding values of Y. The objective is to predict the population total $T=\sum_{i=1}^{N}Y_{i}$ . The design problem which we consider is to specify a selection rule, using only the values of the auxiliary variable, to select the n units for the sample so that the predictor has optimal robustness properties. We suppose that T will be predicted by methods based on a linear relationship between Y—possibly transformed—and given functions of x. We maximise the mean squared error of the prediction of T over realistic neighbourhoods of the fitted linear relationship, and of the assumed variance and correlation structures. This maximised mean squared error is then minimised over the class of possible samples, yielding an optimally robust (‘minimax’) design. To carry out the minimisation step we introduce a genetic algorithm and discuss its tuning for maximal efficiency. 相似文献

5.

Data cloning: Data visualisation, smoothing, confidentiality, and encryption

S.J. Haslett K. Govindaraju 《Journal of statistical planning and inference》2012,142(2):410-422

相似文献

6.

Minimax estimation in linear regression with ellipsoidal constraints

Maciej Wilczyński 《Journal of statistical planning and inference》2007

相似文献

7.

On properties of BLUEs under general linear regression models

Yongge Tian 《Journal of statistical planning and inference》2013

相似文献

8.

On Bahadur representation for sample quantiles under α-mixing sequence

Qinchi Zhang Wenzhi Yang Shuhe Hu 《Statistical Papers》2014,55(2):285-299

In this paper, by relaxing the mixing coefficients to α(n) = O(n ^?β), β > 3, we investigate the Bahadur representation of sample quantiles under α-mixing sequence and obtain the rate as ${O(n^{-\frac{1}{2}}(\log\log n\cdot\log n)^{\frac{1}{2}})}$ . Meanwhile, for any δ > 0, by strengthening the mixing coefficients to α(n) = O(n ^?β), ${\beta > \max\{3+\frac{5}{1+\delta},1+\frac{2}{\delta}\}}$ , we have the rate as ${O(n^{-\frac{3}{4}+\frac{\delta}{4(2+\delta)}}(\log\log n\cdot \log n)^{\frac{1}{2}})}$ . Specifically, if ${\delta=\frac{\sqrt{41}-5}{4}}$ and ${\beta > \frac{\sqrt{41}+7}{2}}$ , then the rate is presented as ${O(n^{-\frac{\sqrt{41}+5}{16}}(\log\log n\cdot \log n)^{\frac{1}{2}})}$ . 相似文献

9.

Distribution theory of quadratic forms for matrix multivariate elliptical distribution

José A. Díaz-García 《Journal of statistical planning and inference》2013

This paper proposes the density and characteristic functions of a general matrix quadratic form X^(?)AX

X^{(?)} AX

, when A=A^(?)

A = A^{(?)}

is a positive semidefinite matrix, X

X

has a matrix multivariate elliptical distribution and X^(?)

X^{(?)}

denotes the usual conjugate transpose of X

X

. These results are obtained for real normed division algebras. With particular cases we obtained the density and characteristic functions of matrix quadratic forms for matrix multivariate normal, Pearson type VII, t and Cauchy distributions. 相似文献

10.

Linear models that allow perfect estimation

Ronald Christensen Yong Lin 《Statistical Papers》2013,54(3):695-708

The general Gauss–Markov model, Y = Xβ + e, E(e) = 0, Cov(e) = σ ² V, has been intensively studied and widely used. Most studies consider covariance matrices V that are nonsingular but we focus on the most difficult case wherein C(X), the column space of X, is not contained in C(V). This forces V to be singular. Under this condition there exist nontrivial linear functions of Q′Xβ that are known with probability 1 (perfectly) where ${C(Q)=C(V)^\perp}$ . To treat ${C(X) \not \subset C(V)}$ , much of the existing literature obtains estimates and tests by replacing V with a pseudo-covariance matrix T = V + XUX′ for some nonnegative definite U such that ${C(X) \subset C(T)}$ , see Christensen (Plane answers to complex questions: the theory of linear models, 2002, Chap. 10). We find it more intuitive to first eliminate what is known about Xβ and then to adjust X while keeping V unchanged. We show that we can decompose β into the sum of two orthogonal parts, β = β ₀ + β ₁, where β ₀ is known. We also show that the unknown component of X β is ${X\beta_1 \equiv \tilde{X} \gamma}$ , where ${C(\tilde{X})=C(X)\cap C(V)}$ . We replace the original model with ${Y-X\beta_0=\tilde{X}\gamma+e}$ , E(e) = 0, ${Cov(e)=\sigma^2V}$ and perform estimation and tests under this new model for which the simplifying assumption ${C(\tilde{X}) \subset C(V)}$ holds. This allows us to focus on the part of that parameters that are not known perfectly. We show that this method provides the usual estimates and tests. 相似文献

11.

Orthogonal designs of parallel flats type

《Journal of statistical planning and inference》1996,53(2):261-283

相似文献

12.

On the joint distribution of success runs of several lengths in the sequence of MBT and its applications

Kirtee K. Kamalja 《Statistical Papers》2014,55(4):1179-1206

Let $X_1 ,X_2 ,\ldots ,X_n $ be a sequence of Markov Bernoulli trials (MBT) and $\underline{X}_n =( {X_{n,k_1 } ,X_{n,k_2 } ,\ldots ,X_{n,k_r } })$ be a random vector where $X_{n,k_i } $ represents the number of occurrences of success runs of length $k_i \,( {i=1,2,\ldots ,r})$ . In this paper the joint distribution of $\underline{X}_n $ in the sequence of $n$ MBT is studied using method of conditional probability generating functions. Five different counting schemes of runs namely non-overlapping runs, runs of length at least $k$ , overlapping runs, runs of exact length $k$ and $\ell $ -overlapping runs (i.e. $\ell $ -overlapping counting scheme), $0\le \ell are considered. The pgf of joint distribution of \(\underline{X}_n $ is obtained in terms of matrix polynomial and an algorithm is developed to get exact probability distribution. Numerical results are included to demonstrate the computational flexibility of the developed results. Various applications of the joint distribution of $\underline{X}_n $ such as in evaluation of the reliability of $( {n,f,k})\!\!:\!\!G$ and $\!:\!\!G$ system, in evaluation of quantities related to start-up demonstration tests, acceptance sampling plans are also discussed. 相似文献

13.

A new universal resample-stable bootstrap-based stopping criterion for PLS component construction

Jérémy Magnanensi Frédéric Bertrand Myriam Maumy-Bertrand Nicolas Meyer 《Statistics and Computing》2017,27(3):757-774

We develop a new robust stopping criterion for partial least squares regression (PLSR) component construction, characterized by a high level of stability. This new criterion is universal since it is suitable both for PLSR and extensions to generalized linear regression (PLSGLR). The criterion is based on a non-parametric bootstrap technique and must be computed algorithmically. It allows the testing of each successive component at a preset significance level $\alpha $. In order to assess its performance and robustness with respect to various noise levels, we perform dataset simulations in which there is a preset and known number of components. These simulations are carried out for datasets characterized both by $n>p$, with n the number of subjects and p the number of covariates, as well as for $n<p$. We then use t-tests to compare the predictive performance of our approach with other common criteria. The stability property is in particular tested through re-sampling processes on a real allelotyping dataset. An important additional conclusion is that this new criterion gives globally better predictive performances than existing ones in both the PLSR and PLSGLR (logistic and poisson) frameworks. 相似文献

14.

An interesting property of the Poisson operating characteristic function

R. Göb 《Statistical Papers》1992,33(1):273-277

In elementary probability theory, as a result of a limiting process the probabilities of aBi(n, p) binomial distribution are approximated by the probabilities of aPo(np) Poisson distribution. Accordingly, in statistical quality control the binomial operating characteristic function $\mathcal{L}_{n,c} (p)$ is approximated by the Poisson operating characteristic function $\mathcal{F}_{n,c} (p)$ . The inequality $\mathcal{L}_{n + 1,c + 1} (p) > \mathcal{L}_{n,c} (p)$ forp∈(0;1) is evident from the interpretation of $\mathcal{L}_{n + 1,c + 1} (p)$ , $\mathcal{L}_{n,c} (p)$ as probabilities of accepting a lot. It is shown that the Poisson approximation $\mathcal{F}_{n,c} (p)$ preserves this essential feature of the binomial operating characteristic function, i.e. that an analogous inequality holds for the Poisson operating characteristic function, too. 相似文献

15.

On Two Types of Breakdown Points of Weighted L 2-median

Caiya Zhang Yan Luo 《统计学通讯:理论与方法》2013,42(7):1131-1141

Zuo (2004) investigated the simplified replacement finite sample breakdown point of weighted L ^p-depth and L ^p-median for some appropriate weight functions. The addition breakdown point of weighted L ^p-depth functions is studied firstly in this article. In addition, for some other weight functions different from those in Zuo (2004 Zuo , Y. ( 2004 ). Robustness of weighted L ^p-depth and L ^p-median . Allgemeines Statistics Archiv. 88 : 215 – 234 . [Google Scholar]), we establish the lower bounds of these two types of breakdown point of weighted L ²-median. 相似文献

16.

Efficient Bayesian estimation of the multivariate Double Chain Markov Model

Matthew Fitzpatrick Dobrin Marchev 《Statistics and Computing》2013,23(4):467-480

The Double Chain Markov Model (DCMM) is used to model an observable process $Y = \{Y_{t}\}_{t=1}^{T}$ as a Markov chain with transition matrix, $P_{x_{t}}$ , dependent on the value of an unobservable (hidden) Markov chain $\{X_{t}\}_{t=1}^{T}$ . We present and justify an efficient algorithm for sampling from the posterior distribution associated with the DCMM, when the observable process Y consists of independent vectors of (possibly) different lengths. Convergence of the Gibbs sampler, used to simulate the posterior density, is improved by adding a random permutation step. Simulation studies are included to illustrate the method. The problem that motivated our model is presented at the end. It is an application to real data, consisting of the credit rating dynamics of a portfolio of financial companies where the (unobserved) hidden process is the state of the broader economy. 相似文献

17.

The variable parameters T ^{2} chart with run rules

Alireza Faraz Giovanni Celano Erwin Saniga C. Heuchenne S. Fichera 《Statistical Papers》2014,55(4):933-950

The Hotelling’s $\textit{T}^{2 }$ control chart with variable parameters (VP $T^{2})$ has been shown to have better statistical performance than other adaptive control schemes in detecting small to moderate process mean shifts. In this paper, we investigate the statistical performance of the VP $T^{2}$ control chart coupled with run rules. We consider two well-known run rules schemes. Statistical performance is evaluated by using a Markov chain modeling the random shock mechanism of the monitored process. The in-control time interval of the process is assumed to follow an exponential distribution. A genetic algorithm has been designed to select the optimal chart design parameters. We provide an extensive numerical analysis indicating that the VP $T^{2}$ control chart with run rules outperforms other charts for small sizes of the mean shift expressed through the Mahalanobis distance. 相似文献

18.

Bayesian variable selection for correlated covariates via colored cliques

Stefano Monni 《AStA Advances in Statistical Analysis》2014,98(2):143-163

We propose a Bayesian method to select groups of correlated explanatory variables in a linear regression framework. We do this by introducing in the prior distribution assigned to the regression coefficients a random matrix $G$ that encodes the group structure. The groups can thus be inferred by sampling from the posterior distribution of $G$ . We then give a graph-theoretic interpretation of this random matrix $G$ as the adjacency matrix of cliques. We discuss the extension of the groups from cliques to more general random graphs, so that the proposed approach can be viewed as a method to find networks of correlated covariates that are associated with the response. 相似文献

19.

On the strong convergence for weighted sums of ρ *-mixing random variables

Soo Hak Sung 《Statistical Papers》2013,54(3):773-781

A complete convergence result is obtained for weighted sums of identically distributed ρ ^*-mixing random variables with E|X ₁|^α log(1 + |X ₁|) < ∞ for some 0 < α ≤ 2. This result partially extends the result of Sung (Stat Papers 52: 447–454, 2011) for negatively associated random variables to ρ ^*-mixing random variables. It also settles the open problem posed by Zhou et al. (J Inequal Appl, 2011, doi:10.1155/2011/157816). 相似文献

20.

Order selection in finite mixtures of linear regressions

Nicolas Depraetere Martina Vandebroek 《Statistical Papers》2014,55(3):871-911

Finite mixture models can adequately model population heterogeneity when this heterogeneity arises from a finite number of relatively homogeneous clusters. An example of such a situation is market segmentation. Order selection in mixture models, i.e. selecting the correct number of components, however, is a problem which has not been satisfactorily resolved. Existing simulation results in the literature do not completely agree with each other. Moreover, it appears that the performance of different selection methods is affected by the type of model and the parameter values. Furthermore, most existing results are based on simulations where the true generating model is identical to one of the models in the candidate set. In order to partly fill this gap we carried out a (relatively) large simulation study for finite mixture models of normal linear regressions. We included several types of model (mis)specification to study the robustness of 18 order selection methods. Furthermore, we compared the performance of these selection methods based on unpenalized and penalized estimates of the model parameters. The results indicate that order selection based on penalized estimates greatly improves the success rates of all order selection methods. The most successful methods were $MDL2$ , $MRC$ , $MRC_k$ , $ICL$ – $BIC$ , $ICL$ , $CAIC$ , $BIC$ and $CLC$ but not one method was consistently good or best for all types of model (mis)specification. 相似文献