期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Maximum expected entropy transformed Latin hypercube designs

Chong Sheng Matthias Hwai Yong Tan Lu Zou 《Journal of applied statistics》2021,48(12):2152

Existing projection designs (e.g. maximum projection designs) attempt to achieve good space-filling properties in all projections. However, when using a Gaussian process (GP), model-based design criteria such as the entropy criterion is more appropriate. We employ the entropy criterion averaged over a set of projections, called expected entropy criterion (EEC), to generate projection designs. We show that maximum EEC designs are invariant to monotonic transformations of the response, i.e. they are optimal for a wide class of stochastic process models. We also demonstrate that transformation of each column of a Latin hypercube design (LHD) based on a monotonic function can substantially improve the EEC. Two types of input transformations are considered: a quantile function of a symmetric Beta distribution chosen to optimize the EEC, and a nonparametric transformation corresponding to the quantile function of a symmetric density chosen to optimize the EEC. Numerical studies show that the proposed transformations of the LHD are efficient and effective for building robust maximum EEC designs. These designs give projections with markedly higher entropies and lower maximum prediction variances (MPV''s) at the cost of small increases in average prediction variances (APV''s) compared to state-of-the-art space-filling designs over wide ranges of covariance parameter values. 相似文献

2.

Optimal regular graph designs

Sera Aylin Cakiroglu 《Statistics and Computing》2018,28(1):103-112

A typical problem in optimal design theory is finding an experimental design that is optimal with respect to some criteria in a class of designs. The most popular criteria include the A- and D-criteria. Regular graph designs occur in many optimality results, and if the number of blocks is large enough, an A-optimal (or D-optimal) design is among them (if any exist). To explore the landscape of designs with a large number of blocks, we introduce extensions of regular graph designs. These are constructed by adding the blocks of a balanced incomplete block design repeatedly to the original design. We present the results of an exact computer search for the best regular graph designs and the best extended regular graph designs with up to 20 treatments v, block size \(k \le 10\) and replication r \(\le 10\) and \(r(k-1)-(v-1)\lfloor r(k-1)/(v-1)\rfloor \le 9\). 相似文献

3.

Robust extrapolation designs and weights for biased regression models with heteroscedastic errors

Zhide Fang Douglas P. Wiens 《Revue canadienne de statistique》1999,27(4):751-770

We consider the construction of designs for the extrapolation of regression responses, allowing both for possible heteroscedasticity in the errors and for imprecision in the specification of the response function. We find minimax designs and correspondingly optimal estimation weights in the context of the following problems: (1) for ordinary least squares estimation, determine a design to minimize the maximum value of the integrated mean squared prediction error (IMSPE), with the maximum being evaluated over both types of departure; (2) for weighted least squares estimation, determine both weights and a design to minimize the maximum IMSPE; (3) choose weights and design points to minimize the maximum IMSPE, subject to a side condition of unbiasedness. Solutions to (1) and (2) are given for multiple linear regression with no interactions, a spherical design space and an annular extrapolation space. For (3) the solution is given in complete generality; as one example we consider polynomial regression. Applications to a dose-response problem for bioassays are discussed. Numerical comparisons, including a simulation study, indicate that, as well as being easily implemented, the designs and weights for (3) perform as well as those for (1) and (2) and outperform some common competitors for moderate but undetectable amounts of model bias. 相似文献

4.

Comparing and generating Latin Hypercube designs in Kriging models

Giovanni Pistone Grazia Vicario 《AStA Advances in Statistical Analysis》2010,94(4):353-366

In Computer Experiments (CE), a careful selection of the design points is essential for predicting the system response at untried points, based on the values observed at tried points. In physical experiments, the protocol is based on Design of Experiments, a methodology whose basic principles are questioned in CE. When the responses of a CE are modeled as jointly Gaussian random variables with their covariance depending on the distance between points, the use of the so called space-filling designs (random designs, stratified designs and Latin Hypercube designs) is a common choice, because it is expected that the nearer the untried point is to the design points, the better is the prediction. In this paper we focus on the class of Latin Hypercube (LH) designs. The behavior of various LH designs is examined according to the Gaussian assumption with exponential correlation, in order to minimize the total prediction error at the points of a regular lattice. In such a special case, the problem is reduced to an algebraic statistical model, which is solved using both symbolic algebraic software and statistical software. We provide closed-form computation of the variance of the Gaussian linear predictor as a function of the design, in order to make a comparison between LH designs. In principle, the method applies to any number of factors and any number of levels, and also to classes of designs other than LHs. In our current implementation, the applicability is limited by the high computational complexity of the algorithms involved. 相似文献

5.

Importance tempering

Robert Gramacy Richard Samworth Ruth King 《Statistics and Computing》2010,20(1):1-7

Simulated tempering (ST) is an established Markov chain Monte Carlo (MCMC) method for sampling from a multimodal density π(θ). Typically, ST involves introducing an auxiliary variable k taking values in a finite subset of [0,1] and indexing a set of tempered distributions, say π _k(θ)∝ π(θ)^k. In this case, small values of k encourage better mixing, but samples from π are only obtained when the joint chain for (θ,k) reaches k=1. However, the entire chain can be used to estimate expectations under π of functions of interest, provided that importance sampling (IS) weights are calculated. Unfortunately this method, which we call importance tempering (IT), can disappoint. This is partly because the most immediately obvious implementation is naïve and can lead to high variance estimators. We derive a new optimal method for combining multiple IS estimators and prove that the resulting estimator has a highly desirable property related to the notion of effective sample size. We briefly report on the success of the optimal combination in two modelling scenarios requiring reversible-jump MCMC, where the naïve approach fails. 相似文献

6.

Rate of uniform consistency for a class of mode regression on functional stationary ergodic data

Mohamed Chaouch Naâmane Laïb Djamal Louani 《Statistical Methods and Applications》2017,26(1):19-47

The aim of this paper is to study the asymptotic properties of a class of kernel conditional mode estimates whenever functional stationary ergodic data are considered. To be more precise on the matter, in the ergodic data setting, we consider a random elements (X, Z) taking values in some semi-metric abstract space \(E\times F\). For a real function \(\varphi \) defined on the space F and \(x\in E\), we consider the conditional mode of the real random variable \(\varphi (Z)\) given the event “\(X=x\)”. While estimating the conditional mode function, say \(\theta _\varphi (x)\), using the well-known kernel estimator, we establish the strong consistency with rate of this estimate uniformly over Vapnik–Chervonenkis classes of functions \(\varphi \). Notice that the ergodic setting offers a more general framework than the usual mixing structure. Two applications to energy data are provided to illustrate some examples of the proposed approach in time series forecasting framework. The first one consists in forecasting the daily peak of electricity demand in France (measured in Giga-Watt). Whereas the second one deals with the short-term forecasting of the electrical energy (measured in Giga-Watt per Hour) that may be consumed over some time intervals that cover the peak demand. 相似文献

7.

Approximations for weighted Kolmogorov–Smirnov distributions via boundary crossing probabilities

Nino Kordzakhia Alexander Novikov Bernard Ycart 《Statistics and Computing》2017,27(6):1513-1523

A statistical application to Gene Set Enrichment Analysis implies calculating the distribution of the maximum of a certain Gaussian process, which is a modification of the standard Brownian bridge. Using the transformation into a boundary crossing problem for the Brownian motion and a piecewise linear boundary, it is proved that the desired distribution can be approximated by an n-dimensional Gaussian integral. Fast approximations are defined and validated by Monte Carlo simulation. The performance of the method for the genomics application is discussed. 相似文献

8.

Optimal designs for treatment comparisons represented by graphs

Samuel Rosa 《AStA Advances in Statistical Analysis》2018,102(4):479-503

Consider an experiment for comparing a set of treatments: in each trial, one treatment is chosen and its effect determines the mean response of the trial. We examine the optimal approximate designs for the estimation of a system of treatment contrasts under this model. These designs can be used to provide optimal treatment proportions in more general models with nuisance effects. For any system of pairwise treatment comparisons, we propose to represent such a system by a graph. Then, we represent the designs by the inverses of the vertex weights in the corresponding graph and we show that the values of the eigenvalue-based optimality criteria can be expressed using the Laplacians of the vertex-weighted graphs. We provide a graph theoretic interpretation of D-, A- and E-optimality for estimating sets of pairwise comparisons. We apply the obtained graph representation to provide optimality results for these criteria as well as for ’symmetric’ systems of treatment contrasts. 相似文献

9.

Adaptive grid semidefinite programming for finding optimal designs

Belmiro P. M. Duarte Weng Kee Wong Holger Dette 《Statistics and Computing》2018,28(2):441-460

We find optimal designs for linear models using a novel algorithm that iteratively combines a semidefinite programming (SDP) approach with adaptive grid techniques. The proposed algorithm is also adapted to find locally optimal designs for nonlinear models. The search space is first discretized, and SDP is applied to find the optimal design based on the initial grid. The points in the next grid set are points that maximize the dispersion function of the SDP-generated optimal design using nonlinear programming. The procedure is repeated until a user-specified stopping rule is reached. The proposed algorithm is broadly applicable, and we demonstrate its flexibility using (i) models with one or more variables and (ii) differentiable design criteria, such as A-, D-optimality, and non-differentiable criterion like E-optimality, including the mathematically more challenging case when the minimum eigenvalue of the information matrix of the optimal design has geometric multiplicity larger than 1. Our algorithm is computationally efficient because it is based on mathematical programming tools and so optimality is assured at each stage; it also exploits the convexity of the problems whenever possible. Using several linear and nonlinear models with one or more factors, we show the proposed algorithm can efficiently find optimal designs. 相似文献

10.

Random projections for Bayesian regression

Leo N. Geppert Katja Ickstadt Alexander Munteanu Jens Quedenfeld Christian Sohler 《Statistics and Computing》2017,27(1):79-101

This article deals with random projections applied as a data reduction technique for Bayesian regression analysis. We show sufficient conditions under which the entire d-dimensional distribution is approximately preserved under random projections by reducing the number of data points from n to \(k\in O({\text {poly}}(d/\varepsilon ))\) in the case \(n\gg d\). Under mild assumptions, we prove that evaluating a Gaussian likelihood function based on the projected data instead of the original data yields a \((1+O(\varepsilon ))\)-approximation in terms of the \(\ell _2\) Wasserstein distance. Our main result shows that the posterior distribution of Bayesian linear regression is approximated up to a small error depending on only an \(\varepsilon \)-fraction of its defining parameters. This holds when using arbitrary Gaussian priors or the degenerate case of uniform distributions over \(\mathbb {R}^d\) for \(\beta \). Our empirical evaluations involve different simulated settings of Bayesian linear regression. Our experiments underline that the proposed method is able to recover the regression model up to small error while considerably reducing the total running time. 相似文献

11.

Objective Bayesian analysis for the multivariate skew-t model

Antonio Parisi B. Liseo 《Statistical Methods and Applications》2018,27(2):277-295

We propose a novel Bayesian analysis of the p-variate skew-t model, providing a new parameterization, a set of non-informative priors and a sampler specifically designed to explore the posterior density of the model parameters. Extensions, such as the multivariate regression model with skewed errors and the stochastic frontiers model, are easily accommodated. A novelty introduced in the paper is given by the extension of the bivariate skew-normal model given in Liseo and Parisi (2013) to a more realistic p-variate skew-t model. We also introduce the R package mvst, which produces a posterior sample for the parameters of a multivariate skew-t model. 相似文献

12.

Minimax versions of the two-stage <Emphasis Type="Italic">t</Emphasis> test

Wolf Krumbholz Andreas Rohr Eno Vangjeli 《Statistical Papers》2012,53(2):311-321

Let X be a N(μ, σ ²) distributed characteristic with unknown σ. We present the minimax version of the two-stage t test having minimal maximal average sample size among all two-stage t tests obeying the classical two-point-condition on the operation characteristic. We give several examples. Furthermore, the minimax version of the two-stage t test is compared with the corresponding two-stage Gauß test. 相似文献

13.

Automated selection of <Emphasis Type="Italic">r</Emphasis> for the <Emphasis Type="Italic">r</Emphasis> largest order statistics approach with adjustment for sequential testing

Brian Bader Jun Yan Xuebin Zhang 《Statistics and Computing》2017,27(6):1435-1451

The r largest order statistics approach is widely used in extreme value analysis because it may use more information from the data than just the block maxima. In practice, the choice of r is critical. If r is too large, bias can occur; if too small, the variance of the estimator can be high. The limiting distribution of the r largest order statistics, denoted by GEV\(_r\), extends that of the block maxima. Two specification tests are proposed to select r sequentially. The first is a score test for the GEV\(_r\) distribution. Due to the special characteristics of the GEV\(_r\) distribution, the classical chi-square asymptotics cannot be used. The simplest approach is to use the parametric bootstrap, which is straightforward to implement but computationally expensive. An alternative fast weighted bootstrap or multiplier procedure is developed for computational efficiency. The second test uses the difference in estimated entropy between the GEV\(_r\) and GEV\(_{r-1}\) models, applied to the r largest order statistics and the \(r-1\) largest order statistics, respectively. The asymptotic distribution of the difference statistic is derived. In a large scale simulation study, both tests held their size and had substantial power to detect various misspecification schemes. A new approach to address the issue of multiple, sequential hypotheses testing is adapted to this setting to control the false discovery rate or familywise error rate. The utility of the procedures is demonstrated with extreme sea level and precipitation data. 相似文献

14.

On the optimal designs for the prediction of complex Ornstein-Uhlenbeck processes

Kinga Sikolya Sándor Baran 《统计学通讯:理论与方法》2020,49(20):4859-4870

相似文献

15.

Discussion of “The power of monitoring: how to make the most of a contaminated multivariate sample” by Andrea Cerioli,Marco Riani,Anthony C. Atkinson and Aldo Corbellini

Valentin Todorov 《Statistical Methods and Applications》2018,27(4):595-602

This paper discusses the contribution of Cerioli et al. (Stat Methods Appl, 2018), where robust monitoring based on high breakdown point estimators is proposed for multivariate data. The results follow years of development in robust diagnostic techniques. We discuss the issues of extending data monitoring to other models with complex structure, e.g. factor analysis, mixed linear models for which S and MM-estimators exist or deviating data cells. We emphasise the importance of robust testing that is often overlooked despite robust tests being readily available once S and MM-estimators have been defined. We mention open questions like out-of-sample inference or big data issues that would benefit from monitoring. 相似文献

16.

Nonlinear surface regression with dimension reduction method

Takuma Yoshida 《AStA Advances in Statistical Analysis》2017,101(1):29-50

This paper considers nonlinear regression analysis with a scalar response and multiple predictors. An unknown regression function is approximated by radial basis function models. The coefficients are estimated in the context of M-estimation. It is known that ordinary M-estimation leads to overfitting in nonlinear regression. The purpose of this paper is to construct a smooth estimator. The proposed method in this paper is conducted by a two-step procedure. First, the sufficient dimension reduction methods are applied to the response and radial basis functions for transforming the large number of radial bases to a small number of linear combinations of the radial bases without loss of information. In the second step, a multiple linear regression model between a response and the transformed radial bases is assumed and the ordinary M-estimation is applied. Thus, the final estimator is also obtained as a linear combination of radial bases. The validity and an asymptotic study of the proposed method are explored. A simulation and data example are addressed to confirm the behavior of the proposed method. 相似文献

17.

Conditional density estimation using the local Gaussian correlation

Håkon Otneim Dag Tjøstheim 《Statistics and Computing》2018,28(2):303-321

Let \(\mathbf {X} = (X_1,\ldots ,X_p)\) be a stochastic vector having joint density function \(f_{\mathbf {X}}(\mathbf {x})\) with partitions \(\mathbf {X}_1 = (X_1,\ldots ,X_k)\) and \(\mathbf {X}_2 = (X_{k+1},\ldots ,X_p)\). A new method for estimating the conditional density function of \(\mathbf {X}_1\) given \(\mathbf {X}_2\) is presented. It is based on locally Gaussian approximations, but simplified in order to tackle the curse of dimensionality in multivariate applications, where both response and explanatory variables can be vectors. We compare our method to some available competitors, and the error of approximation is shown to be small in a series of examples using real and simulated data, and the estimator is shown to be particularly robust against noise caused by independent variables. We also present examples of practical applications of our conditional density estimator in the analysis of time series. Typical values for k in our examples are 1 and 2, and we include simulation experiments with values of p up to 6. Large sample theory is established under a strong mixing condition. 相似文献

18.

I-robust and D-robust designs on a finite design space

Douglas P. Wiens 《Statistics and Computing》2018,28(2):241-258

We present and discuss the theory of minimax I- and D-robust designs on a finite design space, and detail three methods for their construction that are new in this context: (i) a numerical search for the optimal parameters in a provably minimax robust parametric class of designs, (ii) a first-order iterative algorithm similar to that of Wynn (Ann Math Stat 5:1655–1664, 1970), and (iii) response-adaptive designs. These designs minimize a loss function, based on the mean squared error of the predicted responses or the parameter estimates, when the regression response is possibly misspecified. The loss function being minimized has first been maximized over a neighbourhood of the approximate and possibly inadequate response being fitted by the experimenter. The methods presented are all vastly more economical, in terms of the computing time required, than previously available algorithms. 相似文献

19.

State-dependent swap strategies and automatic reduction of number of temperatures in adaptive parallel tempering algorithm

Mateusz Krzysztof Łącki Błażej Miasojedow 《Statistics and Computing》2016,26(5):951-964

In this paper we present extensions to the original adaptive Parallel Tempering algorithm. Two different approaches are presented. In the first one we introduce state-dependent strategies using current information to perform a swap step. It encompasses a wide family of potential moves including the standard one and Equi-Energy type move, without any loss in tractability. In the second one, we introduce online trimming of the number of temperatures. Numerical experiments demonstrate the effectiveness of the proposed method. 相似文献

20.

Estimating a sparse reduction for general regression in high dimensions

Tao Wang Mengjie Chen Hongyu Zhao Lixing Zhu 《Statistics and Computing》2018,28(1):33-46

Although the concept of sufficient dimension reduction that was originally proposed has been there for a long time, studies in the literature have largely focused on properties of estimators of dimension-reduction subspaces in the classical “small p, and large n” setting. Rather than the subspace, this paper considers directly the set of reduced predictors, which we believe are more relevant for subsequent analyses. A principled method is proposed for estimating a sparse reduction, which is based on a new, revised representation of an existing well-known method called the sliced inverse regression. A fast and efficient algorithm is developed for computing the estimator. The asymptotic behavior of the new method is studied when the number of predictors, p, exceeds the sample size, n, providing a guide for choosing the number of sufficient dimension-reduction predictors. Numerical results, including a simulation study and a cancer-drug-sensitivity data analysis, are presented to examine the performance. 相似文献