首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 562 毫秒
1.
In this paper, we seek to establish asymptotic results for selective inference procedures removing the assumption of Gaussianity. The class of selection procedures we consider are determined by affine inequalities, which we refer to as affine selection procedures. Examples of affine selection procedures include selective inference along the solution path of the least absolute shrinkage and selection operator (LASSO), as well as selective inference after fitting the least absolute shrinkage and selection operator at a fixed value of the regularization parameter. We also consider some tests in penalized generalized linear models. Our result proves asymptotic convergence in the high‐dimensional setting where n<p, and n can be of a logarithmic factor of the dimension p for some procedures.  相似文献   

2.
3.
The traditional non-parametric bootstrap (referred to as the n-out-of-n bootstrap) is a widely applicable and powerful tool for statistical inference, but in important situations it can fail. It is well known that by using a bootstrap sample of size m, different from n, the resulting m-out-of-n bootstrap provides a method for rectifying the traditional bootstrap inconsistency. Moreover, recent studies have shown that interesting cases exist where it is better to use the m-out-of-n bootstrap in spite of the fact that the n-out-of-n bootstrap works. In this paper, we discuss another case by considering its application to hypothesis testing. Two new data-based choices of m are proposed in this set-up. The results of simulation studies are presented to provide empirical comparisons between the performance of the traditional bootstrap and the m-out-of-n bootstrap, based on the two data-dependent choices of m, as well as on an existing method in the literature for choosing m. These results show that the m-out-of-n bootstrap, based on our choice of m, generally outperforms the traditional bootstrap procedure as well as the procedure based on the choice of m proposed in the literature.  相似文献   

4.
We consider the situation where there is a known regression model that can be used to predict an outcome, Y, from a set of predictor variables X . A new variable B is expected to enhance the prediction of Y. A dataset of size n containing Y, X and B is available, and the challenge is to build an improved model for Y| X ,B that uses both the available individual level data and some summary information obtained from the known model for Y| X . We propose a synthetic data approach, which consists of creating m additional synthetic data observations, and then analyzing the combined dataset of size n + m to estimate the parameters of the Y| X ,B model. This combined dataset of size n + m now has missing values of B for m of the observations, and is analyzed using methods that can handle missing data (e.g., multiple imputation). We present simulation studies and illustrate the method using data from the Prostate Cancer Prevention Trial. Though the synthetic data method is applicable to a general regression context, to provide some justification, we show in two special cases that the asymptotic variances of the parameter estimates in the Y| X ,B model are identical to those from an alternative constrained maximum likelihood estimation approach. This correspondence in special cases and the method's broad applicability makes it appealing for use across diverse scenarios. The Canadian Journal of Statistics 47: 580–603; 2019 © 2019 Statistical Society of Canada  相似文献   

5.
The generalized skew-normal distribution introduced by Balakrishnan (2002 Balakrishnan , N. ( 2002 ). Discussion on ‘Skew multivariate models related to hidden truncation and/or selective reporting’ by B. C. Arnold and R. J. Beaver . Test 11 : 3739 .[Web of Science ®] [Google Scholar]) is used to obtain new generalizations of univariate Cauchy distribution with two parameters, denoted by GC m, n (a, b) with m and n non-negative integer numbers and a, b ∈ R. For cases (m, n) = (1, 2), (m, n) = (2, 1), (m, n) = (0, 3) and (m, n) = (3, 0) explicit forms of the density functions are derived and compared to previous generalizations of Cauchy and skew-Cauchy distributions.  相似文献   

6.
The orthogonal arrays with mixed levels have become widely used in fractional factorial designs. It is highly desirable to know when such designs with resolution III or IV have clear two-factor interaction components (2fic’s). In this paper, we give a complete classification of the existence of clear 2fic’s in regular 2 m 4 n designs with resolution III or IV. The necessary and sufficient conditions for a 2 m 4 n design to have clear 2fic’s are given. Also, 2 m 4 n designs of 32 runs with the most clear 2fic’s are given for n = 1,2.   相似文献   

7.
The authors propose nonparametric tests for the hypothesis of no direct treatment effects, as well as for the hypothesis of no carryover effects, for balanced crossover designs in which the number of treatments equals the number of periods p, where p ≥ 3. They suppose that the design consists of n replications of balanced crossover designs, each formed by m Latin squares of order p. Their tests are permutation tests which are based on the n vectors of least squares estimators of the parameters of interest obtained from the n replications of the experiment. They obtain both the exact and limiting distribution of the test statistics, and they show that the tests have, asymptotically, the same power as the F‐ratio test.  相似文献   

8.
General minimum lower-order confounding (GMC) criterion is to choose optimal designs, which are based on the aliased effect-number pattern (AENP). The AENP and GMC criterion have been developed to form GMC theory. Zhang et al. (2015 Zhang, T.F., Yang, J.F., Li, Z.M., Zhang, R.C. (2015). Construction of regular 2n41 designs with general minimum lower-order confounding. Commun. Stat. - Theory Methods 46:27242735.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]) introduced GMC 2n4m criterion for choosing optimal designs and constructed all GMC 2n41 designs with N/4 + 1 ? n + 2 ? 5N/16. In this article, we analyze the properties of 2n41 designs and construct GMC 2n41 designs with 5N/16 + 1 ? n + 2 < N ? 1, where n and N are, respectively, the numbers of two-level factors and runs. Further, GMC 2n41 designs with 16-run, 32-run are tabulated.  相似文献   

9.
10.
Abstract. We consider the problem of efficiently estimating multivariate densities and their modes for moderate dimensions and an abundance of data. We propose polynomial histograms to solve this estimation problem. We present first‐ and second‐order polynomial histogram estimators for a general d‐dimensional setting. Our theoretical results include pointwise bias and variance of these estimators, their asymptotic mean integrated square error (AMISE), and optimal binwidth. The asymptotic performance of the first‐order estimator matches that of the kernel density estimator, while the second order has the faster rate of O(n?6/(d+6)). For a bivariate normal setting, we present explicit expressions for the AMISE constants which show the much larger binwidths of the second order estimator and hence also more efficient computations of multivariate densities. We apply polynomial histogram estimators to real data from biotechnology and find the number and location of modes in such data.  相似文献   

11.
This article describes a cutpoint sampling method for efficiently sampling from an n-point discrete distribution that preserves the monotone relationship between a uniform deviate and the random variate it generates. This property is useful for developing a sampling plan to reduce variance in a Monte Carlo or simulation study. The expected number of comparisons with this method is derived and shown to be bounded above by (m + n ?1)/n, where m denotes the number of cut-points. The alias sampling method, which is regarded as the most efficient table sampling technique, generally lacks the monotone property and requires 2n storage locations, whereas the proposed cutpoint sampling method requires m + n storage locations. The article describes two modifications for cases in which n is large and possibly infinite. It is shown that circumstances arise in which the cutpoint method requires fewer comparisons on average than the alias method does for exactly the same space requirement. The article also describes an algorithm to implement the proposed method.  相似文献   

12.
Let X2: n and Y2: m be the second order statistics from n independent exponential variables with hazards λ1, …, λn, and an independent exponential sample of size m with hazard change to λ, respectively. When m ? n, we obtain necessary and sufficient conditions for comparing X2: n and Y2: m in mean residual life, dispersive, hazard rate, and likelihood ratio orderings based on some inequalities between λi’s and λ. The established results show how one can compare an (n ? 1)-out-of-n system consisting of heterogeneous components with exponential lifetimes with any (m ? 1)-out-of-m system consisting of homogeneous components with exponential lifetimes.  相似文献   

13.
By means of a search design one is able to search for and estimate a small set of non‐zero elements from the set of higher order factorial interactions in addition to estimating the lower order factorial effects. One may be interested in estimating the general mean and main effects, in addition to searching for and estimating a non‐negligible effect in the set of 2‐ and 3‐factor interactions, assuming 4‐ and higher‐order interactions are all zero. Such a search design is called a ‘main effect plus one plan’ and is denoted by MEP.1. Construction of such a plan, for 2m factorial experiments, has been considered and developed by several authors and leads to MEP.1 plans for an odd number m of factors. These designs are generally determined by two arrays, one specifying a main effect plan and the other specifying a follow‐up. In this paper we develop the construction of search designs for an even number of factors m, m≠6. The new series of MEP.1 plans is a set of single array designs with a well structured form. Such a structure allows for flexibility in arriving at an appropriate design with optimum properties for search and estimation.  相似文献   

14.
Consider a parallel system with n independent components. Assume that the lifetime of the jth component follows an exponential distribution with a constant but unknown parameter λj, 1≤jn. We test rj components of type-j for failure and compute the total time Tj of rj failures for the jth component. Based on T=(T1,T2,…,Tn) and r=(r1,r2,…,rn), we derive optimal reliability test plans which ensure the usual probability requirements on system reliability. Further, we solve the associated nonlinear integer programming problem by a simple enumeration of integers over the feasible range. An algorithm is developed to obtain integer solutions with minimum cost. Finally, some examples have been discussed for various levels of producer’s and consumer’s risk to illustrate the approach. Our optimal plans lead to considerable savings in costs over the available plans in the literature.  相似文献   

15.
Multivariate control charts are used to monitor stochastic processes for changes and unusual observations. Hotelling's T2 statistic is calculated for each new observation and an out‐of‐control signal is issued if it goes beyond the control limits. However, this classical approach becomes unreliable as the number of variables p approaches the number of observations n, and impossible when p exceeds n. In this paper, we devise an improvement to the monitoring procedure in high‐dimensional settings. We regularise the covariance matrix to estimate the baseline parameter and incorporate a leave‐one‐out re‐sampling approach to estimate the empirical distribution of future observations. An extensive simulation study demonstrates that the new method outperforms the classical Hotelling T2 approach in power, and maintains appropriate false positive rates. We demonstrate the utility of the method using a set of quality control samples collected to monitor a gas chromatography–mass spectrometry apparatus over a period of 67 days.  相似文献   

16.
We consider the problem of constructing search designs for 3m factorial designs. By using projection properties of some three-level orthogonal arrays, some search designs are obtained for 3 ? m ? 11. The new obtained orthogonal search designs are capable of searching and identifying up to four two-factor interactions and estimating them along with the general mean and main effects. The resulted designs have very high searching probabilities; it means that besides the well-known orthogonal structure, they have high ability in searching the true effects.  相似文献   

17.
In this paper, we consider the asymptotic distributions of functionals of the sample covariance matrix and the sample mean vector obtained under the assumption that the matrix of observations has a matrix‐variate location mixture of normal distributions. The central limit theorem is derived for the product of the sample covariance matrix and the sample mean vector. Moreover, we consider the product of the inverse sample covariance matrix and the mean vector for which the central limit theorem is established as well. All results are obtained under the large‐dimensional asymptotic regime, where the dimension p and the sample size n approach infinity such that p/nc ∈ [0, + ) when the sample covariance matrix does not need to be invertible and p/nc ∈ [0,1) otherwise.  相似文献   

18.
With the rapid growth of modern technology, many biomedical studies are being conducted to collect massive datasets with volumes of multi‐modality imaging, genetic, neurocognitive and clinical information from increasingly large cohorts. Simultaneously extracting and integrating rich and diverse heterogeneous information in neuroimaging and/or genomics from these big datasets could transform our understanding of how genetic variants impact brain structure and function, cognitive function and brain‐related disease risk across the lifespan. Such understanding is critical for diagnosis, prevention and treatment of numerous complex brain‐related disorders (e.g., schizophrenia and Alzheimer's disease). However, the development of analytical methods for the joint analysis of both high‐dimensional imaging phenotypes and high‐dimensional genetic data, a big data squared (BD2) problem, presents major computational and theoretical challenges for existing analytical methods. Besides the high‐dimensional nature of BD2, various neuroimaging measures often exhibit strong spatial smoothness and dependence and genetic markers may have a natural dependence structure arising from linkage disequilibrium. We review some recent developments of various statistical techniques for imaging genetics, including massive univariate and voxel‐wise approaches, reduced rank regression, mixture models and group sparse multi‐task regression. By doing so, we hope that this review may encourage others in the statistical community to enter into this new and exciting field of research. The Canadian Journal of Statistics 47: 108–131; 2019 © 2019 Statistical Society of Canada  相似文献   

19.
Consider the randomly weighted sums Sm(θ) = ∑mi = 1θiXi, 1 ? m ? n, and their maxima Mn(θ) = max?1 ? m ? nSm(θ), where Xi, 1 ? i ? n, are real-valued and dependent according to a wide type of dependence structure, and θi, 1 ? i ? n, are non negative and arbitrarily dependent, but independent of Xi, 1 ? i ? n. Under some mild conditions on the right tails of the weights θi, 1 ? i ? n, we establish some asymptotic equivalence formulas for the tail probabilities of Sn(θ) and Mn(θ) in the case where Xi, 1 ? i ? n, are dominatedly varying, long-tailed and subexponential distributions, respectively.  相似文献   

20.
This paper is concerned with the estimation of a shift parameter δo, based on some nonnegative functional Hg1 of the pair (DδN(x), f?δN(x)), where DδN(x) = KN/b {F2,n(x)—F1,m (x + δ)}, +δN(x) = {mF1,m (x + δ) + nF2,n(x)}/N, where F1,m and F2,n are the empirical distribution functions of two independent random samples (N = m + n), and where K2N = mn/N. First an estimator δN, is defined as a value of δ minimizing a functional H of the type of H1. A second estimator δ1N is also defined which is a linearized version of the first. Finite and asymptotic properties of these estimators are considered. It is also shown that most well-known test statistics of the Kolmogorov-Smirnov type are particular cases of such functionals H1. The asymptotic distribution and the asymptotic efficiency of some estimators are given.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号