期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

On difference-based variance estimation in nonparametric regression when the covariate is high dimensional

Axel Munk Nicolai Bissantz Thorsten Wagner Gudrun Freitag 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2005,67(1):19-41

Summary. We consider the problem of estimating the noise variance in homoscedastic nonparametric regression models. For low dimensional covariates t ∈ R ^d, d =1, 2, difference-based estimators have been investigated in a series of papers. For a given length of such an estimator, difference schemes which minimize the asymptotic mean-squared error can be computed for d =1 and d =2. However, from numerical studies it is known that for finite sample sizes the performance of these estimators may be deficient owing to a large finite sample bias. We provide theoretical support for these findings. In particular, we show that with increasing dimension d this becomes more drastic. If d 4, these estimators even fail to be consistent. A different class of estimators is discussed which allow better control of the bias and remain consistent when d 4. These estimators are compared numerically with kernel-type estimators (which are asymptotically efficient), and some guidance is given about when their use becomes necessary. 相似文献

2.

Gamma variates of fractional shape as functioials of a homogeneous multidimensional poisson process

Charles Anderson 《统计学通讯:模拟与计算》2013,42(3):781-787

If events are scattered in Rⁿ in accordance with a homogeneous Poisson process and if X is the location of the event with minimal [d]l^P norm, then in the case p = n the n^th absolute powers of the coordinates of X form a sample of size n from a gamma distribution with shape parameter 1/n. In an age of parallel computing, this fact may lead to some attractive simulation methods. One possibility is to generate R = [d]X[d] and U = Y/[d]X[d] independently, perhaps by setting U = Y/[d]Y[d] where Y has any p.d.f. which is a function only of ¦Y¦. We consider for example Y having the uniform distribution in an l^P ball. 相似文献

3.

Chebyshev's inequality for nonparametric testing with small N and α in microarray research

T. Mark Beasley Grier P. Page Jaap P. L. Brand Gary L. Gadbury John D. Mountz David B. Allison 《Journal of the Royal Statistical Society. Series C, Applied statistics》2004,53(1):95-108

Summary. Microarrays are a powerful new technology that allow for the measurement of the expression of thousands of genes simultaneously. Owing to relatively high costs, sample sizes tend to be quite small. If investigators apply a correction for multiple testing, a very small p -value will be required to declare significance. We use modifications to Chebyshev's inequality to develop a testing procedure that is nonparametric and yields p -values on the interval [0, 1]. We evaluate its properties via simulation and show that it both holds the type I error rate below nominal levels in almost all conditions and can yield p -values denoting significance even with very small sample sizes and stringent corrections for multiple testing. 相似文献

4.

Estimation of the sum of differences distribution

Stuart. Jay Deutsch Bruce. Wayne Schmeiser 《统计学通讯:模拟与计算》2013,42(6):563-587

Consider the distribution of Z_i d_iwhere the d.d_i?s are 1=1 lldifferences independently, identically and symmetrically distributed with mean zero. The problem is to determine properties of the sdd given the distribution of the d._i?fs and the sample size n. The standardized moments as a function of the moments of the d._i!s are developed. A variance reduction technique for estimating the quantiles of the sdd using Monte Carlo methods is developed based on using the randomization sample consisting of the 2ⁿ values of Z _i+d. rather than the single observation i=l lZ d. corresponding to each sample d_id_n. The randomization sample is shown to produce unbiased and consistent estimators. 相似文献

5.

“What If” analysis: Benefits of utilizing a “What If” analysis in excel

Corey Peltier 《统计学通讯:理论与方法》2017,46(12):6119-6129

The “What If” analysis is applicablein research and heuristic situations that utilize statistical significance testing. One utility for the “What If” is in a pedagogical perspective; the “What If” analysis provides professors an interactive tool that visually represents examples of what statistical significance testing entails and the variables that affect the commonly misinterpreted p_CALCULATED value. In order to develop a strong understanding of what affects the p_CALCULATED value, the students tangibly manipulate data within the Excel sheet to create a visualize representation that explicitly demonstrates how variables affect the p_CALCULATED value. The second utility is primarily applicable to researchers. “What If” analysis contributes to research in two ways: (1) a “What If” analysis can be run a priori to estimate the sample size a researcher may wish to use for his study; and (2) a “What If” analysis can be run a posteriori to aid in the interpretation of results. If used, the “What If” analysis provides researchers with another utility that enables them to conduct high-quality research and disseminate their results in an accurate manner. 相似文献

6.

Convergent estimators for the l1-median of banach valued random variable

Benoît Cadre 《Statistics》2013,47(4):509-521

Let E be a separable Banach space, which is the dual of a Banach space F. If X is an E-valued random variable, the set of L₁-medians of X is ArgminE[(d)]. Assume that this set contains only one element. From any sequence of probability measures {(d) 1} on E, which converges in law to X, we give two approximating sequences of the L₁-median, for the weak* topology induced by F. 相似文献

7.

A large-scale monte carlo study of the buckley-james estimator with censored data

《Journal of Statistical Computation and Simulation》2012,82(2-4):97-119

Estimation in the presence of censoring is an important problem. In the linear model, the Buckley-James method proceeds iteratively by estimating the censored values than re-estimating the regression coeffi- cients. A large-scale Monte Carlo simulation technique has been developed to test the performance of the Buckley-James (denoted B-J) estimator. One hundred and seventy two randomly generated data sets, each with three thousand replications, based on four failure distributions, four censoring patterns, three sample sizes and four censoring rates have been investigated, and the results are presented. It is found that, except for Type I1 censoring, the B-J estimator is essentially unbiased, even when the data sets with small sample sizes are subjected to a high censoring rate. The variance formula suggested by Buckley and James (1979) is shown to be sensitive to the failure distribution. If the censoring rate is kept constant along the covariate line, the sample variance of the estimator appears to be insensitive to the censoring pattern with a selected failure distribution. Oscillation of the convergence values associated with the B-J estimator is illustrated and thoroughly discussed. 相似文献

8.

A new confidence interval for the difference between two binomial proportions of paired data

《Journal of statistical planning and inference》2005,128(2):527-542

Motivated by a study on comparing sensitivities and specificities of two diagnostic tests in a paired design when the sample size is small, we first derived an Edgeworth expansion for the studentized difference between two binomial proportions of paired data. The Edgeworth expansion can help us understand why the usual Wald interval for the difference has poor coverage performance in the small sample size. Based on the Edgeworth expansion, we then derived a transformation based confidence interval for the difference. The new interval removes the skewness in the Edgeworth expansion; the new interval is easy to compute, and its coverage probability converges to the nominal level at a rate of O(n^−1/2). Numerical results indicate that the new interval has the average coverage probability that is very close to the nominal level on average even for sample sizes as small as 10. Numerical results also indicate this new interval has better average coverage accuracy than the best existing intervals in finite sample sizes. 相似文献

9.

Grouping normal distributions with unknown parameters

Peter M. Burrows 《Journal of applied statistics》1993,20(2):311-318

Optimal group definitions, maximizing retained information, are well known for grouping normal distributions with known parameters. When these group definitions are used with sample estimates substituted for unknown parameters, there is a shrinkage of retained information. If there are k groups, and if the size of the sample yielding parameter estimates is n, then this shrinkage is less than 5% for n <= 30 and minimization of shrinkage yields less than 1 % improvement for n <= 5, when k >= 12. 相似文献

10.

Additional results for 'Sequential design approaches for bioequivalence studies with crossover designs'

Montague TH Potvin D Diliberti CE Hauck WW Parr AF Schuirmann DJ 《Pharmaceutical statistics》2012,11(1):8-13

In 2008, this group published a paper on approaches for two‐stage crossover bioequivalence (BE) studies that allowed for the reestimation of the second‐stage sample size based on the variance estimated from the first‐stage results. The sequential methods considered used an assumed GMR of 0.95 as part of the method for determining power and sample size. This note adds results for an assumed GMR = 0.90. Two of the methods recommended for GMR = 0.95 in the earlier paper have some unacceptable increases in Type I error rate when the GMR is changed to 0.90. If a sponsor wants to assume 0.90 for the GMR, Method D is recommended. Copyright © 2011 John Wiley & Sons, Ltd. 相似文献

11.

Lemuel Shattuck,Statist Founder of the American Statistical Association

Walter F. Willcox 《The American statistician》2013,67(1):11-13

Polynomial regression of degree p in one independent variable χ is considered. Numerically large sample correlations between χ^α and χ^β, α < β, a, β = 1, ···, p, may cause ill-conditioning in the matrix to be inverted in application of the method of least squares. These sample correlations are investigated. It is confirmed that centering of the independent variable to have zero sample mean removes nonessential ill-conditioning. If the sample values of χ are placed symmetrically about their mean, the sample correlation between χ^α and χ^β is reduced to zero by centering when α + β is odd, but may remain large when α + β is even. Some examples and recommendations are given. 相似文献

12.

The asymptotic behavior of monotone percentile regression estimates

F. T. Wright 《Revue canadienne de statistique》1984,12(3):229-236

The least-absolute-deviation estimate of a monotone regression function on an interval has been studied in the literature. If the observation points become dense in the interval, the almost sure rate of convergence has been shown to be O(n^1/4). Applying the techniques used by Brunk (1970, Nonparametric, Techniques in Statistical Inference. Cambridge Univ. Press), the asymptotic distribution of the l₁ estimator at a point is obtained. If the underlying regression function has positive slope at the point, the rate of convergence is seen to be O(n^1/3). Monotone percentile regression estimates are also considered. 相似文献

13.

EXACT TWO-SAMPLE MEDIAN TESTS WHEN ONE "SAMPLE" VALUE POSSIBLY FROM A DIFFERENT POPULATION

Grace J. Kelleher 《Australian & New Zealand Journal of Statistics》1974,16(1):26-29

Random samples are assumed for the univariate two-sample problem. Sometimes this assumption may be violated in that an observation in one “sample”, of size m, is from a population different from that yielding the remaining m—1 observations (which are a random sample). Then, the interest is in whether this random sample of size m—1 is from the same population as the other random sample. If such a violation occurs and can be recognized, and also the non-conforming observation can be identified (without imposing conditional effects), then that observation could be removed and a two-sample test applied to the remaining samples. Unfortunately, satisfactory procedures for such a removal do not seem to exist. An alternative approach is to use two-sample tests whose significance levels remain the same when a non-conforming observation occurs, and is removed, as for the case where the samples were both truly random. The equal-tail median test is shown to have this property when the two “samples” are of the same size (and ties do not occur). 相似文献

14.

Rate efficient estimation of realized Laplace transform of volatility with microstructure noise

Li Wang Zhi Liu Xiaochao Xia 《Scandinavian Journal of Statistics》2019,46(3):920-953

In this paper, we consider the problem of estimating the Laplace transform of volatility within a fixed time interval [0,T] using high‐frequency sampling, where we assume that the discretized observations of the latent process are contaminated by microstructure noise. We use the pre‐averaging approach to deal with the effect of microstructure noise. Under the high‐frequency scenario, we obtain a consistent estimator whose convergence rate is , which is known as the optimal convergence rate of the estimation of integrated volatility functionals under the presence of microstructure noise. The related central limit theorem is established. The simulation studies justify the finite‐sample performance of the proposed estimator. 相似文献

15.

ESTIMATION OF PARAMETERS IN THE MARGINAL FISHER DISTRIBUTION1

R. M. Clark 《Australian & New Zealand Journal of Statistics》1983,25(2):227-237

The Fisher distribution is frequently used as a model for the probability distribution of directional data, which may be specified either in terms of unit vectors or angular co-ordinates (co-latitude and azimuth). If, in practical situations, only the co-latitudes can be observed, the available data must be regarded as a sample from the corresponding marginal distribution. This paper discusses the estimation by Maximum Likelihood (ML) and the Method of Moments of the two parameters of this marginal Fisher distribution. The moment estimators are generally simpler to compute than the ML estimators, and have high asymptotic efficiency. 相似文献

16.

The likelihood ratio under noncontiguous alternatives

Edit Gombay 《Revue canadienne de statistique》1997,25(3):417-423

The asymptotic distribution of the likelihood ratio under noncontiguous alternatives is shown to be normal for the exponential family of distributions. The rate of convergence of the parameters to the hypothetical value is specified where the asymptotic noncentral chi-square distribution no longer holds. It is only a little slower than $\O\left( {n^{ - \frac{1}{2}} } \right)$. The result provides compact power approximation formulae and is shown to work reasonably well even for moderate sample sizes. 相似文献

17.

Adaptive designs for single‐arm phase II trials in oncology

Stefan Englert Meinhard Kieser 《Pharmaceutical statistics》2012,11(3):241-249

Clinical phase II trials in oncology are conducted to determine whether the activity of a new anticancer treatment is promising enough to merit further investigation. Two‐stage designs are commonly used for this situation to allow for early termination. Designs proposed in the literature so far have the common drawback that the sample sizes for the two stages have to be specified in the protocol and have to be adhered to strictly during the course of the trial. As a consequence, designs that allow a higher extent of flexibility are desirable. In this article, we propose a new adaptive method that allows an arbitrary modification of the sample size of the second stage using the results of the interim analysis or external information while controlling the type I error rate. If the sample size is not changed during the trial, the proposed design shows very similar characteristics to the optimal two‐stage design proposed by Chang et al. (Biometrics 1987; 43:865–874). However, the new design allows the use of mid‐course information for the planning of the second stage, thus meeting practical requirements when performing clinical phase II trials in oncology. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献

18.

Asymptotic confidence interval of power spectrum of a continuous time process through progressively faster sampling

Radhendushka Srivastava Debasis Sengupta 《Journal of statistical planning and inference》2013

If the power spectral density of a continuous time stationary stochastic process is not limited to a finite bandwidth, data sampled from that process at any uniform sampling rate leads to biased and inconsistent spectrum estimators, which are unsuitable for constructing confidence intervals. In this paper, we use the smoothed periodogram estimator to construct asymptotic confidence intervals shrinking to the true spectra, by allowing the sampling rate to go to infinity suitably fast as the sample size goes to infinity. The proposed method requires minimal computation, as it does not involve bootstrap or other resampling. The method is illustrated through a Monte-Carlo simulation study, and its performance is compared with that of the corresponding method based on uniform sampling at a fixed rate. 相似文献

19.

Population size estimation for capture-recapture models with applications to epidemiological data 总被引：1，自引：0，他引：1

P. K. Tsay A. Chao 《Journal of applied statistics》2001,28(1):25-36

The capture-recapture method is applied to estimate the population size of a target population based on ascertainment data in epidemiological applications. We generalize the three-list case of Chao & Tsay (1998) to situations where more than three lists are available. An estimation procedure is presented using the concept of sample coverage, which can be interpreted as a measure of overlap information among multiple list records. When there is enough overlap, an estimator of the total population size is proposed. The bootstrap method is used to construct a variance estimator and confidence interval. If the overlap rate is relatively low, then the population size cannot be precisely estimated and thus only a lower (upper) bound is proposed for positively (negatively) dependent lists. The proposed method is applied to two data sets, one with a high and one with a low overlap rate. 相似文献

20.

基于稳健Cook距离的时间序列异常值诊断

王志坚罗舒琪王斌会《统计与决策》2022,(3)

Cook距离公式常用于回归模型的异常值诊断,但由于公式中的样本方差■对异常值敏感,导致公式缺乏稳健性,使得诊断效果不理想。基于以上问题,文章选取绝对离差中位数作为样本标准差的稳健估计量,得到了样本方差■的稳健估计量,进而构造出稳健Cook距离公式;借鉴传统Cook距离的回归模型异常值诊断理论,将稳健Cook距离公式应用于时间序列异常值诊断,拓展了传统Cook距离公式的异常值诊断领域。通过选取模拟样本量分别为50、100、200,污染率分别为0、1%、5%、10%的ARMA(1,1)序列及金融时间序列进行实例分析,结果发现:(1)在无污染时,稳健Cook距离法与常规Cook距离法的诊断正确率均为100%,两者没有出现"误诊"现象;(2)在样本量、污染率同时增大时,常规Cook距离诊断正确率急剧下降,当污染率达到5%及以上时,已基本无诊断力,而稳健Cook距离法依然能保持较高的诊断力。稳健Cook距离法不仅能应用于时间序列异常值诊断,也能应用于回归分析的异常值诊断。相似文献