首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 437 毫秒
1.
Abstract

This article introduces a parametric robust way of comparing two population means and two population variances. With large samples the comparison of two means, under model misspecification, is lesser a problem, for, the validity of inference is protected by the central limit theorem. However, the assumption of normality is generally required, so that the inference for the ratio of two variances can be carried out by the familiar F statistic. A parametric robust approach that is insensitive to the distributional assumption will be proposed here. More specifically, it will be demonstrated that the normal likelihood function can be adjusted for asymptotically valid inferences for all underlying distributions with finite fourth moments. The normal likelihood function, on the other hand, is itself robust for the comparison of two means so that no adjustment is needed.  相似文献   

2.
We propose a new summary statistic for inhomogeneous intensity‐reweighted moment stationarity spatio‐temporal point processes. The statistic is defined in terms of the n‐point correlation functions of the point process, and it generalizes the J‐function when stationarity is assumed. We show that our statistic can be represented in terms of the generating functional and that it is related to the spatio‐temporal K‐function. We further discuss its explicit form under some specific model assumptions and derive ratio‐unbiased estimators. We finally illustrate the use of our statistic in practice. © 2014 Board of the Foundation of the Scandinavian Journal of Statistics  相似文献   

3.
We studied asymptotic distribution and finite sample properties of a randomly weighted permutation statistic. The asymptotic normality and the finite sample simulations derived from our studies provided theoretical and numerical justifications for distributional assumption of many useful test statistics used in identifying spatial autocorrelations of mapped data. We compared a new method in computing the mean and the approximated variance of the randomly weighted D statistic, a special permutation statistic, with the Walter’s conditional method. In the numerical illustration of the method, we calculated the standardized values of the D statistic by subtracting the mean from the D statistic and dividing the difference by the standard deviation for the standardized mortality ratios (SMRs) and the life expectancies among the 48 states of the continental USA. Spatial autocorrelations of the SMRs and the life expectancies were found to be statistically significant.  相似文献   

4.
Multivariate control charts are used to monitor stochastic processes for changes and unusual observations. Hotelling's T2 statistic is calculated for each new observation and an out‐of‐control signal is issued if it goes beyond the control limits. However, this classical approach becomes unreliable as the number of variables p approaches the number of observations n, and impossible when p exceeds n. In this paper, we devise an improvement to the monitoring procedure in high‐dimensional settings. We regularise the covariance matrix to estimate the baseline parameter and incorporate a leave‐one‐out re‐sampling approach to estimate the empirical distribution of future observations. An extensive simulation study demonstrates that the new method outperforms the classical Hotelling T2 approach in power, and maintains appropriate false positive rates. We demonstrate the utility of the method using a set of quality control samples collected to monitor a gas chromatography–mass spectrometry apparatus over a period of 67 days.  相似文献   

5.
《统计学通讯:理论与方法》2012,41(13-14):2545-2569
We study the general linear model (GLM) with doubly exchangeable distributed error for m observed random variables. The doubly exchangeable general linear model (DEGLM) arises when the m-dimensional error vectors are “doubly exchangeable,” jointly normally distributed, which is a much weaker assumption than the independent and identically distributed error vectors as in the case of GLM or classical GLM (CGLM). We estimate the parameters in the model and also find their distributions. We show that the tests of intercept and slope are possible in DEGLM as a particular case using parametric bootstrap as well as multivariate Satterthwaite approximation.  相似文献   

6.
The mode of a distribution provides an important summary of data and is often estimated on the basis of some non‐parametric kernel density estimator. This article develops a new data analysis tool called modal linear regression in order to explore high‐dimensional data. Modal linear regression models the conditional mode of a response Y given a set of predictors x as a linear function of x . Modal linear regression differs from standard linear regression in that standard linear regression models the conditional mean (as opposed to mode) of Y as a linear function of x . We propose an expectation–maximization algorithm in order to estimate the regression coefficients of modal linear regression. We also provide asymptotic properties for the proposed estimator without the symmetric assumption of the error density. Our empirical studies with simulated data and real data demonstrate that the proposed modal regression gives shorter predictive intervals than mean linear regression, median linear regression and MM‐estimators.  相似文献   

7.
Pseudo‐values have proven very useful in censored data analysis in complex settings such as multi‐state models. It was originally suggested by Andersen et al., Biometrika, 90, 2003, 335 who also suggested to estimate standard errors using classical generalized estimating equation results. These results were studied more formally in Graw et al., Lifetime Data Anal., 15, 2009, 241 that derived some key results based on a second‐order von Mises expansion. However, results concerning large sample properties of estimates based on regression models for pseudo‐values still seem unclear. In this paper, we study these large sample properties in the simple setting of survival probabilities and show that the estimating function can be written as a U‐statistic of second order giving rise to an additional term that does not vanish asymptotically. We further show that previously advocated standard error estimates will typically be too large, although in many practical applications the difference will be of minor importance. We show how to estimate correctly the variability of the estimator. This is further studied in some simulation studies.  相似文献   

8.
ABSTRACT: We introduce a class of Toeplitz‐band matrices for simple goodness of fit tests for parametric regression models. For a given length r of the band matrix the asymptotic optimal solution is derived. Asymptotic normality of the corresponding test statistic is established under a fixed and random design assumption as well as for linear and non‐linear models, respectively. This allows testing at any parametric assumption as well as the computation of confidence intervals for a quadratic measure of discrepancy between the parametric model and the true signal g;. Furthermore, the connection between testing the parametric goodness of fit and estimating the error variance is highlighted. As a by‐product we obtain a much simpler proof of a result of 34 ) concerning the optimality of an estimator for the variance. Our results unify and generalize recent results by 9 ) and 15 , 16 ) in several directions. Extensions to multivariate predictors and unbounded signals are discussed. A simulation study shows that a simple jacknife correction of the proposed test statistics leads to reasonable finite sample approximations.  相似文献   

9.
Partially linear regression models are semiparametric models that contain both linear and nonlinear components. They are extensively used in many scientific fields for their flexibility and convenient interpretability. In such analyses, testing the significance of the regression coefficients in the linear component is typically a key focus. Under the high-dimensional setting, i.e., “large p, small n,” the conventional F-test strategy does not apply because the coefficients need to be estimated through regularization techniques. In this article, we develop a new test using a U-statistic of order two, relying on a pseudo-estimate of the nonlinear component from the classical kernel method. Using the martingale central limit theorem, we prove the asymptotic normality of the proposed test statistic under some regularity conditions. We further demonstrate our proposed test's finite-sample performance by simulation studies and by analyzing some breast cancer gene expression data.  相似文献   

10.
Kernel discriminant analysis translates the original classification problem into feature space and solves the problem with dimension and sample size interchanged. In high‐dimension low sample size (HDLSS) settings, this reduces the ‘dimension’ to that of the sample size. For HDLSS two‐class problems we modify Mika's kernel Fisher discriminant function which – in general – remains ill‐posed even in a kernel setting; see Mika et al. (1999). We propose a kernel naive Bayes discriminant function and its smoothed version, using first‐ and second‐degree polynomial kernels. For fixed sample size and increasing dimension, we present asymptotic expressions for the kernel discriminant functions, discriminant directions and for the error probability of our kernel discriminant functions. The theoretical calculations are complemented by simulations which show the convergence of the estimators to the population quantities as the dimension grows. We illustrate the performance of the new discriminant rules, which are easy to implement, on real HDLSS data. For such data, our results clearly demonstrate the superior performance of the new discriminant rules, and especially their smoothed versions, over Mika's kernel Fisher version, and typically also over the commonly used naive Bayes discriminant rule.  相似文献   

11.
Central limit theorems play an important role in the study of statistical inference for stochastic processes. However, when the non‐parametric local polynomial threshold estimator, especially local linear case, is employed to estimate the diffusion coefficients of diffusion processes, the adaptive and predictable structure of the estimator conditionally on the σ ‐field generated by diffusion processes is destroyed, so the classical central limit theorem for martingale difference sequences cannot work. In high‐frequency data, we proved the central limit theorems of local polynomial threshold estimators for the volatility function in diffusion processes with jumps by Jacod's stable convergence theorem. We believe that our proof procedure for local polynomial threshold estimators provides a new method in this field, especially in the local linear case.  相似文献   

12.
Chapter Notes     
Tests for redundancy of variables in linear two-group discriminant analysis are well known and frequently used. We give a survey of similar tests, including the one-sample T 2 as a special case, in the situation in which only the mean vector (but no covariance matrix) is available in one sample. Then we show that a relation between linear regression and discriminant functions found by Fisher (1936) can be generalized to this situation. Relating regression and discriminant analysis to a multivariate linear model sheds more light on the relationship between them. Practical and didactical advantages of the regression approach to T 2 tests and discriminant analysis are outlined.  相似文献   

13.
In the multinomial regression model, we consider the methodology for simultaneous model selection and parameter estimation by using the shrinkage and LASSO (least absolute shrinkage and selection operation) [R. Tibshirani, Regression shrinkage and selection via the LASSO, J. R. Statist. Soc. Ser. B 58 (1996), pp. 267–288] strategies. The shrinkage estimators (SEs) provide significant improvement over their classical counterparts in the case where some of the predictors may or may not be active for the response of interest. The asymptotic properties of the SEs are developed using the notion of asymptotic distributional risk. We then compare the relative performance of the LASSO estimator with two SEs in terms of simulated relative efficiency. A simulation study shows that the shrinkage and LASSO estimators dominate the full model estimator. Further, both SEs perform better than the LASSO estimators when there are many inactive predictors in the model. A real-life data set is used to illustrate the suggested shrinkage and LASSO estimators.  相似文献   

14.
With data collection in environmental science and bioassay, left censoring because of nondetects is a problem. Similarly in reliability and life data analysis right censoring frequently occurs. There is a need for goodness of fit tests that can adapt to left or right censored data and be used to check important distributional assumptions without becoming too difficult to regularly implement in practice. A new test statistic is derived from a plot of the standardized spacings between the order statistics versus their ranks. Any linear or curvilinear pattern is evidence against the null distribution. When testing the Weibull or extreme value null hypothesis this statistic has a null distribution that is approximately F for most combinations of sample size and censoring of practical interest. Our statistic is compared to the Mann-Scheuer-Fertig statistic which also uses the standardized spacings between the order statistics. The results of a simulation study show the two tests are competitive in terms of power. Although the Mann-Scheuer-Fertig statistic is somewhat easier to compute, our test enjoys advantages in the accuracy of the F approximation and the availability of a graphical diagnostic.  相似文献   

15.
Early investigations of the effects of non-normality indicated that skewness has a greater effect on the distribution of t-statistic than does kurtosis. When the distribution is skewed, the actual p-values can be larger than the values calculated from the t-tables. Transformation of data to normality has shown good results in the case of univariate t-test. In order to reduce the effect of skewness of the distribution on normal-based t-test, one can transform the data and perform the t-test on the transformed scale. This method is not only a remedy for satisfying the distributional assumption, but it also turns out that one can achieve greater efficiency of the test. We investigate the efficiency of tests after a Box-Cox transformation. In particular, we consider the one sample test of location and study the gains in efficiency for one-sample t-test following a Box-Cox transformation. Under some conditions, we prove that the asymptotic relative efficiency of transformed t-test and Hotelling's T 2-test of multivariate location with respect to the same statistic based on untransformed data is at least one.  相似文献   

16.
We consider permutation tests based on a likelihood ratio like statistic for the one way or k sample design used in an example in Kolassa and Robinson [(2011), ‘Saddlepoint Approximations for Likelihood Ratio Like Statistics with Applications to Permutation Tests’, Annals of Statistics, 39, 3357–3368]. We give explicitly the region in which the statistic exists, obtaining results which permit calculation of the statistic on the boundary of this region. Numerical examples are given to illustrate improvement in the power of the tests compared to the classical statistics for long-tailed error distributions and no loss of power for normal error distributions.  相似文献   

17.
The threshold diffusion model assumes a piecewise linear drift term and a piecewise smooth diffusion term, which constitutes a rich model for analyzing nonlinear continuous-time processes. We consider the problem of testing for threshold nonlinearity in the drift term. We do this by developing a quasi-likelihood test derived under the working assumption of a constant diffusion term, which circumvents the problem of generally unknown functional form for the diffusion term. The test is first developed for testing for one threshold at which the drift term breaks into two linear functions. We show that under some mild regularity conditions, the asymptotic null distribution of the proposed test statistic is given by the distribution of certain functional of some centered Gaussian process. We develop a computationally efficient method for calibrating the p-value of the test statistic by bootstrapping its asymptotic null distribution. The local power function is also derived, which establishes the consistency of the proposed test. The test is then extended to testing for multiple thresholds. We demonstrate the efficacy of the proposed test by simulations. Using the proposed test, we examine the evidence of nonlinearity in the term structure of a long time series of U.S. interest rates.  相似文献   

18.
Data‐analytic tools for models other than the normal linear regression model are relatively rare. Here we develop plots and diagnostic statistics for nonconstant variance for the random‐effects model (REM). REMs for longitudinal data include both within‐ and between‐subject variances. A basic assumption is that the two variance terms are constant across subjects. However, we often find that these variances are functions of covariates, and the data set has what we call explainable heterogeneity, which needs to be allowed for in the model. We characterize several types of heterogeneity of variance in REMs and develop three diagnostic tests using the score statistic: one for each of the two variance terms, and the third for a form of multivariate nonconstant variance. For each test we present an adjusted residual plot which can identify cases that are unusually influential on the outcome of the test.  相似文献   

19.
In this article we study a linear discriminant function of multiple m-variate observations at u-sites and over v-time points under the assumption of multivariate normality. We assume that the m-variate observations have a separable mean vector structure and a “jointly equicorrelated covariance” structure. The new discriminant function is very effective in discriminating individuals in a small sample scenario. No closed-form expression exists for the maximum likelihood estimates of the unknown population parameters, and their direct computation is nontrivial. An iterative algorithm is proposed to calculate the maximum likelihood estimates of these unknown parameters. A discriminant function is also developed for unstructured mean vectors. The new discriminant functions are applied to simulated data sets as well as to a real data set. Results illustrating the benefits of the new classification methods over the traditional one are presented.  相似文献   

20.
The distribution of the probabilities of misclassification is derived in this paper, which are reproduced by the use of the linear discriminant function. The statistical background is two independent doubly truncated t populations with distinct location parameters and common scale parameter and degrees of freedom. The behavior of the linear discriminant function is studied by comparing the distribution function of the errors of misclassification under the truncated t and truncated normal models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号