首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
In the last few years, two adaptive tests for paired data have been proposed. One test proposed by Freidlin et al. [On the use of the Shapiro–Wilk test in two-stage adaptive inference for paired data from moderate to very heavy tailed distributions, Biom. J. 45 (2003), pp. 887–900] is a two-stage procedure that uses a selection statistic to determine which of three rank scores to use in the computation of the test statistic. Another statistic, proposed by O'Gorman [Applied Adaptive Statistical Methods: Tests of Significance and Confidence Intervals, Society for Industrial and Applied Mathematics, Philadelphia, 2004], uses a weighted t-test with the weights determined by the data. These two methods, and an earlier rank-based adaptive test proposed by Randles and Hogg [Adaptive Distribution-free Tests, Commun. Stat. 2 (1973), pp. 337–356], are compared with the t-test and to Wilcoxon's signed-rank test. For sample sizes between 15 and 50, the results show that the adaptive test proposed by Freidlin et al. and the adaptive test proposed by O'Gorman have higher power than the other tests over a range of moderate to long-tailed symmetric distributions. The results also show that the test proposed by O'Gorman has greater power than the other tests for short-tailed distributions. For sample sizes greater than 50 and for small sample sizes the adaptive test proposed by O'Gorman has the highest power for most distributions.  相似文献   

2.
The demand for reliable statistics in subpopulations, when only reduced sample sizes are available, has promoted the development of small area estimation methods. In particular, an approach that is now widely used is based on the seminal work by Battese et al. [An error-components model for prediction of county crop areas using survey and satellite data, J. Am. Statist. Assoc. 83 (1988), pp. 28–36] that uses linear mixed models (MM). We investigate alternatives when a linear MM does not hold because, on one side, linearity may not be assumed and/or, on the other, normality of the random effects may not be assumed. In particular, Opsomer et al. [Nonparametric small area estimation using penalized spline regression, J. R. Statist. Soc. Ser. B 70 (2008), pp. 265–283] propose an estimator that extends the linear MM approach to the case in which a linear relationship may not be assumed using penalized splines regression. From a very different perspective, Chambers and Tzavidis [M-quantile models for small area estimation, Biometrika 93 (2006), pp. 255–268] have recently proposed an approach for small-area estimation that is based on M-quantile (MQ) regression. This allows for models robust to outliers and to distributional assumptions on the errors and the area effects. However, when the functional form of the relationship between the qth MQ and the covariates is not linear, it can lead to biased estimates of the small area parameters. Pratesi et al. [Semiparametric M-quantile regression for estimating the proportion of acidic lakes in 8-digit HUCs of the Northeastern US, Environmetrics 19(7) (2008), pp. 687–701] apply an extended version of this approach for the estimation of the small area distribution function using a non-parametric specification of the conditional MQ of the response variable given the covariates [M. Pratesi, M.G. Ranalli, and N. Salvati, Nonparametric m-quantile regression using penalized splines, J. Nonparametric Stat. 21 (2009), pp. 287–304]. We will derive the small area estimator of the mean under this model, together with its mean-squared error estimator and compare its performance to the other estimators via simulations on both real and simulated data.  相似文献   

3.
Quantile regression (QR) proposed by Koenker and Bassett [Regression quantiles, Econometrica 46(1) (1978), pp. 33–50] is a statistical technique that estimates conditional quantiles. It has been widely studied and applied to economics. Meinshausen [Quantile regression forests, J. Mach. Learn. Res. 7 (2006), pp. 983–999] proposed quantile regression forests (QRF), a non-parametric way based on random forest. QRF performs well in terms of prediction accuracy, but it struggles with noisy data sets. This motivates us to propose a multi-step QR tree method using GUIDE (Generalized, Unbiased, Interaction Detection and Estimation) made by Loh [Regression trees with unbiased variable selection and interaction detection, Statist. Sinica 12 (2002), pp. 361–386]. Our simulation study shows that the multi-step QR tree performs better than a single tree or QRF especially when it deals with data sets having many irrelevant variables.  相似文献   

4.
This paper demonstrates that cross-validation (CV) and Bayesian adaptive bandwidth selection can be applied in the estimation of associated kernel discrete functions. This idea is originally proposed by Brewer [A Bayesian model for local smoothing in kernel density estimation, Stat. Comput. 10 (2000), pp. 299–309] to derive variable bandwidths in adaptive kernel density estimation. Our approach considers the adaptive binomial kernel estimator and treats the variable bandwidths as parameters with beta prior distribution. The best variable bandwidth selector is estimated by the posterior mean in the Bayesian sense under squared error loss. Monte Carlo simulations are conducted to examine the performance of the proposed Bayesian adaptive approach in comparison with the performance of the Asymptotic mean integrated squared error estimator and CV technique for selecting a global (fixed) bandwidth proposed in Kokonendji and Senga Kiessé [Discrete associated kernels method and extensions, Stat. Methodol. 8 (2011), pp. 497–516]. The Bayesian adaptive bandwidth estimator performs better than the global bandwidth, in particular for small and moderate sample sizes.  相似文献   

5.
This paper considers the estimation of the regression coefficients in the Cox proportional hazards model with left-truncated and interval-censored data. Using the approaches of Pan [A multiple imputation approach to Cox regression with interval-censored data, Biometrics 56 (2000), pp. 199–203] and Heller [Proportional hazards regression with interval censored data using an inverse probability weight, Lifetime Data Anal. 17 (2011), pp. 373–385], we propose two estimates of the regression coefficients. The first estimate is based on a multiple imputation methodology. The second estimate uses an inverse probability weight to select event time pairs where the ordering is unambiguous. A simulation study is conducted to investigate the performance of the proposed estimators. The proposed methods are illustrated using the Centers for Disease Control and Prevention (CDC) acquired immunodeficiency syndrome (AIDS) Blood Transfusion Data.  相似文献   

6.
In this article, the restricted rk class estimator and restricted rd class estimator are introduced, which are general estimators of the rk class estimator by Baye and Parker [Combining ridge and principal component regression: A money demand illustration, Commun. Stat. Theory Methods 13(2) (1984), pp. 197–205] and the rd class estimator by Kaç?ranlar and Sakall?o?lu [Combining the Liu estimator and the principal component regression estimator, Commun. Stat. Theory Methods 30(12) (2001), pp. 2699–2705], respectively. For the two cases when the restrictions are true and not true, the superiority of the restricted rk class estimator and rd class estimator over the restricted ridge regression estimator by Sarkar [A new estimator combining the ridge regression and the restricted least squares methods of estimation, Commun. Stat. Theory Methods 21 (1992), pp. 1987–2000] and the restricted Liu estimator by Kaç?ranlar et al. [A new biased estimator in linear regression and a detailed analysis of the widely analysed dataset on Portland cement, Sankhya - Indian J. Stat. 61B(3) (1999), pp. 443–459] are discussed with respect to the mean squared error matrix criterion. Furthermore, a Monte Carlo evaluation of the estimators is given to illustrate some of the theoretical results.  相似文献   

7.
This article investigates the confidence regions for semiparametric nonlinear reproductive dispersion models (SNRDMs), which is an extension of nonlinear regression models. Based on local linear estimate of nonparametric component and generalized profile likelihood estimate of parameter in SNRDMs, a modified geometric framework of Bates and Wattes is proposed. Within this geometric framework, we present three kinds of improved approximate confidence regions for the parameters and parameter subsets in terms of curvatures. The work extends the previous results of Hamilton et al. [in Accounting for intrinsic nonlinearity in nonlinear regression parameter inference regions, Ann. Statist. 10, pp. 386–393, 1982], Hamilton [in Confidence regions for parameter subset in nonlinear regression, Biometrika, 73, pp. 57–64, 1986], Wei [in On confidence regions of embedded models in regular parameter families (a geometric approch), Austral. J. Statist. 36, pp. 327–338, 1994], Tang et al. [in Confidence regions in quasi-likelihood nonlinear models: a geometric approach, J. Biomath. 15, pp. 55–64, 2000b] and Zhu et al. [in On confidence regions of semiparametric nonlinear regression models, Acta. Math. Scient. 20, pp. 68–75, 2000].  相似文献   

8.
In disease screening and diagnosis, often multiple markers are measured and combined to improve the accuracy of diagnosis. McIntosh and Pepe [Combining several screening tests: optimality of the risk score, Biometrics 58 (2002), pp. 657–664] showed that the risk score, defined as the probability of disease conditional on multiple markers, is the optimal function for classification based on the Neyman–Pearson lemma. They proposed a two-step procedure to approximate the risk score. However, the resulting receiver operating characteristic (ROC) curve is only defined in a subrange (L, h) of false-positive rates in (0,1) and the determination of the lower limit L needs extra prior information. In practice, most diagnostic tests are not perfect, and it is usually rare that a single marker is uniformly better than the other tests. Using simulation, I show that multivariate adaptive regression spline is a useful tool to approximate the risk score when combining multiple markers, especially when ROC curves from multiple tests cross. The resulting ROC is defined in the whole range of (0,1) and is easy to implement and has intuitive interpretation. The sample code of the application is shown in the appendix.  相似文献   

9.
Hu Yang 《Statistics》2013,47(6):759-766
In this paper, we introduce a stochastic restricted kd class estimator for the vector of parameters in a linear model when additional linear restrictions on the parameter vector are assumed to hold. The stochastic restricted kd class estimator is a generalization of the ordinary mixed estimator and the kd class estimator. We show that our new biased estimator is superior in the mean squared error matrix sense to the kd class estimator [S. Sakall?o?lu and S. Kaçiranlar, A new biased estimator based on ridge estimation, Statist. Papers 49 (2008), pp. 669–689] and the stochastic restricted Liu estimator [H. Yang and J.W. Xu, An alternative stochastic restricted Liu estimator in linear regression, Statist. Papers 50 (2009), pp. 639–647]. Finally, a numerical example is given to show the theoretical results.  相似文献   

10.
We investigate the instability problem of the covariance structure of time series by combining the non-parametric approach based on the evolutionary spectral density theory of Priestley [Evolutionary spectra and non-stationary processes, J. R. Statist. Soc., 27 (1965), pp. 204–237; Wavelets and time-dependent spectral analysis, J. Time Ser. Anal., 17 (1996), pp. 85–103] and the parametric approach based on linear regression models of Bai and Perron [Estimating and testing linear models with multiple structural changes, Econometrica 66 (1998), pp. 47–78]. A Monte Carlo study is presented to evaluate the performance of some parametric testing and estimation procedures for models characterized by breaks in variance. We attempt to see whether these procedures perform in the same way as models characterized by mean-shifts as investigated by Bai and Perron [Multiple structural change models: a simulation analysis, in: Econometric Theory and Practice: Frontiers of Analysis and Applied Research, D. Corbea, S. Durlauf, and B.E. Hansen, eds., Cambridge University Press, 2006, pp. 212–237]. We also provide an analysis of financial data series, of which the stability of the covariance function is doubtful.  相似文献   

11.
The varying coefficient model (VCM) is an important generalization of the linear regression model and many existing estimation procedures for VCM were built on L 2 loss, which is popular for its mathematical beauty but is not robust to non-normal errors and outliers. In this paper, we address the problem of both robustness and efficiency of estimation and variable selection procedure based on the convex combined loss of L 1 and L 2 instead of only quadratic loss for VCM. By using local linear modeling method, the asymptotic normality of estimation is driven and a useful selection method is proposed for the weight of composite L 1 and L 2. Then the variable selection procedure is given by combining local kernel smoothing with adaptive group LASSO. With appropriate selection of tuning parameters by Bayesian information criterion (BIC) the theoretical properties of the new procedure, including consistency in variable selection and the oracle property in estimation, are established. The finite sample performance of the new method is investigated through simulation studies and the analysis of body fat data. Numerical studies show that the new method is better than or at least as well as the least square-based method in terms of both robustness and efficiency for variable selection.  相似文献   

12.
In this paper, the three-decision procedures to classify p treatments as better than or worse than one control, proposed for normal/symmetric probability models [Bohrer, Multiple three-decision rules for parametric signs. J. Amer. Statist. Assoc. 74 (1979), pp. 432–437; Bohrer et al., Multiple three-decision rules for factorial simple effects: Bonferroni wins again!, J. Amer. Statist. Assoc. 76 (1981), pp. 119–124; Liu, A multiple three-decision procedure for comparing several treatments with a control, Austral. J. Statist. 39 (1997), pp. 79–92 and Singh and Mishra, Classifying logistic populations using sample medians, J. Statist. Plann. Inference 137 (2007), pp. 1647–1657]; in the literature, have been extended to asymmetric two-parameter exponential probability models to classify p(p≥1) treatments as better than or worse than the best of q(q≥1) control treatments in terms of location parameters. Critical constants required for the implementation of the proposed procedure are tabulated for some pre-specified values of probability of no misclassification. Power function of the proposed procedure is also defined and a common sample size necessary to guarantee various pre-specified power levels are tabulated. Optimal allocation scheme is also discussed. Finally, the implementation of the proposed methodology is demonstrated through numerical example.  相似文献   

13.
As an alternative to an estimation based on a simple random sample (BLUE-SRS) for the simple linear regression model, Moussa-Hamouda and Leone [E. Moussa-Hamouda and F.C. Leone, The o-blue estimators for complete and censored samples in linear regression, Technometrics, 16 (3) (1974), pp. 441–446.] discussed the best linear unbiased estimators based on order statistics (BLUE-OS), and showed that BLUE-OS is more efficient than BLUE-SRS for normal data. Using the ranked set sampling, Barreto and Barnett [M.C.M. Barreto and V. Barnett, Best linear unbiased estimators for the simple linear regression model using ranked set sampling. Environ. Ecoll. Stat. 6 (1999), pp. 119–133.] derived the best linear unbiased estimators (BLUE-RSS) for simple linear regression model and showed that BLUE-RSS is more efficient for the estimation of the regression parameters (intercept and slope) than BLUE-SRS for normal data, but not so for the estimation of the residual standard deviation in the case of small sample size. As an alternative to RSS, this paper considers the best linear unbiased estimators based on order statistics from a ranked set sample (BLUE-ORSS) and shows that BLUE-ORSS is uniformly more efficient than BLUE-RSS and BLUE-OS for normal data.  相似文献   

14.
An important problem in statistics is the study of longitudinal data taking into account the effect of other explanatory variables such as treatments and time. In this paper, a new Bayesian approach for analysing longitudinal data is proposed. This innovative approach takes into account the possibility of having nonlinear regression structures on the mean and linear regression structures on the variance–covariance matrix of normal observations, and it is based on the modelling strategy suggested by Pourahmadi [M. Pourahmadi, Joint mean-covariance models with applications to longitudinal data: Unconstrained parameterizations, Biometrika, 87 (1999), pp. 667–690.]. We initially extend the classical methodology to accommodate the fitting of nonlinear mean models then we propose our Bayesian approach based on a generalization of the Metropolis–Hastings algorithm of Cepeda [E.C. Cepeda, Variability modeling in generalized linear models, Unpublished Ph.D. Thesis, Mathematics Institute, Universidade Federal do Rio de Janeiro, 2001]. Finally, we illustrate the proposed methodology by analysing one example, the cattle data set, that is used to study cattle growth.  相似文献   

15.
In this paper, we translate variable selection for linear regression into multiple testing, and select significant variables according to testing result. New variable selection procedures are proposed based on the optimal discovery procedure (ODP) in multiple testing. Due to ODP’s optimality, if we guarantee the number of significant variables included, it will include less non significant variables than marginal p-value based methods. Consistency of our procedures is obtained in theory and simulation. Simulation results suggest that procedures based on multiple testing have improvement over procedures based on selection criteria, and our new procedures have better performance than marginal p-value based procedures.  相似文献   

16.
In this paper, a nonparametric discriminant analysis procedure that is less sensitive than traditional procedures to deviations from the usual assumptions is proposed. The procedure uses the projection pursuit methodology where the projection index is the two-group transvariation probability. Montanari [A. Montanari, Linear discriminant analysis and transvariation, J. Classification 21 (2004), pp. 71–88] proposed and used this projection index to measure group separation but allocated the new observation using distances. Our procedure employs a method of allocation based on group–group transvariation probability to classify the new observation. A simulation study shows that the procedure proposed in this paper provides lower misclassification error rates than classical procedures like linear discriminant analysis and quadratic discriminant analysis and recent procedures like maximum depth and Montanari's transvariation-based classifiers, when the underlying distributions are skewed and/or the prior probabilities are unequal.  相似文献   

17.
In this paper, we consider the bootstrap procedure for the augmented Dickey–Fuller (ADF) unit root test by implementing the modified divergence information criterion (MDIC, Mantalos et al. [An improved divergence information criterion for the determination of the order of an AR process, Commun. Statist. Comput. Simul. 39(5) (2010a), pp. 865–879; Forecasting ARMA models: A comparative study of information criteria focusing on MDIC, J. Statist. Comput. Simul. 80(1) (2010b), pp. 61–73]) for the selection of the optimum number of lags in the estimated model. The asymptotic distribution of the resulting bootstrap ADF/MDIC test is established and its finite sample performance is investigated through Monte-Carlo simulations. The proposed bootstrap tests are found to have finite sample sizes that are generally much closer to their nominal values, than those tests that rely on other information criteria, like the Akaike information criterion [H. Akaike, Information theory and an extension of the maximum likelihood principle, in Proceedings of the 2nd International Symposium on Information Theory, B.N. Petrov and F. Csáki, eds., Akademiai Kaido, Budapest, 1973, pp. 267–281]. The simulations reveal that the proposed procedure is quite satisfactory even for models with large negative moving average coefficients.  相似文献   

18.
In this paper, we give matrix formulae of order 𝒪(n ?1), where n is the sample size, for the first two moments of Pearson residuals in exponential family nonlinear regression models [G.M. Cordeiro and G.A. Paula, Improved likelihood ratio statistic for exponential family nonlinear models, Biometrika 76 (1989), pp. 93–100.]. The formulae are applicable to many regression models in common use and generalize the results by Cordeiro [G.M. Cordeiro, On Pearson's residuals in generalized linear models, Statist. Prob. Lett. 66 (2004), pp. 213–219.] and Cook and Tsai [R.D. Cook and C.L. Tsai, Residuals in nonlinear regression, Biometrika 72(1985), pp. 23–29.]. We suggest adjusted Pearson residuals for these models having, to this order, the expected value zero and variance one. We show that the adjusted Pearson residuals can be easily computed by weighted linear regressions. Some numerical results from simulations indicate that the adjusted Pearson residuals are better approximated by the standard normal distribution than the Pearson residuals.  相似文献   

19.
Quantile regression methods have been used to estimate upper and lower quantile reference curves as the function of several covariates. In this article, it is demonstrated that the estimating equation of Zhou [A weighted quantile regression for randomly truncated data, Comput. Stat. Data Anal. 55 (2011), pp. 554–566.] can be extended to analyse left-truncated and right-censored data. We evaluate the finite sample performance of the proposed estimators through simulation studies. The proposed estimator β?(q) is applied to the Veteran's Administration lung cancer data reported by Prentice [Exponential survival with censoring and explanatory variables, Biometrika 60 (1973), pp. 279–288].  相似文献   

20.
This paper introduces a new shrinkage estimator for the negative binomial regression model that is a generalization of the estimator proposed for the linear regression model by Liu [A new class of biased estimate in linear regression, Comm. Stat. Theor. Meth. 22 (1993), pp. 393–402]. This shrinkage estimator is proposed in order to solve the problem of an inflated mean squared error of the classical maximum likelihood (ML) method in the presence of multicollinearity. Furthermore, the paper presents some methods of estimating the shrinkage parameter. By means of Monte Carlo simulations, it is shown that if the Liu estimator is applied with these shrinkage parameters, it always outperforms ML. The benefit of the new estimation method is also illustrated in an empirical application. Finally, based on the results from the simulation study and the empirical application, a recommendation regarding which estimator of the shrinkage parameter that should be used is given.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号