首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Expectile regression [Newey W, Powell J. Asymmetric least squares estimation and testing, Econometrica. 1987;55:819–847] is a nice tool for estimating the conditional expectiles of a response variable given a set of covariates. Expectile regression at 50% level is the classical conditional mean regression. In many real applications having multiple expectiles at different levels provides a more complete picture of the conditional distribution of the response variable. Multiple linear expectile regression model has been well studied [Newey W, Powell J. Asymmetric least squares estimation and testing, Econometrica. 1987;55:819–847; Efron B. Regression percentiles using asymmetric squared error loss, Stat Sin. 1991;1(93):125.], but it can be too restrictive for many real applications. In this paper, we derive a regression tree-based gradient boosting estimator for nonparametric multiple expectile regression. The new estimator, referred to as ER-Boost, is implemented in an R package erboost publicly available at http://cran.r-project.org/web/packages/erboost/index.html. We use two homoscedastic/heteroscedastic random-function-generator models in simulation to show the high predictive accuracy of ER-Boost. As an application, we apply ER-Boost to analyse North Carolina County crime data. From the nonparametric expectile regression analysis of this dataset, we draw several interesting conclusions that are consistent with the previous study using the economic model of crime. This real data example also provides a good demonstration of some nice features of ER-Boost, such as its ability to handle different types of covariates and its model interpretation tools.  相似文献   

2.

We propose a semiparametric framework based on sliced inverse regression (SIR) to address the issue of variable selection in functional regression. SIR is an effective method for dimension reduction which computes a linear projection of the predictors in a low-dimensional space, without loss of information on the regression. In order to deal with the high dimensionality of the predictors, we consider penalized versions of SIR: ridge and sparse. We extend the approaches of variable selection developed for multidimensional SIR to select intervals that form a partition of the definition domain of the functional predictors. Selecting entire intervals rather than separated evaluation points improves the interpretability of the estimated coefficients in the functional framework. A fully automated iterative procedure is proposed to find the critical (interpretable) intervals. The approach is proved efficient on simulated and real data. The method is implemented in the R package SISIR available on CRAN at https://cran.r-project.org/package=SISIR.

  相似文献   

3.
Principal component regression uses principal components (PCs) as regressors. It is particularly useful in prediction settings with high-dimensional covariates. The existing literature treating of Bayesian approaches is relatively sparse. We introduce a Bayesian approach that is robust to outliers in both the dependent variable and the covariates. Outliers can be thought of as observations that are not in line with the general trend. The proposed approach automatically penalises these observations so that their impact on the posterior gradually vanishes as they move further and further away from the general trend, corresponding to a concept in Bayesian statistics called whole robustness. The predictions produced are thus consistent with the bulk of the data. The approach also exploits the geometry of PCs to efficiently identify those that are significant. Individual predictions obtained from the resulting models are consolidated according to model-averaging mechanisms to account for model uncertainty. The approach is evaluated on real data and compared to its nonrobust Bayesian counterpart, the traditional frequentist approach and a commonly employed robust frequentist method. Detailed guidelines to automate the entire statistical procedure are provided. All required code is made available, see ArXiv:1711.06341.  相似文献   

4.
Doubly truncated data appear in a number of applications, including astronomy and survival analysis. For doubly-truncated data, the lifetime T is observable only when UTV, where U and V are the left-truncated and right-truncated time, respectively. Based on the empirical likelihood approach of Zhou [21 Zhou, M. 2005. Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm. J. Comput. Graph. Statist., 14: 643656. [Taylor & Francis Online], [Web of Science ®] [Google Scholar]], we propose a modified EM algorithm of Turnbull [19 Turnbull, B. W. 1976. The empirical distribution function with arbitrarily grouped censored and truncated data. J. R. Stat. Soc. Ser. B, 38: 290295.  [Google Scholar]] to construct the interval estimator of the distribution function of T. Simulation results indicate that the empirical likelihood method can be more efficient than the bootstrap method.  相似文献   

5.
Abstract

In this article, we revisit the problem of fitting a mixture model under the assumption that the mixture components are symmetric and log-concave. To this end, we first study the nonparametric maximum likelihood estimation (MLE) of a monotone log-concave probability density. To fit the mixture model, we propose a semiparametric EM (SEM) algorithm, which can be adapted to other semiparametric mixture models. In our numerical experiments, we compare our algorithm to that of Balabdaoui and Doss (2018 Balabdaoui, F., and C. R. Doss. 2018. Inference for a two-component mixture of symmetric distributions under log-concavity. Bernoulli 24 (2):105371.[Crossref], [Web of Science ®] [Google Scholar], Inference for a two-component mixture of symmetric distributions under log-concavity. Bernoulli 24 (2):1053–71) and other mixture models both on simulated and real-world datasets.  相似文献   

6.
This article addresses the density estimation problem using nonparametric Bayesian approach. It is considered hierarchical mixture models where the uncertainty about the mixing measure is modeled using the Dirichlet process. The main goal is to build more flexible models for density estimation. Extensions of the normal mixture model via Dirichlet process previously introduced in the literature are twofold. First, Dirichlet mixtures of skew-normal distributions are considered, say, in the first stage of the hierarchical model, the normal distribution is replaced by the skew-normal one. We also assume a skew-normal distribution as the center measure in the Dirichlet mixture of normal distributions. Some important results related to Bayesian inference in the location-scale skew-normal family are introduced. In particular, we obtain the stochastic representations for the full conditional distributions of the location and skewness parameters. The algorithm introduced by MacEachern and Müller in 1998 MacEachern, S.N., Müller, P. (1998). Estimating mixture of Dirichlet Process models. J. Computat. Graph. Statist. 7(2):223238.[Taylor & Francis Online], [Web of Science ®] [Google Scholar] is used to sample from the posterior distributions. The models are compared considering simulated data sets. Finally, the well-known Old Faithful Geyser data set is analyzed using the proposed models and the Dirichlet mixture of normal distributions. The model based on Dirichlet mixture of skew-normal distributions captured the data bimodality and skewness shown in the empirical distribution.  相似文献   

7.
8.
A new modeling approach called ‘recursive segmentation’ is proposed to support the supervised exploration and identification of subgroups or clusters. It is based on the frameworks of recursive partitioning and the Patient Rule Induction Method (PRIM). Through combining these methods, recursive segmentation aims to exploit their respective strengths while reducing their weaknesses. Consequently, recursive segmentation can be applied in a very general way, that is in any (multivariate) regression, classification or survival (time-to-event) problem, using conditional inference, evolutionary learning or the CART algorithm, with predictor variables of any scale and with missing values. Furthermore, results of a synthetic example and a benchmark application study that comprises 26 data sets suggest that recursive segmentation achieves a competitive prediction accuracy and provides more accurate definitions of subgroups by models of less complexity as compared to recursive partitioning and PRIM. An application to the German Breast Cancer Study Group data demonstrates the improved interpretability and reliability of results produced by the new approach. The method is made publicly available through the R-package rseg (http://rseg.r-forge.r-project.org/).  相似文献   

9.
Multivariate skew-normal (SN) distributions (Azzalini and Dalla Valle, 1996 Azzalini , A. , Dalla Valle , A. ( 1996 ). The multivariate skew-normal distribution . Biometrika 83 : 715726 .[Crossref], [Web of Science ®] [Google Scholar]) enjoy some of the useful properties of normal distributions, have nonlinear heteroscedastic predictors but lack the closure property of normal distributions (the sum of independent SN random variables is not SN). Recently, there has been a proliferation of classes of SN distributions with certain closure properties, one of the most promising being the closed skew-normal (CSN) distributions of González-Farías et al. (2004 González-Farías , G. , Dominguez-Molina , J. A. , Gupta , A. K. ( 2004 ). Additive properties of skew-normal random vectors . J. Statist. Plann. Infer. 126 : 521534 .[Crossref], [Web of Science ®] [Google Scholar]). We study the construction of stationary SN ARMA models for colored SN noise and show that their finite-dimensional distributions are skew-normal, seldom strictly stationary and their covariance functions differ from their normal ARMA counterparts in that they do not converge to zero for large lags. The situation is better for ARMA models driven by CSN noise, but at the additional cost of considerable computational complexity and a less explicit skewness parameter. In view of these results, the widespread use of such classes of SN distributions in the framework of ARMA models seem doubtful.  相似文献   

10.
Abstract

When the mixed chart proposed by Aslam et al. (2015 Aslam, M., M. Azam, N. Khan, and C.-H. Jun. 2015. A mixed control chart to monitor the process. International Journal of Production Research 53 (15):468493. doi:10.1080/00207543.2015.1031354.[Taylor & Francis Online], [Web of Science ®] [Google Scholar]) is in use, the sample items are classified as defective or not defective and, depending on the number of defectives, the quality characteristic X of the sample items are also measured. In this case, an Xbar chart decides the state of the process. The previous conforming/non-conforming classification truncates the X distribution and, because of that, the mathematical development to obtain the ARLs is complex. Aslam et al. (2015 Aslam, M., M. Azam, N. Khan, and C.-H. Jun. 2015. A mixed control chart to monitor the process. International Journal of Production Research 53 (15):468493. doi:10.1080/00207543.2015.1031354.[Taylor & Francis Online], [Web of Science ®] [Google Scholar]) didn’t pay attention to the fact that the X distribution is truncated and, due to that, they obtained incorrect ARLs.  相似文献   

11.
Abstract

In this paper, we discuss how to model the mean and covariancestructures in linear mixed models (LMMs) simultaneously. We propose a data-driven method to modelcovariance structures of the random effects and random errors in the LMMs. Parameter estimation in the mean and covariances is considered by using EM algorithm, and standard errors of the parameter estimates are calculated through Louis’ (1982 Louis, T.A. (1982). Finding observed information using the EM algorithm. J. Royal Stat. Soc. B 44:98130. [Google Scholar]) information principle. Kenward’s (1987 Kenward, M.G. (1987). A method for comparing profiles of repeated measurements. Appl. Stat. 36:296308.[Crossref], [Web of Science ®] [Google Scholar]) cattle data sets are analyzed for illustration,and comparison to the literature work is made through simulation studies. Our numerical analysis confirms the superiority of the proposed method to existing approaches in terms of Akaike information criterion.  相似文献   

12.
ABSTRACT

This article considers three practical hypotheses involving the equicorrelation matrix for grouped normal data. We obtain statistics and computing formulae for common test procedures such as the score test and the likelihood ratio test. In addition, statistics and computing formulae are obtained for various small sample procedures as proposed in Skovgaard (2001 Skovgaard , I. M. . ( 2001 ). Likelihood asymptotics . Scand. J. Statist. . 28 : 332 . [CROSSREF] [Crossref], [Web of Science ®] [Google Scholar]). The properties of the tests for each of the three hypotheses are compared using Monte Carlo simulations.  相似文献   

13.
This article deals with the study of some properties of a mixture periodically correlated n-variate vector autoregressive (MPVAR) time series model, which extends the mixture time invariant parameter n-vector autoregressive (MVAR) model that has been recently studied by Fong et al. (2007 Fong, P.W., Li, W.K., Yau, C.W., Wong, C.S. (2007). On a mixture vector autoregressive model. The Canadian Journal of Statistics 35:135150.[Crossref], [Web of Science ®] [Google Scholar]). Our main contributions here are, on the one side, the obtaining of the second moment periodically stationary condition for a n-variate MPVARS(n; K; 2, …, 2) model; furthermore, the closed-form of the second moment is obtained and, on the other side, the estimation, via the Expectation-Maximization (EM) algorithm, of the coefficient matrices and the error variance matrix.  相似文献   

14.
15.
16.
Over the last 25 years, increasing attention has been given to the problem of analysing data arising from circular distributions. The most important circular distribution was introduced by Von Mises (1918) which takes the form:

[Formulas]

where Io(k) is a modified Bessel function, u0 is the mean direction and k is the concentration parameter of the distribution. Watson & Williams (1956) laid the foundation of analysis of variance type techniques for the two-dimensional case of circular data using the Von Mises distribution. Stephens (1962a,b; 1969, 1972). Upton (1974) and Stephens (1982) made further improvements to Watson & Williams’ work. In this paper the authors will discuss the pitfalls of the methods adopted by Stephens (1982) and present a unified analysis of variance type approach for circular data.  相似文献   


17.
We deal with sampling by variables with two-way protection in the case of a $N\>(\mu ,\sigma ^2)$ distributed characteristic with unknown $\sigma $ . The LR sampling plan proposed by Lieberman and Resnikoff (JASA 50: 457 ${-}$ 516, 1955) and the BSK sampling plan proposed by Bruhn-Suhr and Krumbholz (Stat. Papers 31: 195–207, 1990) are based on the UMVU and the plug-in estimator, respectively. For given $p_1$ (AQL), $p_2$ (RQL) and $\alpha ,\beta $ (type I and II errors) we present an algorithm allowing to determine the optimal LR and BSK plans having minimal sample size among all plans satisfying the corresponding two-point condition on the OC. An R (R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org/ 2012) package, ExLiebeRes‘ (Krumbholz and Steuer ExLiebeRes: calculating exact LR- and BSK-plans, R-package version 0.9.9. http://exlieberes.r-forge.r-project.org 2012) implementing that algorithm is provided to the public.  相似文献   

18.
ABSTRACT

Stratification of distribution functions is an important issue in the area of income distributions. Two distribution functions form a perfect stratification if they occupy disjoint ranges on the horizontal axis. Otherwise, there is overlapping. A measure which quantifies the amount of stratification is introduced by Yitzhaki (1994 Yitzhaki , S. ( 1994 ). Economic distance and overlapping of distributions . J. Econometrics 61 : 147159 . [CSA] [CROSSREF] [Crossref], [Web of Science ®] [Google Scholar]), but no procedure for drawing inference is suggested. We develop a consistent estimator of the degree of overlapping and offer a nonparametric procedure for inference. Its limiting distribution, properly standardized, is normal. The asymptotic variance can be estimated using the jackknife method, and simulations show that the suggested procedure works well for sample sizes of 50 (100 for some cases).  相似文献   

19.
The article investigates diagnostic procedures for finite mixture models. The problem is to decide whether given data stem from an exponential distribution or a finite mixture of such distributions. Recently, three new test approaches have been proposed, the modified likelihood ratio test (MLRT) by Chen et al. (2001 Chen , H. , Chen , J. , Kalbfleisch , J. D. ( 2001 ). A modified likelihood ratio test for homogeneity in finite mixture models . Journal of the Royal Statistical Society, B 63 : 1929 .[Crossref] [Google Scholar]), the ADDS test by Mosler and Seidel (2001 Mosler , K. , Seidel , W. ( 2001 ). Testing for homogeneity in an exponential mixture model . Australian and New Zealand Journal of Statistics 43 : 231247 . [Google Scholar]), and the D-test by Charnigo and Sun (2004 Charnigo , R. , Sun , J. ( 2004 ). Testing homogeneity in a mixture distribution via the l 2 distance between competing models . Journal of the American Statistical Society 99 : 488498 .[Taylor & Francis Online], [Web of Science ®] [Google Scholar]). The size and power of these tests are determined by Monte Carlo simulation and their relative merits are evaluated. We conclude that the ADDS test shows always not much less and under some alternatives, in particular lower contaminations, considerably more power than its competitors. Also, new tables for the ADDS test are provided.  相似文献   

20.
Consider a skewed population. Suppose an intelligent guess could be made about an interval that contains the population mean. There may exist biased estimators with smaller mean squared error than the arithmetic mean within such an interval. This article indicates when it is advisable to shrink the arithmetic mean towards a guessed interval using root estimators. The goal is to obtain an estimator that is better near the average of natural origins. An estimator proposed. This estimator contains the Thompson (1968 Thompson , J. R. ( 1968 ). Accuracy borrowing in the estimation of the mean by shrinkage towards an interval . J. Amer. Statist. Assoc. 63 : 953963 . [CSA] [CROSSREF] [Taylor & Francis Online], [Web of Science ®] [Google Scholar]) ordinary shrinkage estimator, the Jenkins et al. (1973 Jenkins , O. C. , Ringer , L. J. , Hartley , H. O. ( 1973 ). Root estimators . J Amer. Statist. Assoc. 68 : 414419 . [CSA] [CROSSREF] [Taylor & Francis Online], [Web of Science ®] [Google Scholar]) square-root estimator, and the arithmetic sample mean as special cases. The bias and the mean squared error of the proposed more general estimator is compared with the three special cases. Shrinkage coefficients that yield minimum mean squared error estimators are obtained. The proposed estimator is considerably more efficient than the three special cases. This remains true for highly skewed populations. The merits of the proposed shrinkage square-root estimator are supported by the results of numerical and simulation studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号