首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 730 毫秒
1.
Jae Keun Yoo 《Statistics》2016,50(5):1086-1099
The purpose of this paper is to define the central informative predictor subspace to contain the central subspace and to develop methods for estimating the former subspace. Potential advantages of the proposed methods are no requirements of linearity, constant variance and coverage conditions in methodological developments. Therefore, the central informative predictor subspace gives us the benefit of restoring the central subspace exhaustively despite failing the conditions. Numerical studies confirm the theories, and real data analyses are presented.  相似文献   

2.
In this article, we propose a new method for sufficient dimension reduction when both response and predictor are vectors. The new method, using distance covariance, keeps the model-free advantage, and can fully recover the central subspace even when many predictors are discrete. We then extend this method to the dual central subspace, including a special case of canonical correlation analysis. We illustrated estimators through extensive simulations and real datasets, and compared to some existing methods, showing that our estimators are competitive and robust.  相似文献   

3.
Variable selection is a very important tool when dealing with high dimensional data. However, most popular variable selection methods are model based, which might provide misleading results when the model assumption is not satisfied. Sufficient dimension reduction provides a general framework for model-free variable selection methods. In this paper, we propose a model-free variable selection method via sufficient dimension reduction, which incorporates the grouping information into the selection procedure for multi-population data. Theoretical properties of our selection methods are also discussed. Simulation studies suggest that our method greatly outperforms those ignoring the grouping information.  相似文献   

4.
Sliced average variance estimation (SAVE) is a method for constructing sufficient summary plots in regressions with many predictors. The summary plots are designed to capture all the information about the response that is available from the predictors, and do not require a model for their construction. They can be particularly helpful for guiding the choice of a first model. Methodological aspects of SAVE are studied in this article.  相似文献   

5.
6.
Accurate diagnosis of disease is a critical part of health care. New diagnostic and screening tests must be evaluated based on their abilities to discriminate diseased conditions from non‐diseased conditions. For a continuous‐scale diagnostic test, a popular summary index of the receiver operating characteristic (ROC) curve is the area under the curve (AUC). However, when our focus is on a certain region of false positive rates, we often use the partial AUC instead. In this paper we have derived the asymptotic normal distribution for the non‐parametric estimator of the partial AUC with an explicit variance formula. The empirical likelihood (EL) ratio for the partial AUC is defined and it is shown that its limiting distribution is a scaled chi‐square distribution. Hybrid bootstrap and EL confidence intervals for the partial AUC are proposed by using the newly developed EL theory. We also conduct extensive simulation studies to compare the relative performance of the proposed intervals and existing intervals for the partial AUC. A real example is used to illustrate the application of the recommended intervals. The Canadian Journal of Statistics 39: 17–33; 2011 © 2011 Statistical Society of Canada  相似文献   

7.
This paper deals with the nonparametric estimation of the mean and variance functions of univariate time series data. We propose a nonparametric dimension reduction technique for both mean and variance functions of time series. This method does not require any model specification and instead we seek directions in both the mean and variance functions such that the conditional distribution of the current observation given the vector of past observations is the same as that of the current observation given a few linear combinations of the past observations without loss of inferential information. The directions of the mean and variance functions are estimated by maximizing the Kullback–Leibler distance function. The consistency of the proposed estimators is established. A computational procedure is introduced to detect lags of the conditional mean and variance functions in practice. Numerical examples and simulation studies are performed to illustrate and evaluate the performance of the proposed estimators.  相似文献   

8.
As new diagnostic tests are developed and marketed, it is very important to be able to compare the accuracy of a given two continuous‐scale diagnostic tests. An effective method to evaluate the difference between the diagnostic accuracy of two tests is to compare partial areas under the receiver operating characteristic curves (AUCs). In this paper, we review existing parametric methods. Then, we propose a new semiparametric method and a new nonparametric method to investigate the difference between two partial AUCs. For the difference between two partial AUCs under each method, we derive a normal approximation, define an empirical log‐likelihood ratio, and show that the empirical log‐likelihood ratio follows a scaled chi‐square distribution. We construct five confidence intervals for the difference based on normal approximation, bootstrap, and empirical likelihood methods. Finally, extensive simulation studies are conducted to compare the finite‐sample performances of these intervals, and a real example is used as an application of our recommended intervals. The simulation results indicate that the proposed hybrid bootstrap and empirical likelihood intervals outperform other existing intervals in most cases.  相似文献   

9.
Time series are often affected by interventions such as strikes, earthquakes, or policy changes. In the current paper, we build a practical nonparametric intervention model using the central mean subspace in time series. We estimate the central mean subspace for time series taking into account known interventions by using the Nadaraya–Watson kernel estimator. We use the modified Bayesian information criterion to estimate the unknown lag and dimension. Finally, we demonstrate that this nonparametric approach for intervened time series performs well in simulations and in a real data analysis such as the Monthly average of the oxidant.  相似文献   

10.
Summary.  Because highly correlated data arise from many scientific fields, we investigate parameter estimation in a semiparametric regression model with diverging number of predictors that are highly correlated. For this, we first develop a distribution-weighted least squares estimator that can recover directions in the central subspace, then use the distribution-weighted least squares estimator as a seed vector and project it onto a Krylov space by partial least squares to avoid computing the inverse of the covariance of predictors. Thus, distrbution-weighted partial least squares can handle the cases with high dimensional and highly correlated predictors. Furthermore, we also suggest an iterative algorithm for obtaining a better initial value before implementing partial least squares. For theoretical investigation, we obtain strong consistency and asymptotic normality when the dimension p of predictors is of convergence rate O { n 1/2/ log ( n )} and o ( n 1/3) respectively where n is the sample size. When there are no other constraints on the covariance of predictors, the rates n 1/2 and n 1/3 are optimal. We also propose a Bayesian information criterion type of criterion to estimate the dimension of the Krylov space in the partial least squares procedure. Illustrative examples with a real data set and comprehensive simulations demonstrate that the method is robust to non-ellipticity and works well even in 'small n –large p ' problems.  相似文献   

11.
The dimension reduction in regression is an efficient method of overcoming the curse of dimensionality in non-parametric regression. Motivated by recent developments for dimension reduction in time series, an empirical extension of central mean subspace in time series to a single-input transfer function model is performed in this paper. Here, we use central mean subspace as a tool of dimension reduction for bivariate time series in the case when the dimension and lag are known and estimate the central mean subspace through the Nadaraya–Watson kernel smoother. Furthermore, we develop a data-dependent approach based on a modified Schwarz Bayesian criterion to estimate the unknown dimension and lag. Finally, we show that the approach in bivariate time series works well using an expository demonstration, two simulations, and a real data analysis such as El Niño and fish Population.  相似文献   

12.
S. S. Wulff 《Statistics》2013,47(1):53-65
In a variance components model for normally distributed data, for a specified vector of linear combinations of the variance components, necessary and sufficient conditions are given under which the vector has a uniformly minimum variance unbiased translation-invariant estimator. The competing class of estimators is not restricted to those that are quadratic. For classification models, the conditions are translated into easy-to-check partial balance requirements on the incidence array.  相似文献   

13.
Abstract. A non‐parametric rank‐based test of exchangeability for bivariate extreme‐value copulas is first proposed. The two key ingredients of the suggested approach are the non‐parametric rank‐based estimators of the Pickands dependence function recently studied by Genest and Segers, and a multiplier technique for obtaining approximate p‐values for the derived statistics. The proposed approach is then extended to left‐tail decreasing dependence structures that are not necessarily extreme‐value copulas. Large‐scale Monte Carlo experiments are used to investigate the level and power of the various versions of the test and show that the proposed procedure can be substantially more powerful than tests of exchangeability derived directly from the empirical copula. The approach is illustrated on well‐known financial data.  相似文献   

14.
The process comparing the empirical cumulative distribution function of the sample with a parametric estimate of the cumulative distribution function is known as the empirical process with estimated parameters and has been extensively employed in the literature for goodness‐of‐fit testing. The simplest way to carry out such goodness‐of‐fit tests, especially in a multivariate setting, is to use a parametric bootstrap. Although very easy to implement, the parametric bootstrap can become very computationally expensive as the sample size, the number of parameters, or the dimension of the data increase. An alternative resampling technique based on a fast weighted bootstrap is proposed in this paper, and is studied both theoretically and empirically. The outcome of this work is a generic and computationally efficient multiplier goodness‐of‐fit procedure that can be used as a large‐sample alternative to the parametric bootstrap. In order to approximately determine how large the sample size needs to be for the parametric and weighted bootstraps to have roughly equivalent powers, extensive Monte Carlo experiments are carried out in dimension one, two and three, and for models containing up to nine parameters. The computational gains resulting from the use of the proposed multiplier goodness‐of‐fit procedure are illustrated on trivariate financial data. A by‐product of this work is a fast large‐sample goodness‐of‐fit procedure for the bivariate and trivariate t distribution whose degrees of freedom are fixed. The Canadian Journal of Statistics 40: 480–500; 2012 © 2012 Statistical Society of Canada  相似文献   

15.
Semiparametric maximum likelihood estimation with estimating equations (SMLE) is more flexible than traditional methods; it has fewer restrictions on distributions and regression models. The required information about distribution and regression structures is incorporated in estimating equations of the SMLE to improve the estimation quality of non‐parametric methods. The likelihood of SMLE for censored data involves complicated implicit functions without closed‐form expressions, and the first derivatives of the log‐profile‐likelihood cannot be expressed as summations of independent and identically distributed random variables; it is challenging to derive asymptotic properties of the SMLE for censored data. For group‐censored data, the paper shows that all the implicit functions are well defined and obtains the asymptotic distributions of the SMLE for model parameters and lifetime distributions. With several examples the paper compares the SMLE, the regular non‐parametric likelihood estimation method and the parametric MLEs in terms of their asymptotic efficiencies, and illustrates application of SMLE. Various asymptotic distributions of the likelihood ratio statistics are derived for testing the adequacy of estimating equations and a partial set of parameters equal to some known values.  相似文献   

16.
In the past decades, the number of variables explaining observations in different practical applications increased gradually. This has led to heavy computational tasks, despite of widely using provisional variable selection methods in data processing. Therefore, more methodological techniques have appeared to reduce the number of explanatory variables without losing much of the information. In these techniques, two distinct approaches are apparent: ‘shrinkage regression’ and ‘sufficient dimension reduction’. Surprisingly, there has not been any communication or comparison between these two methodological categories, and it is not clear when each of these two approaches are appropriate. In this paper, we fill some of this gap by first reviewing each category in brief, paying special attention to the most commonly used methods in each category. We then compare commonly used methods from both categories based on their accuracy, computation time, and their ability to select effective variables. A simulation study on the performance of the methods in each category is generated as well. The selected methods are concurrently tested on two sets of real data which allows us to recommend conditions under which one approach is more appropriate to be applied to high-dimensional data.  相似文献   

17.
Length‐biased sampling data are often encountered in the studies of economics, industrial reliability, epidemiology, genetics and cancer screening. The complication of this type of data is due to the fact that the observed lifetimes suffer from left truncation and right censoring, where the left truncation variable has a uniform distribution. In the Cox proportional hazards model, Huang & Qin (Journal of the American Statistical Association, 107, 2012, p. 107) proposed a composite partial likelihood method which not only has the simplicity of the popular partial likelihood estimator, but also can be easily performed by the standard statistical software. The accelerated failure time model has become a useful alternative to the Cox proportional hazards model. In this paper, by using the composite partial likelihood technique, we study this model with length‐biased sampling data. The proposed method has a very simple form and is robust when the assumption that the censoring time is independent of the covariate is violated. To ease the difficulty of calculations when solving the non‐smooth estimating equation, we use a kernel smoothed estimation method (Heller; Journal of the American Statistical Association, 102, 2007, p. 552). Large sample results and a re‐sampling method for the variance estimation are discussed. Some simulation studies are conducted to compare the performance of the proposed method with other existing methods. A real data set is used for illustration.  相似文献   

18.
It is the main purpose of this paper to study the asymptotics of certain variants of the empirical process in the context of survey data. Precisely, Functional Central Limit Theorems are established under usual conditions when the sample is drawn from a Poisson or a rejective sampling design. The framework we develop encompasses sampling designs with non‐uniform first order inclusion probabilities, which can be chosen so as to optimize estimation accuracy. Applications to Hadamard differentiable functionals are considered.  相似文献   

19.
Elimination of a nuisance variable is often non‐trivial and may involve the evaluation of an intractable integral. One approach to evaluate these integrals is to use the Laplace approximation. This paper concentrates on a new approximation, called the partial Laplace approximation, that is useful when the integrand can be partitioned into two multiplicative disjoint functions. The technique is applied to the linear mixed model and shows that the approximate likelihood obtained can be partitioned to provide a conditional likelihood for the location parameters and a marginal likelihood for the scale parameters equivalent to restricted maximum likelihood (REML). Similarly, the partial Laplace approximation is applied to the t‐distribution to obtain an approximate REML for the scale parameter. A simulation study reveals that, in comparison to maximum likelihood, the scale parameter estimates of the t‐distribution obtained from the approximate REML show reduced bias.  相似文献   

20.
This paper deals with the problem of predicting the real‐valued response variable using explanatory variables containing both multivariate random variable and random curve. The proposed functional partial linear single‐index model treats the multivariate random variable as linear part and the random curve as functional single‐index part, respectively. To estimate the non‐parametric link function, the functional single‐index and the parameters in the linear part, a two‐stage estimation procedure is proposed. Compared with existing semi‐parametric methods, the proposed approach requires no initial estimation and iteration. Asymptotical properties are established for both the parameters in the linear part and the functional single‐index. The convergence rate for the non‐parametric link function is also given. In addition, asymptotical normality of the error variance is obtained that facilitates the construction of confidence region and hypothesis testing for the unknown parameter. Numerical experiments including simulation studies and a real‐data analysis are conducted to evaluate the empirical performance of the proposed method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号