首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Abstract.  Prediction error is critical to assess model fit and evaluate model prediction. We propose the cross-validation (CV) and approximated CV methods for estimating prediction error under the Bregman divergence (BD), which embeds nearly all of the commonly used loss functions in the regression, classification procedures and machine learning literature. The approximated CV formulas are analytically derived, which facilitate fast estimation of prediction error under BD. We then study a data-driven optimal bandwidth selector for local-likelihood estimation that minimizes the overall prediction error or equivalently the covariance penalty. It is shown that the covariance penalty and CV methods converge to the same mean-prediction-error-criterion. We also propose a lower-bound scheme for computing the local logistic regression estimates and demonstrate that the algorithm monotonically enhances the target local likelihood and converges. The idea and methods are extended to the generalized varying-coefficient models and additive models.  相似文献   

2.
Two nonparametric classification rules for e-univariace populations are proposed, one in which the probability of correct classification is a specified number and the other in which one has to evaluate the probability of correct classification. In each case the classification is with respect to the Chernoff and Savage (1958) class of statistics, with possible specialization to populations having different location shifts and different changes of scale. An optimum property, namely the consistency of the classification procedure, is established for the second rule, when the distributions are either fixed or “near” in the Pitman sense and are tending to a common distribution at a specified rate. A measure of asymptotic efficiency is defined for the second rule and its asymptotic efficiency based on the Chernoff-Savage class of statistics relative to the parametric competitors ie the case of location shifts and scale changes is shown to be equal to the analogous Pitman efficiency.  相似文献   

3.
We consider the problem of choosing the ridge parameter. Two penalized maximum likelihood (PML) criteria based on a distribution-free and a data-dependent penalty function are proposed. These PML criteria can be considered as “continuous” versions of AIC. A systematic simulation is conducted to compare the suggested criteria to several existing methods. The simulation results strongly support the use of our method. The method is also applied to two real data sets.  相似文献   

4.
With a growing interest in using non-representative samples to train prediction models for numerous outcomes it is necessary to account for the sampling design that gives rise to the data in order to assess the generalized predictive utility of a proposed prediction rule. After learning a prediction rule based on a non-uniform sample, it is of interest to estimate the rule's error rate when applied to unobserved members of the population. Efron (1986) proposed a general class of covariance penalty inflated prediction error estimators that assume the available training data are representative of the target population for which the prediction rule is to be applied. We extend Efron's estimator to the complex sample context by incorporating Horvitz–Thompson sampling weights and show that it is consistent for the true generalization error rate when applied to the underlying superpopulation. The resulting Horvitz–Thompson–Efron estimator is equivalent to dAIC, a recent extension of Akaike's information criteria to survey sampling data, but is more widely applicable. The proposed methodology is assessed with simulations and is applied to models predicting renal function obtained from the large-scale National Health and Nutrition Examination Study survey. The Canadian Journal of Statistics 48: 204–221; 2020 © 2019 Statistical Society of Canada  相似文献   

5.
Cox’s proportional hazards model is the most common way to analyze survival data. The model can be extended in the presence of collinearity to include a ridge penalty, or in cases where a very large number of coefficients (e.g. with microarray data) has to be estimated. To maximize the penalized likelihood, optimal weights of the ridge penalty have to be obtained. However, there is no definite rule for choosing the penalty weight. One approach suggests maximization of the weights by maximizing the leave-one-out cross validated partial likelihood, however this is time consuming and computationally expensive, especially in large datasets. We suggest modelling survival data through a Poisson model. Using this approach, the log-likelihood of a Poisson model is maximized by standard iterative weighted least squares. We will illustrate this simple approach, which includes smoothing of the hazard function and move on to include a ridge term in the likelihood. We will then maximize the likelihood by considering tools from generalized mixed linear models. We will show that the optimal value of the penalty is found simply by computing the hat matrix of the system of linear equations and dividing its trace by a product of the estimated coefficients.  相似文献   

6.
Based on the SCAD penalty and the area under the ROC curve (AUC), we propose a new method for selecting and combining biomarkers for disease classification and prediction. The proposed estimator for the combination of the biomarkers has an oracle property; that is, the estimated combination of the biomarkers performs as well as it would have been if the biomarkers significantly associated with the outcome had been known in advance, in terms of discriminative power. The proposed estimator is computationally feasible, n1/2‐consistent and asymptotically normal. Simulation studies show that the proposed method performs better than existing methods. We illustrate the proposed methodology in the acoustic startle response study. The Canadian Journal of Statistics 39: 324–343; 2011 © 2011 Statistical Society of Canada  相似文献   

7.
The least squares estimate of the autoregressive coefficient in the AR(1) model is known to be biased towards zero, especially for parameters close to the stationarity boundary. Several methods for correcting the autoregressive parameter estimate for the bias have been suggested. Using simulations, we study the bias and the mean square error of the least squares estimate and the bias-corrections proposed by Kendall and Quenouille.

We also study the mean square forecast error and the coverage of the 95% prediction interval when using the biased least squares estimate or one of its bias-corrected versions. We find that the estimation bias matters little for point forecasts, but that it affects the coverage of the prediction intervals. Prediction intervals for forecasts more than one step ahead, when calculated with the biased least squares estimate, are too narrow.  相似文献   

8.
Non parametric approaches to classification have gained significant attention in the last two decades. In this paper, we propose a classification methodology based on the multivariate rank functions and show that it is a Bayes rule for spherically symmetric distributions with a location shift. We show that a rank-based classifier is equivalent to optimal Bayes rule under suitable conditions. We also present an affine invariant version of the classifier. To accommodate different covariance structures, we construct a classifier based on the central rank region. Asymptotic properties of these classification methods are studied. We illustrate the performance of our proposed methods in comparison to some other depth-based classifiers using simulated and real data sets.  相似文献   

9.
范新妍等 《统计研究》2021,38(2):99-113
传统信用评分方法主要利用统计分类方法,只能预测借款人是否会发生违约,但不能预测违约发生的时点。治愈率模型是二分类和生存分析的混合模型,不仅可以预测是否会发生违约,而且可以预测违约发生的时点,比传统二分类方法可以提供更多的信息。另外,随着大数据的发展,数据源越来越多,针对相同或者相似任务,可以收集到多个数据集,本文提出了融合多源数据的整合治愈率模型,可以对多个数据集同时建模和估计参数,通过复合惩罚函数进行组间和组内双层变量选择,并通过促进两个子模型回归系数符号相同,提高模型的可解释性。通过数值模拟发现,所提方法在变量选择和参数估计上均有明显优势。最后,将所提方法应用于信用贷款的违约时点预测中,模型表现良好。  相似文献   

10.
Conventional approaches for inference about efficiency in parametric stochastic frontier (PSF) models are based on percentiles of the estimated distribution of the one-sided error term, conditional on the composite error. When used as prediction intervals, coverage is poor when the signal-to-noise ratio is low, but improves slowly as sample size increases. We show that prediction intervals estimated by bagging yield much better coverages than the conventional approach, even with low signal-to-noise ratios. We also present a bootstrap method that gives confidence interval estimates for (conditional) expectations of efficiency, and which have good coverage properties that improve with sample size. In addition, researchers who estimate PSF models typically reject models, samples, or both when residuals have skewness in the “wrong” direction, i.e., in a direction that would seem to indicate absence of inefficiency. We show that correctly specified models can generate samples with “wrongly” skewed residuals, even when the variance of the inefficiency process is nonzero. Both our bagging and bootstrap methods provide useful information about inefficiency and model parameters irrespective of whether residuals have skewness in the desired direction.  相似文献   

11.
The support vector machine (SVM) has been successfully applied to various classification areas with great flexibility and a high level of classification accuracy. However, the SVM is not suitable for the classification of large or imbalanced datasets because of significant computational problems and a classification bias toward the dominant class. The SVM combined with the k-means clustering (KM-SVM) is a fast algorithm developed to accelerate both the training and the prediction of SVM classifiers by using the cluster centers obtained from the k-means clustering. In the KM-SVM algorithm, however, the penalty of misclassification is treated equally for each cluster center even though the contributions of different cluster centers to the classification can be different. In order to improve classification accuracy, we propose the WKM–SVM algorithm which imposes different penalties for the misclassification of cluster centers by using the number of data points within each cluster as a weight. As an extension of the WKM–SVM, the recovery process based on WKM–SVM is suggested to incorporate the information near the optimal boundary. Furthermore, the proposed WKM–SVM can be successfully applied to imbalanced datasets with an appropriate weighting strategy. Experiments show the effectiveness of our proposed methods.  相似文献   

12.
We consider a challenging problem of testing any possible association between a response variable and a set of predictors, when the dimensionality of predictors is much greater than the number of observations. In the context of generalized linear models, a new approach is proposed for testing against high-dimensional alternatives. Our method uses soft-thresholding to suppress stochastic noise and applies the independence rule to borrow strength across the predictors. Moreover, the method can provide a ranked predictor list and automatically select “important” features to retain in the test statistic. We compare the performance of this method with some competing approaches via real data and simulation studies, demonstrating that our method maintains relatively higher power against a wide family of alternatives.  相似文献   

13.
Incorporating historical data has a great potential to improve the efficiency of phase I clinical trials and to accelerate drug development. For model-based designs, such as the continuous reassessment method (CRM), this can be conveniently carried out by specifying a “skeleton,” that is, the prior estimate of dose limiting toxicity (DLT) probability at each dose. In contrast, little work has been done to incorporate historical data into model-assisted designs, such as the Bayesian optimal interval (BOIN), Keyboard, and modified toxicity probability interval (mTPI) designs. This has led to the misconception that model-assisted designs cannot incorporate prior information. In this paper, we propose a unified framework that allows for incorporating historical data into model-assisted designs. The proposed approach uses the well-established “skeleton” approach, combined with the concept of prior effective sample size, thus it is easy to understand and use. More importantly, our approach maintains the hallmark of model-assisted designs: simplicity—the dose escalation/de-escalation rule can be tabulated prior to the trial conduct. Extensive simulation studies show that the proposed method can effectively incorporate prior information to improve the operating characteristics of model-assisted designs, similarly to model-based designs.  相似文献   

14.
The procedure of steepest ascent consists of performing a sequence of sets of trials. Each set of trials is obtained as a result of proceeding sequentially along the path of maximum increase in response. Until now there has been no formal stopping rule, When response values are subject to random error, the decision to stop can be premature due to a “false” drop in the observed response.

A new stopping rule procedure for steepest ascent is intro-duced that takes into account the random error variation in response values. The new procedure protects against taking too many observations when the true mean response is decreasing, it also protects against stopping. prematurely when the true mean response is increasing, A numerical example is given which illus-trates the method.  相似文献   

15.
In many economic models, theory restricts the shape of functions, such as monotonicity or curvature conditions. This article reviews and presents a framework for constrained estimation and inference to test for shape conditions in parametric models. We show that “regional” shape-restricting estimators have important advantages in terms of model fit and flexibility (as opposed to standard “local” or “global” shape-restricting estimators). In our empirical illustration, this is the first article to impose and test for all shape restrictions required by economic theory simultaneously in the “Berndt and Wood” data. We find that this dataset is consistent with “duality theory,” whereas previous studies have found violations of economic theory. We discuss policy consequences for key parameters, such as whether energy and capital are complements or substitutes.  相似文献   

16.
We consider a heteroscedastic convolution density model under the “ordinary smooth assumption.” We introduce a new adaptive wavelet estimator based on term-by-term hard thresholding rule. Its asymptotic properties are explored via the minimax approach under the mean integrated squared error over Besov balls. We prove that our estimator attains near optimal rates of convergence (lower bounds are determined). Simulation results are reported to support our theoretical findings.  相似文献   

17.
Many Bayes factors have been proposed for comparing population means in two-sample (independent samples) studies. Recently, Wang and Liu presented an “objective” Bayes factor (BF) as an alternative to a “subjective” one presented by Gönen et al. Their report was evidently intended to show the superiority of their BF based on “undesirable behavior” of the latter. A wonderful aspect of Bayesian models is that they provide an opportunity to “lay all cards on the table.” What distinguishes the various BFs in the two-sample problem is the choice of priors (cards) for the model parameters. This article discusses desiderata of BFs that have been proposed, and proposes a new criterion to compare BFs, no matter whether subjectively or objectively determined. A BF may be preferred if it correctly classifies the data as coming from the correct model most often. The criterion is based on a famous result in classification theory to minimize the total probability of misclassification. This criterion is objective, easily verified by simulation, shows clearly the effects (positive or negative) of assuming particular priors, provides new insights into the appropriateness of BFs in general, and provides a new answer to the question, “Which BF is best?”  相似文献   

18.
ABSTRACT

Most statistical analyses use hypothesis tests or estimation about parameters to form inferential conclusions. I think this is noble, but misguided. The point of view expressed here is that observables are fundamental, and that the goal of statistical modeling should be to predict future observations, given the current data and other relevant information. Further, the prediction of future observables provides multiple advantages to practicing scientists, and to science in general. These include an interpretable numerical summary of a quantity of direct interest to current and future researchers, a calibrated prediction of what’s likely to happen in future experiments, a prediction that can be either “corroborated” or “refuted” through experimentation, and avoidance of inference about parameters; quantities that exists only as convenient indices of hypothetical distributions. Finally, the predictive probability of a future observable can be used as a standard for communicating the reliability of the current work, regardless of whether confirmatory experiments are conducted. Adoption of this paradigm would improve our rigor for scientific accuracy and reproducibility by shifting our focus from “finding differences” among hypothetical parameters to predicting observable events based on our current scientific understanding.  相似文献   

19.
Bien and Tibshirani (Biometrika, 98(4):807–820, 2011) have proposed a covariance graphical lasso method that applies a lasso penalty on the elements of the covariance matrix. This method is definitely useful because it not only produces sparse and positive definite estimates of the covariance matrix but also discovers marginal independence structures by generating exact zeros in the estimated covariance matrix. However, the objective function is not convex, making the optimization challenging. Bien and Tibshirani (Biometrika, 98(4):807–820, 2011) described a majorize-minimize approach to optimize it. We develop a new optimization method based on coordinate descent. We discuss the convergence property of the algorithm. Through simulation experiments, we show that the new algorithm has a number of advantages over the majorize-minimize approach, including its simplicity, computing speed and numerical stability. Finally, we show that the cyclic version of the coordinate descent algorithm is more efficient than the greedy version.  相似文献   

20.
The purpose of this study is to highlight the application of sparse logistic regression models in dealing with prediction of tumour pathological subtypes based on lung cancer patients'' genomic information. We consider sparse logistic regression models to deal with the high dimensionality and correlation between genomic regions. In a hierarchical likelihood (HL) method, it is assumed that the random effects follow a normal distribution and its variance is assumed to follow a gamma distribution. This formulation considers ridge and lasso penalties as special cases. We extend the HL penalty to include a ridge penalty (called ‘HLnet’) in a similar principle of the elastic net penalty, which is constructed from lasso penalty. The results indicate that the HL penalty creates more sparse estimates than lasso penalty with comparable prediction performance, while HLnet and elastic net penalties have the best prediction performance in real data. We illustrate the methods in a lung cancer study.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号