首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Penalized likelihood methods provide a range of practical modelling tools, including spline smoothing, generalized additive models and variants of ridge regression. Selecting the correct weights for penalties is a critical part of using these methods and in the single-penalty case the analyst has several well-founded techniques to choose from. However, many modelling problems suggest a formulation employing multiple penalties, and here general methodology is lacking. A wide family of models with multiple penalties can be fitted to data by iterative solution of the generalized ridge regression problem minimize || W 1/2 ( Xp − y ) ||2ρ+Σ i =1 m  θ i p ' S i p ( p is a parameter vector, X a design matrix, S i a non-negative definite coefficient matrix defining the i th penalty with associated smoothing parameter θ i , W a diagonal weight matrix, y a vector of data or pseudodata and ρ an 'overall' smoothing parameter included for computational efficiency). This paper shows how smoothing parameter selection can be performed efficiently by applying generalized cross-validation to this problem and how this allows non-linear, generalized linear and linear models to be fitted using multiple penalties, substantially increasing the scope of penalized modelling methods. Examples of non-linear modelling, generalized additive modelling and anisotropic smoothing are given.  相似文献   

2.
Summary.  Smoothing splines via the penalized least squares method provide versatile and effective nonparametric models for regression with Gaussian responses. The computation of smoothing splines is generally of the order O ( n 3), n being the sample size, which severely limits its practical applicability. We study more scalable computation of smoothing spline regression via certain low dimensional approximations that are asymptotically as efficient. A simple algorithm is presented and the Bayes model that is associated with the approximations is derived, with the latter guiding the porting of Bayesian confidence intervals. The practical choice of the dimension of the approximating space is determined through simulation studies, and empirical comparisons of the approximations with the exact solution are presented. Also evaluated is a simple modification of the generalized cross-validation method for smoothing parameter selection, which to a large extent fixes the occasional undersmoothing problem that is suffered by generalized cross-validation.  相似文献   

3.
The penalized spline is a popular method for function estimation when the assumption of “smoothness” is valid. In this paper, methods for estimation and inference are proposed using penalized splines under additional constraints of shape, such as monotonicity or convexity. The constrained penalized spline estimator is shown to have the same convergence rates as the corresponding unconstrained penalized spline, although in practice the squared error loss is typically smaller for the constrained versions. The penalty parameter may be chosen with generalized cross‐validation, which also provides a method for determining if the shape restrictions hold. The method is not a formal hypothesis test, but is shown to have nice large‐sample properties, and simulations show that it compares well with existing tests for monotonicity. Extensions to the partial linear model, the generalized regression model, and the varying coefficient model are given, and examples demonstrate the utility of the methods. The Canadian Journal of Statistics 40: 190–206; 2012 © 2012 Statistical Society of Canada  相似文献   

4.
Summary. It is occasionally necessary to smooth data over domains in R 2 with complex irregular boundaries or interior holes. Traditional methods of smoothing which rely on the Euclidean metric or which measure smoothness over the entire real plane may then be inappropriate. This paper introduces a bivariate spline smoothing function defined as the minimizer of a penalized sum-of-squares functional. The roughness penalty is based on a partial differential operator and is integrated only over the problem domain by using finite element analysis. The method is motivated by and applied to two sample smoothing problems and is compared with the thin plate spline.  相似文献   

5.
There has been much recent interest in supersaturated designs and their application in factor screening experiments. Supersaturated designs have mainly been constructed by using the E ( s 2)-optimality criterion originally proposed by Booth and Cox in 1962. However, until now E ( s 2)-optimal designs have only been established with certainty for n experimental runs when the number of factors m is a multiple of n-1 , and in adjacent cases where m = q ( n -1) + r (| r | 2, q an integer). A method of constructing E ( s 2)-optimal designs is presented which allows a reasonably complete solution to be found for various numbers of runs n including n ,=8 12, 16, 20, 24, 32, 40, 48, 64.  相似文献   

6.
Many different methods have been proposed to construct nonparametric estimates of a smooth regression function, including local polynomial, (convolution) kernel and smoothing spline estimators. Each of these estimators uses a smoothing parameter to control the amount of smoothing performed on a given data set. In this paper an improved version of a criterion based on the Akaike information criterion (AIC), termed AICC, is derived and examined as a way to choose the smoothing parameter. Unlike plug-in methods, AICC can be used to choose smoothing parameters for any linear smoother, including local quadratic and smoothing spline estimators. The use of AICC avoids the large variability and tendency to undersmooth (compared with the actual minimizer of average squared error) seen when other 'classical' approaches (such as generalized cross-validation (GCV) or the AIC) are used to choose the smoothing parameter. Monte Carlo simulations demonstrate that the AICC-based smoothing parameter is competitive with a plug-in method (assuming that one exists) when the plug-in method works well but also performs well when the plug-in approach fails or is unavailable.  相似文献   

7.
Estimating smooth monotone functions   总被引:1,自引:0,他引:1  
Many situations call for a smooth strictly monotone function f of arbitrary flexibility. The family of functions defined by the differential equation D  2 f  = w Df , where w is an unconstrained coefficient function comprises the strictly monotone twice differentiable functions. The solution to this equation is f = C 0 + C 1  D −1{exp( D −1 w )}, where C 0 and C 1 are arbitrary constants and D −1 is the partial integration operator. A basis for expanding w is suggested that permits explicit integration in the expression of f . In fitting data, it is also useful to regularize f by penalizing the integral of w 2 since this is a measure of the relative curvature in f . Applications are discussed to monotone nonparametric regression, to the transformation of the dependent variable in non-linear regression and to density estimation.  相似文献   

8.
Smoothing splines are known to exhibit a type of boundary bias that can reduce their estimation efficiency. In this paper, a boundary corrected cubic smoothing spline is developed in a way that produces a uniformly fourth order estimator. The resulting estimator can be calculated efficiently using an O(n) algorithm that is designed for the computation of fitted values and associated smoothing parameter selection criteria. A simulation study shows that use of the boundary corrected estimator can improve estimation efficiency in finite samples. Applications to the construction of asymptotically valid pointwise confidence intervals are also investigated .  相似文献   

9.
Let X = (X1, - Xp)prime; ˜ Np (μ, Σ) where μ= (μ1, -, μp)' and Σ= diag (Σ21, -, Σ2p) are both unknown and p3. Let (ni - 2) wi2i! X2ni, independent. of wi (I ≠ j = 1, -, p). Assume that (w1, -, wp) and X are independent. Define W = diag (w1, -, wp) and ¶ X ¶2w= X'W-1Q-1W-1X where Q = diag (q1, -,n qp), qi > 0, i = 1, -, p. In this paper, the minimax estimator of Berger & Bock (1976), given by δ (X, W) = [Ip - r(X, W) ¶ X ¶-2w Q-1W-1] X, is shown to be minimax relative to the convex loss (δ - μ)'[αQ + (1 - α) Σ-1] δ - μ)/C, where C =α tr (Σ) + (1 - α)p and 0 α 1, under certain conditions on r(X, W). This generalizes the above mentioned result of Berger & Bock.  相似文献   

10.
Three types of polynomial mixed model splines have been proposed: smoothing splines, P‐splines and penalized splines using a truncated power function basis. The close connections between these models are demonstrated, showing that the default cubic form of the splines differs only in the penalty used. A general definition of the mixed model spline is given that includes general constraints and can be used to produce natural or periodic splines. The impact of different penalties is demonstrated by evaluation across a set of functions with specific features, and shows that the best penalty in terms of mean squared error of prediction depends on both the form of the underlying function and the signal:noise ratio.  相似文献   

11.
Summary.  Because highly correlated data arise from many scientific fields, we investigate parameter estimation in a semiparametric regression model with diverging number of predictors that are highly correlated. For this, we first develop a distribution-weighted least squares estimator that can recover directions in the central subspace, then use the distribution-weighted least squares estimator as a seed vector and project it onto a Krylov space by partial least squares to avoid computing the inverse of the covariance of predictors. Thus, distrbution-weighted partial least squares can handle the cases with high dimensional and highly correlated predictors. Furthermore, we also suggest an iterative algorithm for obtaining a better initial value before implementing partial least squares. For theoretical investigation, we obtain strong consistency and asymptotic normality when the dimension p of predictors is of convergence rate O { n 1/2/ log ( n )} and o ( n 1/3) respectively where n is the sample size. When there are no other constraints on the covariance of predictors, the rates n 1/2 and n 1/3 are optimal. We also propose a Bayesian information criterion type of criterion to estimate the dimension of the Krylov space in the partial least squares procedure. Illustrative examples with a real data set and comprehensive simulations demonstrate that the method is robust to non-ellipticity and works well even in 'small n –large p ' problems.  相似文献   

12.
Regular smoothing splines are known to have a type of boundary bias problem that can reduce their estimation efficiency. In this paper, a boundary corrected smoothing spline with general order is designed in a way that the risk will decay at an optimal rate. An O(n) algorithm is also developed to compute the resultant estimator efficiently.  相似文献   

13.
In statistical models where jumps of a d -dimensional stable process ( S t ) t ≥0 are observed in windows with certain asymptotic properties, and where parameters appearing in the Levy measure of S are to be estimated, we have asymptotically efficient estimators. If Poisson random measure μ on (0, ∞) × ( R d \{0}) with intensity dt Λ( dx ) replaces the jump measure of S , where Λ is a ε-finite measure on R d \{0} admitting tail parameters in a suitable sense, we specify a notion of neighbourhood which allows to treat efficiency in statistical experiments of the second type by switching to accompanying sequences of the stable process type considered first.  相似文献   

14.
It is shown that the least squares estimators of B and Σ in the multivariate linear model {E Y i= X 1 B , D ( Y i) =Σ, 1 ≤ i ≤ n , Y 1 Y n uncorrelated} subject to the constraints Y i M = X i N are just the usual least squares estimators = ( X'X )-1 X'Y and ΣC = 1/n( Y-X )( Y-X ) in the unconstrained model where Σ has full rank. Tests of hypotheses concerning B are discussed for situations in which each Y i has a multivariate normal distribution, and examples of the applicability of the model reviewed.  相似文献   

15.
Summary.  We analyse data from a seroincident cohort of 457 homosexual men who were infected with the human immunodeficiency virus, followed within the multicentre Italian Seroconversion Study. These data include onset times to acquired immune deficiency syndrome (AIDS), longitudinal measurements of CD4+ T-cell counts taken on each subject during the AIDS-free period of observation and the period of administration of a highly active antiretro- viral therapy (HAART), for the subset of individuals who received it. The aim of the study is to assess the effect of HAART on the course of the disease. We analyse the data by a Bayesian model in which the sequence of longitudinal CD4+ cell count observations and the associated time to AIDS are jointly modelled at an individual subject's level as depending on the treatment. We discuss the inferences obtained about the efficacy of HAART, as well as modelling and computation difficulties that were encountered in the analysis. These latter motivate a model criticism stage of the analysis, in which the model specification of CD4+ cell count progression and of the effect of treatment are checked. Our approach to model criticism is based on the notion of a counterfactual replicate data set Z c . This is a data set with the same shape and size as the observed data, which we might have observed by rerunning the study in exactly the same conditions as the actual study if the treated patients had not been treated at all. We draw samples of Z c from a null model M 0, which assumes absence of treatment effect, conditioning on data collected in each subject before initiation of treatment. Model checking is performed by comparing the observed data with a set of samples of Z c drawn from M 0.  相似文献   

16.
Abstract.  We focus on a class of non-standard problems involving non-parametric estimation of a monotone function that is characterized by n 1/3 rate of convergence of the maximum likelihood estimator, non-Gaussian limit distributions and the non-existence of     -regular estimators. We have shown elsewhere that under a null hypothesis of the type ψ ( z 0) =  θ 0 ( ψ being the monotone function of interest) in non-standard problems of the above kind, the likelihood ratio statistic has a 'universal' limit distribution that is free of the underlying parameters in the model. In this paper, we illustrate its limiting behaviour under local alternatives of the form ψ n ( z ), where ψ n (·) and ψ (·) vary in O ( n −1/3) neighbourhoods around z 0 and ψ n converges to ψ at rate n 1/3 in an appropriate metric. Apart from local alternatives, we also consider the behaviour of the likelihood ratio statistic under fixed alternatives and establish the convergence in probability of an appropriately scaled version of the same to a constant involving a Kullback–Leibler distance.  相似文献   

17.
A new definition of asymptotic quasi-score sequence of estimating functions is given and studied. The relationship between asymptotic quasi-likelihood and quasi-likelihood estimates is investigated. A new practical approach for obtaining a good estimate of θ in the model y t = ft (θ) + mt without any prior knowledge on the nature of E ( m 2 t |F t −1) is suggested, where ft is a predictable process and mt is a martingale difference process. Two examples are used to show that the approach is practicable.  相似文献   

18.
Approximate Representation of Estimators in Constrained Regression Problems   总被引:6,自引:0,他引:6  
The estimators of inequality-constrained regression problems can be computed by iterative algorithms of mathematical programming, but they do not have analytical expressions in terms of the given data. This situation brings obstacles to further studies on the constrained regression. In this paper we derive approximate representations of the estimators with a remainder of magnitude ( N −1 log log N )1/2. From these representations one can clearly see the concrete structure of the estimators of these problems. It will be very helpful for further regression analysis.  相似文献   

19.
Simple Transformation Techniques for Improved Non-parametric Regression   总被引:2,自引:0,他引:2  
We propose and investigate two new methods for achieving less bias in non- parametric regression. We show that the new methods have bias of order h 4, where h is a smoothing parameter, in contrast to the basic kernel estimator's order h 2. The methods are conceptually very simple. At the first stage, perform an ordinary non-parametric regression on { xi , Yi } to obtain m^ ( xi ) (we use local linear fitting). In the first method, at the second stage, repeat the non-parametric regression but on the transformed dataset { m^ ( xi , Yi )}, taking the estimator at x to be this second stage estimator at m^ ( x ). In the second, and more appealing, method, again perform non-parametric regression on { m^ ( xi , Yi )}, but this time make the kernel weights depend on the original x scale rather than using the m^ ( x ) scale. We concentrate more of our effort in this paper on the latter because of its advantages over the former. Our emphasis is largely theoretical, but we also show that the latter method has practical potential through some simulated examples.  相似文献   

20.
Thin plate regression splines   总被引:2,自引:0,他引:2  
Summary. I discuss the production of low rank smoothers for d  ≥ 1 dimensional data, which can be fitted by regression or penalized regression methods. The smoothers are constructed by a simple transformation and truncation of the basis that arises from the solution of the thin plate spline smoothing problem and are optimal in the sense that the truncation is designed to result in the minimum possible perturbation of the thin plate spline smoothing problem given the dimension of the basis used to construct the smoother. By making use of Lanczos iteration the basis change and truncation are computationally efficient. The smoothers allow the use of approximate thin plate spline models with large data sets, avoid the problems that are associated with 'knot placement' that usually complicate modelling with regression splines or penalized regression splines, provide a sensible way of modelling interaction terms in generalized additive models, provide low rank approximations to generalized smoothing spline models, appropriate for use with large data sets, provide a means for incorporating smooth functions of more than one variable into non-linear models and improve the computational efficiency of penalized likelihood models incorporating thin plate splines. Given that the approach produces spline-like models with a sparse basis, it also provides a natural way of incorporating unpenalized spline-like terms in linear and generalized linear models, and these can be treated just like any other model terms from the point of view of model selection, inference and diagnostics.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号