首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
Abstract.  We focus on a class of non-standard problems involving non-parametric estimation of a monotone function that is characterized by n 1/3 rate of convergence of the maximum likelihood estimator, non-Gaussian limit distributions and the non-existence of     -regular estimators. We have shown elsewhere that under a null hypothesis of the type ψ ( z 0) =  θ 0 ( ψ being the monotone function of interest) in non-standard problems of the above kind, the likelihood ratio statistic has a 'universal' limit distribution that is free of the underlying parameters in the model. In this paper, we illustrate its limiting behaviour under local alternatives of the form ψ n ( z ), where ψ n (·) and ψ (·) vary in O ( n −1/3) neighbourhoods around z 0 and ψ n converges to ψ at rate n 1/3 in an appropriate metric. Apart from local alternatives, we also consider the behaviour of the likelihood ratio statistic under fixed alternatives and establish the convergence in probability of an appropriately scaled version of the same to a constant involving a Kullback–Leibler distance.  相似文献   

2.
Summary.  Smoothing splines via the penalized least squares method provide versatile and effective nonparametric models for regression with Gaussian responses. The computation of smoothing splines is generally of the order O ( n 3), n being the sample size, which severely limits its practical applicability. We study more scalable computation of smoothing spline regression via certain low dimensional approximations that are asymptotically as efficient. A simple algorithm is presented and the Bayes model that is associated with the approximations is derived, with the latter guiding the porting of Bayesian confidence intervals. The practical choice of the dimension of the approximating space is determined through simulation studies, and empirical comparisons of the approximations with the exact solution are presented. Also evaluated is a simple modification of the generalized cross-validation method for smoothing parameter selection, which to a large extent fixes the occasional undersmoothing problem that is suffered by generalized cross-validation.  相似文献   

3.
Summary.  Partial least squares regression has been an alternative to ordinary least squares for handling multicollinearity in several areas of scientific research since the 1960s. It has recently gained much attention in the analysis of high dimensional genomic data. We show that known asymptotic consistency of the partial least squares estimator for a univariate response does not hold with the very large p and small n paradigm. We derive a similar result for a multivariate response regression with partial least squares. We then propose a sparse partial least squares formulation which aims simultaneously to achieve good predictive performance and variable selection by producing sparse linear combinations of the original predictors. We provide an efficient implementation of sparse partial least squares regression and compare it with well-known variable selection and dimension reduction approaches via simulation experiments. We illustrate the practical utility of sparse partial least squares regression in a joint analysis of gene expression and genomewide binding data.  相似文献   

4.
Estimating smooth monotone functions   总被引:1,自引:0,他引:1  
Many situations call for a smooth strictly monotone function f of arbitrary flexibility. The family of functions defined by the differential equation D  2 f  = w Df , where w is an unconstrained coefficient function comprises the strictly monotone twice differentiable functions. The solution to this equation is f = C 0 + C 1  D −1{exp( D −1 w )}, where C 0 and C 1 are arbitrary constants and D −1 is the partial integration operator. A basis for expanding w is suggested that permits explicit integration in the expression of f . In fitting data, it is also useful to regularize f by penalizing the integral of w 2 since this is a measure of the relative curvature in f . Applications are discussed to monotone nonparametric regression, to the transformation of the dependent variable in non-linear regression and to density estimation.  相似文献   

5.
Abstract.  In this paper, a two-stage estimation method for non-parametric additive models is investigated. Differing from Horowitz and Mammen's two-stage estimation, our first-stage estimators are designed not only for dimension reduction but also as initial approximations to all of the additive components. The second-stage estimators are obtained by using one-dimensional non-parametric techniques to refine the first-stage ones. From this procedure, we can reveal a relationship between the regression function spaces and convergence rate, and then provide estimators that are optimal in the sense that, better than the usual one-dimensional mean-squared error (MSE) of the order n −4/5 , the MSE of the order n − 1 can be achieved when the underlying models are actually parametric. This shows that our estimation procedure is adaptive in a certain sense. Also it is proved that the bandwidth that is selected by cross-validation depends only on one-dimensional kernel estimation and maintains the asymptotic optimality. Simulation studies show that the new estimators of the regression function and all components outperform the existing estimators, and their behaviours are often similar to that of the oracle estimator.  相似文献   

6.
Suppose that subjects in a population follow the model f   ( y * x *; ) where y * denotes a response, x * denotes a vector of covariates and is the parameter to be estimated. We consider response-biased sampling, in which a subject is observed with a probability which is a function of its response. Such response-biased sampling frequently occurs in econometrics, epidemiology and survey sampling. The semiparametric maximum likelihood estimate of is derived, along with its asymptotic normality, efficiency and variance estimates. The estimate proposed can be used as a maximum partial likelihood estimate in stratified response-selective sampling. Some computation algorithms are also provided.  相似文献   

7.
There has been much recent interest in supersaturated designs and their application in factor screening experiments. Supersaturated designs have mainly been constructed by using the E ( s 2)-optimality criterion originally proposed by Booth and Cox in 1962. However, until now E ( s 2)-optimal designs have only been established with certainty for n experimental runs when the number of factors m is a multiple of n-1 , and in adjacent cases where m = q ( n -1) + r (| r | 2, q an integer). A method of constructing E ( s 2)-optimal designs is presented which allows a reasonably complete solution to be found for various numbers of runs n including n ,=8 12, 16, 20, 24, 32, 40, 48, 64.  相似文献   

8.
Non-parametric Kernel Estimation of the Coefficient of a Diffusion   总被引:4,自引:0,他引:4  
In this work we exhibit a non-parametric estimator of kernel type, for the diffusion coefficient when one observes a one-dimensional diffusion process at times i / n for i = , ..., n and study its asymptotics as n ←∞. When the diffusion coefficient has regularity r ≥ 1, we obtain a rate 1/ n r /(1+2 r ), both for pointwise estimation and for estimation on a compact subset of R: this is the same rate as for non-parametric estimation of a density with i.i.d. observations.  相似文献   

9.
In biostatistical applications interest often focuses on the estimation of the distribution of time T between two consecutive events. If the initial event time is observed and the subsequent event time is only known to be larger or smaller than an observed point in time, then the data is described by the well understood singly censored current status model, also known as interval censored data, case I. Jewell et al. (1994) extended this current status model by allowing the initial time to be unobserved, but with its distribution over an observed interval ' A, B ' known to be uniformly distributed; the data is referred to as doubly censored current status data. These authors used this model to handle application in AIDS partner studies focusing on the NPMLE of the distribution G of T . The model is a submodel of the current status model, but the distribution G is essentially the derivative of the distribution of interest F in the current status model. In this paper we establish that the NPMLE of G is uniformly consistent and that the resulting estimators for the n 1/2-estimable parameters are efficient. We propose an iterative weighted pool-adjacent-violator-algorithm to compute the estimator. It is also shown that, without smoothness assumptions, the NPMLE of F converges at rate n −2/5 in L 2-norm while the NPMLE of F in the non-parametric current status data model converges at rate n −1/3 in L 2-norm, which shows that there is a substantial gain in using the submodel information.  相似文献   

10.
Summary.  We analyse data from a seroincident cohort of 457 homosexual men who were infected with the human immunodeficiency virus, followed within the multicentre Italian Seroconversion Study. These data include onset times to acquired immune deficiency syndrome (AIDS), longitudinal measurements of CD4+ T-cell counts taken on each subject during the AIDS-free period of observation and the period of administration of a highly active antiretro- viral therapy (HAART), for the subset of individuals who received it. The aim of the study is to assess the effect of HAART on the course of the disease. We analyse the data by a Bayesian model in which the sequence of longitudinal CD4+ cell count observations and the associated time to AIDS are jointly modelled at an individual subject's level as depending on the treatment. We discuss the inferences obtained about the efficacy of HAART, as well as modelling and computation difficulties that were encountered in the analysis. These latter motivate a model criticism stage of the analysis, in which the model specification of CD4+ cell count progression and of the effect of treatment are checked. Our approach to model criticism is based on the notion of a counterfactual replicate data set Z c . This is a data set with the same shape and size as the observed data, which we might have observed by rerunning the study in exactly the same conditions as the actual study if the treated patients had not been treated at all. We draw samples of Z c from a null model M 0, which assumes absence of treatment effect, conditioning on data collected in each subject before initiation of treatment. Model checking is performed by comparing the observed data with a set of samples of Z c drawn from M 0.  相似文献   

11.
Summary.  For a binary treatment ν =0, 1 and the corresponding 'potential response' Y 0 for the control group ( ν =0) and Y 1 for the treatment group ( ν =1), one definition of no treatment effect is that Y 0 and Y 1 follow the same distribution given a covariate vector X . Koul and Schick have provided a non-parametric test for no distributional effect when the realized response (1− ν ) Y 0+ ν Y 1 is fully observed and the distribution of X is the same across the two groups. This test is thus not applicable to censored responses, nor to non-experimental (i.e. observational) studies that entail different distributions of X across the two groups. We propose ' X -matched' non-parametric tests generalizing the test of Koul and Schick following an idea of Gehan. Our tests are applicable to non-experimental data with randomly censored responses. In addition to these motivations, the tests have several advantages. First, they have the intuitive appeal of comparing all available pairs across the treatment and control groups, instead of selecting a number of matched controls (or treated) in the usual pair or multiple matching. Second, whereas most matching estimators or tests have a non-overlapping support (of X ) problem across the two groups, our tests have a built-in protection against the problem. Third, Gehan's idea allows the tests to make good use of censored observations. A simulation study is conducted, and an empirical illustration for a job training effect on the duration of unemployment is provided.  相似文献   

12.
Goodness-of-fit tests based on residual sums of squares are standard procedures used when fitting regression models. Often we have a smooth alternative in mind, a qualitative feature that the χ2-test does not take into account. We show that the power of detecting a smooth alternative increases when we smooth the current model as well. The proposed test is shown to be able to detect any continuous local alternative tending to zero slower than n −1/2. Theoretical results also address minimax non-parametric hypothesis testing in Sobolev spaces. A simulation study is presented, and the procedure is applied to expenditure curve estimation.  相似文献   

13.
We consider the problem of estimating the mean of a multivariate distribution. As a general alternative to penalized least squares estimators, we consider minimax estimators for squared error over a restricted parameter space where the restriction is determined by the penalization term. For a quadratic penalty term, the minimax estimator among linear estimators can be found explicitly. It is shown that all symmetric linear smoothers with eigenvalues in the unit interval can be characterized as minimax linear estimators over a certain parameter space where the bias is bounded. The minimax linear estimator depends on smoothing parameters that must be estimated in practice. Using results in Kneip (1994), this can be done using Mallows' C L -statistic and the resulting adaptive estimator is now asymptotically minimax linear. The minimax estimator is compared to the penalized least squares estimator both in finite samples and asymptotically.  相似文献   

14.
Summary.  We consider the problem of estimating the noise variance in homoscedastic nonparametric regression models. For low dimensional covariates t  ∈  R d ,  d =1, 2, difference-based estimators have been investigated in a series of papers. For a given length of such an estimator, difference schemes which minimize the asymptotic mean-squared error can be computed for d =1 and d =2. However, from numerical studies it is known that for finite sample sizes the performance of these estimators may be deficient owing to a large finite sample bias. We provide theoretical support for these findings. In particular, we show that with increasing dimension d this becomes more drastic. If d 4, these estimators even fail to be consistent. A different class of estimators is discussed which allow better control of the bias and remain consistent when d 4. These estimators are compared numerically with kernel-type estimators (which are asymptotically efficient), and some guidance is given about when their use becomes necessary.  相似文献   

15.
Abstract.  Let Ω be a space of densities with respect to some σ -finite measure μ and let Π be a prior distribution having support Ω with respect to some suitable topology. Conditional on f , let X n  = ( X 1 ,…, X n ) be an independent and identically distributed sample of size n from f . This paper introduces a Bayesian non-parametric criterion for sample size determination which is based on the integrated squared distance between posterior predictive densities. An expression for the sample size is obtained when the prior is a Dirichlet mixture of normal densities.  相似文献   

16.
It is shown that the least squares estimators of B and Σ in the multivariate linear model {E Y i= X 1 B , D ( Y i) =Σ, 1 ≤ i ≤ n , Y 1 Y n uncorrelated} subject to the constraints Y i M = X i N are just the usual least squares estimators = ( X'X )-1 X'Y and ΣC = 1/n( Y-X )( Y-X ) in the unconstrained model where Σ has full rank. Tests of hypotheses concerning B are discussed for situations in which each Y i has a multivariate normal distribution, and examples of the applicability of the model reviewed.  相似文献   

17.
Let X = (X1, - Xp)prime; ˜ Np (μ, Σ) where μ= (μ1, -, μp)' and Σ= diag (Σ21, -, Σ2p) are both unknown and p3. Let (ni - 2) wi2i! X2ni, independent. of wi (I ≠ j = 1, -, p). Assume that (w1, -, wp) and X are independent. Define W = diag (w1, -, wp) and ¶ X ¶2w= X'W-1Q-1W-1X where Q = diag (q1, -,n qp), qi > 0, i = 1, -, p. In this paper, the minimax estimator of Berger & Bock (1976), given by δ (X, W) = [Ip - r(X, W) ¶ X ¶-2w Q-1W-1] X, is shown to be minimax relative to the convex loss (δ - μ)'[αQ + (1 - α) Σ-1] δ - μ)/C, where C =α tr (Σ) + (1 - α)p and 0 α 1, under certain conditions on r(X, W). This generalizes the above mentioned result of Berger & Bock.  相似文献   

18.
We are concerned with estimators which improve upon the best invariant estimator, in estimating a location parameter θ. If the loss function is L(θ - a) with L convex, we give sufficient conditions for the inadmissibility of δ0(X) = X. If the loss is a weighted sum of squared errors, we find various classes of estimators δ which are better than δ0. In general, δ is the convolution of δ1 (an estimator which improves upon δ0 outside of a compact set) with a suitable probability density in Rp. The critical dimension of inadmissibility depends on the estimator δ1 We also give several examples of estimators δ obtained in this way and state some open problems.  相似文献   

19.
Several estimators of squared prediction error have been suggested for use in model and bandwidth selection problems. Among these are cross-validation, generalized cross-validation and a number of related techniques based on the residual sum of squares. For many situations with squared error loss, e.g. nonparametric smoothing, these estimators have been shown to be asymptotically optimal in the sense that in large samples the estimator minimizing the selection criterion also minimizes squared error loss. However, cross-validation is known not to be asymptotically optimal for some `easy' location problems. We consider selection criteria based on estimators of squared prediction risk for choosing between location estimators. We show that criteria based on adjusted residual sum of squares are not asymptotically optimal for choosing between asymptotically normal location estimators that converge at rate n 1/2but are when the rate of convergence is slower. We also show that leave-one-out cross-validation is not asymptotically optimal for choosing between √ n -differentiable statistics but leave- d -out cross-validation is optimal when d ∞ at the appropriate rate.  相似文献   

20.
Simple Transformation Techniques for Improved Non-parametric Regression   总被引:2,自引:0,他引:2  
We propose and investigate two new methods for achieving less bias in non- parametric regression. We show that the new methods have bias of order h 4, where h is a smoothing parameter, in contrast to the basic kernel estimator's order h 2. The methods are conceptually very simple. At the first stage, perform an ordinary non-parametric regression on { xi , Yi } to obtain m^ ( xi ) (we use local linear fitting). In the first method, at the second stage, repeat the non-parametric regression but on the transformed dataset { m^ ( xi , Yi )}, taking the estimator at x to be this second stage estimator at m^ ( x ). In the second, and more appealing, method, again perform non-parametric regression on { m^ ( xi , Yi )}, but this time make the kernel weights depend on the original x scale rather than using the m^ ( x ) scale. We concentrate more of our effort in this paper on the latter because of its advantages over the former. Our emphasis is largely theoretical, but we also show that the latter method has practical potential through some simulated examples.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号