首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
Functional data analysis involves the extension of familiar statistical procedures such as principal‐components analysis, linear modelling and canonical correlation analysis to data where the raw observation is a function x, (t). An essential preliminary to a functional data analysis is often the registration or alignment of salient curve features by suitable monotone transformations hi(t). In effect, this conceptualizes variation among functions as being composed of two aspects: phase and amplitude. Registration aims to remove phase variation as a preliminary to statistical analyses of amplitude variation. A local nonlinear regression technique is described for identifying the smooth monotone transformations hi, and is illustrated by analyses of simulated and actual data.  相似文献   

Simple Transformation Techniques for Improved Non-parametric Regression   总被引:2,自引:0,他引:2  
We propose and investigate two new methods for achieving less bias in non- parametric regression. We show that the new methods have bias of order h 4, where h is a smoothing parameter, in contrast to the basic kernel estimator's order h 2. The methods are conceptually very simple. At the first stage, perform an ordinary non-parametric regression on { xi , Yi } to obtain m^ ( xi ) (we use local linear fitting). In the first method, at the second stage, repeat the non-parametric regression but on the transformed dataset { m^ ( xi , Yi )}, taking the estimator at x to be this second stage estimator at m^ ( x ). In the second, and more appealing, method, again perform non-parametric regression on { m^ ( xi , Yi )}, but this time make the kernel weights depend on the original x scale rather than using the m^ ( x ) scale. We concentrate more of our effort in this paper on the latter because of its advantages over the former. Our emphasis is largely theoretical, but we also show that the latter method has practical potential through some simulated examples.  相似文献   

A subset T of S is said to be a Pareto Optimal subset of m ordered attributes (factors) if for profiles (combination of attribute levels) ( x 1, …, xm ) and ( y 1, …, ym ) ∈ T , no profile 'dominates' another; that is, there exists no pair such that xi ≤ yi , for i = 1, …, m . Pareto Optimal designs have specific applications in economics, cognitive psychology, and marketing research where investigators use main effects linear models to infer how respondents values level of costs and benefits from their preferences for sets of profiles offered them. In such studies, it is desirable that no profile dominates the others in a set. This paper shows how to construct a Pareto Optimal subset, proves that a single Pareto Optimal subset is not a connected main effects plan, provides subsets of two or more attributes that are connected in symmetric designs and gives corresponding results for asymmetric designs.  相似文献   

Estimating smooth monotone functions   总被引:1,自引:0,他引:1  
Many situations call for a smooth strictly monotone function f of arbitrary flexibility. The family of functions defined by the differential equation D  2 f  = w Df , where w is an unconstrained coefficient function comprises the strictly monotone twice differentiable functions. The solution to this equation is f = C 0 + C 1  D −1{exp( D −1 w )}, where C 0 and C 1 are arbitrary constants and D −1 is the partial integration operator. A basis for expanding w is suggested that permits explicit integration in the expression of f . In fitting data, it is also useful to regularize f by penalizing the integral of w 2 since this is a measure of the relative curvature in f . Applications are discussed to monotone nonparametric regression, to the transformation of the dependent variable in non-linear regression and to density estimation.  相似文献   

Suppose the p -variate random vector W , partitioned into q variables W1 and p - q variables W2, follows a multivariate normal mixture distribution. If the investigator is mainly interested in estimation of the parameters of the distribution of W1, there are two possibilities: (1) use only the data on W1 for estimation, and (2) estimate the parameters of the p -variate mixture distribution, and then extract the estimates of the marginal distribution of W1. In this article we study the choice between these two possibilities mainly for the case of two mixture components with identical covariance matrices. We find the asymptotic distribution of the linear discriminant function coefficients using the work of Efron (1975 ) and O'Neill (1978 ), and give a Wald–test for redundancy of W2. A simulation study gives further insights into conditions under which W2 should be used in the analysis: in summary, the inclusion of W2 seems justified if Δ 2.1, the Mahalanobis distance between the two component distributions based on the conditional distribution of W2 given W1, is at least 2.  相似文献   

A two-phase sampling estimator of the ratio-type for estimating the mean of a finite population, has been considered where the value of ρCy/Cx can be guessed or estimated in advance. Here Cy and Cx denote respectively the coefficients of variation of the characteristic under study, y, and the auxiliary characteristic x and ρ denotes the coefficient of correlation between y and x. When the value of ρCy/Cx is guessed or estimated exactly, the estimator has a smaller large-sample variance compared with either an ordinary ratio estimator or an ordinary linear regression estimator in two-phase sampling in the case where the first-phase sample is drawn independently from the second-phase sample. If the sample at the second phase is a subsample of the first-phase sample, the estimator has variance equal to that of the linear regression estimator. The largest value of the difference between the assumed value and the actual value of ρCy/Cx has been obtained so as not to result in the variance of the estimator being larger than the variances of either an ordinary ratio estimator or an ordinary linear regression estimator.  相似文献   

Summary.  Principal component analysis has become a fundamental tool of functional data analysis. It represents the functional data as X i ( t )= μ ( t )+Σ1≤ l <∞ η i ,  l +  v l ( t ), where μ is the common mean, v l are the eigenfunctions of the covariance operator and the η i ,  l are the scores. Inferential procedures assume that the mean function μ ( t ) is the same for all values of i . If, in fact, the observations do not come from one population, but rather their mean changes at some point(s), the results of principal component analysis are confounded by the change(s). It is therefore important to develop a methodology to test the assumption of a common functional mean. We develop such a test using quantities which can be readily computed in the R package fda. The null distribution of the test statistic is asymptotically pivotal with a well-known asymptotic distribution. The asymptotic test has excellent finite sample performance. Its application is illustrated on temperature data from England.  相似文献   

Abstract.  We focus on a class of non-standard problems involving non-parametric estimation of a monotone function that is characterized by n 1/3 rate of convergence of the maximum likelihood estimator, non-Gaussian limit distributions and the non-existence of     -regular estimators. We have shown elsewhere that under a null hypothesis of the type ψ ( z 0) =  θ 0 ( ψ being the monotone function of interest) in non-standard problems of the above kind, the likelihood ratio statistic has a 'universal' limit distribution that is free of the underlying parameters in the model. In this paper, we illustrate its limiting behaviour under local alternatives of the form ψ n ( z ), where ψ n (·) and ψ (·) vary in O ( n −1/3) neighbourhoods around z 0 and ψ n converges to ψ at rate n 1/3 in an appropriate metric. Apart from local alternatives, we also consider the behaviour of the likelihood ratio statistic under fixed alternatives and establish the convergence in probability of an appropriately scaled version of the same to a constant involving a Kullback–Leibler distance.  相似文献   

Consider the problem of covariance analysis based on regression models whose regression function is the sum of a linear and a non-parametric component. We propose a parametric and a non-parametric statistical test to compare the effects of the linear and non-parametric components, respectively, on the response variable in   L ≥ 2  groups. Serially correlated errors within each group are allowed. The first (second) test compares the differences between the estimates of the parametric (non-parametric) components of each group by means of a Mahalanobis  ( L 2)  distance. The asymptotic distribution of each statistic under the null hypothesis is obtained. A modest simulation study and an application to a real data set illustrate our methodology.  相似文献   

Summary.  The process of quality control of micrometeorological and carbon dioxide (CO2) flux data can be subjective and may lack repeatability, which would undermine the results of many studies. Multivariate statistical methods and time series analysis were used together and independently to detect and replace outliers in CO2 flux data derived from a Bowen ratio energy balance system. The results were compared with those produced by five experts who applied the current and potentially subjective protocol. All protocols were tested on the same set of three 5-day periods, when measurements were conducted in an abandoned agricultural field. The concordance of the protocols was evaluated by using the experts' opinion (mean ± 1.96 standard deviations) as a reference interval (the Bland–Altman method). Analysing the 15 days together, the statistical protocol that combined multivariate distance, multiple linear regression and time series analysis showed a concordance of 93% on a 20-min flux basis and 87% on a daily basis (only 2 days fell outside the reference interval), and the overall flux differed only by 1.7% (3.2 g CO2 m−2). An automated version of this or a similar statistical protocol could be used as a standard way of filling gaps and processing data from Bowen ratio energy balance and other techniques (e.g. eddy covariance). This would enforce objectivity in comparisons of CO2 flux data that are generated by different research groups and streamline the protocols for quality control.  相似文献   

In sequential analysis it is often necessary to determine the distributions of √t Y t and/or √a Y t , where t is a stopping time of the form t = inf{ n ≥ 1 : n+Snn> a }, Y n is the sample mean of n independent and identically distributed random variables (iidrvs) Yi with mean zero and variance one, Sn is the partial sum of iidrvs Xi with mean zero and a positive finite variance, and { ξn } is a sequence of random variables that converges in distribution to a random variable ξ as n →∞ and ξn is independent of ( Xn+1, Yn+1), (Xn+2, Yn+2), . . . for all n ≥ 1. Anscombe's (1952) central limit theorem asserts that both √t Y t and √a Y t are asymptotically normal for large a , but a normal approximation is not accurate enough for many applications. Refined approximations are available only for a few special cases of the general setting above and are often very complex. This paper provides some simple Edgeworth approximations that are numerically satisfactory for the problems it considers.  相似文献   

Nonparametric regression methods are used as exploratory tools for formulating, identifying and estimating non-linear models for the Canadian lynx data, which have attained bench-mark status in the time series literature since the work of Moran in 1953. To avoid the curse of dimensionality in the nonparametric analysis of this short series with 114 observations, we confine attention to the restricted class of additive and projection pursuit regression (PPR) models and rely on the estimated prediction error variance to compare the predictive performance of various (non-)linear models. A PPR model is found to have the smallest (in-sample) estimated prediction error variance of all the models fitted to these data in the literature. We use a data perturbation procedure to assess and adjust for the effect of data mining on the estimated prediction error variances; this renders most models fitted to the lynx data comparable and nearly equivalent. However, on the basis of the mean-squared error of out-of-sample prediction error, the semiparametric model Xt =1.08+1.37 Xt −1+ f ( Xt −2)+ et and Tong's self-exciting threshold autoregression model perform much better than the PPR and other models known for the lynx data.  相似文献   

Summary.  We consider the problem of multistep-ahead prediction in time series analysis by using nonparametric smoothing techniques. Forecasting is always one of the main objectives in time series analysis. Research has shown that non-linear time series models have certain advantages in multistep-ahead forecasting. Traditionally, nonparametric k -step-ahead least squares prediction for non-linear autoregressive AR( d ) models is done by estimating E ( X t + k  | X t , …,  X t − d +1) via nonparametric smoothing of X t + k on ( X t , …,  X t − d +1) directly. We propose a multistage nonparametric predictor. We show that the new predictor has smaller asymptotic mean-squared error than the direct smoother, though the convergence rate is the same. Hence, the predictor proposed is more efficient. Some simulation results, advice for practical bandwidth selection and a real data example are provided.  相似文献   

Summary.  The paper considers the double-autoregressive model y t  =  φ y t −1+ ɛ t with ɛ t  =     . Consistency and asymptotic normality of the estimated parameters are proved under the condition E  ln | φ  +√ α η t |<0, which includes the cases with | φ |=1 or | φ |>1 as well as     . It is well known that all kinds of estimators of φ in these cases are not normal when ɛ t are independent and identically distributed. Our result is novel and surprising. Two tests are proposed for testing stationarity of the model and their asymptotic distributions are shown to be a function of bivariate Brownian motions. Critical values of the tests are tabulated and some simulation results are reported. An application to the US 90-day treasury bill rate series is given.  相似文献   

In the estimators t 3 , t 4 , t 5 of Mukerjee, Rao & Vijayan (1987), b y x and b y z are partial regression coefficients of y on x and z , respectively, based on the smaller sample. With the above interpretation of b y x and b y z in t 3 , t 4 , t 5 , all the calculations in Mukerjee at al. (1987) are correct. In this connection, we also wish to make it explicit that b x z in t 5 is an ordinary and not a partial regression coefficient. The 'corrected' MSEs of t 3 , t 4 , t 5 , as given in Ahmed (1998 Section 3) are computed assuming that our b y x and b y z are ordinary and not partial regression coefficients. Indeed, we had no intention of giving estimators using the corresponding ordinary regression coefficients which would lead to estimators inferior to those given by Kiregyera (1984). We accept responsibility for any notational confusion created by us and express regret to readers who have been confused by our notation. Finally, in consideration of the above, it may be noted that Tripathi & Ahmed's (1995) estimator t 0 , quoted also in Ahmed (1998), is no better than t 5 of Mukerjee at al. (1987).  相似文献   

Many different methods have been proposed to construct nonparametric estimates of a smooth regression function, including local polynomial, (convolution) kernel and smoothing spline estimators. Each of these estimators uses a smoothing parameter to control the amount of smoothing performed on a given data set. In this paper an improved version of a criterion based on the Akaike information criterion (AIC), termed AICC, is derived and examined as a way to choose the smoothing parameter. Unlike plug-in methods, AICC can be used to choose smoothing parameters for any linear smoother, including local quadratic and smoothing spline estimators. The use of AICC avoids the large variability and tendency to undersmooth (compared with the actual minimizer of average squared error) seen when other 'classical' approaches (such as generalized cross-validation (GCV) or the AIC) are used to choose the smoothing parameter. Monte Carlo simulations demonstrate that the AICC-based smoothing parameter is competitive with a plug-in method (assuming that one exists) when the plug-in method works well but also performs well when the plug-in approach fails or is unavailable.  相似文献   

Penalized likelihood methods provide a range of practical modelling tools, including spline smoothing, generalized additive models and variants of ridge regression. Selecting the correct weights for penalties is a critical part of using these methods and in the single-penalty case the analyst has several well-founded techniques to choose from. However, many modelling problems suggest a formulation employing multiple penalties, and here general methodology is lacking. A wide family of models with multiple penalties can be fitted to data by iterative solution of the generalized ridge regression problem minimize || W 1/2 ( Xp − y ) ||2ρ+Σ i =1 m  θ i p ' S i p ( p is a parameter vector, X a design matrix, S i a non-negative definite coefficient matrix defining the i th penalty with associated smoothing parameter θ i , W a diagonal weight matrix, y a vector of data or pseudodata and ρ an 'overall' smoothing parameter included for computational efficiency). This paper shows how smoothing parameter selection can be performed efficiently by applying generalized cross-validation to this problem and how this allows non-linear, generalized linear and linear models to be fitted using multiple penalties, substantially increasing the scope of penalized modelling methods. Examples of non-linear modelling, generalized additive modelling and anisotropic smoothing are given.  相似文献   

A new definition of asymptotic quasi-score sequence of estimating functions is given and studied. The relationship between asymptotic quasi-likelihood and quasi-likelihood estimates is investigated. A new practical approach for obtaining a good estimate of θ in the model y t = ft (θ) + mt without any prior knowledge on the nature of E ( m 2 t |F t −1) is suggested, where ft is a predictable process and mt is a martingale difference process. Two examples are used to show that the approach is practicable.  相似文献   

Non-parametric Regression with Dependent Censored Data   总被引:1,自引:0,他引:1  
Abstract.  Let ( X i , Y i ) ( i = 1 ,…, n ) be n replications of a random vector ( X , Y  ), where Y is supposed to be subject to random right censoring. The data ( X i , Y i ) are assumed to come from a stationary α -mixing process. We consider the problem of estimating the function m ( x ) = E ( φ ( Y ) |  X = x ), for some known transformation φ . This problem is approached in the following way: first, we introduce a transformed variable     , that is not subject to censoring and satisfies the relation     , and then we estimate m ( x ) by applying local linear regression techniques. As a by-product, we obtain a general result on the uniform rate of convergence of kernel type estimators of functionals of an unknown distribution function, under strong mixing assumptions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号