首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Nonparametric regression techniques have been studied extensively in the literature in recent years due to their flexibility.In addition robust versions of these techniques have become popular and have been incorporated into some of the standard statistical analysis packages.With new techniques available comes the responsibility of using them properly and in appropriate situations. Often, as in the case presented here, model-fitting diagnostics, such as cross-validation statistics,are not available as tools to determine if the smoothing parameter value being used is preferable to some other arbitrarily chosen value.  相似文献   

2.
Summary.  We develop a general non-parametric approach to the analysis of clustered data via random effects. Assuming only that the link function is known, the regression functions and the distributions of both cluster means and observation errors are treated non-parametrically. Our argument proceeds by viewing the observation error at the cluster mean level as though it were a measurement error in an errors-in-variables problem, and using a deconvolution argument to access the distribution of the cluster mean. A Fourier deconvolution approach could be used if the distribution of the error-in-variables were known. In practice it is unknown, of course, but it can be estimated from repeated measurements, and in this way deconvolution can be achieved in an approximate sense. This argument might be interpreted as implying that large numbers of replicates are necessary for each cluster mean distribution, but that is not so; we avoid this requirement by incorporating statistical smoothing over values of nearby explanatory variables. Empirical rules are developed for the choice of smoothing parameter. Numerical simulations, and an application to real data, demonstrate small sample performance for this package of methodology. We also develop theory establishing statistical consistency.  相似文献   

3.
Beta-Bernstein Smoothing for Regression Curves with Compact Support   总被引:5,自引:0,他引:5  
ABSTRACT. The problem of boundary bias is associated with kernel estimation for regression curves with compact support. This paper proposes a simple and uni(r)ed approach for remedying boundary bias in non-parametric regression, without dividing the compact support into interior and boundary areas and without applying explicitly different smoothing treatments separately. The approach uses the beta family of density functions as kernels. The shapes of the kernels vary according to the position where the curve estimate is made. Theyare symmetric at the middle of the support interval, and become more and more asymmetric nearer the boundary points. The kernels never put any weight outside the data support interval, and thus avoid boundary bias. The method is a generalization of classical Bernstein polynomials, one of the earliest methods of statistical smoothing. The proposed estimator has optimal mean integrated squared error at an order of magnitude n −4/5, equivalent to that of standard kernel estimators when the curve has an unbounded support.  相似文献   

4.
The main purpose of this study is to analyze the global and local statistical properties of nonparametric smoothers subject to a priori fixed length restriction. In order to do so, we introduce a set of local statistical measures based on their weighting system shapes and weight values. In this way, the local statistical measures of bias, variance and mean square error are intrinsic to the smoothers and independent of the data to which they will be applied on. One major advantage of the statistical measures relative to the classical spectral ones is their easiness of calculation. However, in this paper we use both in a complementary manner. The smoothers studied are based on two broad classes of weighting generating functions, local polynomials and probability distributions. We consider within the first class, the locally weighted regression smoother (loess) of degree 1 and 2 (L1 and L2), the cubic smoothing spline (CSS), and the Henderson smoothing linear filter (H); and in the second class, the Gaussian kernel (GK). The weighting systems of these estimators depend on a smoothing parameter that traditionally, is estimated by means of data dependent optimization criteria. However, by imposing to all of them the condition of an equal number of weights, it will be shown that some of their optimal statistical properties are no longer valid. Without any loss of generality, the analysis is carried out for 13- and 9-term lengths because these are the most often selected for the Henderson filters in the context of monthly time series decomposition. We would like to thank an Associate Editor and an anonymous referee for their valuable comments on an earlier version of this paper. Financing from MURST is also gratefully acknowledged.  相似文献   

5.
Cross-validation has been widely used in the context of statistical linear models and multivariate data analysis. Recently, technological advancements give possibility of collecting new types of data that are in the form of curves. Statistical procedures for analysing these data, which are of infinite dimension, have been provided by functional data analysis. In functional linear regression, using statistical smoothing, estimation of slope and intercept parameters is generally based on functional principal components analysis (FPCA), that allows for finite-dimensional analysis of the problem. The estimators of the slope and intercept parameters in this context, proposed by Hall and Hosseini-Nasab [On properties of functional principal components analysis, J. R. Stat. Soc. Ser. B: Stat. Methodol. 68 (2006), pp. 109–126], are based on FPCA, and depend on a smoothing parameter that can be chosen by cross-validation. The cross-validation criterion, given there, is time-consuming and hard to compute. In this work, we approximate this cross-validation criterion by such another criterion so that we can turn to a multivariate data analysis tool in some sense. Then, we evaluate its performance numerically. We also treat a real dataset, consisting of two variables; temperature and the amount of precipitation, and estimate the regression coefficients for the former variable in a model predicting the latter one.  相似文献   

6.
This paper describes inference methods for functional data under the assumption that the functional data of interest are smooth latent functions, characterized by a Gaussian process, which have been observed with noise over a finite set of time points. The methods we propose are completely specified in a Bayesian environment that allows for all inferences to be performed through a simple Gibbs sampler. Our main focus is in estimating and describing uncertainty in the covariance function. However, these models also encompass functional data estimation, functional regression where the predictors are latent functions, and an automatic approach to smoothing parameter selection. Furthermore, these models require minimal assumptions on the data structure as the time points for observations do not need to be equally spaced, the number and placement of observations are allowed to vary among functions, and special treatment is not required when the number of functional observations is less than the dimensionality of those observations. We illustrate the effectiveness of these models in estimating latent functional data, capturing variation in the functional covariance estimate, and in selecting appropriate smoothing parameters in both a simulation study and a regression analysis of medfly fertility data.  相似文献   

7.
Neuroimaging studies aim to analyze imaging data with complex spatial patterns in a large number of locations (called voxels) on a two-dimensional (2D) surface or in a 3D volume. Conventional analyses of imaging data include two sequential steps: spatially smoothing imaging data and then independently fitting a statistical model at each voxel. However, conventional analyses suffer from the same amount of smoothing throughout the whole image, the arbitrary choice of smoothing extent, and low statistical power in detecting spatial patterns. We propose a multiscale adaptive regression model (MARM) to integrate the propagation-separation (PS) approach (Polzehl and Spokoiny, 2000, 2006) with statistical modeling at each voxel for spatial and adaptive analysis of neuroimaging data from multiple subjects. MARM has three features: being spatial, being hierarchical, and being adaptive. We use a multiscale adaptive estimation and testing procedure (MAET) to utilize imaging observations from the neighboring voxels of the current voxel to adaptively calculate parameter estimates and test statistics. Theoretically, we establish consistency and asymptotic normality of the adaptive parameter estimates and the asymptotic distribution of the adaptive test statistics. Our simulation studies and real data analysis confirm that MARM significantly outperforms conventional analyses of imaging data.  相似文献   

8.
Summary The Value-at-Risk calculation reduces the dimensionality of the risk factor space. The main reasons for such simplifications are, e. g., technical efficiency, the logic and statistical appropriateness of the model. In Chapter 2 we present three simple mappings: the mapping on the market index, the principal components models and the model with equally correlated risk factors. The comparison of these models in Chapter 3 is based on the literature on the verification of weather forecasts (Murphy and Winkler, 1992; Murphy, 1997). Some considerations on the quantitative analysis are presented in the fourth chapter. In the last chapter, we present empirical analysis of the DAX data using XploRe. We acknowlege the support of Deutsche Forschungsgemeinschaft, Sonderforschungsbereich 649 “Economic Risk”, MSM 0021620839 and 1K04018  相似文献   

9.
Forecasting in economic data analysis is dominated by linear prediction methods where the predicted values are calculated from a fitted linear regression model. With multiple predictor variables, multivariate nonparametric models were proposed in the literature. However, empirical studies indicate the prediction performance of multi-dimensional nonparametric models may be unsatisfactory. We propose a new semiparametric model average prediction (SMAP) approach to analyse panel data and investigate its prediction performance with numerical examples. Estimation of individual covariate effect only requires univariate smoothing and thus may be more stable than previous multivariate smoothing approaches. The estimation of optimal weight parameters incorporates the longitudinal correlation and the asymptotic properties of the estimated results are carefully studied in this paper.  相似文献   

10.
An important problem for fitting local linear regression is the choice of the smoothing parameter. As the smoothing parameter becomes large, the estimator tends to a straight line, which is the least squares fit in the ordinary linear regression setting. This property may be used to assess the adequacy of a simple linear model. Motivated by Silverman's (1981) work in kernel density estimation, a suitable test statistic is the critical smoothing parameter where the estimate changes from nonlinear to linear, while linearity or non- linearity requires a more precise judgment. We define the critical smoothing parameter through the approximate F-tests by Hastie and Tibshirani (1990). To assess the significance, the “wild bootstrap” procedure is used to replicate the data and the proportion of bootstrap samples which give a nonlinear estimate when using the critical bandwidth is obtained as the p-value. Simulation results show that the critical smoothing test is useful in detecting a wide range of alternatives.  相似文献   

11.
This paper considers linear and nonlinear regression with a response variable that is allowed to be “missing at random”. The only structural assumptions on the distribution of the variables are that the errors have mean zero and are independent of the covariates. The independence assumption is important. It enables us to construct an estimator for the response density that uses all the observed data, in contrast to the usual local smoothing techniques, and which therefore permits a faster rate of convergence. The idea is to write the response density as a convolution integral which can be estimated by an empirical version, with a weighted residual-based kernel estimator plugged in for the error density. For an appropriate class of regression functions, and a suitably chosen bandwidth, this estimator is consistent and converges with the optimal parametric rate n1/2. Moreover, the estimator is proved to be efficient (in the sense of Hájek and Le Cam) if an efficient estimator is used for the regression parameter.  相似文献   

12.
We develop Bayesian models for density regression with emphasis on discrete outcomes. The problem of density regression is approached by considering methods for multivariate density estimation of mixed scale variables, and obtaining conditional densities from the multivariate ones. The approach to multivariate mixed scale outcome density estimation that we describe represents discrete variables, either responses or covariates, as discretised versions of continuous latent variables. We present and compare several models for obtaining these thresholds in the challenging context of count data analysis where the response may be over‐ and/or under‐dispersed in some of the regions of the covariate space. We utilise a nonparametric mixture of multivariate Gaussians to model the directly observed and the latent continuous variables. The paper presents a Markov chain Monte Carlo algorithm for posterior sampling, sufficient conditions for weak consistency, and illustrations on density, mean and quantile regression utilising simulated and real datasets.  相似文献   

13.
Nonparametric curve estimation is an extremely common statistical procedure. While its primary purpose has been exploratory, some advances in inference have been made. This paper provides a critical review of inferential tests that make fundamental use of a key element of nonparametric smoothing, the bandwidth, to determine the significance of certain features. A major focus is on two important problems that have been tackled using bandwidth-based inference: testing for the multimodality of a density and testing for the monotonicity of a regression curve. Early research in bandwidth-based inference is surveyed, as well as recent theoretical advances. Possible future directions in bandwidth-based inference are discussed.  相似文献   

14.
The field of nonparametric function estimation has broadened its appeal in recent years with an array of new tools for statistical analysis. In particular, theoretical and applied research on the field of wavelets has had noticeable influence on statistical topics such as nonparametric regression, nonparametric density estimation, nonparametric discrimination and many other related topics. This is a survey article that attempts to synthetize a broad variety of work on wavelets in statistics and includes some recent developments in nonparametric curve estimation that have been omitted from review articles and books on the subject. After a short introduction to wavelet theory, wavelets are treated in the familiar context of estimation of «smooth» functions. Both «linear» and «nonlinear» wavelet estimation methods are discussed and cross-validation methods for choosing the smoothing parameters are addressed. Finally, some areas of related research are mentioned, such as hypothesis testing, model selection, hazard rate estimation for censored data, and nonparametric change-point problems. The closing section formulates some promising research directions relating to wavelets in statistics.  相似文献   

15.
Density function is a fundamental concept in data analysis. Non-parametric methods including kernel smoothing estimate are available if the data is completely observed. However, in studies such as diagnostic studies following a two-stage design the membership of some of the subjects may be missing. Simply ignoring those subjects with unknown membership is valid only in the MCAR situation. In this paper, we consider kernel smoothing estimate of the density functions, using the inverse probability approaches to address the missing values. We illustrate the approaches with simulation studies and real study data in mental health.  相似文献   

16.
Summary Nonparametric models have become more and more popular over the last two decades. One reason for their popularity is software availability, which easily allows to fit smooth but otherwise unspecified functions to data. A benefit of the models is that the functional shape of a regression function is not prespecified in advance, but determined by the data. Clearly this allows for more insight which can be interpreted on a substance matter level. This paper gives an overview of available fitting routines, commonly called smoothing procedures. Moreover, a number of extensions to classical scatterplot smoothing are discussed, with examples supporting the advantages of the routines.  相似文献   

17.
National statistical agencies and other data custodians collect and hold a vast amount of survey and census data, containing information vital for research and policy analysis. However, the problem of allowing analysis of these data, while protecting respondent confidentiality, has proved challenging to address. In this paper we will focus on the remote analysis approach, under which a confidential dataset is held in a secure environment under the direct control of the data custodian agency. A computer system within the secure environment accepts a query from an analyst, runs it on the data, then returns the results to the analyst. In particular, the analyst does not have direct access to the data at all, and cannot view any microdata records. We further focus on the fitting of linear regression models to confidential data in the presence of outliers and influential points, such as are often present in business data. We propose a new method for protecting confidentiality in linear regression via a remote analysis system, that provides additional confidentiality protection for outliers and influential points in the data. The method we describe in this paper was designed for the prototype DataAnalyser system developed by the Australian Bureau of Statistics, however the method would be suitable for similar remote analysis systems.  相似文献   

18.
Both kriging and non-parametric regression smoothing can model a non-stationary regression function with spatially correlated errors. However comparisons have mainly been based on ordinary kriging and smoothing with uncorrelated errors. Ordinary kriging attributes smoothness of the response to spatial autocorrelation whereas non-parametric regression attributes trends to a smooth regression function. For spatial processes it is reasonable to suppose that the response is due to both trend and autocorrelation. This paper reviews methodology for non-parametric regression with autocorrelated errors which is a natural compromise between the two methods. Re-analysis of the one-dimensional stationary spatial data of Laslett (1994) and a clearly non-stationary time series demonstrates the rather surprising result that for these data, ordinary kriging outperforms more computationally intensive models including both universal kriging and correlated splines for spatial prediction. For estimating the regression function, non-parametric regression provides adaptive estimation, but the autocorrelation must be accounted for in selecting the smoothing parameter.  相似文献   

19.
We propose a flexible semiparametric stochastic mixed effects model for bivariate cyclic longitudinal data. The model can handle either single cycle or, more generally, multiple consecutive cycle data. The approach models the mean of responses by parametric fixed effects and a smooth nonparametric function for the underlying time effects, and the relationship across the bivariate responses by a bivariate Gaussian random field and a joint distribution of random effects. The proposed model not only can model complicated individual profiles, but also allows for more flexible within-subject and between-response correlations. The fixed effects regression coefficients and the nonparametric time functions are estimated using maximum penalized likelihood, where the resulting estimator for the nonparametric time function is a cubic smoothing spline. The smoothing parameters and variance components are estimated simultaneously using restricted maximum likelihood. Simulation results show that the parameter estimates are close to the true values. The fit of the proposed model on a real bivariate longitudinal dataset of pre-menopausal women also performs well, both for a single cycle analysis and for a multiple consecutive cycle analysis. The Canadian Journal of Statistics 48: 471–498; 2020 © 2020 Statistical Society of Canada  相似文献   

20.
We propose an exploratory data analysis approach when data are observed as intervals in a nonparametric regression setting. The interval-valued data contain richer information than single-valued data in the sense that they provide both center and range information of the underlying structure. Conventionally, these two attributes have been studied separately as traditional tools can be readily used for single-valued data analysis. We propose a unified data analysis tool that attempts to capture the relationship between response and covariate by simultaneously accounting for variability present in the data. It utilizes a kernel smoothing approach, which is conducted in scale-space so that it considers a wide range of smoothing parameters rather than selecting an optimal value. It also visually summarizes the significance of trends in the data as a color map across multiple locations and scales. We demonstrate its effectiveness as an exploratory data analysis tool for interval-valued data using simulated and real examples.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号