首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The paper proposes a cross-validation method to address the question of specification search in a multiple nonlinear quantile regression framework. Linear parametric, spline-based partially linear and kernel-based fully nonparametric specifications are contrasted as competitors using cross-validated weighted L 1-norm based goodness-of-fit and prediction error criteria. The aim is to provide a fair comparison with respect to estimation accuracy and/or predictive ability for different semi- and nonparametric specification paradigms. This is challenging as the model dimension cannot be estimated for all competitors and the meta-parameters such as kernel bandwidths, spline knot numbers and polynomial degrees are difficult to compare. General issues of specification comparability and automated data-driven meta-parameter selection are discussed. The proposed method further allows us to assess the balance between fit and model complexity. An extensive Monte Carlo study and an application to a well-known data set provide empirical illustration of the method.  相似文献   

2.
This paper is concerned with a semiparametric partially linear regression model with unknown regression coefficients, an unknown nonparametric function for the non-linear component, and unobservable Gaussian distributed random errors. We present a wavelet thresholding based estimation procedure to estimate the components of the partial linear model by establishing a connection between an l 1-penalty based wavelet estimator of the nonparametric component and Huber’s M-estimation of a standard linear model with outliers. Some general results on the large sample properties of the estimates of both the parametric and the nonparametric part of the model are established. Simulations are used to illustrate the general results and to compare the proposed methodology with other methods available in the recent literature.  相似文献   

3.
A number of efficient computer codes are available for the simple linear L 1 regression problem. However, a number of these codes can be made more efficient by utilizing the least squares solution. In fact, a couple of available computer programs already do so.

We report the results of a computational study comparing several openly available computer programs for solving the simple linear L 1 regression problem with and without computing and utilizing a least squares solution.  相似文献   

4.
Recently, Kokonendji et al. have adapted the well-known Nadaraya–Watson kernel estimator for estimating the count function m in the context of nonparametric discrete regression. The authors have also investigated the bandwidth selection using the cross-validation method. In this article, we propose a Bayesian approach in the context of nonparametric count regression for estimating the bandwidth and the variance of the model error, which has not been estimated in Kokonendji et al. The model error is considered as Gaussian with mean of zero and a variance of σ2. The Bayes estimates cannot be obtained in closed form and then, we use the well-known Markov chain Monte Carlo (MCMC) technique to compute the Bayes estimates under the squared errors loss function. The performance of this proposed approach and the cross-validation method are compared through simulation and real count data.  相似文献   

5.
In this paper, the regression model with a nonnegativity constraint on the dependent variable is considered. Under weak conditions, L 1 estimates of the regression coefficients are shown to be consistent.  相似文献   

6.
The Dirichlet process prior allows flexible nonparametric mixture modeling. The number of mixture components is not specified in advance and can grow as new data arrive. However, analyses based on the Dirichlet process prior are sensitive to the choice of the parameters, including an infinite-dimensional distributional parameter G 0. Most previous applications have either fixed G 0 as a member of a parametric family or treated G 0 in a Bayesian fashion, using parametric prior specifications. In contrast, we have developed an adaptive nonparametric method for constructing smooth estimates of G 0. We combine this method with a technique for estimating α, the other Dirichlet process parameter, that is inspired by an existing characterization of its maximum-likelihood estimator. Together, these estimation procedures yield a flexible empirical Bayes treatment of Dirichlet process mixtures. Such a treatment is useful in situations where smooth point estimates of G 0 are of intrinsic interest, or where the structure of G 0 cannot be conveniently modeled with the usual parametric prior families. Analysis of simulated and real-world datasets illustrates the robustness of this approach.  相似文献   

7.
Nonparametric regression can be considered as a problem of model choice. In this article, we present the results of a simulation study in which several nonparametric regression techniques including wavelets and kernel methods are compared with respect to their behavior on different test beds. We also include the taut-string method whose aim is not to minimize the distance of an estimator to some “true” generating function f but to provide a simple adequate approximation to the data. Test beds are situations where a “true” generating f exists and in this situation it is possible to compare the estimates of f with f itself. The measures of performance we use are the L2- and the L-norms and the ability to identify peaks.  相似文献   

8.
In this paper we define a finite mixture of quantile and M-quantile regression models for heterogeneous and /or for dependent/clustered data. Components of the finite mixture represent clusters of individuals with homogeneous values of model parameters. For its flexibility and ease of estimation, the proposed approaches can be extended to random coefficients with a higher dimension than the simple random intercept case. Estimation of model parameters is obtained through maximum likelihood, by implementing an EM-type algorithm. The standard error estimates for model parameters are obtained using the inverse of the observed information matrix, derived through the Oakes (J R Stat Soc Ser B 61:479–482, 1999) formula in the M-quantile setting, and through nonparametric bootstrap in the quantile case. We present a large scale simulation study to analyse the practical behaviour of the proposed model and to evaluate the empirical performance of the proposed standard error estimates for model parameters. We considered a variety of empirical settings in both the random intercept and the random coefficient case. The proposed modelling approaches are also applied to two well-known datasets which give further insights on their empirical behaviour.  相似文献   

9.
A novel approach to quantile estimation in multivariate linear regression models with change-points is proposed: the change-point detection and the model estimation are both performed automatically, by adopting either the quantile-fused penalty or the adaptive version of the quantile-fused penalty. These two methods combine the idea of the check function used for the quantile estimation and the L1 penalization principle known from the signal processing and, unlike some standard approaches, the presented methods go beyond typical assumptions usually required for the model errors, such as sub-Gaussian or normal distribution. They can effectively handle heavy-tailed random error distributions, and, in general, they offer a more complex view on the data as one can obtain any conditional quantile of the target distribution, not just the conditional mean. The consistency of detection is proved and proper convergence rates for the parameter estimates are derived. The empirical performance is investigated via an extensive comparative simulation study and practical utilization is demonstrated using a real data example.  相似文献   

10.
We consider nonparametric estimation based on interval-censored competing risks data with masked failure cause. The generalized maximum likelihood estimator of the joint survival function of the failure time and the failure cause is studied under mixed case interval censorship and random partition masking. Strong consistency in the L 1(μ)-topology is established for some finite measure μ which is derived from the joint censoring and masking distribution. Under additional regularity assumptions we also establish the strong consistencies in the topologies of weak convergence, point-wise convergence, and uniform convergence.  相似文献   

11.
The L1-type regularization provides a useful tool for variable selection in high-dimensional regression modeling. Various algorithms have been proposed to solve optimization problems for L1-type regularization. Especially the coordinate descent algorithm has been shown to be effective in sparse regression modeling. Although the algorithm shows a remarkable performance to solve optimization problems for L1-type regularization, it suffers from outliers, since the procedure is based on the inner product of predictor variables and partial residuals obtained from a non-robust manner. To overcome this drawback, we propose a robust coordinate descent algorithm, especially focusing on the high-dimensional regression modeling based on the principal components space. We show that the proposed robust algorithm converges to the minimum value of its objective function. Monte Carlo experiments and real data analysis are conducted to examine the efficiency of the proposed robust algorithm. We observe that our robust coordinate descent algorithm effectively performs for the high-dimensional regression modeling even in the presence of outliers.  相似文献   

12.
The nonparametric density function estimation using sample observations which are contaminated with random noise is studied. The particular form of contamination under consideration is Y = X + Z, where Y is an observable random variableZ is a random noise variable with known distribution, and X is an absolutely continuous random variable which cannot be observed directly. The finite sample size performance of a strongly consistent estimator for the density function of the random variable X is illustrated for different distributions. The estimator uses Fourier and kernel function estimation techniques and allows the user to choose constants which relate to bandwidth windows and limits on integration and which greatly affect the appearance and properties of the estimates. Numerical techniques for computation of the estimated densities and for optimal selection of the constant are given.  相似文献   

13.
This article describes a recursive nonparametric estimation for the local partial first derivative of an arbitrary function satisfied some regularity conditions and establishes its consistency and asymptotic normality under the assumption of strong mixing sequence. The proposed estimator is a variable window width version of the Watson-Nadaraya type of derivative estimator. The window width varied as more data points become available enables a recursive algorithm that reduce computational complexity from order N 3 normally required by batch methods for kernel regression to order N 2. This approach is computationally simple and attractive from practical viewpoint especially when the situation call for frequent updating of first derivative estimates. For example, maintaining a delta-hedged position of a portfolio of equities with index options is one of many applications of such estimation.  相似文献   

14.
The problem addressed is that of smoothing parameter selection in kernel nonparametric regression in the fixed design regression model with dependent noise. An asymptotic expression of the optimum bandwidth parameter has been obtained in recent studies, where this takes the form h = C 0 n ?1/5. This paper proposes to use a plug-in methodology, in order to obtain an optimum estimation of the bandwidth parameter, through preliminary estimation of the unknown value of C 0.  相似文献   

15.
Abstract

This study concerns semiparametric approaches to estimate discrete multivariate count regression functions. The semiparametric approaches investigated consist of combining discrete multivariate nonparametric kernel and parametric estimations such that (i) a prior knowledge of the conditional distribution of model response may be incorporated and (ii) the bias of the traditional nonparametric kernel regression estimator of Nadaraya-Watson may be reduced. We are precisely interested in combination of the two estimations approaches with some asymptotic properties of the resulting estimators. Asymptotic normality results were showed for nonparametric correction terms of parametric start function of the estimators. The performance of discrete semiparametric multivariate kernel estimators studied is illustrated using simulations and real count data. In addition, diagnostic checks are performed to test the adequacy of the parametric start model to the true discrete regression model. Finally, using discrete semiparametric multivariate kernel estimators provides a bias reduction when the parametric multivariate regression model used as start regression function belongs to a neighborhood of the true regression model.  相似文献   

16.
This article introduces a method of nonparametric bivariate density estimation based on a bivariate sample level crossing function, which leads to the construction of a bivariate level crossing empirical distribution function (BLCEDF). An efficiency function for this BLCEDF relative to the classical empirical distribution function (EDF), is derived. The BLCEDF gives more efficient estimates than the EDF in the tails of any underlying continuous distribution, for both small and large sample sizes. On the basis of BLCEDF we define a bivariate level crossing kernel density estimator (BLCKDE) and study its properties. We apply the BLCEDF and BLCKDE for various distributions and provide results of simulations that confirm the theoretical properties. A real-world example is given.  相似文献   

17.
In healthcare studies, count data sets measured with covariates often exhibit heterogeneity and contain extreme values. To analyse such count data sets, we use a finite mixture of regression model framework and investigate a robust estimation approach, called the L2E [D.W. Scott, On fitting and adapting of density estimates, Comput. Sci. Stat. 30 (1998), pp. 124–133], to estimate the parameters. The L2E is based on an integrated L2 distance between parametric conditional and true conditional mass functions. In addition to studying the theoretical properties of the L2E estimator, we compare the performance of L2E with the maximum likelihood (ML) estimator and a minimum Hellinger distance (MHD) estimator via Monte Carlo simulations for correctly specified and gross-error contaminated mixture of Poisson regression models. These show that the L2E is a viable robust alternative to the ML and MHD estimators. More importantly, we use the L2E to perform a comprehensive analysis of a Western Australia hospital inpatient obstetrical length of stay (LOS) (in days) data that contains extreme values. It is shown that the L2E provides a two-component Poisson mixture regression fit to the LOS data which is better than those based on the ML and MHD estimators. The L2E fit identifies admission type as a significant covariate that profiles the predominant subpopulation of normal-stayers as planned patients and the small subpopulation of long-stayers as emergency patients.  相似文献   

18.
Estimation of regression functions from independent and identically distributed data is considered. The L2L2 error with integration with respect to the design measure is used as an error criterion. Usually in the analysis of the rate of convergence of estimates a boundedness assumption on the explanatory variable XX is made besides smoothness assumptions on the regression function and moment conditions on the response variable YY. In this article we consider the kernel estimate and show that by replacing the boundedness assumption on XX by a proper moment condition the same (optimal) rate of convergence can be shown as for bounded data. This answers Question 1 in Stone [1982. Optimal global rates of convergence for nonparametric regression. Ann. Statist., 10, 1040–1053].  相似文献   

19.
The large nonparametric model in this note is a statistical model with the family ? of all continuous and strictly increasing distribution functions. In the abundant literature of the subject, there are many proposals for nonparametric estimators that are applicable in the model. Typically the kth order statistic X k:n is taken as a simplest estimator, with k = [nq], or k = [(n + 1)q], or k = [nq] + 1, etc. Often a linear combination of two consecutive order statistics is considered. In more sophisticated constructions, different L-statistics (e.g., Harrel–Davis, Kaigh–Lachenbruch, Bernstein, kernel estimators) are proposed. Asymptotically the estimators do not differ substantially, but if the sample size n is fixed, which is the case of our concern, differences may be serious. A unified treatment of quantile estimators in the large, nonparametric statistical model is developed.  相似文献   

20.
ABSTRACT

Recently, the Bayesian nonparametric approaches in survival studies attract much more attentions. Because of multimodality in survival data, the mixture models are very common. We introduce a Bayesian nonparametric mixture model with Burr distribution (Burr type XII) as the kernel. Since the Burr distribution shares good properties of common distributions on survival analysis, it has more flexibility than other distributions. By applying this model to simulated and real failure time datasets, we show the preference of this model and compare it with Dirichlet process mixture models with different kernels. The Markov chain Monte Carlo (MCMC) simulation methods to calculate the posterior distribution are used.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号