首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
In this paper we present a unified discussion of different approaches to the identification of smoothing spline analysis of variance (ANOVA) models: (i) the “classical” approach (in the line of Wahba in Spline Models for Observational Data, 1990; Gu in Smoothing Spline ANOVA Models, 2002; Storlie et al. in Stat. Sin., 2011) and (ii) the State-Dependent Regression (SDR) approach of Young in Nonlinear Dynamics and Statistics (2001). The latter is a nonparametric approach which is very similar to smoothing splines and kernel regression methods, but based on recursive filtering and smoothing estimation (the Kalman filter combined with fixed interval smoothing). We will show that SDR can be effectively combined with the “classical” approach to obtain a more accurate and efficient estimation of smoothing spline ANOVA models to be applied for emulation purposes. We will also show that such an approach can compare favorably with kriging.  相似文献   

2.
We propose a more efficient version of the slice sampler for Dirichlet process mixture models described by Walker (Commun. Stat., Simul. Comput. 36:45–54, 2007). This new sampler allows for the fitting of infinite mixture models with a wide-range of prior specifications. To illustrate this flexibility we consider priors defined through infinite sequences of independent positive random variables. Two applications are considered: density estimation using mixture models and hazard function estimation. In each case we show how the slice efficient sampler can be applied to make inference in the models. In the mixture case, two submodels are studied in detail. The first one assumes that the positive random variables are Gamma distributed and the second assumes that they are inverse-Gaussian distributed. Both priors have two hyperparameters and we consider their effect on the prior distribution of the number of occupied clusters in a sample. Extensive computational comparisons with alternative “conditional” simulation techniques for mixture models using the standard Dirichlet process prior and our new priors are made. The properties of the new priors are illustrated on a density estimation problem.  相似文献   

3.
In order to guarantee confidentiality and privacy of firm-level data, statistical offices apply various disclosure limitation techniques. However, each anonymization technique has its protection limits such that the probability of disclosing the individual information for some observations is not minimized. To overcome this problem, we propose combining two separate disclosure limitation techniques, blanking and multiplication of independent noise, in order to protect the original dataset. The proposed approach yields a decrease in the probability of reidentifying/disclosing individual information and can be applied to linear and nonlinear regression models. We show how to combine the blanking method with the multiplicative measurement error method and how to estimate the model by combining the multiplicative Simulation-Extrapolation (M-SIMEX) approach from Nolte (, 2007) on the one side with the Inverse Probability Weighting (IPW) approach going back to Horwitz and Thompson (J. Am. Stat. Assoc. 47:663–685, 1952) and on the other side with matching methods, as an alternative to IPW, like the semiparametric M-Estimator proposed by Flossmann (, 2007). Based on Monte Carlo simulations, we show that multiplicative measurement error combined with blanking as a masking procedure does not necessarily lead to a severe reduction in the estimation quality, provided that its effects on the data generating process are known.  相似文献   

4.
In this article we develop a class of stochastic boosting (SB) algorithms, which build upon the work of Holmes and Pintore (Bayesian Stat. 8, Oxford University Press, Oxford, 2007). They introduce boosting algorithms which correspond to standard boosting (e.g. Bühlmann and Hothorn, Stat. Sci. 22:477–505, 2007) except that the optimization algorithms are randomized; this idea is placed within a Bayesian framework. We show that the inferential procedure in Holmes and Pintore (Bayesian Stat. 8, Oxford University Press, Oxford, 2007) is incorrect and further develop interpretational, computational and theoretical results which allow one to assess SB’s potential for classification and regression problems. To use SB, sequential Monte Carlo (SMC) methods are applied. As a result, it is found that SB can provide better predictions for classification problems than the corresponding boosting algorithm. A theoretical result is also given, which shows that the predictions of SB are not significantly worse than boosting, when the latter provides the best prediction. We also investigate the method on a real case study from machine learning.  相似文献   

5.
Time series arising in practice often have an inherently irregular sampling structure or missing values, that can arise for example due to a faulty measuring device or complex time-dependent nature. Spectral decomposition of time series is a traditionally useful tool for data variability analysis. However, existing methods for spectral estimation often assume a regularly-sampled time series, or require modifications to cope with irregular or ‘gappy’ data. Additionally, many techniques also assume that the time series are stationary, which in the majority of cases is demonstrably not appropriate. This article addresses the topic of spectral estimation of a non-stationary time series sampled with missing data. The time series is modelled as a locally stationary wavelet process in the sense introduced by Nason et al. (J. R. Stat. Soc. B 62(2):271–292, 2000) and its realization is assumed to feature missing observations. Our work proposes an estimator (the periodogram) for the process wavelet spectrum, which copes with the missing data whilst relaxing the strong assumption of stationarity. At the centre of our construction are second generation wavelets built by means of the lifting scheme (Sweldens, Wavelet Applications in Signal and Image Processing III, Proc. SPIE, vol. 2569, pp. 68–79, 1995), designed to cope with irregular data. We investigate the theoretical properties of our proposed periodogram, and show that it can be smoothed to produce a bias-corrected spectral estimate by adopting a penalized least squares criterion. We demonstrate our method with real data and simulated examples.  相似文献   

6.
A new procedure is proposed to estimate the jump location curve and surface in the two-dimensional (2D) and three-dimensional (3D) nonparametric jump regression models, respectively. In each of the 2D and 3D cases, our estimation procedure is motivated by the fact that, under some regularity conditions, the ridge location of the rotational difference kernel estimate (RDKE; Qiu in Sankhyā Ser. A 59, 268–294, 1997, and J. Comput. Graph. Stat. 11, 799–822, 2002; Garlipp and Müller in Sankhyā Ser. A 69, 55–86, 2007) obtained from the noisy image is asymptotically close to the jump location of the true image. Accordingly, a computational procedure based on the kernel smoothing method is designed to find the ridge location of RDKE, and the result is taken as the jump location estimate. The sequence relationship among the points comprising our jump location estimate is obtained. Our jump location estimate is produced without the knowledge of the range or shape of jump region. Simulation results demonstrate that the proposed estimation procedure can detect the jump location very well, and thus it is a useful alternative for estimating the jump location in each of the 2D and 3D cases.  相似文献   

7.
The last decade saw enormous progress in the development of causal inference tools to account for noncompliance in randomized clinical trials. With survival outcomes, structural accelerated failure time (SAFT) models enable causal estimation of effects of observed treatments without making direct assumptions on the compliance selection mechanism. The traditional proportional hazards model has however rarely been used for causal inference. The estimator proposed by Loeys and Goetghebeur (2003, Biometrics vol. 59 pp. 100–105) is limited to the setting of all or nothing exposure. In this paper, we propose an estimation procedure for more general causal proportional hazards models linking the distribution of potential treatment-free survival times to the distribution of observed survival times via observed (time-constant) exposures. Specifically, we first build models for observed exposure-specific survival times. Next, using the proposed causal proportional hazards model, the exposure-specific survival distributions are backtransformed to their treatment-free counterparts, to obtain – after proper mixing – the unconditional treatment-free survival distribution. Estimation of the parameter(s) in the causal model is then based on minimizing a test statistic for equality in backtransformed survival distributions between randomized arms.  相似文献   

8.
In randomized clinical trials, we are often concerned with comparing two-sample survival data. Although the log-rank test is usually suitable for this purpose, it may result in substantial power loss when the two groups have nonproportional hazards. In a more general class of survival models of Yang and Prentice (Biometrika 92:1–17, 2005), which includes the log-rank test as a special case, we improve model efficiency by incorporating auxiliary covariates that are correlated with the survival times. In a model-free form, we augment the estimating equation with auxiliary covariates, and establish the efficiency improvement using the semiparametric theories in Zhang et al. (Biometrics 64:707–715, 2008) and Lu and Tsiatis (Biometrics, 95:674–679, 2008). Under minimal assumptions, our approach produces an unbiased, asymptotically normal estimator with additional efficiency gain. Simulation studies and an application to a leukemia study show the satisfactory performance of the proposed method.  相似文献   

9.
Quantile regression, including median regression, as a more completed statistical model than mean regression, is now well known with its wide spread applications. Bayesian inference on quantile regression or Bayesian quantile regression has attracted much interest recently. Most of the existing researches in Bayesian quantile regression focus on parametric quantile regression, though there are discussions on different ways of modeling the model error by a parametric distribution named asymmetric Laplace distribution or by a nonparametric alternative named scale mixture asymmetric Laplace distribution. This paper discusses Bayesian inference for nonparametric quantile regression. This general approach fits quantile regression curves using piecewise polynomial functions with an unknown number of knots at unknown locations, all treated as parameters to be inferred through reversible jump Markov chain Monte Carlo (RJMCMC) of Green (Biometrika 82:711–732, 1995). Instead of drawing samples from the posterior, we use regression quantiles to create Markov chains for the estimation of the quantile curves. We also use approximate Bayesian factor in the inference. This method extends the work in automatic Bayesian mean curve fitting to quantile regression. Numerical results show that this Bayesian quantile smoothing technique is competitive with quantile regression/smoothing splines of He and Ng (Comput. Stat. 14:315–337, 1999) and P-splines (penalized splines) of Eilers and de Menezes (Bioinformatics 21(7):1146–1153, 2005).  相似文献   

10.
Fairly general rational expectation (RE) models are proved to have linear vector autoregressive moving average (VARMA) models as their reduced forms, using Muth's method of undetermined coefficients (MUC). An advantage of using VARMAs is that RE estimation can benefit from well-developed theory of solutions of stochastic difference equations and computer packages for VARMA and state space (SS) models. For example, we report theoretical derivations of explicit dynamics associated with the RE structure for Muth's and Lucas-Sargent-Wallace's models, suggesting generalizations and new solutions. We illustrate an RE structure estimation by using energy prices and Fackler and Krieger's (J. Bus. Econom. Statist. 4 (1986), 71–80) study of five major macroeconomic variables.  相似文献   

11.
12.
For multiway contingency tables, Wall and Lienert (Biom. J. 18:259–264, 1976) considered the point-symmetry model. For square contingency tables, Tomizawa (Biom. J. 27:895–905, 1985) gave a theorem that the point-symmetry model holds if and only if both the quasi point-symmetry and the marginal point-symmetry models hold. This paper proposes some quasi point-symmetry models and marginal point-symmetry models for multiway tables, and extends Tomizawa’s (Biom. J. 27:895–905, 1985) theorem into multiway tables. We also show that for multiway tables the likelihood ratio statistic for testing goodness of fit of the point-symmetry model is asymptotically equivalent to the sum of those for testing the quasi point-symmetry model with some order and the marginal point-symmetry model with the corresponding order. An example is given.  相似文献   

13.
Two-sample comparison problems are often encountered in practical projects and have widely been studied in literature. Owing to practical demands, the research for this topic under special settings such as a semiparametric framework have also attracted great attentions. Zhou and Liang (Biometrika 92:271–282, 2005) proposed an empirical likelihood-based semi-parametric inference for the comparison of treatment effects in a two-sample problem with censored data. However, their approach is actually a pseudo-empirical likelihood and the method may not be fully efficient. In this study, we develop a new empirical likelihood-based inference under more general framework by using the hazard formulation of censored data for two sample semi-parametric hybrid models. We demonstrate that our empirical likelihood statistic converges to a standard chi-squared distribution under the null hypothesis. We further illustrate the use of the proposed test by testing the ROC curve with censored data, among others. Numerical performance of the proposed method is also examined.  相似文献   

14.
A fast new algorithm is proposed for numerical computation of (approximate) D-optimal designs. This cocktail algorithm extends the well-known vertex direction method (VDM; Fedorov in Theory of Optimal Experiments, 1972) and the multiplicative algorithm (Silvey et al. in Commun. Stat. Theory Methods 14:1379–1389, 1978), and shares their simplicity and monotonic convergence properties. Numerical examples show that the cocktail algorithm can lead to dramatically improved speed, sometimes by orders of magnitude, relative to either the multiplicative algorithm or the vertex exchange method (a variant of VDM). Key to the improved speed is a new nearest neighbor exchange strategy, which acts locally and complements the global effect of the multiplicative algorithm. Possible extensions to related problems such as nonparametric maximum likelihood estimation are mentioned.  相似文献   

15.
Hailin Sang 《Statistics》2015,49(1):187-208
We propose a sparse coefficient estimation and automated model selection procedure for autoregressive processes with heavy-tailed innovations based on penalized conditional maximum likelihood. Under mild moment conditions on the innovation processes, the penalized conditional maximum likelihood estimator satisfies a strong consistency, OP(N?1/2) consistency, and the oracle properties, where N is the sample size. We have the freedom in choosing penalty functions based on the weak conditions on them. Two penalty functions, least absolute shrinkage and selection operator and smoothly clipped average deviation, are compared. The proposed method provides a distribution-based penalized inference to AR models, which is especially useful when the other estimation methods fail or under perform for AR processes with heavy-tailed innovations [Feigin, Resnick. Pitfalls of fitting autoregressive models for heavy-tailed time series. Extremes. 1999;1:391–422]. A simulation study confirms our theoretical results. At the end, we apply our method to a historical price data of the US Industrial Production Index for consumer goods, and obtain very promising results.  相似文献   

16.
The goal of this paper is to introduce a partially adaptive estimator for the censored regression model based on an error structure described by a mixture of two normal distributions. The model we introduce is easily estimated by maximum likelihood using an EM algorithm adapted from the work of Bartolucci and Scaccia (Comput Stat Data Anal 48:821–834, 2005). A Monte Carlo study is conducted to compare the small sample properties of this estimator to the performance of some common alternative estimators of censored regression models including the usual tobit model, the CLAD estimator of Powell (J Econom 25:303–325, 1984), and the STLS estimator of Powell (Econometrica 54:1435–1460, 1986). In terms of RMSE, our partially adaptive estimator performed well. The partially adaptive estimator is applied to data on wife’s hours worked from Mroz (1987). In this application we find support for the partially adaptive estimator over the usual tobit model.  相似文献   

17.
Abstract

Structured sparsity has recently been a very popular technique to deal with the high-dimensional data. In this paper, we mainly focus on the theoretical problems for the overlapping group structure of generalized linear models (GLMs). Although the overlapping group lasso method for GLMs has been widely applied in some applications, the theoretical properties about it are still unknown. Under some general conditions, we presents the oracle inequalities for the estimation and prediction error of overlapping group Lasso method in the generalized linear model setting. Then, we apply these results to the so-called Logistic and Poisson regression models. It is shown that the results of the Lasso and group Lasso procedures for GLMs can be recovered by specifying the group structures in our proposed method. The effect of overlap and the performance of variable selection of our proposed method are both studied by numerical simulations. Finally, we apply our proposed method to two gene expression data sets: the p53 data and the lung cancer data.  相似文献   

18.
ABSTRACT

Joint models are statistical tools for estimating the association between time-to-event and longitudinal outcomes. One challenge to the application of joint models is its computational complexity. Common estimation methods for joint models include a two-stage method, Bayesian and maximum-likelihood methods. In this work, we consider joint models of a time-to-event outcome and multiple longitudinal processes and develop a maximum-likelihood estimation method using the expectation–maximization algorithm. We assess the performance of the proposed method via simulations and apply the methodology to a data set to determine the association between longitudinal systolic and diastolic blood pressure measures and time to coronary artery disease.  相似文献   

19.
Conformal predictors, introduced by Vovk et al. (Algorithmic Learning in a Random World, Springer, New York, 2005), serve to build prediction intervals by exploiting a notion of conformity of the new data point with previously observed data. We propose a novel method for constructing prediction intervals for the response variable in multivariate linear models. The main emphasis is on sparse linear models, where only few of the covariates have significant influence on the response variable even if the total number of covariates is very large. Our approach is based on combining the principle of conformal prediction with the 1 penalized least squares estimator (LASSO). The resulting confidence set depends on a parameter ε>0 and has a coverage probability larger than or equal to 1−ε. The numerical experiments reported in the paper show that the length of the confidence set is small. Furthermore, as a by-product of the proposed approach, we provide a data-driven procedure for choosing the LASSO penalty. The selection power of the method is illustrated on simulated and real data.  相似文献   

20.
This paper focuses on robust estimation and variable selection for partially linear models. We combine the weighted least absolute deviation (WLAD) regression with the adaptive least absolute shrinkage and selection operator (LASSO) to achieve simultaneous robust estimation and variable selection for partially linear models. Compared with the LAD-LASSO method, the WLAD-LASSO method will resist to the heavy-tailed errors and outliers in the parametric components. In addition, we estimate the unknown smooth function by a robust local linear regression. Under some regular conditions, the theoretical properties of the proposed estimators are established. We further examine finite-sample performance of the proposed procedure by simulation studies and a real data example.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号