首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Searching for regions of the input space where a statistical model is inappropriate is useful in many applications. The study proposes an algorithm for finding local departures from a regression-type prediction model. The algorithm returns low-dimensional hypercubes where the average prediction error clearly departs from zero. The study describes the developed algorithm, and shows successful applications on the simulated and real data from the steel plate production. The algorithms that have been originally developed for searching regions of the high-response value from the input space are reviewed and considered as alternative methods for locating model departures. The proposed algorithm succeeds in locating the model departure regions better than the compared alternatives. The algorithm can be utilized in sequential follow-up of a model as time goes along and new data are observed.  相似文献   

2.
Abstract.  Prediction error is critical to assess model fit and evaluate model prediction. We propose the cross-validation (CV) and approximated CV methods for estimating prediction error under the Bregman divergence (BD), which embeds nearly all of the commonly used loss functions in the regression, classification procedures and machine learning literature. The approximated CV formulas are analytically derived, which facilitate fast estimation of prediction error under BD. We then study a data-driven optimal bandwidth selector for local-likelihood estimation that minimizes the overall prediction error or equivalently the covariance penalty. It is shown that the covariance penalty and CV methods converge to the same mean-prediction-error-criterion. We also propose a lower-bound scheme for computing the local logistic regression estimates and demonstrate that the algorithm monotonically enhances the target local likelihood and converges. The idea and methods are extended to the generalized varying-coefficient models and additive models.  相似文献   

3.
We consider a Bayesian deterministically trending dynamic time series model with heteroscedastic error variance, in which there exist multiple structural changes in level, trend and error variance, but the number of change-points and the timings are unknown. For a Bayesian analysis, a truncated Poisson prior and conjugate priors are used for the number of change-points and the distributional parameters, respectively. To identify the best model and estimate the model parameters simultaneously, we propose a new method by sequentially making use of the Gibbs sampler in conjunction with stochastic approximation Monte Carlo simulations, as an adaptive Monte Carlo algorithm. The numerical results are in favor of our method in terms of the quality of estimates.  相似文献   

4.
Classification error can lead to substantial biases in the estimation of gross flows from longitudinal data. We propose a method to adjust flow estimates for bias, based on fitting separate multinomial logistic models to the classification error probabilities and the true state transition probabilities using values of auxiliary variables. Our approach has the advantages that it does not require external information on misclassification rates, it permits the identification of factors that are related to misclassification and true transitions and it does not assume independence between classification errors at successive points in time. Constraining the prediction of the stocks to agree with the observed stocks protects against model misspecification. We apply the approach to data on women from the Panel Study of Income Dynamics with three categories of labour force status. The model fitted is shown to have interpretable coefficient estimates and to provide a good fit. Simulation results indicate good performance of the model in predicting the true flows and robustness against departures from the model postulated.  相似文献   

5.
In multi-category response models, categories are often ordered. In the case of ordinal response models, the usual likelihood approach becomes unstable with ill-conditioned predictor space or when the number of parameters to be estimated is large relative to the sample size. The likelihood estimates do not exist when the number of observations is less than the number of parameters. The same problem arises if constraint on the order of intercept values is not met during the iterative procedure. Proportional odds models (POMs) are most commonly used for ordinal responses. In this paper, penalized likelihood with quadratic penalty is used to address these issues with a special focus on POMs. To avoid large differences between two parameter values corresponding to the consecutive categories of an ordinal predictor, the differences between the parameters of two adjacent categories should be penalized. The considered penalized-likelihood function penalizes the parameter estimates or differences between the parameter estimates according to the type of predictors. Mean-squared error for parameter estimates, deviance of fitted probabilities and prediction error for ridge regression are compared with usual likelihood estimates in a simulation study and an application.  相似文献   

6.
Summary. Long-transported air pollution in Europe is monitored by a combination of a highly complex mathematical model and a limited number of measurement stations. The model predicts deposition on a 150 km × 150 km square grid covering the whole of the continent. These predictions can be regarded as spatial averages, with some spatially correlated model error. The measurement stations give a limited number of point estimates, regarded as error free. We combine these two sources of data by assuming that both are observations of an underlying true process. This true deposition is made up of a smooth deterministic trend, due to gradual changes in emissions over space and time, and two stochastic components. One is non- stationary and correlated over long distances; the other describes variation within a grid square. Our approach is through hierarchical modelling with predictions and measurements being independent conditioned on the underlying non-stationary true deposition. We assume Gaussian processes and calculate maximum likelihood estimates through numerical optimization. We find that the variation within a grid square is by far the largest component of the variation in the true deposition. We assume that the mathematical model produces estimates of the mean over an area that is approximately equal to a grid square, and we find that it has an error that is similar to the long-range stochastic component of the true deposition, in addition to a large bias.  相似文献   

7.
This article is concerned with how the bootstrap can be applied to study conditional forecast error distributions and construct prediction regions for future observations in periodic time-varying state-space models. We derive, first, an algorithm for assessing the precision of quasi-maximum likelihood estimates of the parameters. As a result, the derived algorithm is exploited for numerically evaluating the conditional forecast accuracy of a periodic time series model expressed in state space form. We propose a method which requires the backward, or reverse-time, representation of the model for assessing conditional forecast errors. Finally, the small sample properties of the proposed procedures will be investigated by some simulation studies. Furthermore, we illustrate the results by applying the proposed method to a real time series.  相似文献   

8.
We consider estimation of the unknown parameters of Chen distribution [Chen Z. A new two-parameter lifetime distribution with bathtub shape or increasing failure rate function. Statist Probab Lett. 2000;49:155–161] with bathtub shape using progressive-censored samples. We obtain maximum likelihood estimates by making use of an expectation–maximization algorithm. Different Bayes estimates are derived under squared error and balanced squared error loss functions. It is observed that the associated posterior distribution appears in an intractable form. So we have used an approximation method to compute these estimates. A Metropolis–Hasting algorithm is also proposed and some more approximate Bayes estimates are obtained. Asymptotic confidence interval is constructed using observed Fisher information matrix. Bootstrap intervals are proposed as well. Sample generated from MH algorithm are further used in the construction of HPD intervals. Finally, we have obtained prediction intervals and estimates for future observations in one- and two-sample situations. A numerical study is conducted to compare the performance of proposed methods using simulations. Finally, we analyse real data sets for illustration purposes.  相似文献   

9.
We postulate a dynamic spatio-temporal model with constant covariate effect but with varying spatial effect over time and varying temporal effect across locations. To mitigate the effect of temporary structural change, the model can be estimated using the backfitting algorithm embedded with forward search algorithm and bootstrap. A simulation study is designed to evaluate structural optimality of the model with the estimation procedure. The fitted model exhibit superior predictive ability relative to the linear model. The proposed algorithm also consistently produced lower relative bias and standard errors for the spatial parameter estimates. While additional neighbourhoods do not necessarily improve predictive ability of the model, it trims down relative bias on the parameter estimates, specially for spatial parameter. Location of the temporary structural change along with the degree of structural change contributes to lower relative bias of parameter estimates and in better predictive ability of the model. The estimation procedure is able to produce parameter estimates that are robust to the occurrence of temporary structural change.  相似文献   

10.
We postulate a spatiotemporal multilevel model and estimate using forward search algorithm and MLE imbedded into the backfitting algorithm. Forward search algorithm ensures robustness of the estimates by filtering the effect of temporary structural changes in the estimation of the group-level covariates, the individual-level covariates and spatial parameters. Backfitting algorithm provides computational efficiency of estimation procedure assuming an additive model. Simulation studies show that estimates are robust even in the presence of structural changes induced for example by epidemic outbreak. The model also produced robust estimates even for small sample and short time series common in epidemiological settings.  相似文献   

11.
Computer Experiments, consisting of a number of runs of a computer model with different inputs, are now common-place in scientific research. Using a simple fire model for illustration some guidelines are given for the size of a computer experiment. A graph is provided relating the error of prediction to the sample size which should be of use when designing computer experiments.

Methods for augmenting computer experiments with extra runs are also described and illustrated. The simplest method involves adding one point at a time choosing that point with the maximum prediction variance. Another method that appears to work well is to choose points from a candidate set with maximum determinant of the variance covariance matrix of predictions.  相似文献   

12.
This paper deals with the prediction of time series with missing data using an alternative formulation for Holt's model with additive errors. This formulation simplifies both the calculus of maximum likelihood estimators of all the unknowns in the model and the calculus of point forecasts. In the presence of missing data, the EM algorithm is used to obtain maximum likelihood estimates and point forecasts. Based on this application we propose a leave-one-out algorithm for the data transformation selection problem which allows us to analyse Holt's model with multiplicative errors. Some numerical results show the performance of these procedures for obtaining robust forecasts.  相似文献   

13.
In this paper, we study inference in a heteroscedastic measurement error model with known error variances. Instead of the normal distribution for the random components, we develop a model that assumes a skew-t distribution for the true covariate and a centred Student's t distribution for the error terms. The proposed model enables to accommodate skewness and heavy-tailedness in the data, while the degrees of freedom of the distributions can be different. Maximum likelihood estimates are computed via an EM-type algorithm. The behaviour of the estimators is also assessed in a simulation study. Finally, the approach is illustrated with a real data set from a methods comparison study in Analytical Chemistry.  相似文献   

14.
The statistical properties of two closed-form estimators of the parameters of the quadratic time trend model are derived. The estimators are based on the derived variables from Buys-Ballot table. The estimators are derived by assuming that error term is identically and independently distributed. However, the validity of this assumption is sometimes difficult to verify. We also study, through simulations, the impact of misspecifying the error distribution on the estimation and prediction accuracy in the quadratic time trend model. It is shown that the estimators are inconsistent in the presence of misspecification. T methods are illustrated with real-life examples.  相似文献   

15.
In this article, we use a latent class model (LCM) with prevalence modeled as a function of covariates to assess diagnostic test accuracy in situations where the true disease status is not observed, but observations on three or more conditionally independent diagnostic tests are available. A fast Monte Carlo expectation–maximization (MCEM) algorithm with binary (disease) diagnostic data is implemented to estimate parameters of interest; namely, sensitivity, specificity, and prevalence of the disease as a function of covariates. To obtain standard errors for confidence interval construction of estimated parameters, the missing information principle is applied to adjust information matrix estimates. We compare the adjusted information matrix-based standard error estimates with the bootstrap standard error estimates both obtained using the fast MCEM algorithm through an extensive Monte Carlo study. Simulation demonstrates that the adjusted information matrix approach estimates the standard error similarly with the bootstrap methods under certain scenarios. The bootstrap percentile intervals have satisfactory coverage probabilities. We then apply the LCM analysis to a real data set of 122 subjects from a Gynecologic Oncology Group study of significant cervical lesion diagnosis in women with atypical glandular cells of undetermined significance to compare the diagnostic accuracy of a histology-based evaluation, a carbonic anhydrase-IX biomarker-based test and a human papillomavirus DNA test.  相似文献   

16.
Based on ordered ranked set sample, Bayesian estimation of the model parameter as well as prediction of the unobserved data from Rayleigh distribution are studied. The Bayes estimates of the parameter involved are obtained using both squared error and asymmetric loss functions. The Bayesian prediction approach is considered for predicting the unobserved lifetimes based on a two-sample prediction problem. A real life dataset and simulation study are used to illustrate our procedures.  相似文献   

17.
Lee and Carter proposed in 1992 a non-linear model mxt = exp (ax + bx kt + εxt) for fitting and forecasting age-specific mortality rates at age x and time t. For the model parameter estimation, they employed the singular value decomposition method to find a least squares solution. However, the singular value decomposition algorithm does not provide the standard errors of estimated parameters, making it impossible to assess the accuracy of model parameters. This article describes the Lee-Carter model and the technical procedures to fit and extrapolate this model. To estimate the precision of the parameter estimates of the Lee-Carter model, we propose a binomial framework, whose parameter point estimates can be obtained by the maximum likelihood approach and interval estimates by a bootstrap approach. This model is used to fit mortality data in England and Wales from 1951 to 1990 and to forecast mortality change from 1991 to 2020. The Lee-Carter model fits these mortality data very well with R2 being 0.9980. The estimated overall age pattern of mortality ax is very robust whereas there is considerable uncertainty in bx (changes in the age pattern over time) and kt (overall change in mortality). The fitted log age-specific mortality rates have been declining linearly from 1951 to 1990 at different paces and the projected rates will continue to decline in such a way in the 30 years prediction period.  相似文献   

18.
In this paper, a new hybrid model of vector autoregressive moving average (VARMA) models and Bayesian networks is proposed to improve the forecasting performance of multivariate time series. In the proposed model, the VARMA model, which is a popular linear model in time series forecasting, is specified to capture the linear characteristics. Then the errors of the VARMA model are clustered into some trends by K-means algorithm with Krzanowski–Lai cluster validity index determining the number of trends, and a Bayesian network is built to learn the relationship between the data and the trend of its corresponding VARMA error. Finally, the estimated values of the VARMA model are compensated by the probabilities of their corresponding VARMA errors belonging to each trend, which are obtained from the Bayesian network. Compared with VARMA models, the experimental results with a simulation study and two multivariate real-world data sets indicate that the proposed model can effectively improve the prediction performance.  相似文献   

19.
In this article, we propose a family of bounded influence robust estimates for the parametric and non-parametric components of a generalized partially linear mixed model that are subject to censored responses and missing covariates. The asymptotic properties of the proposed estimates have been looked into. The estimates are obtained by using Monte Carlo expectation–maximization algorithm. An approximate method which reduces the computational time to a great extent is also proposed. A simulation study shows that performances of the two approaches are similar in terms of bias and mean square error. The analysis is illustrated through a study on the effect of environmental factors on the phytoplankton cell count.  相似文献   

20.
In studies that produce data with spatial structure, it is common that covariates of interest vary spatially in addition to the error. Because of this, the error and covariate are often correlated. When this occurs, it is difficult to distinguish the covariate effect from residual spatial variation. In an i.i.d. normal error setting, it is well known that this type of correlation produces biased coefficient estimates, but predictions remain unbiased. In a spatial setting, recent studies have shown that coefficient estimates remain biased, but spatial prediction has not been addressed. The purpose of this paper is to provide a more detailed study of coefficient estimation from spatial models when covariate and error are correlated and then begin a formal study regarding spatial prediction. This is carried out by investigating properties of the generalized least squares estimator and the best linear unbiased predictor when a spatial random effect and a covariate are jointly modelled. Under this setup, we demonstrate that the mean squared prediction error is possibly reduced when covariate and error are correlated.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号