首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
The paper addresses the problem of estimating missing observations in an infinite realization of a linear, possibly nonstationary, stochastic processes when the model is known. The general case of any possible distribution of missing observations in the time series is considered, and analytical expressions for the optimal estimators and their associated mean squared errors are obtained. These expressions involve solely the elements of the inverse or dual autocorrelation function of the series.

This optimal estimator -the conditional expectation of the missing observations given the available ones- is equal to the estimator that results from filling the missing values in the series with arbitrary numbers, treating these numbers as additive outliers, and removing with intervention analysis the outlier effects from the invented numbers.  相似文献   

We propose a nonlinear mixed-effects framework to jointly model longitudinal and repeated time-to-event data. A parametric nonlinear mixed-effects model is used for the longitudinal observations and a parametric mixed-effects hazard model for repeated event times. We show the importance for parameter estimation of properly calculating the conditional density of the observations (given the individual parameters) in the presence of interval and/or right censoring. Parameters are estimated by maximizing the exact joint likelihood with the stochastic approximation expectation–maximization algorithm. This workflow for joint models is now implemented in the Monolix software, and illustrated here on five simulated and two real datasets.  相似文献   

In this paper, we study the maximum likelihood estimation of a model with mixed binary responses and censored observations. The model is very general and includes the Tobit model and the binary choice model as special cases. We show that, by using additional binary choice observations, our method is more efficient than the traditional Tobit model. Two iterative procedures are proposed to compute the maximum likelihood estimator (MLE) for the model based on the EM algorithm (Dempster et al, 1977) and the Newton-Raphson method. The uniqueness of the MLE is proved. The simulation results show that the inconsistency and inefficiency can be significant when the Tobit method is applied to the present mixed model. The experiment results also suggest that the EM algorithm is much faster than the Newton-Raphson method for the present mixed model. The method also allows one to combine two data sets, the smaller data set with more detailed observations and the larger data set with less detailed binary choice observations in order to improve the efficiency of estimation. This may entail substantial savings when one conducts surveys.  相似文献   


This paper analyses the behaviour of the goodness-of-fit tests for regression models. To this end, it uses statistics based on an estimation of the integrated regression function with missing observations either in the response variable or in some of the covariates. It proposes several versions of one empirical process, constructed from a previous estimation, that uses only the complete observations or replaces the missing observations with imputed values. In the case of missing covariates, a link model is used to fill the missing observations with other complete covariates. In all the situations, Bootstrap methodology is used to calibrate the distribution of the test statistics. A broad simulation study compares the different procedures based on empirical regression methodology, with smoothed tests previously studied in the literature. The comparison reflects the effect of the correlation between the covariates in the tests based on the imputed sample for missing covariates. In addition, the paper proposes a computational binning strategy to evaluate the tests based on an empirical process for large data sets. Finally, two applications to real data illustrate the performance of the tests.  相似文献   

Partition models     
Product partition models assume that observations in different components of a random partition of the data are independent given the partition. If the probability distribution of random partitions is in a certain product form prior to making the observations, it is also in product form given the observations. The product model thus provides a convenient machinery for allowing the data to weight the partitions likely to hold; and inference about particular future observations may then be made by first conditioning on the partition, and then averaging over all partitions. This model is applied to fatalities in manned rocket launches, using data from the SOYUZ, APOLLO, SHUTTLE, and post-Challenger SHUTTLE programs in the Soviet Union and the United States. The combination of these data suggest that the chance of a fatality in the next shuttle, launch is about .03, after allowing for the possibility that the older programs are of slight relevance to the present shuttle program.  相似文献   

Detection of outliers or influential observations is an important work in statistical modeling, especially for the correlated time series data. In this paper we propose a new procedure to detect patch of influential observations in the generalized autoregressive conditional heteroskedasticity (GARCH) model. Firstly we compare the performance of innovative perturbation scheme, additive perturbation scheme and data perturbation scheme in local influence analysis. We find that the innovative perturbation scheme give better result than other two schemes although this perturbation scheme may suffer from masking effects. Then we use the stepwise local influence method under innovative perturbation scheme to detect patch of influential observations and uncover the masking effects. The simulated studies show that the new technique can successfully detect a patch of influential observations or outliers under innovative perturbation scheme. The analysis based on simulation studies and two real data sets show that the stepwise local influence method under innovative perturbation scheme is efficient for detecting multiple influential observations and dealing with masking effects in the GARCH model.  相似文献   

A general canonical variate model is derived when the observations are spatially correlated. For spatial covariance structures resulting from dependence of a pixel on its nearest neighbours, the solution reduces to an analysis of neighbour-corrected values. The usual analysis, in which spatial correlation is ignored, gives similar canonical vectors but over-estimates the canonical roots. A formula for approximating the reduction in the canonical roots to adjust for the spatial correlation is given.  相似文献   

This article proposes a semiparametric nonlinear reproductive dispersion model (SNRDM) which is an extension of nonlinear reproductive dispersion model and semiparametric regression model. Maximum penalized likelihood estimators (MPLEs) of unknown parameters and nonparametric functions in SNRDMs are presented. Some novel diagnostic statistics such as Cook distance and difference deviance for parametric and nonparametric parts are developed to identify influence observations in SNRDMs on the basis of case-deletion method, and some formulae readily computed with the MPLEs algorithm for diagnostic measures are given. The equivalency of case-deletion models and mean-shift outlier models in SNRDM is investigated. A simulation study and a real example are used to illustrate the proposed diagnostic measures.  相似文献   

We propose a method for specifying the distribution of random effects included in a model for cluster data. The class of models we consider includes mixed models and frailty models whose random effects and explanatory variables are constant within clusters. The method is based on cluster residuals obtained by assuming that the random effects are equal between clusters. We exhibit an asymptotic relationship between the cluster residuals and variations of the random effects as the number of observations increases and the variance of the random effects decreases. The asymptotic relationship is used to specify the random-effects distribution. The method is applied to a frailty model and a model used to describe the spread of plant diseases.  相似文献   

The forecasting stage in the analysis of a univariate threshold-autoregressive model, with exogenous threshold variable, has been developed in this paper via the computation of the so-called predictive distributions. The procedure permits one to forecast simultaneously the response and exogenous variables. An important issue in this work is the treatment of eventual missing observations present in the two time series before obtaining forecasts.  相似文献   

When the elements of a design matrix are rational numbers and the variances of the observations are rational multiples of a common real constant, the covariances being zero, the design matrix may be factorised into a product of matrices which have usefil statistical interpretations. The main factor matrices have integer elements, while the other factor matrices are diagonal with rational elements. Weights which are rational numbers, and missing observations, are readily accommodated. A computer is usually needed to find the factors. This paper shows how, once the factors have been found, they may be employed for any suitable set of observations without further need for such assistance.  相似文献   

In this paper, we propose nonlinear elliptical models for correlated data with heteroscedastic and/or autoregressive structures. Our aim is to extend the models proposed by Russo et al. 22 by considering a more sophisticated scale structure to deal with variations in data dispersion and/or a possible autocorrelation among measurements taken throughout the same experimental unit. Moreover, to avoid the possible influence of outlying observations or to take into account the non-normal symmetric tails of the data, we assume elliptical contours for the joint distribution of random effects and errors, which allows us to attribute different weights to the observations. We propose an iterative algorithm to obtain the maximum-likelihood estimates for the parameters and derive the local influence curvatures for some specific perturbation schemes. The motivation for this work comes from a pharmacokinetic indomethacin data set, which was analysed previously by Bocheng and Xuping 1 under normality.  相似文献   

This paper documents situations where the variance inflation model for outliers has undesirable properties. The model is commonly used to accommodate outliers in a Bayesian analysis of regression and time series models. The alternative approach provided here does not suffer from these undesirable properties but gives inferences similar to those of the variance inflation model when this is appropriate. It can be used with regression, time series, and regression with correlated errors in a unified way, and adheres to the scientific principle that inference should be based on the data after obvious outliers have been discarded. Only one parameter is required for outliers; it is interpretable as the a priori willingness to remove observations from the analysis.  相似文献   

The problem of spuriousity has been dealt with from a Bayesian perspective by, among others, Box and Taio (1968) and in several papers by Guttman with various co-authors, beginning with Guttman (1973), The main objective of these papers has been to obtain posterior distributions of parameters, and to base inference on these distributions. In the current paper, the Bayesian argument is carried one step further by deriving predictive distributions of future observations. Inferences are then based on these distributions. We will obtain predictive results for several models, First, we consider the univariate normal case with one spurious observation, This is then generalized to several spurious observations. The multivariate normal situation is studied next. Finally, we consider the general linear model with normal errors.  相似文献   


In this article, we propose a new model for binary time series involving an autoregressive moving average structure. The proposed model, which is an extension of the GARMA model, can be used for calculating the forecast probability of an occurrence of an event of interest in cases where these probabilities are dependent on previous observations in the near term. The proposed model is used to analyze a real dataset involving a series that contains only data 0 and 1, indicating the absence or presence of rain in a city located in the central region of São Paulo state, Brazil.  相似文献   

In this article, a general approach to latent variable models based on an underlying generalized linear model (GLM) with factor analysis observation process is introduced. We call these models Generalized Linear Factor Models (GLFM). The observations are produced from a general model framework that involves observed and latent variables that are assumed to be distributed in the exponential family. More specifically, we concentrate on situations where the observed variables are both discretely measured (e.g., binomial, Poisson) and continuously distributed (e.g., gamma). The common latent factors are assumed to be independent with a standard multivariate normal distribution. Practical details of training such models with a new local expectation-maximization (EM) algorithm, which can be considered as a generalized EM-type algorithm, are also discussed. In conjunction with an approximated version of the Fisher score algorithm (FSA), we show how to calculate maximum likelihood estimates of the model parameters, and to yield inferences about the unobservable path of the common factors. The methodology is illustrated by an extensive Monte Carlo simulation study and the results show promising performance.  相似文献   

This paper analyzes the impact of some kinds of contaminant on model selection in graphical Gaussian models. We investigate four different kinds of contaminants, in order to consider the effect of gross errors, model deviations, and model misspecification. The aim of the work is to assess against which kinds of contaminant a model selection procedure for graphical Gaussian models has a more robust behavior. The analysis is based on simulated data. The simulation study shows that relatively few contaminated observations in even just one of the variables can have a significant impact on correct model selection, especially when the contaminated variable is a node in a separating set of the graph.  相似文献   

Abstract.  We consider models based on multivariate counting processes, including multi-state models. These models are specified semi-parametrically by a set of functions and real parameters. We consider inference for these models based on coarsened observations, focusing on families of smooth estimators such as produced by penalized likelihood. An important issue is the choice of model structure, for instance, the choice between a Markov and some non-Markov models. We define in a general context the expected Kullback–Leibler criterion and we show that the likelihood-based cross-validation (LCV) is a nearly unbiased estimator of it. We give a general form of an approximate of the leave-one-out LCV. The approach is studied by simulations, and it is illustrated by estimating a Markov and two semi-Markov illness–death models with application on dementia using data of a large cohort study.  相似文献   

The modified zero order approach to estimating coefficients in the face of missing observations treats them as parameters to be estimated simultaneously with the missing observations. The paper then investigates (in the context of Han's generalized regression model)(i) when parameter estimators don't vary between using the partial data points and using only the complete ones (the informationless result), and (ii) large sample properties of the modified zero order estimator. It's found the sequential cut property is crucial to the informationless result for coefficient estimators; consistency of the modified zero order estimator depends on the percentage of observations with missing elements for large sample sizes or the sequential cut property.  相似文献   

The central topic of this article is the estimation of parameters of the generalized partially linear single-index model (GPLSIM). Two numerical optimization procedures are presented and an S-plus program based on these procedures is compared to a program by Wand in a simulation setting. The results from these simulations indicate that the estimates for the new procedures are as good, if not better, than Wand's. Also, this program is much more flexible than Wand's since it can handle more general models. Other simulations are also conducted. The first compares the effects of using linear interpolation versus spline interpolation in an optimization procedure. The results indicate that by using spline interpolation one gets more stable estimates at a cost of increased computational time. A second simulation was conducted to assess the performance of a method for estimating the variance of alpha. A third set of simulations is carried out to determine the best criterion for testing that one of the elements of alpha is equal to zero. The GPLSIM is applied to a water quality data set and the results indicate an interesting relationship between gastrointestinal illness and turbidity (cloudiness) of drinking water.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号