首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Inference for semi-Markov models under panel data presents considerable computational difficulties. In general the likelihood is intractable, but a tractable likelihood with the form of a hidden Markov model can be obtained if the sojourn times in each of the states are assumed to have phase-type distributions. However, using phase-type distributions directly may be undesirable as they require estimation of parameters which may be poorly identified. In this article, an approach to fitting semi-Markov models with standard parametric sojourn distributions is developed. The method involves establishing a family of Coxian phase-type distribution approximations to the parametric distribution and merging approximations for different states to obtain an approximate semi-Markov process with a tractable likelihood. Approximations are developed for Weibull and Gamma distributions and demonstrated on data relating to post-lung-transplantation patients.  相似文献   

2.
Matrix-analytic Models and their Analysis   总被引:2,自引:0,他引:2  
We survey phase-type distributions and Markovian point processes, aspects of how to use such models in applied probability calculations and how to fit them to observed data. A phase-type distribution is defined as the time to absorption in a finite continuous time Markov process with one absorbing state. This class of distributions is dense and contains many standard examples like all combinations of exponential in series/parallel. A Markovian point process is governed by a finite continuous time Markov process (typically ergodic), such that points are generated at a Poisson intensity depending on the underlying state and at transitions; a main special case is a Markov-modulated Poisson process. In both cases, the analytic formulas typically contain matrix-exponentials, and the matrix formalism carried over when the models are used in applied probability calculations as in problems in renewal theory, random walks and queueing. The statistical analysis is typically based upon the EM algorithm, viewing the whole sample path of the background Markov process as the latent variable.  相似文献   

3.
Most statistical models arising in real life applications as well as in interdisciplinary research are complex in their designs, sampling plans, and associated probability laws, which in turn are often constrained by inequality, order, functional, shape or other restraints. Optimality of conventional likelihood ratio based statistical inference may not be tenable here, although the use of restricted or quasi-likelihood has spurred in such environments. S.N. Roy's ingenious union–intersection principle provides an alternative avenue, often having some computational advantages, increased scope of adaptability, and flexibility beyond conventional likelihood paradigms. This scenario is appraised here with some illustrative examples, and with some interesting problems of inference on stochastic ordering (dominance) in parametric as well as beyond parametric setups.  相似文献   

4.
Abstract

In this paper we are concerned with variable selection in finite mixture of semiparametric regression models. This task consists of model selection for non parametric component and variable selection for parametric part. Thus, we encountered separate model selections for every non parametric component of each sub model. To overcome this computational burden, we introduced a class of variable selection procedures for finite mixture of semiparametric regression models using penalized approach for variable selection. It is shown that the new method is consistent for variable selection. Simulations show that the performance of proposed method is good, and it consequently improves pervious works in this area and also requires much less computing power than existing methods.  相似文献   

5.
As a useful extension of partially linear models and varying coefficient models, the partially linear varying coefficient model is useful in statistical modelling. This paper considers statistical inference for the semiparametric model when the covariates in the linear part are measured with additive error and some additional linear restrictions on the parametric component are available. We propose a restricted modified profile least-squares estimator for the parametric component, and prove the asymptotic normality of the proposed estimator. To test hypotheses on the parametric component, we propose a test statistic based on the difference between the corrected residual sums of squares under the null and alterative hypotheses, and show that its limiting distribution is a weighted sum of independent chi-square distributions. We also develop an adjusted test statistic, which has an asymptotically standard chi-squared distribution. Some simulation studies are conducted to illustrate our approaches.  相似文献   

6.
ABSTRACT

As a compromise between parametric regression and non-parametric regression models, partially linear models are frequently used in statistical modelling. This paper is concerned with the estimation of partially linear regression model in the presence of multicollinearity. Based on the profile least-squares approach, we propose a novel principal components regression (PCR) estimator for the parametric component. When some additional linear restrictions on the parametric component are available, we construct a corresponding restricted PCR estimator. Some simulations are conducted to examine the performance of our proposed estimators and the results are satisfactory. Finally, a real data example is analysed.  相似文献   

7.
As a compromise between parametric regression and nonparametric regression, partially linear models are frequently used in statistical modelling. This article considers statistical inference for this semiparametric model when the linear covariate is measured with additive error and some additional linear restrictions on the parametric component are assumed to hold. We propose a restricted corrected profile least-squares estimator for the parametric component, and study the asymptotic normality of the estimator. To test hypothesis on the parametric component, we construct a Wald test statistic and obtain its limiting distribution. Some simulation studies are conducted to illustrate our approaches.  相似文献   

8.
The computational demand required to perform inference using Markov chain Monte Carlo methods often obstructs a Bayesian analysis. This may be a result of large datasets, complex dependence structures, or expensive computer models. In these instances, the posterior distribution is replaced by a computationally tractable approximation, and inference is based on this working model. However, the error that is introduced by this practice is not well studied. In this paper, we propose a methodology that allows one to examine the impact on statistical inference by quantifying the discrepancy between the intractable and working posterior distributions. This work provides a structure to analyse model approximations with regard to the reliability of inference and computational efficiency. We illustrate our approach through a spatial analysis of yearly total precipitation anomalies where covariance tapering approximations are used to alleviate the computational demand associated with inverting a large, dense covariance matrix.  相似文献   

9.
A number of statistical problems use the moment generating function (mgf) for purposes other than determining the moments of a distribution. If the distribution is not completely specified, then the mgf must be estimated from available data. The empirical mgf makes no assumptions concerning the underlying distribution except for the existence of the mgf. In contrast to the nonparametric approach provided by the empirical mgf, alternative estimators can be formed based on an assumed parametric model. Comparison of these approaches is considered for two parametric models; the normal and a one parameter gamma. Comparison criteria are efficiency and empirical confidence interval coverage. In general the parametric estimators outperform the empirical mgf when the model is correct. The comparisons are extended to underlying models which are two component mixtures from the distributional family assumed by the parametric estimators. Under the mixture models the superiority of the parametric estimator depends upon the model, value of the argument of the mgf, and the comparison criterion. The empirical mgf is the better estimator in some cases.  相似文献   

10.
The likelihood ratio is used for measuring the strength of statistical evidence. The probability of observing strong misleading evidence along with that of observing weak evidence evaluate the performance of this measure. When the corresponding likelihood function is expressed in terms of a parametric statistical model that fails, the likelihood ratio retains its evidential value if the likelihood function is robust [Royall, R., Tsou, T.S., 2003. Interpreting statistical evidence by using imperfect models: robust adjusted likelihood functions. J. Roy. Statist. Soc. Ser. B 65, 391–404]. In this paper, we extend the theory of Royall and Tsou [2003. Interpreting statistical evidence by using imperfect models: robust adjusted likelihood functions. J. Roy. Statist. Soc., Ser. B 65, 391–404] to the case when the assumed working model is a characteristic model for two-way contingency tables (the model of independence, association and correlation models). We observe that association and correlation models are not equivalent in terms of statistical evidence. The association models are bounded by the maximum of the bump function while the correlation models are not.  相似文献   

11.
ABSTRACT

We propose a new semiparametric Weibull cure rate model for fitting nonlinear effects of explanatory variables on the mean, scale and cure rate parameters. The regression model is based on the generalized additive models for location, scale and shape, for which any or all distribution parameters can be modeled as parametric linear and/or nonparametric smooth functions of explanatory variables. We present methods to select additive terms, model estimation and validation, where all computational codes are presented in a simple way such that any R user can fit the new model. Biases of the parameter estimates caused by models specified erroneously are investigated through Monte Carlo simulations. We illustrate the usefulness of the new model by means of two applications to real data. We provide computational codes to fit the new regression model in the R software.  相似文献   

12.
A variety of statistical regression models have been proposed for the comparison of ROC curves for different markers across covariate groups. Pepe developed parametric models for the ROC curve that induce a semiparametric model for the market distributions to relax the strong assumptions in fully parametric models. We investigate the analysis of the power ROC curve using these ROC-GLM models compared to the parametric exponential model and the estimating equations derived from the usual partial likelihood methods in time-to-event analyses. In exploring the robustness to violations of distributional assumptions, we find that the ROC-GLM provides an extra measure of robustness.  相似文献   

13.
The process comparing the empirical cumulative distribution function of the sample with a parametric estimate of the cumulative distribution function is known as the empirical process with estimated parameters and has been extensively employed in the literature for goodness‐of‐fit testing. The simplest way to carry out such goodness‐of‐fit tests, especially in a multivariate setting, is to use a parametric bootstrap. Although very easy to implement, the parametric bootstrap can become very computationally expensive as the sample size, the number of parameters, or the dimension of the data increase. An alternative resampling technique based on a fast weighted bootstrap is proposed in this paper, and is studied both theoretically and empirically. The outcome of this work is a generic and computationally efficient multiplier goodness‐of‐fit procedure that can be used as a large‐sample alternative to the parametric bootstrap. In order to approximately determine how large the sample size needs to be for the parametric and weighted bootstraps to have roughly equivalent powers, extensive Monte Carlo experiments are carried out in dimension one, two and three, and for models containing up to nine parameters. The computational gains resulting from the use of the proposed multiplier goodness‐of‐fit procedure are illustrated on trivariate financial data. A by‐product of this work is a fast large‐sample goodness‐of‐fit procedure for the bivariate and trivariate t distribution whose degrees of freedom are fixed. The Canadian Journal of Statistics 40: 480–500; 2012 © 2012 Statistical Society of Canada  相似文献   

14.
Previously, we developed a modeling framework which classifies individuals with respect to their length of stay (LOS) in the transient states of a continuous-time Markov model with a single absorbing state; phase-type models are used for each class of the Markov model. We here add costs and obtain results for moments of total costs in (0, t], for an individual, a cohort arriving at time zero and when arrivals are Poisson. Based on stroke patient data from the Belfast City Hospital we use the overall modelling framework to obtain results for total cost in a given time interval.  相似文献   

15.
The statistical literature on the analysis of discrete variate time series has concentrated mainly on parametric models, that is the conditional probability mass function is assumed to belong to a parametric family. Generally, these parametric models impose strong assumptions on the relationship between the conditional mean and variance. To generalize these implausible assumptions, this paper instead considers a more realistic semiparametric model, called random rounded integer-valued autoregressive conditional heteroskedastic (RRINARCH) model, where there are essentially no assumptions on the relationship between the conditional mean and variance. The new model has several advantages: (a) it provides a coherent semiparametric framework for discrete variate time series, in which the conditional mean and variance can be modeled separately; (b) it allows negative values both for the series and its autocorrelation function; (c) its autocorrelation structure is the same as that of a standard autoregressive (AR) process; (d) standard software for its estimation is directly applicable. For the new model, conditions for stationarity, ergodicity and the existence of moments are established and the consistency and asymptotic normality of the conditional least squares estimator are proved. Simulation experiments are carried out to assess the performance of the model. The analyses of real data sets illustrate the flexibility and usefulness of the RRINARCH model for obtaining more realistic forecast means and variances.  相似文献   

16.
This paper investigates the estimation of parameters in a multivariate quantile regression model when the investigator wants to evaluate the associated distribution function. It proposes a new directional quantile estimator with the following properties: (1) it applies to an arbitrary number of random variables; (2) it is equivalent to estimating the distribution function allowing for non-convex distribution contours; (3) it satisfies nice equivariance properties; (4) it has desirable statistical properties (i.e., consistency and asymptotic normality); and (5) its implementation involves a modest computational burden: our proposed estimator can be obtained by solving parametric linear programming problems. As such, this paper expands the range of applications of quantile estimation for multivariate regression models.  相似文献   

17.
The Fisher information is intricately linked to the asymptotic (first-order) optimality of maximum likelihood estimators for parametric complete-data models. When data are missing completely at random in a multivariate setup, it is shown that information in a single observation is well-defined and it plays the same role as in the complete-data model in characterizing the first-order asymptotic optimality properties of associated maximum likelihood estimators; computational aspects are also thoroughly appraised. As an illustration, the logistic regression model with incomplete binary responses and an incomplete categorical covariate is worked out.  相似文献   

18.
We introduce a combined two-stage least-squares (2SLS)–expectation maximization (EM) algorithm for estimating vector-valued autoregressive conditional heteroskedasticity models with standardized errors generated by Gaussian mixtures. The procedure incorporates the identification of the parametric settings as well as the estimation of the model parameters. Our approach does not require a priori knowledge of the Gaussian densities. The parametric settings of the 2SLS_EM algorithm are determined by the genetic hybrid algorithm (GHA). We test the GHA-driven 2SLS_EM algorithm on some simulated cases and on international asset pricing data. The statistical properties of the estimated models and the derived mixture densities indicate good performance of the algorithm. We conduct tests on a massively parallel processor supercomputer to cope with situations involving numerous mixtures. We show that the algorithm is scalable.  相似文献   

19.
Computational methods for local regression   总被引:1,自引:0,他引:1  
Local regression is a nonparametric method in which the regression surface is estimated by fitting parametric functions locally in the space of the predictors using weighted least squares in a moving fashion similar to the way that a time series is smoothed by moving averages. Three computational methods for local regression are presented. First, fast surface fitting and evaluation is achieved by building ak-d tree in the space of the predictors, evaluating the surface at the corners of the tree, and then interpolating elsewhere by blending functions. Second, surfaces are made conditionally parametric in any proper subset of the predictors by a simple alteration of the weighting scheme. Third degree-of-freedom quantities that would be extremely expensive to compute exactly are approximated, not by numerical methods, but through a statistical model that predicts the quantities from the trace of the hat matrix, which can be computed easily.  相似文献   

20.
The multinomial logit model (MNL) is one of the most frequently used statistical models in marketing applications. It allows one to relate an unordered categorical response variable, for example representing the choice of a brand, to a vector of covariates such as the price of the brand or variables characterising the consumer. In its classical form, all covariates enter in strictly parametric, linear form into the utility function of the MNL model. In this paper, we introduce semiparametric extensions, where smooth effects of continuous covariates are modelled by penalised splines. A mixed model representation of these penalised splines is employed to obtain estimates of the corresponding smoothing parameters, leading to a fully automated estimation procedure. To validate semiparametric models against parametric models, we utilise different scoring rules as well as predicted market share and compare parametric and semiparametric approaches for a number of brand choice data sets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号