首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A new method for the analysis of time to ankylosis complication on a dataset of replanted teeth is proposed. In this context of left-censored, interval-censored and right-censored data, a Cox model with piecewise constant baseline hazard is introduced. Estimation is carried out with the expectation maximisation (EM) algorithm by treating the true event times as unobserved variables. This estimation procedure is shown to produce a block diagonal Hessian matrix of the baseline parameters. Taking advantage of this interesting feature in the EM algorithm, a L0 penalised likelihood method is implemented in order to automatically determine the number and locations of the cuts of the baseline hazard. This procedure allows to detect specific areas of time where patients are at greater risks for ankylosis. The method can be directly extended to the inclusion of exact observations and to a cure fraction. Theoretical results are obtained which allow to derive statistical inference of the model parameters from asymptotic likelihood theory. Through simulation studies, the penalisation technique is shown to provide a good fit of the baseline hazard and precise estimations of the resulting regression parameters.  相似文献   

2.
A transformation is proposed to convert the nonlinear constraints of the parameters in the mixture transition distribution (MTD) model into box-constraints. The proposed transformation removes the difficulties associated with the maximum likelihood estimation (MLE) process in the MTD modeling so that the MLEs of the parameters can be easily obtained via a hybrid algorithm from the evolutionary algorithms and/or quasi-Newton algorithms for global optimization. Simulation studies are conducted to demonstrate MTD modeling by the proposed novel approach through a global search algorithm in R environment. Finally, the proposed approach is used for the MTD modelings of three real data sets.  相似文献   

3.
Grouped data are frequently used in several fields of study. In this work, we use the expectation-maximization (EM) algorithm for fitting the skew-normal (SN) mixture model to the grouped data. Implementing the EM algorithm requires computing the one-dimensional integrals for each group or class. Our simulation study and real data analyses reveal that the EM algorithm not only always converges but also can be implemented in just a few seconds even when the number of components is large, contrary to the Bayesian paradigm that is computationally expensive. The accuracy of the EM algorithm and superiority of the SN mixture model over the traditional normal mixture model in modelling grouped data are demonstrated through the simulation and three real data illustrations. For implementing the EM algorithm, we use the package called ForestFit developed for R environment available at https://cran.r-project.org/web/packages/ForestFit/index.html.  相似文献   

4.
In recent years, joint analysis of longitudinal measurements and survival data has received much attention. However, previous work has primarily focused on a single failure type for the event time. In this article, we consider joint modeling of repeated measurements and competing risks failure time data to allow for more than one distinct failure type in the survival endpoint so we fit a cause-specific hazards sub-model to allow for competing risks, with a separate latent association between longitudinal measurements and each cause of failure. Besides, previous work does not focus on the hypothesis to test a separate latent association between longitudinal measurements and each cause of failure. In this article, we derive a score test to identify longitudinal biomarkers or surrogates for a time to event outcome in competing risks data. With a carefully chosen definition of complete data, the maximum likelihood estimation of the cause-specific hazard functions is performed via an EM algorithm. We extend this work and allow random effects to be present in both the longitudinal biomarker and underlying survival function. The random effects in the biomarker are introduced via an explicit term while the random effect in the underlying survival function is introduced by the inclusion of frailty into the model.

We use simulations to explore how the number of individuals, the number of time points per individual and the functional form of the random effects from the longitudinal biomarkers considering heterogeneous baseline hazards in individuals influence the power to detect the association of a longitudinal biomarker and the survival time.  相似文献   


5.
The REBMIX algorithm is presented and applied to estimation of finite univariate mixture densities. The algorithm identifies the component parameters, mixing weights, and number of the components successively. Significant improvement is achieved by replacing the rigid restraints with the loose ones, which enables improved modelling of overlapped components. The algorithm is controlled by the extreme relative deviations, total of positive relative deviations, and information criteria. It enables also the modeling of multivariate finite mixtures. However, the article considers univariate normal, lognormal, and Weibull finite mixtures solely. The REBMIX software is available on http://www.fs.uni-lj.si/lavek.  相似文献   

6.
Tree algorithms are a well-known class of random access algorithms with a provable maximum stable throughput under the infinite population model (as opposed to ALOHA or the binary exponential backoff algorithm). In this article, we propose a tree algorithm for opportunistic spectrum usage in cognitive radio networks. A channel in such a network is shared among so-called primary and secondary users, where the secondary users are allowed to use the channel only if there is no primary user activity. The tree algorithm designed in this article can be used by the secondary users to share the channel capacity left by the primary users.

We analyze the maximum stable throughput and mean packet delay of the secondary users by developing a tree structured Quasi-Birth Death Markov chain under the assumption that the primary user activity can be modeled by means of a finite state Markov chain and that packets lengths follow a discrete phase-type distribution.

Numerical experiments provide insight on the effect of various system parameters and indicate that the proposed algorithm is able to make good use of the bandwidth left by the primary users.  相似文献   


7.
This paper proposes an extension of the general location model using a joint model for analyzing inflated counting outcomes and skew continuous outcomes. A zero-inflated binomial with batches of binomials or a zero-inflated Poisson with batches of Poissons is proposed for counting outcome and a skew normal distribution is assumed for continuous outcome. The EM algorithm is developed for estimation of parameters. The accuracy of estimations is evaluated using a simulation study. An application of our models for joint analysis of the number of cigarettes smoked per day and the weights of respondents for the American's Changing Lives study is enclosed.  相似文献   

8.
9.
The Hidden semi-Markov models (HSMMs) were introduced to overcome the constraint of a geometric sojourn time distribution for the different hidden states in the classical hidden Markov models. Several variations of HSMMs were proposed that model the sojourn times by a parametric or a nonparametric family of distributions. In this article, we concentrate our interest on the nonparametric case where the duration distributions are attached to transitions and not to states as in most of the published papers in HSMMs. Therefore, it is worth noticing that here we treat the underlying hidden semi-Markov chain in its general probabilistic structure. In that case, Barbu and Limnios (2008 Barbu , V. , Limnios , N. ( 2008 ). Semi-Markov Chains and Hidden Semi-Markov Models Toward Applications: Their Use in Reliability and DNA Analysis . New York : Springer . [Google Scholar]) proposed an Expectation–Maximization (EM) algorithm in order to estimate the semi-Markov kernel and the emission probabilities that characterize the dynamics of the model. In this article, we consider an improved version of Barbu and Limnios' EM algorithm which is faster than the original one. Moreover, we propose a stochastic version of the EM algorithm that achieves comparable estimates with the EM algorithm in less execution time. Some numerical examples are provided which illustrate the efficient performance of the proposed algorithms.  相似文献   

10.
The joint analysis of longitudinal measurements and survival data is useful in clinical trials and other medical studies. In this paper, we consider a joint model which assumes a linear mixed $tt$ model for longitudinal measurements and a promotion time cure model for survival data and links these two models through a latent variable. A semiparametric inference procedure with an EM algorithm implementation is developed for the parameters in the joint model. The proposed procedure is evaluated in a simulation study and applied to analyze the quality of life and time to recurrence data from a clinical trial on women with early breast cancer. The Canadian Journal of Statistics 40: 207–224; 2012 © 2012 Statistical Society of Canada  相似文献   

11.
Multivariate data are present in many research areas. Its analysis is challenging when assumptions of normality are violated and the data are discrete. The Poisson discrete data can be thought of as very common discrete type, but the inflated and the doubly inflated correspondence are gaining popularity (Sengupta, Chaganty, and Sabo 2015; Lee, Jung, and Jin 2009; Agarwal, Gelfand, and Citron-Pousty 2002).

Our aim is to build a statistical model that can be tractable and used to estimate the model parameters for the multivariate doubly inflated Poisson. To keep the correlation structure, we incorporate ideas from the copula distributions. A multivariate doubly inflated Poisson distribution using Gaussian copula is introduced. Data simulation and parameter estimation algorithms are also provided. Residual checks are carried out to assess any substantial biases. The model dimensionality has been increased to test the performance of the provided estimation method. All results show high-efficiency and promising outcomes in the modeling of discrete data and particularly the doubly inflated Poisson count type data, under a novel modified algorithm.  相似文献   


12.
Efficient, accurate, and fast Markov Chain Monte Carlo estimation methods based on the Implicit approach are proposed. In this article, we introduced the notion of Implicit method for the estimation of parameters in Stochastic Volatility models.

Implicit estimation offers a substantial computational advantage for learning from observations without prior knowledge and thus provides a good alternative to classical inference in Bayesian method when priors are missing.

Both Implicit and Bayesian approach are illustrated using simulated data and are applied to analyze daily stock returns data on CAC40 index.  相似文献   


13.
14.
The EM algorithm is a popular method for parameter estimation in situations where the data can be viewed as being incomplete. As each E-step visits each data point on a given iteration, the EM algorithm requires considerable computation time in its application to large data sets. Two versions, the incremental EM (IEM) algorithm and a sparse version of the EM algorithm, were proposed recently by Neal R.M. and Hinton G.E. in Jordan M.I. (Ed.), Learning in Graphical Models, Kluwer, Dordrecht, 1998, pp. 355–368 to reduce the computational cost of applying the EM algorithm. With the IEM algorithm, the available n observations are divided into B (B n) blocks and the E-step is implemented for only a block of observations at a time before the next M-step is performed. With the sparse version of the EM algorithm for the fitting of mixture models, only those posterior probabilities of component membership of the mixture that are above a specified threshold are updated; the remaining component-posterior probabilities are held fixed. In this paper, simulations are performed to assess the relative performances of the IEM algorithm with various number of blocks and the standard EM algorithm. In particular, we propose a simple rule for choosing the number of blocks with the IEM algorithm. For the IEM algorithm in the extreme case of one observation per block, we provide efficient updating formulas, which avoid the direct calculation of the inverses and determinants of the component-covariance matrices. Moreover, a sparse version of the IEM algorithm (SPIEM) is formulated by combining the sparse E-step of the EM algorithm and the partial E-step of the IEM algorithm. This SPIEM algorithm can further reduce the computation time of the IEM algorithm.  相似文献   

15.
This article provides a procedure for the detection and identification of outliers in the spectral domain where the Whittle maximum likelihood estimator of the panel data model proposed by Chen [W.D. Chen, Testing for spurious regression in a panel data model with the individual number and time length growing, J. Appl. Stat. 33(88) (2006b), pp. 759–772] is implemented. We extend the approach of Chang and co-workers [I. Chang, G.C. Tiao, and C. Chen, Estimation of time series parameters in the presence of outliers, Technometrics 30 (2) (1988), pp. 193–204] to the spectral domain and through the Whittle approach we can quickly detect and identify the type of outliers. A fixed effects panel data model is used, in which the remainder disturbance is assumed to be a fractional autoregressive integrated moving-average (ARFIMA) process and the likelihood ratio criterion is obtained directly through the modified inverse Fourier transform. This saves much time, especially when the estimated model implements a huge data-set.

Through Monte Carlo experiments, the consistency of the estimator is examined by growing the individual number N and time length T, in which the long memory remainder disturbances are contaminated with two types of outliers: additive outlier and innovation outlier. From the power tests, we see that the estimators are quite successful and powerful.

In the empirical study, we apply the model on Taiwan's computer motherboard industry. Weekly data from 1 January 2000 to 31 October 2006 of nine familiar companies are used. The proposed model has a smaller mean square error and shows more distinctive aggressive properties than the raw data model does.  相似文献   


16.
The concept of ranked set sampling (RSS) is applicable whenever ranking on a set of sampling units can be done easily using a judgment method or based on an auxiliary variable. In this paper, we consider a study variable Y correlated with the auxiliary variable X and use it to rank the sampling units. Further (X,Y) is assumed to have Cambanis-type bivariate uniform (CTBU) distribution. We obtain an unbiased estimator of a scale parameter associated with the study variable Y based on different RSS schemes. We perform the efficiency comparison of the proposed estimators numerically. We present the trends in the efficiency performance of estimators under various RSS schemes with respect to parameters through line and surface plots. Further, we develop a Matlab function to simulate data from CTBU distribution and present the performance of proposed estimators through a simulation study. The results developed are implemented to real-life data also.KEYWORDS: Ranked set sampling, concomitants of order statistics, Cambanis-type bivariate uniform distribution, best linear unbiased estimatorSUBJECT CLASSIFICATIONS: 62D05, 62F07, 62G30  相似文献   

17.
A new modeling approach called ‘recursive segmentation’ is proposed to support the supervised exploration and identification of subgroups or clusters. It is based on the frameworks of recursive partitioning and the Patient Rule Induction Method (PRIM). Through combining these methods, recursive segmentation aims to exploit their respective strengths while reducing their weaknesses. Consequently, recursive segmentation can be applied in a very general way, that is in any (multivariate) regression, classification or survival (time-to-event) problem, using conditional inference, evolutionary learning or the CART algorithm, with predictor variables of any scale and with missing values. Furthermore, results of a synthetic example and a benchmark application study that comprises 26 data sets suggest that recursive segmentation achieves a competitive prediction accuracy and provides more accurate definitions of subgroups by models of less complexity as compared to recursive partitioning and PRIM. An application to the German Breast Cancer Study Group data demonstrates the improved interpretability and reliability of results produced by the new approach. The method is made publicly available through the R-package rseg (http://rseg.r-forge.r-project.org/).  相似文献   

18.
In this study an attempt is made to assess statistically the validity of two theories as to the origin of comets. This subject still leads to great controversy amongst astronomers but recently two main schools of thought have developed.

These are that comets are of

(i) planetary origin,

(ii) interstellar origin.

Many theories have been expanded within each school of thought but at the present time one theory in each is generally accepted. This paper sets out to identify the statistical implications of each theory and evaluate each theory in terms of their implications.  相似文献   


19.
Structural breaks in the level as well as in the volatility have often been exhibited in economic time series. In this paper, we propose new unit root tests when a time series has multiple shifts in its level and the corresponding volatility. The proposed tests are Lagrangian multiplier type tests based on the residual's marginal likelihood which is free from the nuisance mean parameters. The limiting null distributions of the proposed tests are the χ2distributions, and are affected not by the size and the location of breaks but only by the number of breaks.

We set the structural breaks under both the null and the alternative hypotheses to relieve a possible vagueness in interpreting test results in empirical work. The null hypothesis implies a unit root process with level shifts and the alternative connotes a stationary process with level shifts. The Monte Carlo simulation shows that our tests are locally more powerful than the OLSE-based tests, and that the powers of our tests, in a fixed time span, remain stable regardless the number of breaks. In our application, we employ the data which are analyzed by Perron (1990), and some results differ from those of Perron's (1990).  相似文献   


20.
Latent class analysis (LCA) has been found to have important applications in social and behavioral sciences for modeling categorical response variables, and nonresponse is typical when collecting data. In this study, the nonresponse mainly included “contingency questions” and real “missing data.” The primary objective of this research was to evaluate the effects of some potential factors on model selection indices in LCA with nonresponse data.

We simulated missing data with contingency questions and evaluated the accuracy rates of eight information criteria for selecting the correct models. The results showed that the main factors are latent class proportions, conditional probabilities, sample size, the number of items, the missing data rate, and the contingency data rate. Interactions of the conditional probabilities with class proportions, sample size, and the number of items are also significant. From our simulation results, the impact of missing data and contingency questions can be amended by increasing the sample size or the number of items.  相似文献   


设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号