期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Partially Accelerated Life Tests for the Weibull Distribution Under Multiply Censored Data

F. K. Wang Y. F. Cheng W. L. Lu 《统计学通讯:模拟与计算》2013,42(9):1667-1678

This article aims to estimate the parameters of the Weibull distribution in step-stress partially accelerated life tests under multiply censored data. The step partially acceleration life test is that all test units are first run simultaneously under normal conditions for a pre-specified time, and the surviving units are then run under accelerated conditions until a predetermined censoring time. The maximum likelihood estimates are used to obtaining the parameters of the Weibull distribution and the acceleration factor under multiply censored data. Additionally, the confidence intervals for the estimators are obtained. Simulation results show that the maximum likelihood estimates perform well in most cases in terms of the mean bias, errors in the root mean square and the coverage rate. An example is used to illustrate the performance of the proposed approach. 相似文献

2.

Clustering Gene Expression Data using a Posterior Split‐Merge‐Birth Procedure

ERLANDSON F. SARAIVA LUIS A. MILAN 《Scandinavian Journal of Statistics》2012,39(3):399-415

Abstract. DNA array technology is an important tool for genomic research due to its capa‐city of measuring simultaneously the expression levels of a great number of genes or fragments of genes in different experimental conditions. An important point in gene expression data analysis is to identify clusters of genes which present similar expression levels. We propose a new procedure for estimating the mixture model for clustering of gene expression data. The proposed method is a posterior split‐merge‐birth MCMC procedure which does not require the specification of the number of components, since it is estimated jointly with component parameters. The strategy for splitting is based on data and on posterior distribution from the previously allocated observations. This procedure defines a quick split proposal in contrary to other split procedures, which require substantial computational effort. The performance of the method is verified using real and simulated datasets. 相似文献

3.

Estimation in the multipath change point problem for correlated data

Lawrence Joseph Alain C. Vandal David B. Wolfson 《Revue canadienne de statistique》1996,24(1):37-53

In many experiments, several measurements on the same variable are taken over time, a geographic region, or some other index set. It is often of interest to know if there has been a change over the index set in the parameters of the distribution of the variable. Frequently, the data consist of a sequence of correlated random variables, and there may also be several experimental units under observation, each providing a sequence of data. A problem in ascertaining the boundaries between the layers in geological sedimentary beds is used to introduce the model and then to illustrate the proposed methodology. It is assumed that, conditional on the change point, the data from each sequence arise from an autoregressive process that undergoes a change in one or more of its parameters. Unconditionally, the model then becomes a mixture of nonstationary autoregressive processes. Maximum-likelihood methods are used, and results of simulations to evaluate the performance of these estimators under practical conditions are given. 相似文献

4.

Near-optimal designs for dual channel microarray studies 总被引：2，自引：0，他引：2

Ernst Wit Agostino Nobile Raya Khanin 《Journal of the Royal Statistical Society. Series C, Applied statistics》2005,54(5):817-830

Summary. Much biological and medical research employs microarray studies to monitor gene expression levels across a wide range of organisms and under many experimental conditions. Dual channel microarrays are a common platform and allow two samples to be measured simultaneously. A frequently used design uses a common reference sample to make conditions across different arrays comparable. Our aim is to formulate microarray experiments in the experimental design context and to use simulated annealing to search for near-optimal designs. We identify a subclass of designs, the so-called interwoven loop designs, that seems to have good optimality properties compared with the near-optimal designs that are found by simulated annealing. Commonly used reference designs and dye swap designs are shown to be highly inefficient. 相似文献

5.

Small area estimation under a multivariate linear model for repeated measures data

Innocent Ngaruye Joseph Nzabanita Dietrich von Rosen Martin Singull 《统计学通讯:理论与方法》2017,46(21):10835-10850

In this article, small area estimation under a multivariate linear model for repeated measures data is considered. The proposed model aims to get a model which borrows strength both across small areas and over time. The model accounts for repeated surveys, grouped response units, and random effects variations. Estimation of model parameters is discussed within a likelihood based approach. Prediction of random effects, small area means across time points, and per group units are derived. A parametric bootstrap method is proposed for estimating the mean squared error of the predicted small area means. Results are supported by a simulation study. 相似文献

6.

Modeling Three-Dimensional Chromosome Structures Using Gene Expression Data

Xiao G Wang X Khodursky AB 《Journal of the American Statistical Association》2011,106(493):61-72

相似文献

7.

Discrete Duration Models Combining Dynamic and Random Effects

Biller C 《Lifetime data analysis》2000,6(4):375-390

Survivaldata may include two different sources of variation, namely variationover time and variation over units. If both of these variationsare present, neglecting one of them can cause serious bias inthe estimations. Here we present an approach for discrete durationdata that includes both time–varying and unit–specificeffects to model these two variations simultaneously. The approachis a combination of a dynamic survival model with dynamic time–varyingbaseline and covariate effects and a frailty model measuringunobserved heterogeneity with random effects varying independentlyover units. Estimation is based on posterior modes, i.e., wemaximize the joint posterior distribution of the unknown parametersto avoid numerical integration and simulation techniques, thatare necessary in a full Bayesian analysis. Estimation of unknownhyperparameters is achieved by an EM–type algorithm. Finally,the proposed method is applied to data of the Veteran's AdministrationLung Cancer Trial. 相似文献

8.

Some state space models of hiv epidemic and its applications for the estimation of hiv infection and incubation

Wai-Yuan Tan Zhengzheng Ye 《统计学通讯:理论与方法》2013,42(5-6):1059-1088

In this paper we have developed some state space models for the HIV epidemic for populations at risk for AIDS. By using these state space models, we have developed a general Bayesian procedure for estimating simultaneously the unknown parameters and the state variables. The unknown parameters include the immigration and recruitment rates, the death and retirement rates, the incidence of HIV infection ( and hence the HIV infection distribution ) and the incidence of HIV incubation ( and hence the HIV incubation distribution). The state variables are the numbers of susceptible people (S people), HIV-infected people (I people) and AIDS incidence over time. The basic approach is through multi-level Gibbs sampler combined with the weighted bootstrap method. We have applied the methods to the Swiss AIDS homosexual and IV drug data to estimate simultaneously the unknown parameters and the state variables. Our results show that in both populations, both the HIV infection and HIV incubation have multi-peaks indicating the mixture nature of these distributions. Our results have also shown that the estimates of the death and retirement rates for I people are greater than those of S people, suggesting that the infection by HIV may have increased the death and retirement rates of the individuals. 相似文献

9.

Hidden Markov mesh random field models in image analysis

Pierre A. Devijver 《Journal of applied statistics》1993,20(5-6):187-227

This paper addresses the image modeling problem under the assumption that images can be represented by third-order, hidden Markov mesh random field models. The range of applications of the techniques described hereafter comprises the restoration of binary images, the modeling and compression of image data, as well as the segmentation of gray-level or multi-spectral images, and image sequences under the short-range motion hypothesis. We outline coherent approaches to both the problems of image modeling (pixel labeling) and estimation of model parameters (learning). We derive a real-time labeling algorithm-based on a maximum, marginal a posteriori probability criterion-for a hidden third-order Markov mesh random field model. Our algorithm achieves minimum time and space complexities simultaneously, and we describe what we believe to be the most appropriate data structures to implement it. Critical aspects of the computer simulation of a real-time implementation are discussed, down to the computer code level. We develop an (unsupervised) learning technique by which the model parameters can be estimated without ground truth information. We lay bare the conditions under which our approach can be made time-adaptive in order to be able to cope with short-range motion in dynamic image sequences. We present extensive experimental results for both static and dynamic images from a wide variety of sources. They comprise standard, infra-red and aerial images, as well as a sequence of ultrasound images of a fetus and a series of frames from a motion picture sequence. These experiments demonstrate that the method is subjectively relevant to the problems of image restoration, segmentation and modeling. 相似文献

10.

A continuous latent spatial model for crack initiation in bone cement

Elizabeth A. Heron Cathal D. Walsh 《Journal of the Royal Statistical Society. Series C, Applied statistics》2008,57(1):25-42

Summary. Hip replacements rovide a means of achieving a higher quality of life for individuals who have, through aging or injury, accumulated damage to their natural joints. This is a very common operation, with over a million people a year benefiting from the procedure. The replacements themselves fail mainly as a result of the mechanical loosening of the components of the artificial joint due to damage accumulation. This damage accumulation consists of the initiation and growth of cracks in the bone cement which is used to fixate the replacement in the human body. The data come from laboratory experiments that are designed to assess the effectiveness of the bone cement in resisting damage. We examine the properties of the bone cement, with the aim being to estimate the effect that both observable and unobservable spatially varying factors have on causing crack initiation. To do this, an explicit model for the damage process is constructed taking into account the tension and compression at different locations in the specimens. A gamma random field is used to model any latent spatial factors that may be influential in crack initiation. Bayesian inference is carried out for the parameters of this field and related covariates by using Markov chain Monte Carlo techniques. 相似文献

11.

Incorporating gene functional annotations in detecting differential gene expression

Wei Pan 《Journal of the Royal Statistical Society. Series C, Applied statistics》2006,55(3):301-316

Summary. The importance of incorporating existing biological knowledge, such as gene functional annotations in gene ontology, in analysing high throughput genomic and proteomic data is being increasingly recognized. In the context of detecting differential gene expression, however, the current practice of using gene annotations is limited primarily to validations. Here we take a direct approach to incorporating gene annotations into mixture models for analysis. First, in contrast with a standard mixture model assuming that each gene of the genome has the same distribution, we study stratified mixture models allowing genes with different annotations to have different distributions, such as prior probabilities. Second, rather than treating parameters in stratified mixture models independently, we propose a hierarchical model to take advantage of the hierarchical structure of most gene annotation systems, such as gene ontology. We consider a simplified implementation for the proof of concept. An application to a mouse microarray data set and a simulation study demonstrate the improvement of the two new approaches over the standard mixture model. 相似文献

12.

Covariates and Random Effects in a Gamma Process Model with Application to Degradation and Failure 总被引：7，自引：0，他引：7

Lawless J Crowder M 《Lifetime data analysis》2004,10(3):213-227

The gamma process is a natural model for degradation processes in which deterioration is supposed to take place gradually over time in a sequence of tiny increments. When units or individuals are observed over time it is often apparent that they degrade at different rates, even though no differences in treatment or environment are present. Thus, in applying gamma-process models to such data, it is necessary to allow for such unexplained differences. In the present paper this is accomplished by constructing a tractable gamma-process model incorporating a random effect. The model is fitted to some data on crack growth and corresponding goodness-of-fit tests are carried out. Prediction calculations for failure times defined in terms of degradation level passages are developed and illustrated. 相似文献

13.

Exact inference for a simple step-stress model with competing risks for failure from exponential distribution under Type-II censoring

N. Balakrishnan Donghoon Han 《Journal of statistical planning and inference》2008

In reliability analysis, accelerated life-testing allows for gradual increment of stress levels on test units during an experiment. In a special class of accelerated life tests known as step-stress tests, the stress levels increase discretely at pre-fixed time points, and this allows the experimenter to obtain information on the parameters of the lifetime distributions more quickly than under normal operating conditions. Moreover, when a test unit fails, there are often more than one fatal cause for the failure, such as mechanical or electrical. In this article, we consider the simple step-stress model under Type-II censoring when the lifetime distributions of the different risk factors are independently exponentially distributed. Under this setup, we derive the maximum likelihood estimators (MLEs) of the unknown mean parameters of the different causes under the assumption of a cumulative exposure model. The exact distributions of the MLEs of the parameters are then derived through the use of conditional moment generating functions. Using these exact distributions as well as the asymptotic distributions and the parametric bootstrap method, we discuss the construction of confidence intervals for the parameters and assess their performance through Monte Carlo simulations. Finally, we illustrate the methods of inference discussed here with an example. 相似文献

14.

A SPARSE CONDITIONAL GAUSSIAN GRAPHICAL MODEL FOR ANALYSIS OF GENETICAL GENOMICS DATA

J Yin H Li 《The annals of applied statistics》2011,5(4):2630-2650

相似文献

15.

Bayesian curve fitting and clustering with Dirichlet process mixture models for microarray data

Ju-Hyun Park Minjung Kyung 《Journal of the Korean Statistical Society》2019,48(2):207-220

In the field of molecular biology, it is often of interest to analyze microarray data for clustering genes based on similar profiles of gene expression to identify genes that are differentially expressed under multiple biological conditions. One of the notable characteristics of a gene expression profile is that it shows a cyclic curve over a course of time. To group sequences of similar molecular functions, we propose a Bayesian Dirichlet process mixture of linear regression models with a Fourier series for the regression coefficients, for each of which a spike and slab prior is assumed. A full Gibbs-sampling algorithm is developed for an efficient Markov chain Monte Carlo (MCMC) posterior computation. Due to the so-called “label-switching” problem and different numbers of clusters during the MCMC computation, a post-process approach of Fritsch and Ickstadt (2009) is additionally applied to MCMC samples for an optimal single clustering estimate by maximizing the posterior expected adjusted Rand index with the posterior probabilities of two observations being clustered together. The proposed method is illustrated with two simulated data and one real data of the physiological response of fibroblasts to serum of Iyer et al. (1999). 相似文献

16.

False Discovery Rate Control With Groups

Hu JX Zhao H Zhou HH 《Journal of the American Statistical Association》2010,105(491):1215-1227

In the context of large-scale multiple hypothesis testing, the hypotheses often possess certain group structures based on additional information such as Gene Ontology in gene expression data and phenotypes in genome-wide association studies. It is hence desirable to incorporate such information when dealing with multiplicity problems to increase statistical power. In this article, we demonstrate the benefit of considering group structure by presenting a p-value weighting procedure which utilizes the relative importance of each group while controlling the false discovery rate under weak conditions. The procedure is easy to implement and shown to be more powerful than the classical Benjamini-Hochberg procedure in both theoretical and simulation studies. By estimating the proportion of true null hypotheses, the data-driven procedure controls the false discovery rate asymptotically. Our analysis on one breast cancer dataset confirms that the procedure performs favorably compared with the classical method. 相似文献

17.

Semi‐parametric small‐area estimation by combining time‐series and cross‐sectional data methods

下载免费PDF全文

Farhad Shokoohi Mahmoud Torabi 《Australian & New Zealand Journal of Statistics》2018,60(3):323-342

In survey sampling, policymaking regarding the allocation of resources to subgroups (called small areas) or the determination of subgroups with specific properties in a population should be based on reliable estimates. Information, however, is often collected at a different scale than that of these subgroups; hence, the estimation can only be obtained on finer scale data. Parametric mixed models are commonly used in small‐area estimation. The relationship between predictors and response, however, may not be linear in some real situations. Recently, small‐area estimation using a generalised linear mixed model (GLMM) with a penalised spline (P‐spline) regression model, for the fixed part of the model, has been proposed to analyse cross‐sectional responses, both normal and non‐normal. However, there are many situations in which the responses in small areas are serially dependent over time. Such a situation is exemplified by a data set on the annual number of visits to physicians by patients seeking treatment for asthma, in different areas of Manitoba, Canada. In cases where covariates that can possibly predict physician visits by asthma patients (e.g. age and genetic and environmental factors) may not have a linear relationship with the response, new models for analysing such data sets are required. In the current work, using both time‐series and cross‐sectional data methods, we propose P‐spline regression models for small‐area estimation under GLMMs. Our proposed model covers both normal and non‐normal responses. In particular, the empirical best predictors of small‐area parameters and their corresponding prediction intervals are studied with the maximum likelihood estimation approach being used to estimate the model parameters. The performance of the proposed approach is evaluated using some simulations and also by analysing two real data sets (precipitation and asthma). 相似文献

18.

Investigation of mixed model repeated measures analyses and non‐linear random coefficient models in the context of long‐term efficacy data

下载免费PDF全文

Bruno Delafont Kevin Carroll Claire Vilain Emmanuel Pham 《Pharmaceutical statistics》2018,17(5):515-526

The longitudinal data from 2 published clinical trials in adult subjects with upper limb spasticity (a randomized placebo‐controlled study [NCT01313299] and its long‐term open‐label extension [NCT01313312]) were combined. Their study designs involved repeat intramuscular injections of abobotulinumtoxinA (Dysport®), and efficacy endpoints were collected accordingly. With the objective of characterizing the pattern of response across cycles, Mixed Model Repeated Measures analyses and Non‐Linear Random Coefficient (NLRC) analyses were performed and their results compared. The Mixed Model Repeated Measures analyses, commonly used in the context of repeated measures with missing dependent data, did not involve any parametric shape for the curve of changes over time. Based on clinical expectations, the NLRC included a negative exponential function of the number of treatment cycles, with its asymptote and rate included as random coefficients in the model. Our analysis focused on 2 specific efficacy parameters reflecting complementary aspects of efficacy in the study population. A simulation study based on a similar study design was also performed to further assess the performance of each method under different patterns of response over time. This highlighted a gain of precision with the NLRC model, and most importantly the need for its assumptions to be verified to avoid potentially biased estimates. These analyses describe a typical situation and the conditions under which non‐linear mixed modeling can provide additional insights on the behavior of efficacy parameters over time. Indeed, the resulting estimates from the negative exponential NLRC can help determine the expected maximal effect and the treatment duration required to reach it. 相似文献

19.

A Simulation Comparison of Approximate Tests for Fixed Effects in Random Coefficients Growth Curve Models

Julia Volaufova Lynn Roy Lamotte 《统计学通讯:模拟与计算》2013,42(2):344-359

Often, the response variables on sampling units are observed repeatedly over time. The sampling units may come from different populations, such as treatment groups. This setting is routinely modeled by a random coefficients growth curve model, and the techniques of general linear mixed models are applied to address the primary research aim. An alternative approach is to reduce each subject’s data to summary measures, such as within-subject averages or regression coefficients. One may then test for equality of means of the summary measures (or functions of them) among treatment groups. Here, we compare by simulation the performance characteristics of three approximate tests based on summary measures and one based on the full data, focusing mainly on accuracy of p-values. We find that performances of these procedures can be quite different for small samples in several different configurations of parameter values. The summary-measures approach performed at least as well as the full-data mixed models approach. 相似文献

20.

Investigate Data Dependency for Dynamic Gene Regulatory Network Identification through High-dimensional Differential Equation Approach

Tao Lu Min Wang 《统计学通讯:模拟与计算》2016,45(7):2377-2391

Gene regulation plays a fundamental role in biological activities. The gene regulation network (GRN) is a high-dimensional complex system, which can be represented by various mathematical or statistical models. The ordinary differential equation (ODE) model is one of the popular dynamic GRN models. We proposed a comprehensive statistical procedure for ODE model to identify the dynamic GRN. In this article, we applied this model to different segments of time course gene expression data from a simulation experiment and a yeast cell cycle study. We found that the two cell cycle and one cell cycle data provided consistent results, but half cell cycle data produced biased estimation. Therefore, we may conclude that the proposed model can quantify both two cell cycle and one cell cycle gene expression dynamics, but not for half cycle dynamics. The findings suggest that the model can identify the dynamic GRN correctly if the time course gene expression data are sufficient enough to capture the overall dynamics of underlying biological mechanism. 相似文献