首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Abstract.  We consider a two-component mixture model where one component distribution is known while the mixing proportion and the other component distribution are unknown. These kinds of models were first introduced in biology to study the differences in expression between genes. The various estimation methods proposed till now have all assumed that the unknown distribution belongs to a parametric family. In this paper, we show how this assumption can be relaxed. First, we note that generally the above model is not identifiable, but we show that under moment and symmetry conditions some 'almost everywhere' identifiability results can be obtained. Where such identifiability conditions are fulfilled we propose an estimation method for the unknown parameters which is shown to be strongly consistent under mild conditions. We discuss applications of our method to microarray data analysis and to the training data problem. We compare our method to the parametric approach using simulated data and, finally, we apply our method to real data from microarray experiments.  相似文献   

2.
The dynamic properties and independence structure of stochastic kinetic models (SKMs) are analyzed. An SKM is a highly multivariate jump process used to model chemical reaction networks, particularly those in biochemical and cellular systems. We identify SKM subprocesses with the corresponding counting processes and propose a directed, cyclic graph (the kinetic independence graph or KIG) that encodes the local independence structure of their conditional intensities. Given a partition [A, D, B] of the vertices, the graphical separation A ⊥ B|D in the undirected KIG has an intuitive chemical interpretation and implies that A is locally independent of B given A ∪ D. It is proved that this separation also results in global independence of the internal histories of A and B conditional on a history of the jumps in D which, under conditions we derive, corresponds to the internal history of D. The results enable mathematical definition of a modularization of an SKM using its implied dynamics. Graphical decomposition methods are developed for the identification and efficient computation of nested modularizations. Application to an SKM of the red blood cell advances understanding of this biochemical system.  相似文献   

3.
We propose a method to obtain several streams of pseudorandom numbers based on a backbone generator of the generalized shift register type. The method is based on inverting one cycle in a de Bruijn digraph into many sequences in a higher-order de Bruijn graph via an appropriate graph homomorphism. We apply this technique to twisted generalized feedback shift register generators and to the Mersenne Twister MT19937. Positive results of statistical testing are reported.  相似文献   

4.
We propose a semiparametric estimator for single‐index models with censored responses due to detection limits. In the presence of left censoring, the mean function cannot be identified without any parametric distributional assumptions, but the quantile function is still identifiable at upper quantile levels. To avoid parametric distributional assumption, we propose to fit censored quantile regression and combine information across quantile levels to estimate the unknown smooth link function and the index parameter. Under some regularity conditions, we show that the estimated link function achieves the non‐parametric optimal convergence rate, and the estimated index parameter is asymptotically normal. The simulation study shows that the proposed estimator is competitive with the omniscient least squares estimator based on the latent uncensored responses for data with normal errors but much more efficient for heavy‐tailed data under light and moderate censoring. The practical value of the proposed method is demonstrated through the analysis of a human immunodeficiency virus antibody data set.  相似文献   

5.
Finding the maximum a posteriori (MAP) assignment of a discrete-state distribution specified by a graphical model requires solving an integer program. The max-product algorithm, also known as the max-plus or min-sum algorithm, is an iterative method for (approximately) solving such a problem on graphs with cycles. We provide a novel perspective on the algorithm, which is based on the idea of reparameterizing the distribution in terms of so-called pseudo-max-marginals on nodes and edges of the graph. This viewpoint provides conceptual insight into the max-product algorithm in application to graphs with cycles. First, we prove the existence of max-product fixed points for positive distributions on arbitrary graphs. Next, we show that the approximate max-marginals computed by max-product are guaranteed to be consistent, in a suitable sense to be defined, over every tree of the graph. We then turn to characterizing the nature of the approximation to the MAP assignment computed by max-product. We generalize previous work by showing that for any graph, the max-product assignment satisfies a particular optimality condition with respect to any subgraph containing at most one cycle per connected component. We use this optimality condition to derive upper bounds on the difference between the log probability of the true MAP assignment, and the log probability of a max-product assignment. Finally, we consider extensions of the max-product algorithm that operate over higher-order cliques, and show how our reparameterization analysis extends in a natural manner.  相似文献   

6.
We consider parametric regression problems with some covariates missing at random. It is shown that the regression parameter remains identifiable under natural conditions. When the always observed covariates are discrete, we propose a semiparametric maximum likelihood method, which does not require parametric specification of the missing data mechanism or the covariate distribution. The global maximum likelihood estimator (MLE), which maximizes the likelihood over the whole parameter set, is shown to exist under simple conditions. For ease of computation, we also consider a restricted MLE which maximizes the likelihood over covariate distributions supported by the observed values. Under regularity conditions, the two MLEs are asymptotically equivalent and strongly consistent for a class of topologies on the parameter set.  相似文献   

7.
While conjugate Bayesian inference in decomposable Gaussian graphical models is largely solved, the non-decomposable case still poses difficulties concerned with the specification of suitable priors and the evaluation of normalizing constants. In this paper we derive the DY-conjugate prior ( Diaconis & Ylvisaker, 1979 ) for non-decomposable models and show that it can be regarded as a generalization to an arbitrary graph G of the hyper inverse Wishart distribution ( Dawid & Lauritzen, 1993 ). In particular, if G is an incomplete prime graph it constitutes a non-trivial generalization of the inverse Wishart distribution. Inference based on marginal likelihood requires the evaluation of a normalizing constant and we propose an importance sampling algorithm for its computation. Examples of structural learning involving non-decomposable models are given. In order to deal efficiently with the set of all positive definite matrices with non-decomposable zero-pattern we introduce the operation of triangular completion of an incomplete triangular matrix. Such a device turns out to be extremely useful both in the proof of theoretical results and in the implementation of the Monte Carlo procedure.  相似文献   

8.
We propose a method for selecting edges in undirected Gaussian graphical models. Our algorithm takes after our previous work, an extension of Least Angle Regression (LARS), and it is based on the information geometry of dually flat spaces. Non-diagonal elements of the inverse of the covariance matrix, the concentration matrix, play an important role in edge selection. Our iterative method estimates these elements and selects covariance models simultaneously. A sequence of pairs of estimates of the concentration matrix and an independence graph is generated, whose length is the same as the number of non-diagonal elements of the matrix. In our algorithm, the next estimate of the graph is the nearest graph to the latest estimate of the concentration matrix. The next estimate of the concentration matrix is not just the projection of the latest estimate, and it is shrunk to the origin. We describe the algorithm and show results for some datasets. Furthermore, we give some remarks on model identification and prediction.  相似文献   

9.
In this paper, we investigate the problem of determining block designs which are optimal under type 1 optimality criteria within various classes of designs having υ treatments arranged in b blocks of size k. The solutions to two optimization problems are given which are related to a general result obtained by Cheng (1978) and which are useful in this investigation. As one application of the solutions obtained, the definition of a regular graph design given in Mitchell and John (1977) is extended to that of a semi-regular graph design and some sufficient conditions are derived for the existence of a semi-regular graph design which is optimal under a given type 1 criterion. A result is also given which shows how the sufficient conditions derived can be used to establish the optimality under a specific type 1 criterion of some particular types of semi- regular graph designs having both equal and unequal numbers of replicates. Finally,some sufficient conditions are obtained for the dual of an A- or D-optimal design to be A- or D-optimal within an appropriate class of dual designs.  相似文献   

10.
Abstract. We propose an extension of graphical log‐linear models to allow for symmetry constraints on some interaction parameters that represent homologous factors. The conditional independence structure of such quasi‐symmetric (QS) graphical models is described by an undirected graph with coloured edges, in which a particular colour corresponds to a set of equality constraints on a set of parameters. Unlike standard QS models, the proposed models apply with contingency tables for which only some variables or sets of the variables have the same categories. We study the graphical properties of such models, including conditions for decomposition of model parameters and of maximum likelihood estimates.  相似文献   

11.
Huang J  Ma S  Li H  Zhang CH 《Annals of statistics》2011,39(4):2021-2046
We propose a new penalized method for variable selection and estimation that explicitly incorporates the correlation patterns among predictors. This method is based on a combination of the minimax concave penalty and Laplacian quadratic associated with a graph as the penalty function. We call it the sparse Laplacian shrinkage (SLS) method. The SLS uses the minimax concave penalty for encouraging sparsity and Laplacian quadratic penalty for promoting smoothness among coefficients associated with the correlated predictors. The SLS has a generalized grouping property with respect to the graph represented by the Laplacian quadratic. We show that the SLS possesses an oracle property in the sense that it is selection consistent and equal to the oracle Laplacian shrinkage estimator with high probability. This result holds in sparse, high-dimensional settings with p ? n under reasonable conditions. We derive a coordinate descent algorithm for computing the SLS estimates. Simulation studies are conducted to evaluate the performance of the SLS method and a real data example is used to illustrate its application.  相似文献   

12.
Abstract. This article deals with two problems concering the probabilities of causation defined by Pearl (Causality: models, reasoning, and inference, 2nd edn, 2009, Cambridge University Press, New York) namely, the probability that one observed event was a necessary (or sufficient, or both) cause of another; one is to derive new bounds, and the other is to provide the covariate selection criteria. Tian & Pearl (Ann. Math. Artif. Intell., 28, 2000, 287–313) showed how to bound the probabilities of causation using information from experimental and observational studies, with minimal assumptions about the data‐generating process, and identifiable conditions for these probabilities. In this article, we derive narrower bounds using covariate information that is available from those studies. In addition, we propose the conditional monotonicity assumption so as to further narrow the bounds. Moreover, we discuss the covariate selection problem from the viewpoint of the estimation accuracy, and show that selecting a covariate that has a direct effect on an outcome variable cannot always improve the estimation accuracy, which is contrary to the situation in linear regression models. These results provide more accurate information for public policy, legal determination of responsibility and personal decision making.  相似文献   

13.
In this paper, we propose a graphical representation of data and a test statistic based on it for testing the goodness of fit of a completely specified null distribution. The graph is constructed as a linked line chart given by vectors which reflect the pattern of order statistics. The test statistic is defined as an area defined by our chart and its asymptotic distribution is derived under the null hypothesis. Computer simulations performed to study the power properties of our chart indicate that the test is powerful for scale alternatives. Furthermore, it is shown that our test is closely related to the Watson test.  相似文献   

14.
15.
Identifiability is a primary assumption in virtually all classical statistical theory. However, such an assumption may be violated in a variety of statistical models. We consider parametric models where the assumption of identifiability is violated, but otherwise satisfy standard assumptions. We propose an analytic method for constructing new parameters under which the model will be at least locally identifiable. This method is based on solving a system of linear partial differential equations involving the Fisher information matrix. Some consequences and valid inference procedures under non-identifiability have been discussed. The method of reparametrization is illustrated with an example.  相似文献   

16.
Summary.  The application of certain Bayesian techniques, such as the Bayes factor and model averaging, requires the specification of prior distributions on the parameters of alternative models. We propose a new method for constructing compatible priors on the parameters of models nested in a given directed acyclic graph model, using a conditioning approach. We define a class of parameterizations that is consistent with the modular structure of the directed acyclic graph and derive a procedure, that is invariant within this class, which we name reference conditioning.  相似文献   

17.
In this article, we focus on a pseudo-coefficient of determination for generalized linear models with binary outcome. Although there are numerous coefficients of determination proposed in the literature, none of them is identified as the best in terms of estimation accuracy, or incorporates all desired characteristics of a precise coefficient of determination. Considering this, we propose a new coefficient of determination by using a computational Monte Carlo approach, and exhibit main characteristics of the proposed coefficient of determination both analytically and numerically. We evaluate and compare performances of the proposed and nine existing coefficients of determination by a comprehensive Monte Carlo simulation study. The proposed measure is found superior to the existent measures when dependent variable is balanced or moderately unbalanced for probit, logit, and complementary log–log link functions and a wide range of sample sizes. Due to the extensive design space of our simulation study, we identify new conditions in which previously recommended coefficients of determination should be used carefully.  相似文献   

18.
When functional data are not homogenous, for example, when there are multiple classes of functional curves in the dataset, traditional estimation methods may fail. In this article, we propose a new estimation procedure for the mixture of Gaussian processes, to incorporate both functional and inhomogenous properties of the data. Our method can be viewed as a natural extension of high-dimensional normal mixtures. However, the key difference is that smoothed structures are imposed for both the mean and covariance functions. The model is shown to be identifiable, and can be estimated efficiently by a combination of the ideas from expectation-maximization (EM) algorithm, kernel regression, and functional principal component analysis. Our methodology is empirically justified by Monte Carlo simulations and illustrated by an analysis of a supermarket dataset.  相似文献   

19.
Abstract. We propose an objective Bayesian method for the comparison of all Gaussian directed acyclic graphical models defined on a given set of variables. The method, which is based on the notion of fractional Bayes factor (BF), requires a single default (typically improper) prior on the space of unconstrained covariance matrices, together with a prior sample size hyper‐parameter, which can be set to its minimal value. We show that our approach produces genuine BFs. The implied prior on the concentration matrix of any complete graph is a data‐dependent Wishart distribution, and this in turn guarantees that Markov equivalent graphs are scored with the same marginal likelihood. We specialize our results to the smaller class of Gaussian decomposable undirected graphical models and show that in this case they coincide with those recently obtained using limiting versions of hyper‐inverse Wishart distributions as priors on the graph‐constrained covariance matrices.  相似文献   

20.
Supremum score test statistics are often used to evaluate hypotheses with unidentifiable nuisance parameters under the null hypothesis. Although these statistics provide an attractive framework to address non‐identifiability under the null hypothesis, little attention has been paid to their distributional properties in small to moderate sample size settings. In situations where there are identifiable nuisance parameters under the null hypothesis, these statistics may behave erratically in realistic samples as a result of a non‐negligible bias induced by substituting these nuisance parameters by their estimates under the null hypothesis. In this paper, we propose an adjustment to the supremum score statistics by subtracting the expected bias from the score processes and show that this adjustment does not alter the limiting null distribution of the supremum score statistics. Using a simple example from the class of zero‐inflated regression models for count data, we show empirically and theoretically that the adjusted tests are superior in terms of size and power. The practical utility of this methodology is illustrated using count data in HIV research.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号