首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 897 毫秒
1.
We consider the assessment of deoxyribonucleic acid (DNA) profiles from biological samples containing a mixture of DNA from more than one person. The problem has been investigated in the context of likelihood ratios by Weir and co-workers under the assumption of independent alleles in DNA profiles. However, uncertainty about independence may arise from various factors such as population substructure and relatedness. This issue has received considerable attention in recent years. Ignoring this uncertainty may seriously overstate the strength of the evidence and therefore disadvantage innocent suspects. Taking this uncertainty into account, we develop a general formula for calculating the match probabilities of DNA profiles. Thus, we extend the result derived by Weir and co-workers to the dependence situation, which is often more to the benefit of the defendant in comparison with the simple product rule result based on an independence assumption. The effect of dependence of alleles on likelihood ratio estimates can be seen in the analysis of two real data sets.  相似文献   

2.
The purpose of this paper is to prove, through the analysis of the behaviour of a standard kernel density estimator, that the notion of weak dependence defined in a previous paper (cf. Doukhan & Louhichi, 1999) has sufficiently sharp properties to be used in various situations. More precisely we investigate the asymptotics of high order losses, asymptotic distributions and uniform almost sure behaviour of kernel density estimates. We prove that they are the same as for independent samples (with some restrictions for a.s. behaviours). Recall finally that this weak dependence condition extends on the previously defined ones such as mixing, association and it allows considerations of new classes such as weak shifts processes based on independent sequences as well as some non-mixing Markov processes.  相似文献   

3.
Recurrent event data arise in longitudinal studies where each study subject may experience multiple events during the follow-up. In many situations in survival studies, pairs of individuals can potentially experience recurrent events. The analysis of such data is not straightforward as it involves two kinds of dependences, namely, dependence between the individuals in the same pair and dependence among a sequence of pairs. In the present paper, we introduce a new stochastic model for the analysis of such recurrent event data. Nonparametric estimators for a bivariate survivor function are developed. Asymptotic properties of the estimators are discussed. Simulation studies are carried out to assess the finite sample properties of the estimator. We illustrate the procedure with real life data on eye disease.  相似文献   

4.
Non-coding deoxyribonucleic acid (DNA) can typically be modelled by a sequence of Bernoulli random variables by coding one base, e.g. T, as 1 and other bases as 0. If a segment of a sequence is functionally important, the probability of a 1 will be different in this changed segment from that in the surrounding DNA. It is important to be able to see whether such a segment occurs in a particular DNA sequence and to pin-point it so that a molecular biologist can investigate its possible function. Here we discuss methods for testing the occurrence of such a changed segment and how to estimate the end points of it. Maximum-likelihood-based methods are not very tractable and so a nonparametric method based on the approach of Pettitt has been developed. The problem and its solution are illustrated by a specific DNA example.  相似文献   

5.
Fitting Markov chain models to discrete state series such as DNA sequences   总被引:1,自引:0,他引:1  
Discrete state series such as DNA sequences can often be modelled by Markov chains. The analysis of such series is discussed in the context of log-linear models. The data produce contingency tables with similar margins due to the dependence of the observations. However, despite the unusual structure of the tables, the analysis is equivalent to that for data from multinomial sampling. The reason why the standard number of degrees of freedom is correct is explained by using theoretical arguments and the asymptotic distribution of the deviance is verified empirically. Problems involved with fitting high order Markov chain models, such as reduced power and computational expense, are also discussed.  相似文献   

6.
In this paper we investigate the impact of model mis-specification, in terms of the dependence structure in the extremes of a spatial process, on the estimation of key quantities that are of interest to hydrologists and engineers. For example, it is often the case that severe flooding occurs as a result of the observation of rainfall extremes at several locations in a region simultaneously. Thus, practitioners might be interested in estimates of the joint exceedance probability of some high levels across these locations. It is likely that there will be spatial dependence present between the extremes, and this should be properly accounted for when estimating such probabilities. We compare the use of standard models from the geostatistics literature with max-stables models from extreme value theory. We find that, in some situations, using an incorrect spatial model for our extremes results in a significant under-estimation of these probabilities which – in flood defence terms – could lead to substantial under-protection.  相似文献   

7.
Genomic alterations have been linked to the development and progression of cancer. The technique of comparative genomic hybridization (CGH) yields data consisting of fluorescence intensity ratios of test and reference DNA samples. The intensity ratios provide information about the number of copies in DNA. Practical issues such as the contamination of tumor cells in tissue specimens and normalization errors necessitate the use of statistics for learning about the genomic alterations from array CGH data. As increasing amounts of array CGH data become available, there is a growing need for automated algorithms for characterizing genomic profiles. Specifically, there is a need for algorithms that can identify gains and losses in the number of copies based on statistical considerations, rather than merely detect trends in the data.We adopt a Bayesian approach, relying on the hidden Markov model to account for the inherent dependence in the intensity ratios. Posterior inferences are made about gains and losses in copy number. Localized amplifications (associated with oncogene mutations) and deletions (associated with mutations of tumor suppressors) are identified using posterior probabilities. Global trends such as extended regions of altered copy number are detected. Because the posterior distribution is analytically intractable, we implement a Metropolis-within-Gibbs algorithm for efficient simulation-based inference. Publicly available data on pancreatic adenocarcinoma, glioblastoma multiforme, and breast cancer are analyzed, and comparisons are made with some widely used algorithms to illustrate the reliability and success of the technique.  相似文献   

8.
ABSTRACT

The class of bivariate copulas that are invariant under truncation with respect to one variable is considered. A simulation algorithm for the members of the class and a novel construction method are presented. Moreover, inspired by a stochastic interpretation of the members of such a class, a procedure is suggested to check whether the dependence structure of a given data set is truncation invariant. The overall performance of the procedure has been illustrated on both simulated and real data.  相似文献   

9.
Analysis of quantal models is a particular aspect of the general problem of investigating multimodality. The distinction is that the spacings between modes are integral multiples of some unspecified fundamental unit and that the number of modes is not defined. Such semi-structured models arise in a wide variety of contexts such as biology, cosmology, archaeology and molecular physics. This paper presents a brief review of their historical development in such areas as an aid to their recognition in other contexts as well as giving guidance to their analysis from the statistical viewpoint. The available methodology for their analysis is collated into a coherent and self-contained account, establishing various optimality properties under particular parametric distributional assumptions. An illustrative power study shows how dependence on sample size and failure of assumptions such as underlying distribution, origin of measurements and independence affect the power of various analyses. These aspects are illustrated by an example from developmental biology.  相似文献   

10.
Study of a Markov model for a high-quality dependent process   总被引:1,自引:0,他引:1  
For high-quality processes, non-conforming items are seldom observed and the traditional p (or np) charts are not suitable for monitoring the state of the process. A type of chart based on the count of cumulative conforming items has recently been introduced and it is especially useful for automatically collected one-at-a-time data. However, in such a case, it is common that the process characteristics become dependent as items produced one after another are inspected. In this paper, we study the problem of process monitoring when the process is of high quality and measurement values possess a certain serial dependence. The problem of assuming independence is examined and a Markov model for this type of process is studied, upon which suitable control procedures can be developed.  相似文献   

11.
There is often more structure in the way two random variables are associated than a single scalar dependence measure, such as correlation, can reflect. Local dependence functions such as that of Holland and Wang (1987) are, therefore, useful. However, it can be argued that estimated local dependence functions convey information that is too detailed to be easily interpretable. We seek to remedy this difficulty, and hence make local dependence a more readily interpretable practical tool, by introducing dependence maps. Via local permutation testing, dependence maps simplify the estimated local dependence structure between two variables by identifying regions of (significant) positive, (not significant) zero and (significant) negative local dependence. When viewed in conjunction with an estimate of the joint density, a comprehensive picture of the joint behaviour of the variables is provided. A little theory, many implementational details and several examples are given.  相似文献   

12.
Bootstrapping the conditional copula   总被引:1,自引:0,他引:1  
This paper is concerned with inference about the dependence or association between two random variables conditionally upon the given value of a covariate. A way to describe such a conditional dependence is via a conditional copula function. Nonparametric estimators for a conditional copula then lead to nonparametric estimates of conditional association measures such as a conditional Kendall's tau. The limiting distributions of nonparametric conditional copula estimators are rather involved. In this paper we propose a bootstrap procedure for approximating these distributions and their characteristics, and establish its consistency. We apply the proposed bootstrap procedure for constructing confidence intervals for conditional association measures, such as a conditional Blomqvist beta and a conditional Kendall's tau. The performances of the proposed methods are investigated via a simulation study involving a variety of models, ranging from models in which the dependence (weak or strong) on the covariate is only through the copula and not through the marginals, to models in which this dependence appears in both the copula and the marginal distributions. As a conclusion we provide practical recommendations for constructing bootstrap-based confidence intervals for the discussed conditional association measures.  相似文献   

13.
Global sensitivity analysis with variance-based measures suffers from several theoretical and practical limitations, since they focus only on the variance of the output and handle multivariate variables in a limited way. In this paper, we introduce a new class of sensitivity indices based on dependence measures which overcomes these insufficiencies. Our approach originates from the idea to compare the output distribution with its conditional counterpart when one of the input variables is fixed. We establish that this comparison yields previously proposed indices when it is performed with Csiszár f-divergences, as well as sensitivity indices which are well-known dependence measures between random variables. This leads us to investigate completely new sensitivity indices based on recent state-of-the-art dependence measures, such as distance correlation and the Hilbert–Schmidt independence criterion. We also emphasize the potential of feature selection techniques relying on such dependence measures as alternatives to screening in high dimension.  相似文献   

14.
The traditional reliability models cannot well reflect the effect of performance dependence of subsystems on the reliability of system, and neglect the problems of initial reliability and standby redundancy. In this paper, the reliability of a parallel system with active multicomponents and a single cold-standby unit has been investigated. The simultaneously working components are dependent and the dependence is expressed by a copula function. Based on the theories of conditional probability, the explicit expressions for the reliability and the MTTF of the system, in terms of the copula function and marginal lifetime distributions, are obtained. Let the copula function be the FGM copula and the marginal lifetime distribution be exponential distribution, a system with two parallel dependent units and a single cold-standby unit is taken as an example. The effect of different degrees of dependence among components on system reliability is analyzed, and the system reliability can be expressed as the linear combination of exponential reliability functions with different failure rates. For investigating how the degree of dependence affects the mean lifetime, furthermore, the parallel system with a single cold standby, comprising different number of active components, is also presented. The effectiveness of the modeling method is verified, and the method presented provides a theoretical basis for reliability design of engineering systems and physics of failure.  相似文献   

15.
Data arising from a randomized double-masked clinical trial for multiple sclerosis have provided particularly variable longitudinal repeated measurements responses. Specific models for such data, other than those based on the multivariate normal distribution, would be a valuable addition to the applied statistician's toolbox. A useful family of multivariate distributions can be generated by substituting the integrated intensity of one distribution into a second (outer) distribution. The parameters in the second distribution are then used to create a dependence structure among observations on a unit. These may either be a form of serial dependence for longitudinal data or of uniform dependence within clusters. These are respectively analogous to the Kalman filter of state space models and to copulas, but they have the major advantage that they do not require any explicit integration. One useful outer distribution for constructing such multivariate distributions is the Pareto distribution. Certain special models based on it have previously been used in event history analysis, but those considered here have much wider application.  相似文献   

16.
It is very well known that analyses for missing data depend on untestable assumptions. As a consequence, in such settings, sensitivity analyses are often sensible. One such class of analyses assesses the dependence of conclusions on an explicit missing value mechanism. Inevitably, there is an association between such dependence and the actual (but unknown) distribution of the missing data. In a particular parametric framework for dropout in this paper, an approach is presented that reduces (but never removes) the impact of incorrect assumptions on the form of the association. It is shown how these models can be formulated and fitted relatively simply using hierarchical likelihood. These are applied directly to an example involving mastitis in dairy cattle, and an extensive simulation study is described to show the properties of the methods.  相似文献   

17.
In this paper, we study the change-point inference problem motivated by the genomic data that were collected for the purpose of monitoring DNA copy number changes. DNA copy number changes or copy number variations (CNVs) correspond to chromosomal aberrations and signify abnormality of a cell. Cancer development or other related diseases are usually relevant to DNA copy number changes on the genome. There are inherited random noises in such data, therefore, there is a need to employ an appropriate statistical model for identifying statistically significant DNA copy number changes. This type of statistical inference is evidently crucial in cancer researches, clinical diagnostic applications, and other related genomic researches. For the high-throughput genomic data resulting from DNA copy number experiments, a mean and variance change point model (MVCM) for detecting the CNVs is appropriate. We propose to use a Bayesian approach to study the MVCM for the cases of one change and propose to use a sliding window to search for all CNVs on a given chromosome. We carry out simulation studies to evaluate the estimate of the locus of the DNA copy number change using the derived posterior probability. These simulation results show that the approach is suitable for identifying copy number changes. The approach is also illustrated on several chromosomes from nine fibroblast cancer cell line data (array-based comparative genomic hybridization data). All DNA copy number aberrations that have been identified and verified by karyotyping are detected by our approach on these cell lines.  相似文献   

18.
The author suggests a heuristic method for detecting the dependence of random time series that can be used in the case when this dependence is relatively weak, such that the traditional methods are not effective. The method requires comparison of some special functionals on the sample characteristic functions with the same functionals computed for the benchmark time series with a known degree of correlation. Some experiments for financial time series are presented.  相似文献   

19.
Finite memory sources and variable‐length Markov chains have recently gained popularity in data compression and mining, in particular, for applications in bioinformatics and language modelling. Here, we consider denser data compression and prediction with a family of sparse Bayesian predictive models for Markov chains in finite state spaces. Our approach lumps transition probabilities into classes composed of invariant probabilities, such that the resulting models need not have a hierarchical structure as in context tree‐based approaches. This can lead to a substantially higher rate of data compression, and such non‐hierarchical sparse models can be motivated for instance by data dependence structures existing in the bioinformatics context. We describe a Bayesian inference algorithm for learning sparse Markov models through clustering of transition probabilities. Experiments with DNA sequence and protein data show that our approach is competitive in both prediction and classification when compared with several alternative methods on the basis of variable memory length.  相似文献   

20.
In this article, we study the effect of dependence on the distributional properties of functions of two random variables. Expressions for the cumulative distribution functions of the linear combinations, products, and ratios of two dependent random variables in terms of their associated copula are derived. We discuss the effect of dependence on quantities such as the variances of linear combinations of functions, the value-at-risk measure, and the stress–strength parameter. Several examples, a simulation study, and a real data analysis are provided to illustrate the result.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号