首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Evaluation of trace evidence in the form of multivariate data   总被引:1,自引:0,他引:1  
Summary.  The evaluation of measurements on characteristics of trace evidence found at a crime scene and on a suspect is an important part of forensic science. Five methods of assessment for the value of the evidence for multivariate data are described. Two are based on significance tests and three on the evaluation of likelihood ratios. The likelihood ratio which compares the probability of the measurements on the evidence assuming a common source for the crime scene and suspect evidence with the probability of the measurements on the evidence assuming different sources for the crime scene and suspect evidence is a well-documented measure of the value of the evidence. One of the likelihood ratio approaches transforms the data to a univariate projection based on the first principal component. The other two versions of the likelihood ratio for multivariate data account for correlation among the variables and for two levels of variation: that between sources and that within sources. One version assumes that between-source variability is modelled by a multivariate normal distribution; the other version models the variability with a multivariate kernel density estimate. Results are compared from the analysis of measurements on the elemental composition of glass.  相似文献   

2.
The likelihood ratio (LR) measures the relative weight of forensic data regarding two hypotheses. Several levels of uncertainty arise if frequentist methods are chosen for its assessment: the assumed population model only approximates the true one, and its parameters are estimated through a limited database. Moreover, it may be wise to discard part of data, especially that only indirectly related to the hypotheses. Different reductions define different LRs. Therefore, it is more sensible to talk about ‘a’ LR instead of ‘the’ LR, and the error involved in the estimation should be quantified. Two frequentist methods are proposed in the light of these points for the ‘rare type match problem’, that is, when a match between the perpetrator's and the suspect's DNA profile, never observed before in the database of reference, is to be evaluated.  相似文献   

3.
The mean past lifetime (MPL) function (also known as the expected inactivity time function) is of interest in many fields such as reliability theory and survival analysis, actuarial studies and forensic science. For estimation of the MPL function some procedures have been proposed in the literature. In this paper, we give a central limit theorem result for the estimator of MPL function based on a right-censored random sample from an unknown distribution. The limiting distribution is used to construct normal approximation-based confidence interval for MPL. Furthermore, we use the empirical likelihood ratio procedure to obtain confidence interval for the MPL function. These two intervals are compared with each other through simulation study in terms of coverage probability. Finally, a couple of numerical example illustrating the theory is also given.  相似文献   

4.
This article analyses diffusion-type processes from a new point-of-view. Consider two statistical hypotheses on a diffusion process. We do not use a classical test to reject or accept one hypothesis using the Neyman–Pearson procedure and do not involve Bayesian approach. As an alternative, we propose using a likelihood paradigm to characterizing the statistical evidence in support of these hypotheses. The method is based on evidential inference introduced and described by Royall [Royall R. Statistical evidence: a likelihood paradigm. London: Chapman and Hall; 1997]. In this paper, we extend the theory of Royall to the case when data are observations from a diffusion-type process instead of iid observations. The empirical distribution of likelihood ratio is used to formulate the probability of strong, misleading and weak evidences. Since the strength of evidence can be affected by the sampling characteristics, we present a simulation study that demonstrates these effects. Also we try to control misleading evidence and reduce them by adjusting these characteristics. As an illustration, we apply the method to the Microsoft stock prices.  相似文献   

5.
How often would investigators be misled if they took advantage of the likelihood principle and used likelihood ratios—which need not be adjusted for multiple looks at the data—to frequently examine accumulating data? The answer, perhaps surprisingly, is not often. As expected, the probability of observing misleading evidence does increase with each additional examination. However, the amount by which this probability increases converges to zero as the sample size grows. As a result, the probability of observing misleading evidence remains bounded—and therefore controllable—even with an infinite number of looks at the data. Here we use boundary crossing results to detail how often misleading likelihood ratios arise in sequential designs. We find that the probability of observing a misleading likelihood ratio is often much less than its universal bound. Additionally, we find that in the presence of fixed-dimensional nuisance parameters, profile likelihoods are to be preferred over estimated likelihoods which result from replacing the nuisance parameters by their global maximum likelihood estimates.  相似文献   

6.
In forensic science, the rare type match problem arises when the matching characteristic from the suspect and the crime scene is not in the reference database; hence, it is difficult to evaluate the likelihood ratio that compares the defense and prosecution hypotheses. A recent solution consists of modeling the ordered population probabilities according to the two-parameter Poisson–Dirichlet distribution, which is a well-known Bayesian nonparametric prior, and plugging the maximum likelihood estimates of the parameters into the likelihood ratio. We demonstrate that this approximation produces a systematic bias that fully Bayesian inference avoids. Motivated by this forensic application, we consider the need to learn the posterior distribution of the parameters that governs the two-parameter Poisson–Dirichlet using two sampling methods: Markov Chain Monte Carlo and approximate Bayesian computation. These methods are evaluated in terms of accuracy and efficiency. Finally, we compare the likelihood ratio that is obtained by our proposal with the existing solution using a database of Y-chromosome haplotypes.  相似文献   

7.
This paper describes an EM algorithm for maximum likelihood estimation in generalized linear models (GLMs) with continuous measurement error in the explanatory variables. The algorithm is an adaptation of that for nonparametric maximum likelihood (NPML) estimation in overdispersed GLMs described in Aitkin (Statistics and Computing 6: 251–262, 1996). The measurement error distribution can be of any specified form, though the implementation described assumes normal measurement error. Neither the reliability nor the distribution of the true score of the variables with measurement error has to be known, nor are instrumental variables or replication required.Standard errors can be obtained by omitting individual variables from the model, as in Aitkin (1996).Several examples are given, of normal and Bernoulli response variables.  相似文献   

8.
Summary. The strength of statistical evidence is measured by the likelihood ratio. Two key performance properties of this measure are the probability of observing strong misleading evidence and the probability of observing weak evidence. For the likelihood function associated with a parametric statistical model, these probabilities have a simple large sample structure when the model is correct. Here we examine how that structure changes when the model fails. This leads to criteria for determining whether a given likelihood function is robust (continuing to perform satisfactorily when the model fails), and to a simple technique for adjusting both likelihoods and profile likelihoods to make them robust. We prove that the expected information in the robust adjusted likelihood cannot exceed the expected information in the likelihood function from a true model. We note that the robust adjusted likelihood is asymptotically fully efficient when the working model is correct, and we show that in some important examples this efficiency is retained even when the working model fails. In such cases the Bayes posterior probability distribution based on the adjusted likelihood is robust, remaining correct asymptotically even when the model for the observable random variable does not include the true distribution. Finally we note a link to standard frequentist methodology—in large samples the adjusted likelihood functions provide robust likelihood-based confidence intervals.  相似文献   

9.
Kang (2006) and Kang and Larsen (in press) used the log likelihood function with Lagrangian multipliers for estimation of cell probabilities in two-way incomplete contingency tables. This paper extends results and simulations to three-way and multi-way tables. Numerous studies cross-classify subjects by three or more categorical factors. Constraints on cell probabilities are incorporated through Lagrangian multipliers. Variances of the MLEs are derived from the matrix of second derivatives of the log likelihood with respect to cell probabilities and the Lagrange multiplier. Wald and likelihood ratio tests of independence are derived using the estimates and estimated variances. In simulation results in Kang and Larsen (in press), for data missing at random, maximum likelihood estimation (MLE) produced more efficient estimates of population proportions than either multiple imputation (MI) based on data augmentation or complete case (CC) analysis. Neither MLE nor MI, however, lead to an improvement over CC analysis with respect to power of tests for independence in two-way tables. Results are extended to multidimensional tables with arbitrary patterns of missing data when the variables are recorded on individual subjects. In three-way and higher-way tables, however, there is information relevant for judging independence in partially classified information, as long as two or more variables are jointly observed. Simulations study three-dimensional tables with three patterns of association and two levels of missing information.  相似文献   

10.
One important type of question in statistical inference is how to interpret data as evidence. The law of likelihood provides a satisfactory answer in interpreting data as evidence for simple hypotheses, but remains silent for composite hypotheses. This article examines how the law of likelihood can be extended to composite hypotheses within the scope of the likelihood principle. From a system of axioms, we conclude that the strength of evidence for the composite hypotheses should be represented by an interval between lower and upper profiles likelihoods. This article is intended to reveal the connection between profile likelihoods and the law of likelihood under the likelihood principle rather than argue in favor of the use of profile likelihoods in addressing general questions of statistical inference. The interpretation of the result is also discussed.  相似文献   

11.
Mixture separation for mixed-mode data   总被引:3,自引:0,他引:3  
One possible approach to cluster analysis is the mixture maximum likelihood method, in which the data to be clustered are assumed to come from a finite mixture of populations. The method has been well developed, and much used, for the case of multivariate normal populations. Practical applications, however, often involve mixtures of categorical and continuous variables. Everitt (1988) and Everitt and Merette (1990) recently extended the normal model to deal with such data by incorporating the use of thresholds for the categorical variables. The computations involved in this model are so extensive, however, that it is only feasible for data containing very few categorical variables. In the present paper we consider an alternative model, known as the homogeneous Conditional Gaussian model in graphical modelling and as the location model in discriminant analysis. We extend this model to the finite mixture situation, obtain maximum likelihood estimates for the population parameters, and show that computation is feasible for an arbitrary number of variables. Some data sets are clustered by this method, and a small simulation study demonstrates characteristics of its performance.  相似文献   

12.
Non-Gaussian spatial responses are usually modeled using spatial generalized linear mixed model with spatial random effects. The likelihood function of this model cannot usually be given in a closed form, thus the maximum likelihood approach is very challenging. There are numerical ways to maximize the likelihood function, such as Monte Carlo Expectation Maximization and Quadrature Pairwise Expectation Maximization algorithms. They can be applied but may in such cases be computationally very slow or even prohibitive. Gauss–Hermite quadrature approximation only suitable for low-dimensional latent variables and its accuracy depends on the number of quadrature points. Here, we propose a new approximate pairwise maximum likelihood method to the inference of the spatial generalized linear mixed model. This approximate method is fast and deterministic, using no sampling-based strategies. The performance of the proposed method is illustrated through two simulation examples and practical aspects are investigated through a case study on a rainfall data set.  相似文献   

13.
Abstract. We consider a semi‐nonparametric specification for the density of latent variables in Generalized Linear Latent Variable Models (GLLVM). This specification is flexible enough to allow for an asymmetric, multi‐modal, heavy or light tailed smooth density. The degree of flexibility required by many applications of GLLVM can be achieved through this semi‐nonparametric specification with a finite number of parameters estimated by maximum likelihood. Even with this additional flexibility, we obtain an explicit expression of the likelihood for conditionally normal manifest variables. We show by simulations that the estimated density of latent variables capture the true one with good degree of accuracy and is easy to visualize. By analysing two real data sets we show that a flexible distribution of latent variables is a useful tool for exploring the adequacy of the GLLVM in practice.  相似文献   

14.
Latent variable models are widely used for jointly modeling of mixed data including nominal, ordinal, count and continuous data. In this paper, we consider a latent variable model for jointly modeling relationships between mixed binary, count and continuous variables with some observed covariates. We assume that, given a latent variable, mixed variables of interest are independent and count and continuous variables have Poisson distribution and normal distribution, respectively. As such data may be extracted from different subpopulations, consideration of an unobserved heterogeneity has to be taken into account. A mixture distribution is considered (for the distribution of the latent variable) which accounts the heterogeneity. The generalized EM algorithm which uses the Newton–Raphson algorithm inside the EM algorithm is used to compute the maximum likelihood estimates of parameters. The standard errors of the maximum likelihood estimates are computed by using the supplemented EM algorithm. Analysis of the primary biliary cirrhosis data is presented as an application of the proposed model.  相似文献   

15.
Non-linear functional relationships between two variables (curve fitting with errors in both variables) may be defined in a similar way to linear functional relationships. The ratio of the error variances must be fixed to obtain a solution to the likelihood equations, but this solution may not be unique. The best fitting solution may change abruptly as the error variance ratio passes through a critical value. Some examples are given.  相似文献   

16.
The k largest order statistics in a random sample from a common heavy‐tailed parent distribution with a regularly varying tail can be characterized as Fréchet extremes. This paper establishes that consecutive ratios of such Fréchet extremes are mutually independent and distributed as functions of beta random variables. The maximum likelihood estimator of the tail index based on these ratios is derived, and the exact distribution of the maximum likelihood estimator is determined for fixed k, and the asymptotic distribution as k →∞ . Inferential procedures based upon the maximum likelihood estimator are shown to be optimal. The Fréchet extremes are not directly observable, but a feasible version of the maximum likelihood estimator is equivalent to Hill's statistic. A simple diagnostic is presented that can be used to decide on the largest value of k for which an assumption of Fréchet extremes is sustainable. The results are illustrated using data on commercial insurance claims arising from fires and explosions, and from hurricanes.  相似文献   

17.
The maximum likelihood approach to the estimation of factor analytic model parameters most commonly deals with outcomes that are assumed to be multivariate Gaussian random variables in a homogeneous input space. In many practical settings, however, many studies needing factor analytic modeling involve data that, not only are not multivariate Gaussian variables, but also come from a partitioned input space. This article introduces an extension of the maximum likelihood factor analysis that handles multivariate outcomes made up of attributes with different probability distributions, and originating from a partitioned input space. An EM Algorithm combined with Fisher Scoring is used to estimate the parameters of the derived model.  相似文献   

18.
Beta Regression for Modelling Rates and Proportions   总被引:9,自引:0,他引:9  
This paper proposes a regression model where the response is beta distributed using a parameterization of the beta law that is indexed by mean and dispersion parameters. The proposed model is useful for situations where the variable of interest is continuous and restricted to the interval (0, 1) and is related to other variables through a regression structure. The regression parameters of the beta regression model are interpretable in terms of the mean of the response and, when the logit link is used, of an odds ratio, unlike the parameters of a linear regression that employs a transformed response. Estimation is performed by maximum likelihood. We provide closed-form expressions for the score function, for Fisher's information matrix and its inverse. Hypothesis testing is performed using approximations obtained from the asymptotic normality of the maximum likelihood estimator. Some diagnostic measures are introduced. Finally, practical applications that employ real data are presented and discussed.  相似文献   

19.
In this paper, local quasi‐likelihood regression is considered for stationary random fields of dependent variables. In the case of independent data, local polynomial quasi‐likelihood regression is known to have several appealing features such as minimax efficiency, design adaptivity and good boundary behaviour. These properties are shown to carry over to the case of random fields. The asymptotic normality of the regression estimator is established and explicit formulae for its asymptotic bias and variance are derived for strongly mixing stationary random fields. The extension to multi‐dimensional covariates is also provided in full generality. Moreover, evaluation of the finite sample performance is made through a simulation study.  相似文献   

20.
Inference in generalized linear mixed models with multivariate random effects is often made cumbersome by the high-dimensional intractable integrals involved in the marginal likelihood. This article presents an inferential methodology based on the GEE approach. This method involves the approximations of the marginal likelihood and joint moments of the variables. It is also proposed an approximate Akaike and Bayesian information criterions based on the approximate marginal likelihood using the estimation of the parameters by the GEE approach. The different results are illustrated with a simulation study and with an analysis of real data from health-related quality of life.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号