期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Evaluation of Gaussian orthant probabilities based on orthogonal projections to subspaces

Noboru Nomura 《Statistics and Computing》2016,26(1-2):187-197

In this paper, a new procedure is described for evaluating the probability that all elements of a normally distributed vector are non-negative, which is called the non-centered orthant probability. This probability is defined by a multivariate integral of the density function. The definition is simple, and this probability arises frequently in statistics because the normal distribution is prevalent. The method for evaluating this probability, however, is not obvious, because applying direct numerical integration is not practical except in low dimensional cases. In the procedure proposed in this paper, the problem is decomposed into sub-problems of lower dimension. Considering the projection onto subspaces, the solutions of the sub-problems can be shared in the evaluation of higher dimensional problems. Thus the sub-problems form a lattice structure. This reduces the computational time from a factorial order, where the interim results are not shared, to order \(p^{2}2^{p}\), which is faster than the procedures that have been reported in the literature. 相似文献

2.

Inverse probability of censoring weighting for visual predictive checks of time-to-event models with time-varying covariates

Christian Bartels Thomas Dumortier 《Pharmaceutical statistics》2021,20(6):1051-1060

When constructing models to summarize clinical data to be used for simulations, it is good practice to evaluate the models for their capacity to reproduce the data. This can be done by means of Visual Predictive Checks (VPC), which consist of several reproductions of the original study by simulation from the model under evaluation, calculating estimates of interest for each simulated study and comparing the distribution of those estimates with the estimate from the original study. This procedure is a generic method that is straightforward to apply, in general. Here we consider the application of the method to time-to-event data and consider the special case when a time-varying covariate is not known or cannot be approximated after event time. In this case, simulations cannot be conducted beyond the end of the follow-up time (event or censoring time) in the original study. Thus, the simulations must be censored at the end of the follow-up time. Since this censoring is not random, the standard KM estimates from the simulated studies and the resulting VPC will be biased. We propose to use inverse probability of censoring weighting (IPoC) method to correct the KM estimator for the simulated studies and obtain unbiased VPCs. For analyzing the Cantos study, the IPoC weighting as described here proved valuable and enabled the generation of VPCs to qualify PKPD models for simulations. Here, we use a generated data set, which allows illustration of the different situations and evaluation against the known truth. 相似文献

3.

Impact of selection bias on the evaluation of clusters of chemical compounds in the drug discovery process

下载免费PDF全文

Ariel Alonso Elasma Milanzi Geert Molenberghs Christophe Buyck Luc Bijnens 《Pharmaceutical statistics》2015,14(2):129-138

Expert opinion plays an important role when selecting promising clusters of chemical compounds in the drug discovery process. Indeed, experts can qualitatively assess the potential of each cluster, and with appropriate statistical methods, these qualitative assessments can be quantified into a success probability for each of them. However, one crucial element often overlooked is the procedure by which the clusters are assigned to/selected by the experts for evaluation. In the present work, the impact such a procedure may have on the statistical analysis and the entire evaluation process is studied. It has been shown that some implementations of the selection procedure may seriously compromise the validity of the evaluation even when the rating and selection processes are independent. Consequently, the fully random allocation of the clusters to the experts is strongly advocated. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献

4.

Objective Testing Procedures in Linear Models: Calibration of the p-values

F. JAVIER GIRÓN M. LINA MARTÍNEZ ELÍAS MORENO FRANCISCO TORRES 《Scandinavian Journal of Statistics》2006,33(4):765-784

Abstract. An optimal Bayesian decision procedure for testing hypothesis in normal linear models based on intrinsic model posterior probabilities is considered. It is proven that these posterior probabilities are simple functions of the classical F -statistic, thus the evaluation of the procedure can be carried out analytically through the frequentist analysis of the posterior probability of the null. An asymptotic analysis proves that, under mild conditions on the design matrix, the procedure is consistent. For any testing hypothesis it is also seen that there is a one-to-one mapping – which we call calibration curve – between the posterior probability of the null hypothesis and the classical bi p -value. This curve adds substantial knowledge about the possible discrepancies between the Bayesian and the p -value measures of evidence for testing hypothesis. It permits a better understanding of the serious difficulties that are encountered in linear models for interpreting the p -values. A specific illustration of the variable selection problem is given. 相似文献

5.

Sampling methods and sensitivity analysis for large parameter sets

《Journal of Statistical Computation and Simulation》2012,82(1-4):77-110

Models with large parameter (i.e., hundreds or thousands of parameters) often behave as if they depend upon only a few parameters, with the rest having comparatively little influence. One challenge of sensitivity analysis with such models is screening parameters to identify the influential ones, and then characterizing their influences.

Large models often require significant computing resources to evaluate their output, and so a good screening mechanism should be efficient: it should minimize the number of times a model must be exercised. This paper describes an efficient procedure to perform sensitivity analysis on deterministic models with specified ranges or probability distributions for each parameter.

It is based on repeated exercising of the model, which can be treated as a black box. Statistical checks can ensure that the screening identified parameters that account for the bulk of the model variation. Subsequent sensitivity analysis can use the screening information to reduce the investment required to characterize the influence of influential and other parameters.

The procedure exploits simplifications in the dependence of a model output on model inputs. It works best where a small number of parameters are much more influential than all the rest. The method is much more sensitive to the number of influential parameters than to the total number of parameters. It is most effective when linear or quadratic effects dominate higher order effects and complex interactions.

The paper presents a set of M athematica functions that can be used to create a variety of types of experimental designs useful for sensitivity analysis, including simple random, latin hypercube and fractional factorial sampling. Each sampling method can use discretization, folding, grouping and replication to create composite designs. These techniques have beencombined in a composite approach called Iterated Fractional Factorial Design (IFFD).

The procedure is applied to model of nuclear fuel waste disposal, and to simplified example models to demonstrate the concepts involved. 相似文献

6.

Model averaging procedure for varying-coefficient partially linear models with missing responses

Jie Zeng Weihu Cheng Guozhi Hu Yaohua Rong 《Journal of the Korean Statistical Society》2018,47(3):379-394

This paper is concerned with model averaging procedure for varying-coefficient partially linear models with missing responses. The profile least-squares estimation process and inverse probability weighted method are employed to estimate regression coefficients of the partially restricted models, in which the propensity score is estimated by the covariate balancing propensity score method. The estimators of the linear parameters are shown to be asymptotically normal. Then we develop the focused information criterion, formulate the frequentist model averaging estimators and construct the corresponding confidence intervals. Some simulation studies are conducted to examine the finite sample performance of the proposed methods. We find that the covariate balancing propensity score improves the performance of the inverse probability weighted estimator. We also demonstrate the superiority of the proposed model averaging estimators over those of existing strategies in terms of mean squared error and coverage probability. Finally, our approach is further applied to a real data example. 相似文献

7.

On the identification of ARMA echelon-form models

Said Nsiri Roch Roy 《Revue canadienne de statistique》1992,20(4):369-386

An identification procedure for multivariate autoregressive moving average (ARMA) echelon-form models is proposed. It is based on the study of the linear dependence between rows of the Hankel matrix of serial correlations. To that end, we define a statistical test for checking the linear dependence between vectors of serial correlations. It is shown that the test statistic t?_n considered is distributed asymptotically as a finite linear combination of independent chi-square random variables with one degree of freedom under the null hypothesis, whereas under the alternative hypothesis, t?_N/N converges in probability to a positive constant. These results allow us, in particular, to compute the asymptotic probability of making a specification error with the proposed procedure. Links to other methods based on the application of canonical analysis are discussed. A simulation experiment was done in order to study the performance of the procedure. It is seen that the graphical representation of t?_N, as a function of N, can be very useful in identifying the dynamic structure of ARMA models. Furthermore, for the model considered, the proposed identification procedure performs very well for series of 100 observations or more and reasonably well with short series of 50 observations. 相似文献

8.

Model robust confidence intervals

G. Knafl J. Sacks D. Ylvisaker 《Journal of statistical planning and inference》1982,6(4):319-334

Confidence intervals are constructed for real-valued parameter estimation in a general regression model with normal errors. When the error variance is known these intervals are optimal (in the sense of minimizing length subject to guaranteed probability of coverage) among all intervals estimates which are centered at a linear estimate of the parameter. When the error variance is unknown and the regression model is an approximately linear model (a class of models which permits bounded systematic departures from an underlying ideal model) then an independent estimate of variance is found and the intervals can then be appropriately scaled. 相似文献

9.

Estimation of Balanced Simultaneous Confidence Sets for SIR Models

Laura Temime Guy Thomas 《统计学通讯:模拟与计算》2013,42(3):803-812

Stochastic compartmental (e.g., SIR) models have proven useful for studying the epidemics of childhood diseases while taking into account the variability of the epidemic dynamics. Here, we present a method for estimating balanced simultaneous confidence sets for the mean sample path of a stochastic SIR model, thus providing a simple representation of both the typical behavior and the variability of the epidemic. The confidence sets are estimated by a bootstrap procedure, using asymptotic properties of density dependent jump Markov processes. The method is applied to chickenpox epidemics in France and the coverage probability of the confidence sets is estimated in that context. 相似文献

10.

Improved generalized Fourier amplitude sensitivity test (FAST) for model assessment

Shoufan Fang George Z. Gertner Svetlana Shinkareva Guangxing Wang Alan Anderson 《Statistics and Computing》2003,13(3):221-226

The Fourier amplitude sensitivity test (FAST) can be used to calculate the relative variance contribution of model input parameters to the variance of predictions made with functional models. It is widely used in the analyses of complicated process modeling systems. This study provides an improved transformation procedure of the Fourier amplitude sensitivity test (FAST) for non-uniform distributions that can be used to represent the input parameters. Here it is proposed that the cumulative probability be used instead of probability density when transforming non-uniform distributions for FAST. This improvement will increase the accuracy of transformation by reducing errors, and makes the transformation more convenient to be used in practice. In an evaluation of the procedure, the improved procedure was demonstrated to have very high accuracy in comparison to the procedure that is currently widely in use. 相似文献

11.

Multi-state Models: A Review 总被引：4，自引：0，他引：4

Hougaard Philip 《Lifetime data analysis》1999,5(3):239-264

Multi-state models are models for a process, for example describing a life history of an individual, which at any time occupies one of a few possible states. This can describe several possible events for a single individual, or the dependence between several individuals. The events are the transitions between the states. This class of models allows for an extremely flexible approach that can model almost any kind of longitudinal failure time data. This is particularly relevant for modeling different events, which have an event-related dependence, like occurrence of disease changing the risk of death. It can also model paired data. It is useful for recurrent events, but has limitations. The Markov models stand out as much simpler than other models from a probability point of view, and this simplifies the likelihood evaluation. However, in many cases, the Markov models do not fit satisfactorily, and happily, it is reasonably simple to study non-Markov models, in particular the Markov extension models. This also makes it possible to consider, whether the dependence is of short-term or long-term nature. Applications include the effect of heart transplantation on the mortality and the mortality among Danish twins. 相似文献

12.

A Discrete-Time Multilevel Mixture Model for Event History Data with Long-Term Survivors,with an Application to an Analysis of Contraceptive Sterilization in Bangladesh 总被引：1，自引：0，他引：1

Steele F 《Lifetime data analysis》2003,9(2):155-174

Event history models typically assume that the entire population is at risk of experiencing the event of interest throughout the observation period. However, there will often be individuals, referred to as long-term survivors, who may be considered a priori to have a zero hazard throughout the study period. In this paper, a discrete-time mixture model is proposed in which the probability of long-term survivorship and the timing of event occurrence are modelled jointly. Another feature of event history data that often needs to be considered is that they may come from a population with a hierarchical structure. For example, individuals may be nested within geographical regions and individuals in the same region may have similar risks of experiencing the event of interest due to unobserved regional characteristics. Thus, the discrete-time mixture model is extended to allow for clustering in the likelihood and timing of an event within regions. The model is further extended to allow for unobserved individual heterogeneity in the hazard of event occurrence. The proposed model is applied in an analysis of contraceptive sterilization in Bangladesh. The results show that a woman's religion and education level affect her probability of choosing sterilization, but not when she gets sterilized. There is also evidence of community-level variation in sterilization timing, but not in the probability of sterilization. 相似文献

13.

Quasi-likelihood Estimation of Non-invertible Moving Average Processes

Jian Huang & Yudi Pawitan 《Scandinavian Journal of Statistics》2000,27(4):689-702

Classical methods based on Gaussian likelihood or least-squares cannot identify non-invertible moving average processes, while recent non-Gaussian results are based on full likelihood consideration. Since the error distribution is rarely known a quasi-likelihood approach is desirable, but its consistency properties are yet unknown. In this paper we study the quasi-likelihood associated with the Laplacian model, a convenient non-Gaussian model that yields a modified L ₁ procedure. We show that consistency holds for all standard heavy tailed errors, but not for light tailed errors, showing that a quasi-likelihood procedure cannot be applied blindly to estimate non-invertible models. This is an interesting contrast to the standard results of the quasi-likelihood in regression models, where consistency usually holds much more generally. Similar results hold for estimation of non-causal non-invertible ARMA processes. Various simulation studies are presented to validate the theory and to show the effect of the error distribution, and an analysis of the US unemployment series is given as an illustration. 相似文献

14.

Estimating functions in indirect inference

Knut Heggland Arnoldo Frigessi 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2004,66(2):447-462

Summary. There are models for which the evaluation of the likelihood is infeasible in practice. For these models the Metropolis–Hastings acceptance probability cannot be easily computed. This is the case, for instance, when only departure times from a G / G /1 queue are observed and inference on the arrival and service distributions are required. Indirect inference is a method to estimate a parameter θ in models whose likelihood function does not have an analytical closed form, but from which random samples can be drawn for fixed values of θ . First an auxiliary model is chosen whose parameter β can be directly estimated. Next, the parameters in the auxiliary model are estimated for the original data, leading to an estimate . The parameter β is also estimated by using several sampled data sets, simulated from the original model for different values of the original parameter θ . Finally, the parameter θ which leads to the best match to is chosen as the indirect inference estimate. We analyse which properties an auxiliary model should have to give satisfactory indirect inference. We look at the situation where the data are summarized in a vector statistic T , and the auxiliary model is chosen so that inference on β is drawn from T only. Under appropriate assumptions the asymptotic covariance matrix of the indirect estimators is proportional to the asymptotic covariance matrix of T and componentwise inversely proportional to the square of the derivative, with respect to θ , of the expected value of T . We discuss how these results can be used in selecting good estimating functions. We apply our findings to the queuing problem. 相似文献

15.

Nonparametric analysis of aggregate loss models

J. M. Vilar R. Cao C. González-Fragueiro 《Journal of applied statistics》2009,36(2):149-166

This paper describes a nonparametric approach to make inferences for aggregate loss models in the insurance framework. We assume that an insurance company provides a historical sample of claims given by claim occurrence times and claim sizes. Furthermore, information may be incomplete as claims may be censored and/or truncated. In this context, the main goal of this work consists of fitting a probability model for the total amount that will be paid on all claims during a fixed future time period. In order to solve this prediction problem, we propose a new methodology based on nonparametric estimators for the density functions with censored and truncated data, the use of Monte Carlo simulation methods and bootstrap resampling. The developed methodology is useful to compare alternative pricing strategies in different insurance decision problems. The proposed procedure is illustrated with a real dataset provided by the insurance department of an international commercial company. 相似文献

16.

Non parametric mixture priors based on an exponential random scheme

Sonia Petrone Piero Veronese 《Statistical Methods and Applications》2002,11(1):1-20

We propose a general procedure for constructing nonparametric priors for Bayesian inference. Under very general assumptions, the proposed prior selects absolutely continuous distribution functions, hence it can be useful with continuous data. We use the notion ofFeller-type approximation, with a random scheme based on the natural exponential family, in order to construct a large class of distribution functions. We show how one can assign a probability to such a class and discuss the main properties of the proposed prior, namedFeller prior. Feller priors are related to mixture models with unknown number of components or, more generally, to mixtures with unknown weight distribution. Two illustrations relative to the estimation of a density and of a mixing distribution are carried out with respect to well known data-set in order to evaluate the performance of our procedure. Computations are performed using a modified version of an MCMC algorithm which is briefly described. 相似文献

17.

Testing a linear ARMA model against threshold-ARMA models: A Bayesian approach

Rubing Liang Jiazhu Pan Jinshan Liu 《统计学通讯:模拟与计算》2017,46(2):1302-1317

We introduce a Bayesian approach to test linear autoregressive moving-average (ARMA) models against threshold autoregressive moving-average (TARMA) models. First, the marginal posterior densities of all parameters, including the threshold and delay, of a TARMA model are obtained by using Gibbs sampler with Metropolis–Hastings algorithm. Second, reversible-jump Markov chain Monte Carlo (RJMCMC) method is adopted to calculate the posterior probabilities for ARMA and TARMA models: Posterior evidence in favor of TARMA models indicates threshold nonlinearity. Finally, based on RJMCMC scheme and Akaike information criterion (AIC) or Bayesian information criterion (BIC), the procedure for modeling TARMA models is exploited. Simulation experiments and a real data example show that our method works well for distinguishing an ARMA from a TARMA model and for building TARMA models. 相似文献

18.

Embedding latent class regression and latent class distal outcome models into cluster-weighted latent class analysis: a detailed simulation experiment

Roberto Di Mari Antonio Punzo Zsuzsa Bakk 《Australian & New Zealand Journal of Statistics》2023,65(3):213-233

Usually in latent class (LC) analysis, external predictors are taken to be cluster conditional probability predictors (LC models with external predictors), and/or score conditional probability predictors (LC regression models). In such cases, their distribution is not of interest. Class-specific distribution is of interest in the distal outcome model, when the distribution of the external variables is assumed to depend on LC membership. In this paper, we consider a more general formulation, that embeds both the LC regression and the distal outcome models, as is typically done in cluster-weighted modelling. This allows us to investigate (1) whether the distribution of the external variables differs across classes, (2) whether there are significant direct effects of the external variables on the indicators, by modelling jointly the relationship between the external and the latent variables. We show the advantages of the proposed modelling approach through a set of artificial examples, an extensive simulation study and an empirical application about psychological contracts among employees and employers in Belgium and the Netherlands. 相似文献

19.

Evaluation of Bayesian multiple stage estimation under spatial CAR model variants

Daniel R. Baer Andrew B. Lawson 《Journal of Statistical Computation and Simulation》2019,89(1):98-144

In this study, an evaluation of Bayesian hierarchical models is made based on simulation scenarios to compare single-stage and multi-stage Bayesian estimations. Simulated datasets of lung cancer disease counts for men aged 65 and older across 44 wards in the London Health Authority were analysed using a range of spatially structured random effect components. The goals of this study are to determine which of these single-stage models perform best given a certain simulating model, how estimation methods (single- vs. multi-stage) compare in yielding posterior estimates of fixed effects in the presence of spatially structured random effects, and finally which of two spatial prior models – the Leroux or ICAR model, perform best in a multi-stage context under different assumptions concerning spatial correlation. Among the fitted single-stage models without covariates, we found that when there is low amount of variability in the distribution of disease counts, the BYM model is relatively robust to misspecification in terms of DIC, while the Leroux model is the least robust to misspecification. When these models were fit to data generated from models with covariates, we found that when there was one set of covariates – either spatially correlated or non-spatially correlated, changing the values of the fixed coefficients affected the ability of either the Leroux or ICAR model to fit the data well in terms of DIC. When there were multiple sets of spatially correlated covariates in the simulating model, however, we could not distinguish the goodness of fit to the data between these single-stage models. We found that the multi-stage modelling process via the Leroux and ICAR models generally reduced the variance of the posterior estimated fixed effects for data generated from models with covariates and a UH term compared to analogous single-stage models. Finally, we found the multi-stage Leroux model compares favourably to the multi-stage ICAR model in terms of DIC. We conclude that the mutli-stage Leroux model should be seriously considered in applications of Bayesian disease mapping when an investigator desires to fit a model with both fixed effects and spatially structured random effects to Poisson count data. 相似文献

20.

A General Multivariate Threshold GARCH Model With Dynamic Conditional Correlations

《商业与经济统计学杂志》2013,31(1):138-149

We introduce a new multivariate GARCH model with multivariate thresholds in conditional correlations and develop a two-step estimation procedure that is feasible in large dimensional applications. Optimal threshold functions are estimated endogenously from the data and the model conditional covariance matrix is ensured to be positive definite. We study the empirical performance of our model in two applications using U.S. stock and bond market data. In both applications our model has, in terms of statistical and economic significance, higher forecasting power than several other multivariate GARCH models for conditional correlations. 相似文献