期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A hybrid estimator for generalized pareto and extreme-value distributions

D.J. Dupuis M. Tsao 《统计学通讯:理论与方法》2013,42(4):925-941

The methods of moments and probability-weighted moments are the most commonly used methods for estimating the parameters of the generalized Pareto distribution and generalized extreme-value distributions. These methods, however, frequently lead to nonfeasible estimates in the sense that the supports inferred from the estimates fail to contain all observations. In this paper, we propose a hybrid estimator which is derived by incorporating a simple auxiliary constraint on feasibility into the estimates. The hybrid estimator is very easy to use, always feasible, and also has smaller bias and mean square error in many cases. Its advantages are further illustrated through the analyses of two real data sets. 相似文献

2.

Analysis of longitudinal data with non-ignorable non-monotone missing values 总被引：2，自引：0，他引：2

A. B. Troxel D. P. Harrington & S. R. Lipsitz 《Journal of the Royal Statistical Society. Series C, Applied statistics》1998,47(3):425-438

A full likelihood method is proposed to analyse continuous longitudinal data with non-ignorable (informative) missing values and non-monotone patterns. The problem arose in a breast cancer clinical trial where repeated assessments of quality of life were collected: patients rated their coping ability during and after treatment. We allow the missingness probabilities to depend on unobserved responses, and we use a multivariate normal model for the outcomes. A first-order Markov dependence structure for the responses is a natural choice and facilitates the construction of the likelihood; estimates are obtained via the Nelder–Mead simplex algorithm. Computations are difficult and become intractable with more than three or four assessments. Applying the method to the quality-of-life data results in easily interpretable estimates, confirms the suspicion that the data are non-ignorably missing and highlights the likely bias of standard methods. Although treatment comparisons are not affected here, the methods are useful for obtaining unbiased means and estimating trends over time. 相似文献

3.

Bayesian model comparison with un-normalised likelihoods

Richard G. Everitt Adam M. Johansen Ellen Rowing Melina Evdemon-Hogan 《Statistics and Computing》2017,27(2):403-422

Models for which the likelihood function can be evaluated only up to a parameter-dependent unknown normalizing constant, such as Markov random field models, are used widely in computer science, statistical physics, spatial statistics, and network analysis. However, Bayesian analysis of these models using standard Monte Carlo methods is not possible due to the intractability of their likelihood functions. Several methods that permit exact, or close to exact, simulation from the posterior distribution have recently been developed. However, estimating the evidence and Bayes’ factors for these models remains challenging in general. This paper describes new random weight importance sampling and sequential Monte Carlo methods for estimating BFs that use simulation to circumvent the evaluation of the intractable likelihood, and compares them to existing methods. In some cases we observe an advantage in the use of biased weight estimates. An initial investigation into the theoretical and empirical properties of this class of methods is presented. Some support for the use of biased estimates is presented, but we advocate caution in the use of such estimates. 相似文献

4.

研发支出资本化核算及对GDP和主要变量的影响

江永宏孙凤娥《统计研究》2016,33(4):8-17

本文依据2008年SNA,阐述了GDP核算框架中的R&D支出资本化核算处理、基本分类和基本方法,分别从生产法GDP核算、收入法GDP核算和支出法GDP核算三个角度,系统梳理了各种不同类型研发活动的资本化核算对GDP核算和相关主要变量的影响,详细讨论了自给性生产与以出售为目的生产、市场生产者与非市场生产者的区别。研究结果表明,不同类型研发活动的资本化核算方法有所不同,对GDP核算的影响也有所不同。总体而言,在研发支出资本化后,从生产法看,总产出和增加值有所增加,中间消耗有所减少;从收入法看,劳动者报酬和生产税净额保持不变,固定资产折旧和营业盈余增加;从支出法看,资本形成总额有所增加,政府消费有所减少,居民消费和净出口保持不变。同时,本文还讨论了开放经济中研发出口和进口核算的问题。相似文献

5.

ALIGNING OF ESTIMATES: AN ALTERNATIVE TO EMPIRICAL BAYES METHODS

J. S. Maritz 《Australian & New Zealand Journal of Statistics》1974,16(3):135-143

The data that are used in constructing empirical Bayes estimates can properly be regarded as arising in a two-stage sampling scheme. In this setting it is possible to modify the conventional parameter estimates so that a reduction in expected squared error is effected. In the empirical Bayes approach this is done through the use of Bayes's theorem. The alternative approach proposed in this paper specifies a class of modified estimates and then seeks to identify that member of the class which yields the minimum squared error. One advantage of this approach relative to the empirical Bayes approach is that certain problems involving multiple parameters are easily overcome. Further, it permits the use of relatively efficient methods of non-parametric estimation, such as those based on quantiles or ranks; this has not been achieved by empirical Bayes methods. 相似文献

6.

Estimation for u-shaped beta distributions: minimum hellinger distance and related methods

D. Richard Cutler Adele Cutler 《统计学通讯:理论与方法》2013,42(7):1487-1509

We compare minimum Hellinger distance and minimum Heiiinger disparity estimates for U-shaped beta distributions. Given suitable density estimates, both methods are known to be asymptotically efficient when the data come from the assumed model family, and robust to small perturbations from the model family. Most implementations use kernel density estimates, which may not be appropriate for U-shaped distributions. We compare fixed binwidth histograms, percentile mesh histograms, and averaged shifted histograms. Minimum disparity estimates are less sensitive to the choice of density estimate than are minimum distance estimates, and the percentile mesh histogram gives the best results for both minimum distance and minimum disparity estimates. Minimum distance estimates are biased and a bias-corrected method is proposed. Minimum disparity estimates and bias-corrected minimum distance estimates are comparable to maximum likelihood estimates when the model holds, and give better results than either method of moments or maximum likelihood when the data are discretized or contaminated, Although our re¬sults are for the beta density, the implementations are easily modified for other U-shaped distributions such as the Dirkhlet or normal generated distribution. 相似文献

7.

Embedded experiments in repeated and overlapping surveys

James Chipperfield Philip Bell 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2010,173(1):51-66

Summary. Statistical agencies make changes to the data collection methodology of their surveys to improve the quality of the data collected or to improve the efficiency with which they are collected. For reasons of cost it may not be possible to estimate the effect of such a change on survey estimates or response rates reliably, without conducting an experiment that is embedded in the survey which involves enumerating some respondents by using the new method and some under the existing method. Embedded experiments are often designed for repeated and overlapping surveys; however, previous methods use sample data from only one occasion. The paper focuses on estimating the effect of a methodological change on estimates in the case of repeated surveys with overlapping samples from several occasions. Efficient design of an embedded experiment that covers more than one time point is also mentioned. All inference is unbiased over an assumed measurement model, the experimental design and the complex sample design. Other benefits of the approach proposed include the following: it exploits the correlation between the samples on each occasion to improve estimates of treatment effects; treatment effects are allowed to vary over time; it is robust against incorrectly rejecting the null hypothesis of no treatment effect; it allows a wide set of alternative experimental designs. This paper applies the methodology proposed to the Australian Labour Force Survey to measure the effect of replacing pen-and-paper interviewing with computer-assisted interviewing. This application considered alternative experimental designs in terms of their statistical efficiency and their risks to maintaining a consistent series. The approach proposed is significantly more efficient than using only 1 month of sample data in estimation. 相似文献

8.

Bayesian Approach to Multicentre Sparse Data

M. Subbiah B. Kishore Kumar 《统计学通讯:模拟与计算》2013,42(4):687-696

In a 2 × 2 contingency table, when the sample size is small, there may be a number of cells that contain few or no observations, usually referred to as sparse data. In such cases, a common recommendation in the conventional frequentist methods is adding a small constant to every cell of the observed table to find the estimates of the unknown parameters. However, this approach is based on asymptotic properties of the estimates and may work poorly for small samples. An alternative approach would be to use Bayesian methods in order to provide better insight into the problem of sparse data coupled with fewer centers, which would otherwise be difficult to carry out the analysis. In this article, an attempt has been made to use hierarchical Bayesian model to a multicenter data on the effect of a surgical treatment with standard foot care among leprosy patients with posterior tibial nerve damage which is summarized as seven 2 × 2 tables. Monte Carlo Markov Chain (MCMC) techniques are applied in estimating the parameters of interest under sparse data setup. 相似文献

9.

Analysis of exacerbation rates in asthma and chronic obstructive pulmonary disease: example from the TRISTAN study

Keene ON Jones MR Lane PW Anderson J 《Pharmaceutical statistics》2007,6(2):89-97

Recurrent events in clinical trials have typically been analysed using either a multiple time-to-event method or a direct approach based on the distribution of the number of events. An area of application for these methods is exacerbation data from respiratory clinical trials. The different approaches to the analysis and the issues involved are illustrated for a large trial (n = 1465) in chronic obstructive pulmonary disease (COPD). For exacerbation rates, clinical interest centres on a direct comparison of rates for each treatment which favours the distribution-based analysis, rather than a time-to-event approach. Poisson regression has often been employed and has recently been recommended as the appropriate method of analysis for COPD exacerbations but the key assumptions often appear unreasonable for this analysis. By contrast use of a negative binomial model which corresponds to assuming a separate Poisson parameter for each subject offers a more appealing approach. Non-parametric methods avoid some of the assumptions required by these models, but do not provide appropriate estimates of treatment effects because of the discrete and bounded nature of the data. 相似文献

10.

Key considerations for choosing a statistical method to deal with incomplete treatment adherence in pragmatic trials

Md. Belal Hossain Mohammad Ehsanul Karim 《Pharmaceutical statistics》2023,22(1):205-231

Pragmatic trials offer practical means of obtaining real-world evidence to help improve decision-making in comparative effectiveness settings. Unfortunately, incomplete adherence is a common problem in pragmatic trials. The commonly used methods in randomized control trials often cannot handle the added complexity imposed by incomplete adherence, resulting in biased estimates. Several naive methods and advanced causal inference methods (e.g., inverse probability weighting and instrumental variable-based approaches) have been used in the literature to deal with incomplete adherence. Practitioners and applied researchers are often confused about which method to consider under a given setting. This current work is aimed to review commonly used statistical methods to deal with non-adherence along with their key assumptions, advantages, and limitations, with a particular focus on pragmatic trials. We have listed the applicable settings for these methods and provided a summary of available software. All methods were applied to two hypothetical datasets to demonstrate how these methods perform in a given scenario, along with the R codes. The key considerations include the type of intervention strategy (point treatment settings, where treatment is administered only once versus sustained treatment settings, where treatment has to be continued over time) and availability of data (e.g., the extent of measured or unmeasured covariates that are associated with adherence, dependent confounding impacted by past treatment, and potential violation of assumptions). This study will guide practitioners and applied researchers to use the appropriate statistical method to address incomplete adherence in pragmatic trial settings for both the point and sustained treatment strategies. 相似文献

11.

The Performance of Two Data-Generation Processes for Data with Specified Marginal Treatment Odds Ratios

Peter C. Austin James Stafford 《统计学通讯:模拟与计算》2013,42(6):1039-1051

Monte Carlo simulation methods are increasingly being used to evaluate the property of statistical estimators in a variety of settings. The utility of these methods depends upon the existence of an appropriate data-generating process. Observational studies are increasingly being used to estimate the effects of exposures and interventions on outcomes. Conventional regression models allow for the estimation of conditional or adjusted estimates of treatment effects. There is an increasing interest in statistical methods for estimating marginal or average treatment effects. However, in many settings, conditional treatment effects can differ from marginal treatment effects. Therefore, existing data-generating processes for conditional treatment effects are of little use in assessing the performance of methods for estimating marginal treatment effects. In the current study, we describe and evaluate the performance of two different data-generation processes for generating data with a specified marginal odds ratio. The first process is based upon computing Taylor Series expansions of the probabilities of success for treated and untreated subjects. The expansions are then integrated over the distribution of the random variables to determine the marginal probabilities of success for treated and untreated subjects. The second process is based upon an iterative process of evaluating marginal odds ratios using Monte Carlo integration. The second method was found to be computationally simpler and to have superior performance compared to the first method. 相似文献

12.

Allowing for non-ignorable non-response in the analysis of voting intention data 总被引：1，自引：0，他引：1

P. W. F. Smith C. J. Skinner & P. S. Clarke 《Journal of the Royal Statistical Society. Series C, Applied statistics》1999,48(4):563-577

We apply some log-linear modelling methods, which have been proposed for treating non-ignorable non-response, to some data on voting intention from the British General Election Survey. We find that, although some non-ignorable non-response models fit the data very well, they may generate implausible point estimates and predictions. Some explanation is provided for the extreme behaviour of the maximum likelihood estimates for the most parsimonious model. We conclude that point estimates for such models must be treated with great caution. To allow for the uncertainty about the non-response mechanism we explore the use of profile likelihood inference and find the likelihood surfaces to be very flat and the interval estimates to be very wide. To reduce the width of these intervals we propose constraining confidence regions to values where the parameters governing the non-response mechanism are plausible and study the effect of such constraints on inference. We find that the widths of these intervals are reduced but remain wide. 相似文献

13.

USING LINEAR PREDICTORS TO IMPUTE ALLELE FREQUENCIES FROM SUMMARY OR POOLED GENOTYPE DATA

Wen X Stephens M 《The annals of applied statistics》2010,4(3):1158-1182

Recently-developed genotype imputation methods are a powerful tool for detecting untyped genetic variants that affect disease susceptibility in genetic association studies. However, existing imputation methods require individual-level genotype data, whereas in practice it is often the case that only summary data are available. For example this may occur because, for reasons of privacy or politics, only summary data are made available to the research community at large; or because only summary data are collected, as in DNA pooling experiments. In this article, we introduce a new statistical method that can accurately infer the frequencies of untyped genetic variants in these settings, and indeed substantially improve frequency estimates at typed variants in pooling experiments where observations are noisy. Our approach, which predicts each allele frequency using a linear combination of observed frequencies, is statistically straight-forward, and related to a long history of the use of linear methods for estimating missing values (e.g. Kriging). The main statistical novelty is our approach to regularizing the covariance matrix estimates, and the resulting linear predictors, which is based on methods from population genetics. We find that, besides being both fast and flexible - allowing new problems to be tackled that cannot be handled by existing imputation approaches purpose-built for the genetic context - these linear methods are also very accurate. Indeed, imputation accuracy using this approach is similar to that obtained by state-of-the art imputation methods that use individual-level data, but at a fraction of the computational cost. 相似文献

14.

Landmark estimation of survival and treatment effects in observational studies

Layla Parast Beth Ann Griffin 《Lifetime data analysis》2017,23(2):161-182

Clinical studies aimed at identifying effective treatments to reduce the risk of disease or death often require long term follow-up of participants in order to observe a sufficient number of events to precisely estimate the treatment effect. In such studies, observing the outcome of interest during follow-up may be difficult and high rates of censoring may be observed which often leads to reduced power when applying straightforward statistical methods developed for time-to-event data. Alternative methods have been proposed to take advantage of auxiliary information that may potentially improve efficiency when estimating marginal survival and improve power when testing for a treatment effect. Recently, Parast et al. (J Am Stat Assoc 109(505):384–394, 2014) proposed a landmark estimation procedure for the estimation of survival and treatment effects in a randomized clinical trial setting and demonstrated that significant gains in efficiency and power could be obtained by incorporating intermediate event information as well as baseline covariates. However, the procedure requires the assumption that the potential outcomes for each individual under treatment and control are independent of treatment group assignment which is unlikely to hold in an observational study setting. In this paper we develop the landmark estimation procedure for use in an observational setting. In particular, we incorporate inverse probability of treatment weights (IPTW) in the landmark estimation procedure to account for selection bias on observed baseline (pretreatment) covariates. We demonstrate that consistent estimates of survival and treatment effects can be obtained by using IPTW and that there is improved efficiency by using auxiliary intermediate event and baseline information. We compare our proposed estimates to those obtained using the Kaplan–Meier estimator, the original landmark estimation procedure, and the IPTW Kaplan–Meier estimator. We illustrate our resulting reduction in bias and gains in efficiency through a simulation study and apply our procedure to an AIDS dataset to examine the effect of previous antiretroviral therapy on survival. 相似文献

15.

Some sequential Bernoulli selection procedures with modified Bechhofer-Kulkarni stopping rules

P. W. Jones S. A. Madhi 《Statistical Papers》1988,29(1):301-308

Sequential methods for choosing the better of two Bernoulli populations are discussed using a Bayesian framework and when the maximum number of observations is fixed. Performance characteristics of the designs are obtained by using Monte Carlo simulation. Several sampling rules are considered, together with a stopping rule due to Bechhofer and Kulkarni (1982) and some modifications which use posterior estimates of the unknown probabilities. 相似文献

16.

Decomposition analysis as a framework for understanding heterogeneity of treatment effects in non-randomized health care studies

William H. Crown 《Pharmaceutical statistics》2021,20(5):945-951

This paper uses the decomposition framework from the economics literature to examine the statistical structure of treatment effects estimated with observational data compared to those estimated from randomized studies. It begins with the estimation of treatment effects using a dummy variable in regression models and then presents the decomposition method from economics which estimates separate regression models for the comparison groups and recovers the treatment effect using bootstrapping methods. This method shows that the overall treatment effect is a weighted average of structural relationships of patient features with outcomes within each treatment arm and differences in the distributions of these features across the arms. In large randomized trials, it is assumed that the distribution of features across arms is very similar. Importantly, randomization not only balances observed features but also unobserved. Applying high dimensional balancing methods such as propensity score matching to the observational data causes the distributional terms of the decomposition model to be eliminated but unobserved features may still not be balanced in the observational data. Finally, a correction for non-random selection into the treatment groups is introduced via a switching regime model. Theoretically, the treatment effect estimates obtained from this model should be the same as those from a randomized trial. However, there are significant challenges in identifying instrumental variables that are necessary for estimating such models. At a minimum, decomposition models are useful tools for understanding the relationship between treatment effects estimated from observational versus randomized data. 相似文献

17.

A Capture-Recapture Problem When Information is Obtained From Two Qualitatively Different Sources

Sylvan Wallenstein Carol Bodian Robin Herbert 《统计学通讯:理论与方法》2013,42(15):2688-2700

We first consider a wildlife-related capture recapture problem with a known total number of fish. Our objective is to estimate the number of healthy fish, H. We first use a rod that only attracts healthy fish, which are tagged and returned to the water. Later, we use a net that scoops both sick and healthy fish. Three assumptions regarding the probability of being caught by the net, conditional on health status and being caught by the rod, lead to three different estimates of H. We give approximations to expected values of the three estimates and give a condition under which they bracket H.

A potential application of these methods is to the follow-up of World Trade Center responders. Responders are disease-free when they arrive at the clean-up site and are asked to report for a visit after a fixed period of time, but some fail to do so. Some responders, whether they come to the scheduled return visit or not, spontaneously report a disease before the scheduled visit, but absence of disease is never reported in this manner. We use the methods developed to estimate the total number of subjects with disease by the time of the scheduled return visit. 相似文献

18.

Parameter estimation of three-parameter Weibull distribution based on progressively Type-II censored samples

《Journal of Statistical Computation and Simulation》2012,82(11):1661-1678

In this paper, the estimation of parameters for a three-parameter Weibull distribution based on progressively Type-II right censored sample is studied. Different estimation procedures for complete sample are generalized to the case with progressively censored data. These methods include the maximum likelihood estimators (MLEs), corrected MLEs, weighted MLEs, maximum product spacing estimators and least squares estimators. We also proposed the use of a censored estimation method with one-step bias-correction to obtain reliable initial estimates for iterative procedures. These methods are compared via a Monte Carlo simulation study in terms of their biases, root mean squared errors and their rates of obtaining reliable estimates. Recommendations are made from the simulation results and a numerical example is presented to illustrate all of the methods of inference developed here. 相似文献

19.

Time series analysis of non-Gaussian observations based on state space models from both classical and Bayesian perspectives 总被引：1，自引：0，他引：1

J. Durbin & S. J. Koopman 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2000,62(1):3-56

The analysis of non-Gaussian time series by using state space models is considered from both classical and Bayesian perspectives. The treatment in both cases is based on simulation using importance sampling and antithetic variables; Markov chain Monte Carlo methods are not employed. Non-Gaussian disturbances for the state equation as well as for the observation equation are considered. Methods for estimating conditional and posterior means of functions of the state vector given the observations, and the mean-square errors of their estimates, are developed. These methods are extended to cover the estimation of conditional and posterior densities and distribution functions. The choice of importance sampling densities and antithetic variables is discussed. The techniques work well in practice and are computationally efficient. Their use is illustrated by applying them to a univariate discrete time series, a series with outliers and a volatility series. 相似文献

20.

Multivariate outlier detection in incomplete survey data: the epidemic algorithm and transformed rank correlations

Cédric Béguin Beat Hulliger 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2004,167(2):275-294

Summary. As a part of the EUREDIT project new methods to detect multivariate outliers in incomplete survey data have been developed. These methods are the first to work with sampling weights and to be able to cope with missing values. Two of these methods are presented here. The epidemic algorithm simulates the propagation of a disease through a population and uses extreme infection times to find outlying observations. Transformed rank correlations are robust estimates of the centre and the scatter of the data. They use a geometric transformation that is based on the rank correlation matrix. The estimates are used to define a Mahalanobis distance that reveals outliers. The two methods are applied to a small data set and to one of the evaluation data sets of the EUREDIT project. 相似文献