期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A Sequential Point Process Model and Bayesian Inference for Spatial Point Patterns with Linear Structures

JESPER MØLLER JAKOB GULDDAHL RASMUSSEN 《Scandinavian Journal of Statistics》2012,39(4):618-634

Abstract. We introduce a flexible spatial point process model for spatial point patterns exhibiting linear structures, without incorporating a latent line process. The model is given by an underlying sequential point process model. Under this model, the points can be of one of three types: a ‘background point’ an ‘independent cluster point’ or a ‘dependent cluster point’. The background and independent cluster points are thought to exhibit ‘complete spatial randomness’, whereas the dependent cluster points are likely to occur close to previous cluster points. We demonstrate the flexibility of the model for producing point patterns with linear structures and propose to use the model as the likelihood in a Bayesian setting when analysing a spatial point pattern exhibiting linear structures. We illustrate this methodology by analysing two spatial point pattern datasets (locations of bronze age graves in Denmark and locations of mountain tops in Spain). 相似文献

2.

美国统计教育研究热点探析 总被引：1，自引：0，他引：1

刘睿颖《统计教育》2010,(11):45-48,15

本文对美国《统计教育杂志》（Journal of Statistics Education）近十年（2000-2009）发表的学术论文进行了统计分析。结果表明,近十年来美国统计教育研究在＂统计教育理论探索＂、＂新手段在统计教育中的应用＂、＂统计在实际生活中的应用＂三大领域发表论文相对较多,占全部论文的81.9%。本文对此三大热点问题进行了归类和分析,旨在把握美国统计教育研究的热点和趋势,使其对我国的统计教育研究具有一定的借鉴和指导意义。相似文献

3.

A spatial structural equation model with an application to area health needs

《Journal of Statistical Computation and Simulation》2012,82(4):401-412

Indices of population ‘health need’ are often used to distribute health resources or assess equity in service provision. This article describes a spatial structural equation model incorporating multiple indicators of need and multiple population health risks that affect need (analogous to multiple indicators–multiple causes models). More specifically, the multiple indicator component of the model involves health outcomes such as hospital admissions or mortality, whereas the multiple risk component models the impact on the need for area social and demographic indicators, which proxy population-level risk factors for different diseases. The latent need construct is allowed (under a Bayesian approach) to be spatially correlated, though the prior assumed for need allows a mix of spatially structured and unstructured influences. A case study considers variations in need for coronary heart disease (CHD) care over 625 small areas in London, using recent mortality and hospitalization data (the ‘indicators’) and measures of general ill-health, income and unemployment, which proxy variations in population risk for CHD. 相似文献

4.

Mark-specific hazard ratio model with missing multivariate marks

Michal Juraska Peter B. Gilbert 《Lifetime data analysis》2016,22(4):606-625

An objective of randomized placebo-controlled preventive HIV vaccine efficacy (VE) trials is to assess the relationship between vaccine effects to prevent HIV acquisition and continuous genetic distances of the exposing HIVs to multiple HIV strains represented in the vaccine. The set of genetic distances, only observed in failures, is collectively termed the ‘mark.’ The objective has motivated a recent study of a multivariate mark-specific hazard ratio model in the competing risks failure time analysis framework. Marks of interest, however, are commonly subject to substantial missingness, largely due to rapid post-acquisition viral evolution. In this article, we investigate the mark-specific hazard ratio model with missing multivariate marks and develop two inferential procedures based on (i) inverse probability weighting (IPW) of the complete cases, and (ii) augmentation of the IPW estimating functions by leveraging auxiliary data predictive of the mark. Asymptotic properties and finite-sample performance of the inferential procedures are presented. This research also provides general inferential methods for semiparametric density ratio/biased sampling models with missing data. We apply the developed procedures to data from the HVTN 502 ‘Step’ HIV VE trial. 相似文献

5.

The pitfall of instrumental variables in big data: What the rule of thumb can't give you

Hui Shao Charles Stoecker Shuang Yang 《统计学通讯:模拟与计算》2013,42(7):2118-2124

ABSTRACT

Background: Instrumental variables (IVs) have become much easier to find in the “Big data era” which has increased the number of applications of the Two-Stage Least Squares model (TSLS). With the increased availability of IVs, the possibility that these IVs are weak has increased. Prior work has suggested a ‘rule of thumb’ that IVs with a first stage F statistic at least ten will avoid a relative bias in point estimates greater than 10%. We investigated whether or not this threshold was also an efficient guarantee of low false rejection rates of the null hypothesis test in TSLS applications with many IVs.

Objective: To test how the ‘rule of thumb’ for weak instruments performs in predicting low false rejection rates in the TSLS model when the number of IVs is large.

Method: We used a Monte Carlo approach to create 28 original data sets for different models with the number of IVs varying from 3 to 30. For each model, we generated 2000 observations for each iteration and conducted 50,000 iterations to reach convergence in rejection rates. The point estimate was set to 0, and probabilities of rejecting this hypothesis were recorded for each model as a measurement of false rejection rate. The relationship between the endogenous variable and IVs was carefully adjusted to let the F statistics for the first stage model equal ten, thus simulating the ‘rule of thumb.’

Results: We found that the false rejection rates (type I errors) increased when the number of IVs in the TSLS model increased while holding the F statistics for the first stage model equal to 10. The false rejection rate exceeds 10% when TLSL has 24 IVs and exceed 15% when TLSL has 30 IVs.

Conclusion: When more instrumental variables were applied in the model, the ‘rule of thumb’ was no longer an efficient guarantee for good performance in hypothesis testing. A more restricted margin for F statistics is recommended to replace the ‘rule of thumb,’ especially when the number of instrumental variables is large. 相似文献

6.

An alternative circular smoothing method to nonparametric estimation of periodic functions

Zheng Xu 《Journal of applied statistics》2016,43(9):1649-1672

This article provides alternative circular smoothing methods in nonparametric estimation of periodic functions. By treating the data as ‘circular’, we solve the “boundary issue” in the nonparametric estimation treating the data as ‘linear’. By redefining the distance metric and signed distance, we modify many estimators used in the situations involving periodic patterns. In the perspective of ‘nonparametric estimation of periodic functions’, we present the examples in nonparametric estimation of (1) a periodic function, (2) multiple periodic functions, (3) an evolving function, (4) a periodically varying-coefficient model and (5) a generalized linear model with periodically varying coefficient. In the perspective of ‘circular statistics’, we provide alternative approaches to calculate the weighted average and evaluate the ‘linear/circular–linear/circular’ association and regression. Simulation studies and an empirical study of electricity price index have been conducted to illustrate and compare our methods with other methods in the literature. 相似文献

7.

The k-sample problem in a multi-state model and testing transition probability matrices

Prabhanjan N. Tattar H. J. Vaman 《Lifetime data analysis》2014,20(3):387-403

The choice of multi-state models is natural in analysis of survival data, e.g., when the subjects in a study pass through different states like ‘healthy’, ‘in a state of remission’, ‘relapse’ or ‘dead’ in a health related quality of life study. Competing risks is another common instance of the use of multi-state models. Statistical inference for such event history data can be carried out by assuming a stochastic process model. Under such a setting, comparison of the event history data generated by two different treatments calls for testing equality of the corresponding transition probability matrices. The present paper proposes solution to this class of problems by assuming a non-homogeneous Markov process to describe the transitions among the health states. A class of test statistics are derived for comparison of $k$ treatments by using a ‘weight process’. This class, in particular, yields generalisations of the log-rank, Gehan, Peto–Peto and Harrington–Fleming tests. For an intrinsic comparison of the treatments, the ‘leave-one-out’ jackknife method is employed for identifying influential observations. The proposed methods are then used to develop the Kolmogorov–Smirnov type supremum tests corresponding to the various extended tests. To demonstrate the usefulness of the test procedures developed, a simulation study was carried out and an application to the Trial V data provided by International Breast Cancer Study Group is discussed. 相似文献

8.

Symmetrically distributed and unbiased estimators in linear models

Justus Seely Robert V. Hogg 《统计学通讯:理论与方法》2013,42(7):721-729

Let Y be distributed symmetrically about Xβ. Natural generalizations of odd location statistics, say T‘Y’, and even location-free statistics, say W‘Y’, that were used by Hogg ‘1960, 1967)’ are introduced. We show that T‘Y’ is distributed symmetrically about β and thus E[T‘Y’] = β and that each element of T‘Y’ is uncorrelated with each element of W‘Y’. Applications of this result are made to R-estiraators and the result is extended to a multivariate linear model situation. 相似文献

9.

Long memory and changepoint models: a spectral classification procedure

Ben Norwood Rebecca Killick 《Statistics and Computing》2018,28(2):291-302

Time series within fields such as finance and economics are often modelled using long memory processes. Alternative studies on the same data can suggest that series may actually contain a ‘changepoint’ (a point within the time series where the data generating process has changed). These models have been shown to have elements of similarity, such as within their spectrum. Without prior knowledge this leads to an ambiguity between these two models, meaning it is difficult to assess which model is most appropriate. We demonstrate that considering this problem in a time-varying environment using the time-varying spectrum removes this ambiguity. Using the wavelet spectrum, we then use a classification approach to determine the most appropriate model (long memory or changepoint). Simulation results are presented across a number of models followed by an application to stock cross-correlations and US inflation. The results indicate that the proposed classification outperforms an existing hypothesis testing approach on a number of models and performs comparatively across others. 相似文献

10.

Model selection information criteria in latent class models with missing data and contingency question

《Journal of Statistical Computation and Simulation》2012,82(1):159-170

Latent class analysis (LCA) has been found to have important applications in social and behavioural sciences for modelling categorical response variables, and non-response is typical when collecting data. In this study, the non-response mainly included ‘contingency questions’ and real ‘missing data’. The primary objective of this study was to evaluate the effects of some potential factors on model selection indices in LCA with non-response data. We simulated missing data with contingency question and evaluated the accuracy rates of eight information criteria for selecting the correct models. The results showed that the main factors are latent class proportions, conditional probabilities, sample size, the number of items, the missing data rate and the contingency data rate. Interactions of the conditional probabilities with class proportions, sample size and the number of items are also significant. From our simulation results, the impact of missing data and contingency questions can be amended by increasing the sample size or the number of items. 相似文献

11.

FACTORIAL ANALYSIS OF RECIDIVIST DATA

R.A. Maller 《Australian & New Zealand Journal of Statistics》1993,35(1):5-18

This paper presents a method of fitting factorial models to recidivism data consisting of the (possibly censored) time to ‘fail’ of individuals, in order to test for differences between groups. Here ‘failure’ means rearrest, reconviction or reincarceration, etc. A proportion P of the sample is assumed to be ‘susceptible’ to failure, i.e. to fail eventually, while the remaining 1-P are ‘immune’, and never fail. Thus failure may be described in two ways: by the probability P that an individual ever fails again (‘probability of recidivism’), and by the rate of failure Λ for the susceptibles. Related analyses have been proposed previously: this paper argues that a factorial approach, as opposed to regression approaches advocated previously, offers simplified analysis and interpretation of these kinds of data. The methods proposed, which are also applicable in medical statistics and reliability analyses, are demonstrated on data sets in which the factors are Parole Type (released to freedom or on parole), Age group (≤ 20 years, 20–40 years, > 40 years), and Marital Status. The outcome (failure) is a return to prison following first or second release. 相似文献

12.

Longitudinal poisson regression with disturbed random intercept

P. David Wilson 《统计学通讯:理论与方法》2013,42(9):2275-2292

Consider repeated event-count data from a sequence of exposures, during each of which a subject can experience some number of events, which is reported at ‘visits’ following each exposure. Within-subject heterogeneity not accounted for by visit-varying covariates is called ‘visit-level’ heterogeneity. Using generalized linear mixed models with log link for longitudinal Poisson regression, I model visit-level heterogeneity by cumulatively adding ‘disturbances’ to the random intercept of each subject over visits to create a ‘disturbed-random-intercept$rsquo; model. I also create a ‘disturbed-random-slope’ model, where the slope is over visits, and both intercept and slope are random but only the slope is disturbed. Simulation studies compare fixed-effect estimation for these models in data with 15 visits, large visit-level heterogeneity, and large multiplicative overdispersion. These studies show statistically significant superiority of the disturbed-random-intercept model. Examples with epidemiological data compare results of this model with those from other published models. 相似文献

13.

Learning causal structure from mixed data with missing values using Gaussian copula models

Cui Ruifei Groot Perry Heskes Tom 《Statistics and Computing》2019,29(2):311-333

We consider the problem of causal structure learning from data with missing values, assumed to be drawn from a Gaussian copula model. First, we extend the ‘Rank PC’ algorithm, designed for Gaussian copula models with purely continuous data (so-called nonparanormal models), to incomplete data by applying rank correlation to pairwise complete observations and replacing the sample size with an effective sample size in the conditional independence tests to account for the information loss from missing values. When the data are missing completely at random (MCAR), we provide an error bound on the accuracy of ‘Rank PC’ and show its high-dimensional consistency. However, when the data are missing at random (MAR), ‘Rank PC’ fails dramatically. Therefore, we propose a Gibbs sampling procedure to draw correlation matrix samples from mixed data that still works correctly under MAR. These samples are translated into an average correlation matrix and an effective sample size, resulting in the ‘Copula PC’ algorithm for incomplete data. Simulation study shows that: (1) ‘Copula PC’ estimates a more accurate correlation matrix and causal structure than ‘Rank PC’ under MCAR and, even more so, under MAR and (2) the usage of the effective sample size significantly improves the performance of ‘Rank PC’ and ‘Copula PC.’ We illustrate our methods on two real-world datasets: riboflavin production data and chronic fatigue syndrome data.

相似文献

14.

A note on current methods of partitioning the contigency x2 statistic

《Journal of Statistical Computation and Simulation》2012,82(3-4):157-167

Comparisons are made between the two values of the chisquare statistic in three dimensional contigency tables as defined respectively by the ‘multiplicative’ an ‘additive’ models of zero second order interaction. It is shown that in practice the two definitions frequently give comparable values for the statistics, and it is concluded that interaction measures, and paritioning of the overall association chisuare, are more useful than the considerable writing on the models' deficiencies would seem to indicate. There seems to be a slight bias in favour of the multiplicative model. 相似文献

15.

Joint models for mixed categorical outcomes: a study of HIV risk perception and disease status in Mozambique

Osvaldo Loquiha Niel Hens Emilia Martins-Fonteyn Herman Meulemans Edwin Wouters Marleen Temmerman 《Journal of applied statistics》2018,45(10):1781-1798

Two types of bivariate models for categorical response variables are introduced to deal with special categories such as ‘unsure’ or ‘unknown’ in combination with other ordinal categories, while taking additional hierarchical data structures into account. The latter is achieved by the use of different covariance structures for a trivariate random effect. The models are applied to data from the INSIDA survey, where interest goes to the effect of covariates on the association between HIV risk perception (quadrinomial with an ‘unknown risk’ category) and HIV infection status (binary). The final model combines continuation-ratio with cumulative link logits for the risk perception, together with partly correlated and partly shared trivariate random effects for the household level. The results indicate that only age has a significant effect on the association between HIV risk perception and infection status. The proposed models may be useful in various fields of application such as social and biomedical sciences, epidemiology and public health. 相似文献

16.

Image search using trained flexible shape models

T. F. Cootes C. J. Taylor D. H. Cooper J. Graham 《Journal of applied statistics》1994,21(1-2):111-139

This paper describes a technique for building compact models of the shape and appearance of flexible objects seen in two-dimensional images. The models are derived from the statistics of sets of images of example objects with ‘landmark’ points labelled on each object. Each model consists of a flexible shape template, describing how the landmark points can vary, and a statistical model of the expected grey levels in regions around each point. Such models have proved useful in a wide variety of applications. We describe how the models can be used in local image search and give examples of their application. 相似文献

17.

能源卫星账户:演化背景、发展历史及国际经验

马晓君等《统计研究》2022,38(1):146-160

2019年,联合国发布《能源环境经济核算体系》(SEEA-Energy2019),为编制能源卫星账户提供了国际标准。能源卫星账户作为国际能源统计的先进方法,能够更好地满足当前我国能源可持续发展的数据需求。因此,本文探究能源卫星账户的演化背景,追溯能源卫星账户的发展历史,提炼SEEA-Energy2019的核算框架与研究视角,归纳各国编制能源卫星账户实践的一般性与特殊性,结合我国能源统计现状探讨能源卫星账户对我国的适用性,展望能源卫星账户的发展方向。相似文献

18.

基于半参数模型的中国GDP数据准确性评估

下载免费PDF全文

刘洪金林《统计研究》2012,29(10):99-104

本文以经济增长理论为基础,对1953-2010年中国GDP数据和劳动投入、资本投入、人力资本等因素建立了半参数回归模型。然后,文章对模型了进行了统计诊断分析,计算了相关统计诊断量,利用统计诊断量得到了模型的异常点,基于此对中国GDP数据的准确性进行了讨论：中国GDP数据的异常点主要集中两个时间段1958-1961年和1991-1994年。文章最后对基于半参数回归模型统计诊断的统计数据准确性评估方法进行了评述。相似文献

19.

空间回归模型选择的反思 总被引：1，自引：0，他引：1

姜磊《统计与信息论坛》2016,(10):10-16

空间计量经济学存在两种最基本的模型:空间滞后模型和空间误差模型,这里旨在重新思考和探讨这两种空间回归模型的选择,结论为:Moran’s I指数可以用来判断回归模型后的残差是否存在空间依赖性;在实证分析中,采用拉格朗日乘子检验判断两种模型优劣是最常见的做法。然而,该检验仅仅是基于统计推断而忽略了理论基础,因此,可能导致选择错误的模型;在实证分析中,空间误差模型经常被选择性遗忘,而该模型的适用性较空间滞后模型更为广泛;实证分析大多缺乏空间回归模型设定的探讨,Anselin提出三个统计量,并且,如果模型设定正确,应该遵从Wald统计量>Log likelihood统计量>LM统计量的排列顺序。相似文献

20.

On a class of nonparametric tests for the treatment vs control problem

K.L. Mehra K.S. Madhava Rao 《统计学通讯:理论与方法》2013,42(6):2323-2336

The classical problem of testing treatment versus control is revisited by considering a class of test statistics based on a kernel that depends on a constant ‘a’. The proposed class includes the celebrated Wilcoxon-Mann-Whitnet statistics as a special case when ‘a’=1. It is shown that, with optimal choice of ‘a’ depending on the underlying distribution, the optimal member performs better (in terms of Pitman efficiency) than the Wilcoxon-Mann-Whitney and the Median tests for a wide range of underlying distributions. An extended Hodges-Lehmann type point estimator of the shift prameter corresponding to the proposed ‘optimal’ test statistic is also derived. 相似文献