期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A continuous-time HMM approach to modeling the magnitude-frequency distribution of earthquakes

Shaochuan Lu 《Journal of applied statistics》2017,44(1):71-88

The magnitude-frequency distribution (MFD) of earthquake is a fundamental statistic in seismology. The so-called b-value in the MFD is of particular interest in geophysics. A continuous time hidden Markov model (HMM) is proposed for characterizing the variability of b-values. The HMM-based approach to modeling the MFD has some appealing properties over the widely used sliding-window approach. Often, large variability appears in the estimation of b-value due to window size tuning, which may cause difficulties in interpretation of b-value heterogeneities. Continuous-time hidden Markov models (CT-HMMs) are widely applied in various fields. It bears some advantages over its discrete time counterpart in that it can characterize heterogeneities appearing in time series in a finer time scale, particularly for highly irregularly-spaced time series, such as earthquake occurrences. We demonstrate an expectation–maximization algorithm for the estimation of general exponential family CT-HMM. In parallel with discrete-time hidden Markov models, we develop a continuous time version of Viterbi algorithm to retrieve the overall optimal path of the latent Markov chain. The methods are applied to New Zealand deep earthquakes. Before the analysis, we first assess the completeness of catalogue events to assure the analysis is not biased by missing data. The estimation of b-value is stable over the selection of magnitude thresholds, which is ideal for the interpretation of b-value variability. 相似文献

2.

BAYESIAN ANALYSIS OF THE ADDITIVE MIXED MODEL FOR RANDOMIZED BLOCK DESIGNS

Jenting Wang John S.J. Hsu 《Australian & New Zealand Journal of Statistics》2006,48(2):225-236

This paper deals with the Bayesian analysis of the additive mixed model experiments. Consider b randomly chosen subjects who respond once to each of t treatments. The subjects are treated as random effects and the treatment effects are fixed. Suppose that some prior information is available, thus motivating a Bayesian analysis. The Bayesian computation, however, can be difficult in this situation, especially when a large number of treatments is involved. Three computational methods are suggested to perform the analysis. The exact posterior density of any parameter of interest can be simulated based on random realizations taken from a restricted multivariate t distribution. The density can also be simulated using Markov chain Monte Carlo methods. The simulated density is accurate when a large number of random realizations is taken. However, it may take substantial amount of computer time when many treatments are involved. An alternative Laplacian approximation is discussed. The Laplacian method produces smooth and very accurate approximates to posterior densities, and takes only seconds of computer time. An example of a pipeline cracks experiment is used to illustrate the Bayesian approaches and the computational methods. 相似文献

3.

An algorithm for the Buckley–James estimator with interval-censored data

《Journal of Statistical Computation and Simulation》2012,82(11):1341-1353

The Buckley–James estimator (BJE) [J. Buckley and I. James, Linear regression with censored data, Biometrika 66 (1979), pp. 429–436] has been extended from right-censored (RC) data to interval-censored (IC) data by Rabinowitz et al. [D. Rabinowitz, A. Tsiatis, and J. Aragon, Regression with interval-censored data, Biometrika 82 (1995), pp. 501–513]. The BJE is defined to be a zero-crossing of a modified score function H(b), a point at which H(·) changes its sign. We discuss several approaches (for finding a BJE with IC data) which are extensions of the existing algorithms for RC data. However, these extensions may not be appropriate for some data, in particular, they are not appropriate for a cancer data set that we are analysing. In this note, we present a feasible iterative algorithm for obtaining a BJE. We apply the method to our data. 相似文献

4.

On occupation times for a risk process with reserve-dependent premium

《随机性模型》2013,29(2):245-255

Consider a risk reserve process under which the reserve can generate interest. For constants a and b such that a<b, we study the occupation time T _a,b(t), which is the total length of the time intervals up to time t during which the reserve is between a and b. We first present a general formula for piecewise deterministic Markov processes, which will be used for the computation of the Laplace transform of T _a,b(t). Explicit results are then given for the special case that claim sizes are exponentially distributed. The classical model is discussed in detail. 相似文献

5.

Generic reversible jump MCMC using graphical models

David J. Lunn Nicky Best John C. Whittaker 《Statistics and Computing》2009,19(4):395-408

Markov chain Monte Carlo techniques have revolutionized the field of Bayesian statistics. Their power is so great that they can even accommodate situations in which the structure of the statistical model itself is uncertain. However, the analysis of such trans-dimensional (TD) models is not easy and available software may lack the flexibility required for dealing with the complexities of real data, often because it does not allow the TD model to be simply part of some bigger model. In this paper we describe a class of widely applicable TD models that can be represented by a generic graphical model, which may be incorporated into arbitrary other graphical structures without significantly affecting the mechanism of inference. We also present a decomposition of the reversible jump algorithm into abstract and problem-specific components, which provides infrastructure for applying the method to all models in the class considered. These developments represent a first step towards a context-free method for implementing TD models that will facilitate their use by applied scientists for the practical exploration of model uncertainty. Our approach makes use of the popular WinBUGS framework as a sampling engine and we illustrate its use via two simple examples in which model uncertainty is a key feature. 相似文献

6.

The extended generalized lambda distribution system for fitting distributions to data: history,completion of theory,tables, applications,the “final word” on moment fits

Zaven A. Karian Edward J. Dudewicz Patrick Mcdonald 《统计学通讯:模拟与计算》2013,42(3):611-642

The generalized lambda distribution, GLD(λ₁, λ₂ λ₃, λ₄), is a four-parameter family that has been used for fitting distributions to a wide variety of data sets. The analysis of the λ₃ and λ₄ values that actually yield valid distributions has (until now) been incomplete. Moreover, because of computational problems and theoretical shortcomings, the moment space over which the GLD can be applied has been limited. This paper completes the analysis of the λ₃ and λ₄ values that are associated with valid distributions, improves previous computational methods to reduce errors associated with fitting data, expands the parameter space over which the GLD can be used, and uses a four-parameter generalized beta distribution to cover the portion of the parameter space where the GLD is not applicable. In short, the paper extends the GLD to an EGLD system that can be used for fitting distributions to data sets that that are cited in the literature as actually occurring in practice. Examples of use of the proposed system are included 相似文献

7.

The estimation of R 2 and adjusted R 2 in incomplete data sets using multiple imputation

Ofer Harel 《Journal of applied statistics》2009,36(10):1109-1118

The coefficient of determination, known also as the R ², is a common measure in regression analysis. Many scientists use the R ² and the adjusted R ² on a regular basis. In most cases, the researchers treat the coefficient of determination as an index of ‘usefulness’ or ‘goodness of fit,’ and in some cases, they even treat it as a model selection tool. In cases in which the data is incomplete, most researchers and common statistical software will use complete case analysis in order to estimate the R ², a procedure that might lead to biased results. In this paper, I introduce the use of multiple imputation for the estimation of R ² and adjusted R ² in incomplete data sets. I illustrate my methodology using a biomedical example. 相似文献

8.

New Section on Social Statistics

Lazare Teper 《The American statistician》2013,67(1):15-16

The analysis of a general k-factor factorial experiment having unequal numbers of observations per cell is complex. For the special case of a 2^k experiment with unequal numbers of observations per cell, the method of unweighted means provides a simple vehicle for analysis that requires no matrix inversion and can be used with existing software programs for the analysis of balanced data. All numerator sums of squares for testing main effects and interactions are χ² with one degree of freedom. In addition, for tests having one degree of freedom in any factorial experiment, the method of unweighted means may be modified to yield exact tests. 相似文献

9.

Optimal Hoeffding-like inequalities under a symmetry assumption

V. Bentkus M. C. A. Van Zuijlen 《Statistics》2013,47(2):159-164

In this paper, we prove a Hoeffding-like inequality for the survival function of a sum of symmetric independent identically distributed random variables, taking values in a segment [?b, b] of the reals. The symmetric case is relevant to the auditing practice and is an important case study for further investigations. The bounds as given by Hoeffding in 1963 cannot be improved upon unless we restrict the class of random variables, for instance, by assuming the law of the random variables to be symmetric with respect to their mean, which we may assume to be zero. The main result in this paper is an improvement of the Hoeffding bound for i.i.d. random variables which are bounded and have a (upper bound for the) variance by further assuming that they have a symmetric law. 相似文献

10.

Evaluation and prediction of polygon approximations of planar contours for shape analysis

Chalani Prematilake 《Journal of applied statistics》2018,45(7):1227-1246

Contours may be viewed as the 2D outline of the image of an object. This type of data arises in medical imaging as well as in computer vision and can be modeled as data on a manifold and can be studied using statistical shape analysis. Practically speaking, each observed contour, while theoretically infinite dimensional, must be discretized for computations. As such, the coordinates for each contour as obtained at k sampling times, resulting in the contour being represented as a k-dimensional complex vector. While choosing large values of k will result in closer approximations to the original contour, this will also result in higher computational costs in the subsequent analysis. The goal of this study is to determine reasonable values for k so as to keep the computational cost low while maintaining accuracy. To do this, we consider two methods for selecting sample points and determine lower bounds for k for obtaining a desired level of approximation error using two different criteria. Because this process is computationally inefficient to perform on a large scale, we then develop models for predicting the lower bounds for k based on simple characteristics of the contours. 相似文献

11.

Estimation of the number of studies with positive trends when studies with negative trends are present

Kenny S. Crump Daniel Krewski 《Revue canadienne de statistique》1998,26(4):643-655

A two-point estimator is proposed for the proportion of studies with positive trends among a collection of studies, some of which may demonstrate negative trends. The proposed estimator is the y-intercept of the secant line joining the points (a, F?(a)) and (b, F?(b)), where F?(p) is the empirical distribution function of p-values from one-tailed tests for positive trend derived from the individual studies. Although this estimator is negatively biased for any choice of the points 0 ≤ a < b ≤ 1, the bias is less than that of the previously proposed one-point estimator defined by setting b = 1. The bias of the two-point estimator is smallest when a and b approach the inflection point of the true distribution function, E [F?(p)]. The utility of the two-point estimator is demonstrated by using it to estimate the number of male-mouse liver carcinogens among carcinogenicity studies conducted by the National Toxicology Program. 相似文献

12.

Aggregate statistical data: models for their representation

Maurizio Rafanelli 《Statistics and Computing》1995,5(1):3-24

The paper gives a review of a number of data models for aggregate statistical data which have appeared in the computer science literature in the last ten years.After a brief introduction to the data model in general, the fundamental concepts of statistical data are introduced. These are called statistical objects because they are complex data structures (vectors, matrices, relations, time series, etc) which may have different possible representations (e.g. tables, relations, vectors, pie-charts, bar-charts, graphs, and so on). For this reason a statistical object is defined by two different types of attribute (a summary attribute, with its own summary type and with its own instances, called summary data, and the set of category attributes, which describe the summary attribute). Some conceptual models of statistical data (CSM, SDM4S), some semantic models of statistical data (SCM, SAM*, OSAM*), and some graphical models of statistical data (SUBJECT, GRASS, STORM) are also discussed. 相似文献

13.

Bayesian Inference in Generalized Error and Generalized Student-t Regression Models

Efthymios G. Tsionas 《统计学通讯:理论与方法》2013,42(3):388-407

This study takes up inference in linear models with generalized error and generalized t distributions. For the generalized error distribution, two computational algorithms are proposed. The first is based on indirect Bayesian inference using an approximating finite scale mixture of normal distributions. The second is based on Gibbs sampling. The Gibbs sampler involves only drawing random numbers from standard distributions. This is important because previously the impression has been that an exact analysis of the generalized error regression model using Gibbs sampling is not possible. Next, we describe computational Bayesian inference for linear models with generalized t disturbances based on Gibbs sampling, and exploiting the fact that the model is a mixture of generalized error distributions with inverse generalized gamma distributions for the scale parameter. The linear model with this specification has also been thought not to be amenable to exact Bayesian analysis. All computational methods are applied to actual data involving the exchange rates of the British pound, the French franc, and the German mark relative to the U.S. dollar. 相似文献

14.

On two approximations to the F-distribution: Application to testing for intraclass correlation in family studies

Allan Donner George A. Wells Michael Eliasziw 《Revue canadienne de statistique》1989,17(2):209-215

Two approximations to the F-distribution are evaluated in the context of testing for intraclass correlation in the analysis of family data. The evaluation is based on a computation of empirical significance levels and a comparison between p-values associated with these approximations and the corresponding exact p-values. It is found that the approximate methods may give very unsatisfactory results, and exact methods are therefore recommended for general use. 相似文献

15.

Inverse regression for ridge recovery: a data-driven approach for parameter reduction in computer experiments

Glaws Andrew Constantine Paul G. Cook R. Dennis 《Statistics and Computing》2020,30(2):237-253

Parameter reduction can enable otherwise infeasible design and uncertainty studies with modern computational science models that contain several input parameters. In statistical regression, techniques for sufficient dimension reduction (SDR) use data to reduce the predictor dimension of a regression problem. A computational scientist hoping to use SDR for parameter reduction encounters a problem: a computer prediction is best represented by a deterministic function of the inputs, so data comprised of computer simulation queries fail to satisfy the SDR assumptions. To address this problem, we interpret SDR methods sliced inverse regression (SIR) and sliced average variance estimation (SAVE) as estimating the directions of a ridge function, which is a composition of a low-dimensional linear transformation with a nonlinear function. Within this interpretation, SIR and SAVE estimate matrices of integrals whose column spaces are contained in the ridge directions’ span; we analyze and numerically verify convergence of these column spaces as the number of computer model queries increases. Moreover, we show example functions that are not ridge functions but whose inverse conditional moment matrices are low-rank. Consequently, the computational scientist should beware when using SIR and SAVE for parameter reduction, since SIR and SAVE may mistakenly suggest that truly important directions are unimportant.

相似文献

16.

PARAMETER ESTIMATION BASED ON GROUPED OR CONTINUOUS DATA FOR TRUNCATED EXPONENTIAL DISTRIBUTIONS

《统计学通讯:理论与方法》2013,42(6):889-900

ABSTRACT

Parameter estimation based on truncated data is dealt with; the data are assumed to obey truncated exponential distributions with a variety of truncation time—a ₁ data are obtained by truncation time b ₁, a ₂ data are obtained by truncation time b ₂ and so on, whereas the underlying distribution is the same exponential one. The purpose of the present paper is to give existence conditions of the maximum likelihood estimators (MLEs) and to show some properties of the MLEs in two cases: 1) the grouped and truncated data are given (that is, the data each express the number of the data value falling in a corresponding subinterval), 2) the continuous and truncated data are given. 相似文献

17.

Evidence From Marginally Significant t Statistics

Valen E. Johnson 《The American statistician》2019,73(1):129-134

ABSTRACT

This article examines the evidence contained in t statistics that are marginally significant in 5% tests. The bases for evaluating evidence are likelihood ratios and integrated likelihood ratios, computed under a variety of assumptions regarding the alternative hypotheses in null hypothesis significance tests. Likelihood ratios and integrated likelihood ratios provide a useful measure of the evidence in favor of competing hypotheses because they can be interpreted as representing the ratio of the probabilities that each hypothesis assigns to observed data. When they are either very large or very small, they suggest that one hypothesis is much better than the other in predicting observed data. If they are close to 1.0, then both hypotheses provide approximately equally valid explanations for observed data. I find that p-values that are close to 0.05 (i.e., that are “marginally significant”) correspond to integrated likelihood ratios that are bounded by approximately 7 in two-sided tests, and by approximately 4 in one-sided tests.

The modest magnitude of integrated likelihood ratios corresponding to p-values close to 0.05 clearly suggests that higher standards of evidence are needed to support claims of novel discoveries and new effects. 相似文献

18.

Discretisation for inference on normal mixture models

Mark J. Brewer 《Statistics and Computing》2003,13(3):209-219

The problem of inference in Bayesian Normal mixture models is known to be difficult. In particular, direct Bayesian inference (via quadrature) suffers from a combinatorial explosion in having to consider every possible partition of n observations into k mixture components, resulting in a computation time which is O(k ⁿ). This paper explores the use of discretised parameters and shows that for equal-variance mixture models, direct computation time can be reduced to O(D ^k n ^k), where relevant continuous parameters are each divided into D regions. As a consequence, direct inference is now possible on genuine data sets for small k, where the quality of approximation is determined by the level of discretisation. For large problems, where the computational complexity is still too great in O(D ^k n ^k) time, discretisation can provide a convergence diagnostic for a Markov chain Monte Carlo analysis. 相似文献

19.

Analysis of variance of paired data without repetition of measurement

Klaus Martin Annette B?ckenhoff 《Allgemeines Statistisches Archiv》2006,90(3):365-384

Summary The objective of this analysis of variance of paired data is to estimate positive random error variances for each ofN=2 measurement methods. The two methods measure the same item only once without measurement repetition. The well-known unbiased Grubbs’ estimators are not suitable for practical purpose because they can become negative. With the help of Chebyshev’s inequality the probability was determined that Grubbs’ estimators become negative. Based on the Grubbs’ estimators new estimators were derived. The new estimators are indeed always positive, but they are biased. It is shown that the biases are small. In case the Grubbs’ estimators are positive a bias correction of the new estimators may be envisaged.

Zusammenfassung Das Ziel dieser Varianzanalyse von gepaarten Messungen ist die Sch?tzung zuf?lliger Messfehlervarianzen für jede derN=2 Messmethoden. Die beiden Messmethoden messen das gleiche Merkmal eines Elements nur einmal ohne Messwiederholung. Die bekannten unverzerrten Grubbs-Sch?tzer sind für die praktische Anwendung nicht geeignet, weil sie negativ werden k?nnten. Die Tschebyscheffsche Ungleichung wurde genutzt, um die Wahrscheinlichkeit zu ermitteln, dass Grubbs-Sch?tzer negativ werden. Basierend auf Grubbs-Sch?tzern wurden neue Sch?tzer hergeleitet. Diese neuen Sch?tzer sind zwar immer positiv, aber verzerrt. Es wird gezeigt, dass die Verzerrungen klein sind. Für den Fall, dass die Grubbs-Sch?tzer positiv ausfallen, k?nnte eine Korrektur der Verzerrung in Betracht gezogen werden.

相似文献

20.

The Estimation and Testing of the Cointegration Order Based on the Frequency Domain

Igor Viveiros Melo Souza Valderio Anselmo Reisen Glaura da Conceição Franco Pascal Bondon 《商业与经济统计学杂志》2013,31(4):695-704

ABSTRACT

This article proposes a method to estimate the degree of cointegration in bivariate series and suggests a test statistic for testing noncointegration based on the determinant of the spectral density matrix for the frequencies close to zero. In the study, series are assumed to be I(d), 0 < d ? 1, with parameter d supposed to be known. In this context, the order of integration of the error series is I(d ? b), b ∈ [0, d]. Besides, the determinant of the spectral density matrix for the dth difference series is a power function of b. The proposed estimator for b is obtained here performing a regression of logged determinant on a set of logged Fourier frequencies. Under the null hypothesis of noncointegration, the expressions for the bias and variance of the estimator were derived and its consistency property was also obtained. The asymptotic normality of the estimator, under Gaussian and non-Gaussian innovations, was also established. A Monte Carlo study was performed and showed that the suggested test possesses correct size and good power for moderate sample sizes, when compared with other proposals in the literature. An advantage of the method proposed here, over the standard methods, is that it allows to know the order of integration of the error series without estimating a regression equation. An application was conducted to exemplify the method in a real context. 相似文献