首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 765 毫秒
1.
The k largest order statistics in a random sample from a common heavy‐tailed parent distribution with a regularly varying tail can be characterized as Fréchet extremes. This paper establishes that consecutive ratios of such Fréchet extremes are mutually independent and distributed as functions of beta random variables. The maximum likelihood estimator of the tail index based on these ratios is derived, and the exact distribution of the maximum likelihood estimator is determined for fixed k, and the asymptotic distribution as k →∞ . Inferential procedures based upon the maximum likelihood estimator are shown to be optimal. The Fréchet extremes are not directly observable, but a feasible version of the maximum likelihood estimator is equivalent to Hill's statistic. A simple diagnostic is presented that can be used to decide on the largest value of k for which an assumption of Fréchet extremes is sustainable. The results are illustrated using data on commercial insurance claims arising from fires and explosions, and from hurricanes.  相似文献   

2.
An extended Gaussian max-stable process model for spatial extremes   总被引:1,自引:0,他引:1  
The extremes of environmental processes are often of interest due to the damage that can be caused by extreme levels of the processes. These processes are often spatial in nature and modelling the extremes jointly at many locations can be important. In this paper, an extension of the Gaussian max-stable process is developed, enabling data from a number of locations to be modelled under a more flexible framework than in previous applications. The model is applied to annual maximum rainfall data from five sites in South-West England. For estimation we employ a pairwise likelihood within a Bayesian analysis, incorporating informative prior information.  相似文献   

3.
Local likelihood smoothing of sample extremes   总被引:2,自引:0,他引:2  
Trends in sample extremes are of interest in many contexts, an example being environmental statistics. Parametric models are often used to model trends in such data, but they may not be suitable for exploratory data analysis. This paper outlines a semiparametric approach to smoothing sample extremes, based on local polynomial fitting of the generalized extreme value distribution and related models. The uncertainty of fits is assessed by using resampling methods. The methods are applied to data on extreme temperatures and on record times for the women's 3000 m race.  相似文献   

4.
Time series of daily mean temperature obtained from the European Climate Assessment data set is analyzed with respect to their extremal properties. A time-series clustering approach which combines Bayesian methodology, extreme value theory and classification techniques is adopted for the analysis of the regional variability of temperature extremes. The daily mean temperature records are clustered on the basis of their corresponding predictive distributions for 25-, 50- and 100-year return values. The results of the cluster analysis show a clear distinction between the highest altitude stations, for which the return values are lowest, and the remaining stations. Furthermore, a clear distinction is also found between the northernmost stations in Scandinavia and the stations in central and southern Europe. This spatial structure of the return period distributions for 25-, 50- and 100-years seems to be consistent with projected changes in the variability of temperature extremes over Europe pointing to a different behavior in central Europe than in northern Europe and the Mediterranean area, possibly related to the effect of soil moisture and land-atmosphere coupling.  相似文献   

5.
Summary.  The number of people to select within selected households has significant consequences for the conduct and output of household surveys. The operational and data quality implications of this choice are carefully considered in many surveys, but the effect on statistical efficiency is not well understood. The usual approach is to select all people in each selected household, where operational and data quality concerns make this feasible. If not, one person is usually selected from each selected household. We find that this strategy is not always justified, and we develop intermediate designs between these two extremes. Current practices were developed when household survey field procedures needed to be simple and robust; however, more complex designs are now feasible owing to the increasing use of computer-assisted interviewing. We develop more flexible designs by optimizing survey cost, based on a simple cost model, subject to a required variance for an estimator of population total. The innovation lies in the fact that household sample sizes are small integers, which creates challenges in both design and estimation. The new methods are evaluated empirically by using census and health survey data, showing considerable improvement over existing methods in some cases.  相似文献   

6.
We present a flexible branching process model for cell population dynamics in synchrony/time-series experiments used to study important cellular processes. Its formulation is constructive, based on an accounting of the unique cohorts in the population as they arise and evolve over time, allowing it to be written in closed form. The model can attribute effects to subsets of the population, providing flexibility not available using the models historically applied to these populations. It provides a tool for in silico synchronization of the population and can be used to deconvolve population-level experimental measurements, such as temporal expression profiles. It also allows for the direct comparison of assay measurements made from multiple experiments. The model can be fit either to budding index or DNA content measurements, or both, and is easily adaptable to new forms of data. The ability to use DNA content data makes the model applicable to almost any organism. We describe the model and illustrate its utility and flexibility in a study of cell cycle progression in the yeast Saccharomyces cerevisiae.  相似文献   

7.
Estimation of the allele frequency at genetic markers is a key ingredient in biological and biomedical research, such as studies of human genetic variation or of the genetic etiology of heritable traits. As genetic data becomes increasingly available, investigators face a dilemma: when should data from other studies and population subgroups be pooled with the primary data? Pooling additional samples will generally reduce the variance of the frequency estimates; however, used inappropriately, pooled estimates can be severely biased due to population stratification. Because of this potential bias, most investigators avoid pooling, even for samples with the same ethnic background and residing on the same continent. Here, we propose an empirical Bayes approach for estimating allele frequencies of single nucleotide polymorphisms. This procedure adaptively incorporates genotypes from related samples, so that more similar samples have a greater influence on the estimates. In every example we have considered, our estimator achieves a mean squared error (MSE) that is smaller than either pooling or not, and sometimes substantially improves over both extremes. The bias introduced is small, as is shown by a simulation study that is carefully matched to a real data example. Our method is particularly useful when small groups of individuals are genotyped at a large number of markers, a situation we are likely to encounter in a genome-wide association study.  相似文献   

8.
In this article, we revisit the importance of the generalized jackknife in the construction of reliable semi-parametric estimates of some parameters of extreme or even rare events. The generalized jackknife statistic is applied to a minimum-variance reduced-bias estimator of a positive extreme value index—a primary parameter in statistics of extremes. A couple of refinements are proposed and a simulation study shows that these are able to achieve a lower mean square error. A real data illustration is also provided.  相似文献   

9.
In extreme value theory, the shape second-order parameter is a quite relevant parameter related to the speed of convergence of maximum values, linearly normalized, towards its limit law. The adequate estimation of this parameter is vital for improving the estimation of the extreme value index, the primary parameter in statistics of extremes. In this article, we consider a recent class of semi-parametric estimators of the shape second-order parameter for heavy right-tailed models. These estimators, based on the largest order statistics, depend on a real tuning parameter, which makes them highly flexible and possibly unbiased for several underlying models. In this article, we are interested in the adaptive choice of such tuning parameter and the number of top order statistics used in the estimation procedure. The performance of the methodology for the adaptive choice of parameters is evaluated through a Monte Carlo simulation study.  相似文献   

10.
When biological or physiological variables change over time, we are often interested in making predictions either of future measurements or of the time taken to reach some threshold value. On the basis of longitudinal data for multiple individuals, we develop Bayesian hierarchical models for making these predictions together with their associated uncertainty. Particular aspects addressed, which include some novel components, are handling curvature in individuals' trends over time, making predictions for both underlying and measured levels, making predictions from a single baseline measurement, making predictions from a series of measurements, allowing flexibility in the error and random-effects distributions, and including covariates. In the context of data on the expansion of abdominal aortic aneurysms over time, where reaching a certain threshold leads to referral for surgery, we discuss the practical application of these models to the planning of monitoring intervals in a national screening programme. Prediction of the time to reach a threshold was too imprecise to be practically useful, and we focus instead on limiting the probability of exceeding the threshold after given time intervals. Although more complex models can be shown to fit the data better, we find that relatively simple models seem to be adequate for planning monitoring intervals.  相似文献   

11.
In the past decades, the number of variables explaining observations in different practical applications increased gradually. This has led to heavy computational tasks, despite of widely using provisional variable selection methods in data processing. Therefore, more methodological techniques have appeared to reduce the number of explanatory variables without losing much of the information. In these techniques, two distinct approaches are apparent: ‘shrinkage regression’ and ‘sufficient dimension reduction’. Surprisingly, there has not been any communication or comparison between these two methodological categories, and it is not clear when each of these two approaches are appropriate. In this paper, we fill some of this gap by first reviewing each category in brief, paying special attention to the most commonly used methods in each category. We then compare commonly used methods from both categories based on their accuracy, computation time, and their ability to select effective variables. A simulation study on the performance of the methods in each category is generated as well. The selected methods are concurrently tested on two sets of real data which allows us to recommend conditions under which one approach is more appropriate to be applied to high-dimensional data.  相似文献   

12.
In the traditional plan for assessing the reliability of a measurement system, a number of raters each measure the same group of subjects. If the system has a large number of raters, we recommend a new set of plans that has two advantages over the traditional plan. First, the proposed plans provide greater precision for estimating the intraclass correlation coefficient with the same total number of measurements. Second, the plans are flexible and can be adapted to constraints on the number of times any subject can be assessed or the number of times any rater can make an assessment. We provide a simple tool for planning a reliability study, access to the software for the planning in the case where there are constraints and an example to demonstrate the analysis of data from the proposed plans. The Canadian Journal of Statistics 39: 344–355; 2011 © 2011 Statistical Society of Canada  相似文献   

13.
The Poisson distribution is commonly used to model the number of occurrences of independent rare events. However, many instances arise where dependence exists, for example, in counting the length of long head runs in coin tossing, or matches between two DNA sequences. The Chen-Stein method of Poisson approximation yields bounds on the error incurred when approximating the number of occurrences of possibly dependent events by a Poisson random variable of the same mean. In addition to the problems related to the motivating examples from molecular biology involving runs and matches, the method may be applied to questions as varied as calculating probabilities involving extremes of sequences of random variables and approximating the probability of general birthday coincidences.  相似文献   

14.
Extreme value models and techniques are widely applied in environmental studies to define protection systems against the effects of extreme levels of environmental processes. Regarding the matter related to the climate science, a certain importance is covered by the implication of changes in the hydrological cycle. Among all hydrologic processes, rainfall is a very important variable as it is strongly related to flood risk assessment and mitigation, as well as to water resources availability and drought identification. We implement here a geoadditive model for extremes assuming that the observations follow a generalized extreme value distribution with spatially dependent location. The analyzed territory is the catchment area of the Arno River in Tuscany in Central Italy.  相似文献   

15.
Acceptance sampling techniques are used to monitor the accuracy of gas meters. Random samples of meters are taken from homogeneous lots, and two accuracy measurements are recorded for each meter. In the past, the two measurements were averaged, and an acceptance sampling test applied to the sample of averages. In 1987, the plan was modified so that virtually the same test is applied to the two measurements individually. This new procedure is more stringent than the old procedure. In a study of data sampled over three years, the new plan rejects more lots than does the old plan, leading to greatly increased costs to the gas industry and therefore to the consumer. Theoretical reasons are given for why this occurs, and an alternative plan is proposed.  相似文献   

16.
In the search for the best of n candidates, two-stage procedures of the following type are in common use. In a first stage, weak candidates are removed, and the subset of promising candidates is then further examined. At a second stage, the best of the candidates in the subset is selected. In this article, optimization is not aimed at the parameter with largest value but rather at the best performance of the selected candidates at Stage 2. Under a normal model, a new procedure based on posterior percentiles is derived using a Bayes approach, where nonsymmetric normal (proper and improper) priors are applied. Comparisons are made with two other procedures frequently used in selection decisions. The three procedures and their performances are illustrated with data from a recent recruitment process at a Midwestern university.  相似文献   

17.
The wide-ranging and rapidly evolving nature of ecological studies mean that it is not possible to cover all existing and emerging techniques for analyzing multivariate data. However, two important methods enticed many followers: the Canonical Correspondence Analysis (CCA) and the STATICO analysis. Despite the particular characteristics of each, they have similarities and differences, which when analyzed properly, can, together, provide important complementary results to those that are usually exploited by researchers. If on one hand, the use of CCA is completely generalized and implemented, solving many problems formulated by ecologists, on the other hand, this method has some weaknesses mainly caused by the imposition of the number of variables that is required to be applied (much higher in comparison with samples). Also, the STATICO method has no such restrictions, but requires that the number of variables (species or environment) is the same in each time or space. Yet, the STATICO method presents information that can be more detailed since it allows visualizing the variability within groups (either in time or space). In this study, the data needed for implementing these methods are sketched, as well as the comparison is made showing the advantages and disadvantages of each method. The treated ecological data are a sequence of pairs of ecological tables, where species abundances and environmental variables are measured at different, specified locations, over the course of time.  相似文献   

18.
Models for Dependent Extremes Using Stable Mixtures   总被引:1,自引:0,他引:1  
Abstract.  This paper unifies and extends results on a class of multivariate extreme value (EV) models studied by Hougaard, Crowder and Tawn. In these models, both unconditional and conditional distributions are themselves EV distributions, and all lower-dimensional marginals and maxima belong to the class. One interpretation of the models is as size mixtures of EV distributions, where the mixing is by positive stable distributions. A second interpretation is as exponential-stable location mixtures (for Gumbel) or as power-stable scale mixtures (for non-Gumbel EV distributions). A third interpretation is through a peaks over thresholds model with a positive stable intensity. The mixing variables are used as a modelling tool and for better understanding and model checking. We study EV analogues of components of variance models, and new time series, spatial and continuous parameter models for extreme values. The results are applied to data from a pitting corrosion investigation.  相似文献   

19.
Summary. Long-transported air pollution in Europe is monitored by a combination of a highly complex mathematical model and a limited number of measurement stations. The model predicts deposition on a 150 km × 150 km square grid covering the whole of the continent. These predictions can be regarded as spatial averages, with some spatially correlated model error. The measurement stations give a limited number of point estimates, regarded as error free. We combine these two sources of data by assuming that both are observations of an underlying true process. This true deposition is made up of a smooth deterministic trend, due to gradual changes in emissions over space and time, and two stochastic components. One is non- stationary and correlated over long distances; the other describes variation within a grid square. Our approach is through hierarchical modelling with predictions and measurements being independent conditioned on the underlying non-stationary true deposition. We assume Gaussian processes and calculate maximum likelihood estimates through numerical optimization. We find that the variation within a grid square is by far the largest component of the variation in the true deposition. We assume that the mathematical model produces estimates of the mean over an area that is approximately equal to a grid square, and we find that it has an error that is similar to the long-range stochastic component of the true deposition, in addition to a large bias.  相似文献   

20.
The authors propose graphical and numerical methods for checking the adequacy of the logistic regression model for matched case‐control data. Their approach is based on the cumulative sum of residuals over the covariate or linear predictor. Under the assumed model, the cumulative residual process converges weakly to a centered Gaussian limit whose distribution can be approximated via computer simulation. The observed cumulative residual pattern can then be compared both visually and analytically to a certain number of simulated realizations of the approximate limiting process under the null hypothesis. The proposed techniques allow one to check the functional form of each covariate, the logistic link function as well as the overall model adequacy. The authors assess the performance of the proposed methods through simulation studies and illustrate them using data from a cardiovascular study.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号