首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 343 毫秒
1.
Summary.  Posterior distributions for the joint projections of future temperature and precipitation trends and changes are derived by applying a Bayesian hierachical model to a rich data set of simulated climate from general circulation models. The simulations that are analysed here constitute the future projections on which the Intergovernmental Panel on Climate Change based its recent summary report on the future of our planet's climate, albeit without any sophisticated statistical handling of the data. Here we quantify the uncertainty that is represented by the variable results of the various models and their limited ability to represent the observed climate both at global and at regional scales. We do so in a Bayesian framework, by estimating posterior distributions of the climate change signals in terms of trends or differences between future and current periods, and we fully characterize the uncertain nature of a suite of other parameters, like biases, correlation terms and model-specific precisions. Besides presenting our results in terms of posterior distributions of the climate signals, we offer as an alternative representation of the uncertainties in climate change projections the use of the posterior predictive distribution of a new model's projections. The results from our analysis can find straightforward applications in impact studies, which necessitate not only best guesses but also a full representation of the uncertainty in climate change projections. For water resource and crop models, for example, it is vital to use joint projections of temperature and precipitation to represent the characteristics of future climate best, and our statistical analysis delivers just that.  相似文献   

2.
Abstract. In geophysical and environmental problems, it is common to have multiple variables of interest measured at the same location and time. These multiple variables typically have dependence over space (and/or time). As a consequence, there is a growing interest in developing models for multivariate spatial processes, in particular, the cross‐covariance models. On the other hand, many data sets these days cover a large portion of the Earth such as satellite data, which require valid covariance models on a globe. We present a class of parametric covariance models for multivariate processes on a globe. The covariance models are flexible in capturing non‐stationarity in the data yet computationally feasible and require moderate numbers of parameters. We apply our covariance model to surface temperature and precipitation data from an NCAR climate model output. We compare our model to the multivariate version of the Matérn cross‐covariance function and models based on coregionalization and demonstrate the superior performance of our model in terms of AIC (and/or maximum loglikelihood values) and predictive skill. We also present some challenges in modelling the cross‐covariance structure of the temperature and precipitation data. Based on the fitted results using full data, we give the estimated cross‐correlation structure between the two variables.  相似文献   

3.
Abstract. We study statistical procedures to quantify uncertainty in multivariate climate projections based on several deterministic climate models. We introduce two different assumptions – called constant bias and constant relation respectively – for extrapolating the substantial additive and multiplicative biases present during the control period to the scenario period. There are also strong indications that the biases in the scenario period are different from the extrapolations from the control period. Including such changes in the statistical models leads to an identifiability problem that we solve in a frequentist analysis using a zero sum side condition and in a Bayesian analysis using informative priors. The Bayesian analysis provides estimates of the uncertainty in the parameter estimates and takes this uncertainty into account for the predictive distributions. We illustrate the method by analysing projections of seasonal temperature and precipitation in the Alpine region from five regional climate models in the PRUDENCE project.  相似文献   

4.
We analyze the multivariate spatial distribution of plant species diversity, distributed across three ecologically distinct land uses, the urban residential, urban non-residential, and desert. We model these data using a spatial generalized linear mixed model. Here plant species counts are assumed to be correlated within and among the spatial locations. We implement this model across the Phoenix metropolis and surrounding desert. Using a Bayesian approach, we utilized the Langevin–Hastings hybrid algorithm. Under a generalization of a spatial log-Gaussian Cox model, the log-intensities of the species count processes follow Gaussian distributions. The purely spatial component corresponding to these log-intensities are jointly modeled using a cross-convolution approach, in order to depict a valid cross-correlation structure. We observe that this approach yields non-stationarity of the model ensuing from different land use types. We obtain predictions of various measures of plant diversity including plant richness and the Shannon–Weiner diversity at observed locations. We also obtain a prediction framework for plant preferences in urban and desert plots.  相似文献   

5.
We propose a novel Bayesian nonparametric (BNP) model, which is built on a class of species sampling models, for estimating density functions of temporal data. In particular, we introduce species sampling mixture models with temporal dependence. To accommodate temporal dependence, we define dependent species sampling models by modeling random support points and weights through an autoregressive model, and then we construct the mixture models based on the collection of these dependent species sampling models. We propose an algorithm to generate posterior samples and present simulation studies to compare the performance of the proposed models with competitors that are based on Dirichlet process mixture models. We apply our method to the estimation of densities for the price of apartment in Seoul, the closing price in Korea Composite Stock Price Index (KOSPI), and climate variables (daily maximum temperature and precipitation) of around the Korean peninsula.  相似文献   

6.
The paper by Battaglia and Protopapas (Stat Method Appl 2012) is stimulating. It gives an elegant mathematical generalization of autoregressive models (the nine types). It explains state-of-the-art model fitting techniques (genetic algorithm combined with fitness function and least squares). It is written in a fluent and authoritative manner. Important for having a wider impact: it is accessible to non-statisticians. Finally, it has interesting results on the temperature evolution over the instrumental period (roughly the past 200?years). These merits make this paper an important contribution to applied statistics as well as climatology. As a climate researcher, coming from Physics and having had an affiliation with a statistical institute only as postdoc, I re-analyse here three data series with the aim of providing motivation for model selection and interpreting the results from the climatological perspective.  相似文献   

7.
Forecasting of future snow depths is useful for many applications like road safety, winter sport activities, avalanche risk assessment and hydrology. Motivated by the lack of statistical forecasts models for snow depth, in this paper we present a set of models to fill this gap. First, we present a model to do short-term forecasts when we assume that reliable weather forecasts of air temperature and precipitation are available. The covariates are included nonlinearly into the model following basic physical principles of snowfall, snow aging and melting. Due to the large set of observations with snow depth equal to zero, we use a zero-inflated gamma regression model, which is commonly used to similar applications like precipitation. We also do long-term forecasts of snow depth and much further than traditional weather forecasts for temperature and precipitation. The long-term forecasts are based on fitting models to historic time series of precipitation, temperature and snow depth. We fit the models to data from six locations in Norway with different climatic and vegetation properties. Forecasting five days into the future, the results showed that, given reliable weather forecasts of temperature and precipitation, the forecast errors in absolute value was between 3 and 7?cm for different locations in Norway. Forecasting three weeks into the future, the forecast errors were between 7 and 16?cm.  相似文献   

8.
A dynamic coupled modelling is investigated to take temperature into account in the individual energy consumption forecasting. The objective is both to avoid the inherent complexity of exhaustive SARIMAX models and to take advantage of the usual linear relation between energy consumption and temperature for thermosensitive customers. We first recall some issues related to individual load curves forecasting. Then, we propose and study the properties of a dynamic coupled modelling taking temperature into account as an exogenous contribution and its application to the intraday prediction of energy consumption. Finally, these theoretical results are illustrated on a real individual load curve. The authors discuss the relevance of such an approach and anticipate that it could form a substantial alternative to the commonly used methods for energy consumption forecasting of individual customers.  相似文献   

9.
ABSTRACT

Genetic data are frequently categorical and have complex dependence structures that are not always well understood. For this reason, clustering and classification based on genetic data, while highly relevant, are challenging statistical problems. Here we consider a versatile U-statistics-based approach for non-parametric clustering that allows for an unconventional way of solving these problems. In this paper we propose a statistical test to assess group homogeneity taking into account multiple testing issues and a clustering algorithm based on dissimilarities within and between groups that highly speeds up the homogeneity test. We also propose a test to verify classification significance of a sample in one of two groups. We present Monte Carlo simulations that evaluate size and power of the proposed tests under different scenarios. Finally, the methodology is applied to three different genetic data sets: global human genetic diversity, breast tumour gene expression and Dengue virus serotypes. These applications showcase this statistical framework's ability to answer diverse biological questions in the high dimension low sample size scenario while adapting to the specificities of the different datatypes.  相似文献   

10.
Birnbaum–Saunders (BS) models are receiving considerable attention in the literature. Multivariate regression models are a useful tool of the multivariate analysis, which takes into account the correlation between variables. Diagnostic analysis is an important aspect to be considered in the statistical modeling. In this paper, we formulate multivariate generalized BS regression models and carry out a diagnostic analysis for these models. We consider the Mahalanobis distance as a global influence measure to detect multivariate outliers and use it for evaluating the adequacy of the distributional assumption. We also consider the local influence approach and study how a perturbation may impact on the estimation of model parameters. We implement the obtained results in the R software, which are illustrated with real-world multivariate data to show their potential applications.  相似文献   

11.
Differences between plant varieties are based on phenotypic observations, which are both space and time consuming. Moreover, the phenotypic data result from the combined effects of genotype and environment. On the contrary, molecular data are easier to obtain and give a direct access to the genotype. In order to save experimental trials and to concentrate efforts on the relevant comparisons between varieties, the relationship between phenotypic and genetic distances is studied. It appears that the classical genetic distances based on molecular data are not appropriate for predicting phenotypic distances. In the linear model framework, we define a new pseudo genetic distance, which is a prediction of the phenotypic one. The distribution of this distance given the pseudo genetic distance is established. Statistical properties of the predicted distance are derived when the parameters of the model are either given or estimated. We finally apply these results to distinguishing between 144 maize lines. This case study is very satisfactory because the use of anonymous molecular markers (RFLP) leads to saving 29% of the trials with an acceptable error risk. These results need to be confirmed on other varieties and species and would certainly be improved by using genes coding for phenotypic traits.  相似文献   

12.
In developed countries the effects of climate on health status are mainly due to temperature. Our analysis is aimed to deepen statistically the relationship between summer climate conditions and daily frequency of health episodes: deaths or hospital admissions. We expect to find a U-shaped relationship between temperature and frequencies of events occurring in summer regarding the elderly population resident in Milano and Brescia. We use as covariates hourly records of temperature recorded at observation sites located in Milano and Brescia. The analysis is performed using Generalized Additive Models (GAM), where the response variable is the daily number of events, which varies as a possibly non-linear function of meteorological variables measured on the same or previous day. We consider separate models for Milano and Brescia and then we compare temperature effects among the two towns and among different age classes. Moreover we consider separate models for all diagnosed events, for those due to respiratory disease and those due to circulatory pathologies. Model selection is a central problem, the basic methods used are the UBRE and GCV criteria but, instead of conditioning all final conclusions on the best model according to the chosen criterion, we investigated the effect of model selection by implementing a bootstrap procedure.  相似文献   

13.
Assisting fund investors in making better investment decisions when faced with market climate change is an important subject. For this purpose, we adopt a genetic algorithm (GA) to search for an optimal decay factor for an exponential weighted moving average model, which is used to calculate the value at risk combined with risk-adjusted return on capital (RAROC). We then propose a GA-based RAROC model. Next, using the model we find the optimal decay factor and investigate the performance and persistence of 31 Taiwanese open-end equity mutual funds over the period from November 2006 to October 2009, divided into three periods: November 2006–October 2007, November 2007–October 2008, and November 2008–October 2009, which includes the global financial crisis. We find that for three periods, the optimal decay factors are 0.999, 0.951, and 0.990, respectively. The rankings of funds between bull and bear markets are quite different. Moreover, the proposed model improves performance persistence. That is, a fund's past performance will continue into the future.  相似文献   

14.
Summary.  Traditional studies of school differences in educational achievement use multilevel modelling techniques to take into account the nesting of pupils within schools. However, educational data are known to have more complex non-hierarchical structures. The potential importance of such structures is apparent when considering the effect of pupil mobility during secondary schooling on educational achievement. Movements of pupils between schools suggest that we should model pupils as belonging to the series of schools that are attended and not just their final school. Since these school moves are strongly linked to residential moves, it is important to explore additionally whether achievement is also affected by the history of neighbourhoods that are lived in. Using the national pupil database, this paper combines multiple membership and cross-classified multilevel models to explore simultaneously the relationships between secondary school, primary school, neighbourhood and educational achievement. The results show a negative relationship between pupil mobility and achievement, the strength of which depends greatly on the nature and timing of these moves. Accounting for pupil mobility also reveals that schools and neighbourhoods are more important than shown by previous analysis. A strong primary school effect appears to last long after a child has left that phase of schooling. The additional effect of neighbourhoods, in contrast, is small. Crucially, the rank order of school effects across all types of pupil is sensitive to whether we account for the complexity of the multilevel data structure.  相似文献   

15.
The computational demand required to perform inference using Markov chain Monte Carlo methods often obstructs a Bayesian analysis. This may be a result of large datasets, complex dependence structures, or expensive computer models. In these instances, the posterior distribution is replaced by a computationally tractable approximation, and inference is based on this working model. However, the error that is introduced by this practice is not well studied. In this paper, we propose a methodology that allows one to examine the impact on statistical inference by quantifying the discrepancy between the intractable and working posterior distributions. This work provides a structure to analyse model approximations with regard to the reliability of inference and computational efficiency. We illustrate our approach through a spatial analysis of yearly total precipitation anomalies where covariance tapering approximations are used to alleviate the computational demand associated with inverting a large, dense covariance matrix.  相似文献   

16.
Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine and health care. Good methods for determining diagnostic accuracy provide useful guidance on selection of patient treatment, and the ability to compare different diagnostic tests has a direct impact on quality of care. In this paper Nonparametric Predictive Inference (NPI) methods for accuracy of diagnostic tests with continuous test results are presented and discussed. For such tests, Receiver Operating Characteristic (ROC) curves have become popular tools for describing the performance of diagnostic tests. We present the NPI approach to ROC curves, and some important summaries of these curves. As NPI does not aim at inference for an entire population but instead explicitly considers a future observation, this provides an attractive alternative to standard methods. We show how NPI can be used to compare two continuous diagnostic tests.  相似文献   

17.
A recently proposed model for describing the distribution of income over a population, based on the Burr distribution, has been shown to fit better than the commonly used lognormal or gamma distributions. The current article extends that analysis by deriving the large-sample properties of the maximum likelihood estimates for this three-parameter model. Consequently, resulting confidence intervals for some measures of income inequality (including the Gini index) are used to further test the model's validity, as well as to examine apparent trends in inequality over time. Since these properties depend on the way the income data are grouped and censored, implications for choosing data-report intervals can be analyzed. Specifically, a choice between two common methods of reporting the data is shown to have an important impact on Gini index estimates.  相似文献   

18.
In family-based longitudinal genetic studies, investigators collect repeated measurements on a trait that changes with time along with genetic markers. Since repeated measurements are nested within subjects and subjects are nested within families, both the subject-level and measurement-level correlations must be taken into account in the statistical analysis to achieve more accurate estimation. In such studies, the primary interests include to test for quantitative trait locus (QTL) effect, and to estimate age-specific QTL effect and residual polygenic heritability function. We propose flexible semiparametric models along with their statistical estimation and hypothesis testing procedures for longitudinal genetic designs. We employ penalized splines to estimate nonparametric functions in the models. We find that misspecifying the baseline function or the genetic effect function in a parametric analysis may lead to substantially inflated or highly conservative type I error rate on testing and large mean squared error on estimation. We apply the proposed approaches to examine age-specific effects of genetic variants reported in a recent genome-wide association study of blood pressure collected in the Framingham Heart Study.  相似文献   

19.
Abstract.  Imagine we have two different samples and are interested in doing semi- or non-parametric regression analysis in each of them, possibly on the same model. In this paper, we consider the problem of testing whether a specific covariate has different impacts on the regression curve in these two samples. We compare the regression curves of different samples but are interested in specific differences instead of testing for equality of the whole regression function. Our procedure does allow for random designs, different sample sizes, different variance functions, different sets of regressors with different impact functions, etc. As we use the marginal integration approach, this method can be applied to any strong, weak or latent separable model as well as to additive interaction models to compare the lower dimensional separable components between the different samples. Thus, in the case of having separable models, our procedure includes the possibility of comparing the whole regression curves, thereby avoiding the curse of dimensionality. It is shown that bootstrap fails in theory and practice. Therefore, we propose a subsampling procedure with automatic choice of subsample size. We present a complete asymptotic theory and an extensive simulation study.  相似文献   

20.
Most growth curves can only be used to model the tumor growth under no intervention. To model the growth curves for treated tumor, both the growth delay due to the treatment and the regrowth of the tumor after the treatment need to be taken into account. In this paper, we consider two tumor regrowth models and determine the locally D- and c-optimal designs for these models. We then show that the locally D- and c-optimal designs are minimally supported. We also consider two equally spaced designs as alternative designs and evaluate their efficiencies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号