首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Summary.  We consider a Bayesian forecasting system to predict the dispersal of contamination on a large scale grid in the event of an accidental release of radioactivity. The statistical model is built on a physical model for atmospheric dispersion and transport called MATCH. Our spatiotemporal model is a dynamic linear model where the state parameters are the (essentially, deterministic) predictions of MATCH; the distributions of these are updated sequentially in the light of monitoring data. One of the distinguishing features of the model is that the number of these parameters is very large (typically several hundreds of thousands) and we discuss practical issues arising in its implementation as a realtime model. Our procedures have been checked against a variational approach which is used widely in the atmospheric sciences. The results of the model are applied to test data from a tracer experiment.  相似文献   

2.
The Peña–Box model is a type of dynamic factor model whose factors try to capture the time-effect movements of a multiple time series. The Peña–Box model can be expressed as a vector autoregressive (VAR) model with constraints. This article derives the maximum likelihood estimates and the likelihood ratio test of the VAR model for Gaussian processes. Then a test statistic constructed by canonical correlation coefficients is presented and adjusted for conditional heteroscedasticity. Simulations confirm the validity of adjustments for conditional heteroscedasticity, and show that the proposed statistics perform better than the statistics used in the existing literature.  相似文献   

3.
General mixed linear models for experiments conducted over a series of sltes and/or years are described. The ordinary least squares (OLS) estlmator is simple to compute, but is not the best unbiased estimator. Also, the usuaL formula for the varlance of the OLS estimator is not correct and seriously underestimates the true variance. The best linear unbiased estimator is the generalized least squares (GLS) estimator. However, t requires an inversion of the variance-covariance matrix V, whlch is usually of large dimension. Also, in practice, V is unknown.

We presented an estlmator [Vcirc] of the matrix V using the estimators of variance components [for sites, blocks (sites), etc.]. We also presented a simple transformation of the data, such that an ordinary least squares regression of the transformed data gives the estimated generalized least squares (EGLS) estimator. The standard errors obtained from the transformed regression serve as asymptotic standard errors of the EGLS estimators. We also established that the EGLS estlmator is unbiased.

An example of fitting a linear model to data for 18 sites (environments) located in Brazil is given. One of the site variables (soil test phosphorus) was measured by plot rather than by site and this established the need for a covariance model such as the one used rather than the usual analysis of variance model. It is for this variable that the resulting parameter estimates did not correspond well between the OLS and EGLS estimators. Regression statistics and the analysis of variance for the example are presented and summarized.  相似文献   

4.
This work studies outlier detection and robust estimation with data that are naturally distributed into groups and which follow approximately a linear regression model with fixed group effects. For this, several methods are considered. First, the robust fitting method of Peña and Yohai [A fast procedure for outlier diagnostics in large regression problems. J Am Stat Assoc. 1999;94:434–445], called principal sensitivity components (PSC) method, is adapted to the grouped data structure and the mentioned model. The robust methods RDL1 of Hubert and Rousseeuw [Robust regression with both continuous and binary regressors. J Stat Plan Inference. 1997;57:153–163] and M-S of Maronna and Yohai [Robust regression with both continuous and categorical predictors. Journal of Statistical Planning and Inference 2000;89:197–214] are also considered. These three methods are compared in terms of their effectiveness in outlier detection and their robustness through simulations, considering several contamination scenarios and growing contamination levels. Results indicate that the adapted PSC procedure is able to detect a high percentage of true outliers and a small number of false outliers. It is appropriate when the contamination is in the error term or in the covariates, detecting also possibly masked high leverage points. Moreover, in simulations the final robust regression estimator preserved good efficiency under Normality while keeping good robustness properties.  相似文献   

5.
When thousands of tests are performed simultaneously to detect differentially expressed genes in microarray analysis, the number of Type I errors can be immense if a multiplicity adjustment is not made. However, due to the large scale, traditional adjustment methods require very stringen significance levels for individual tests, which yield low power for detecting alterations. In this work, we describe how two omnibus tests can be used in conjunction with a gene filtration process to circumvent difficulties due to the large scale of testing. These two omnibus tests, the D-test and the modified likelihood ratio test (MLRT), can be used to investigate whether a collection of P-values has arisen from the Uniform(0,1) distribution or whether the Uniform(0,1) distribution contaminated by another Beta distribution is more appropriate. In the former case, attention can be directed to a smaller part of the genome; in the latter event, parameter estimates for the contamination model provide a frame of reference for multiple comparisons. Unlike the likelihood ratio test (LRT), both the D-test and MLRT enjoy simple limiting distributions under the null hypothesis of no contamination, so critical values can be obtained from standard tables. Simulation studies demonstrate that the D-test and MLRT are superior to the AIC, BIC, and Kolmogorov-Smirnov test. A case study illustrates omnibus testing and filtration.  相似文献   

6.
The performance of tests in Aalen's linear regression model is studied using asymptotic power calculations and stochastic simulation. Aalen's original least squares test is compared to two modifications: a weighted least squares test with correct weights and a test where the variance is re-estimated under the null hypothesis. The test with re-estimated variance provides the highest power of the tests for the setting of this paper, and the gain is substantial for covariates following a skewed distribution like the exponential. It is further shown that Aalen's choice for weight function with re-estimated variance is optimal in the one-parameter case against proportional alternatives.  相似文献   

7.
A robust rank-based estimator for variable selection in linear models, with grouped predictors, is studied. The proposed estimation procedure extends the existing rank-based variable selection [Johnson, B.A., and Peng, L. (2008), ‘Rank-based Variable Selection’, Journal of Nonparametric Statistics, 20(3):241–252] and the ww-scad [Wang, L., and Li, R. (2009), ‘Weighted Wilcoxon-type Smoothly Clipped Absolute Deviation Method’, Biometrics, 65(2):564–571] to linear regression models with grouped variables. The resulting estimator is robust to contamination or deviations in both the response and the design space.The Oracle property and asymptotic normality of the estimator are established under some regularity conditions. Simulation studies reveal that the proposed method performs better than the existing rank-based methods [Johnson, B.A., and Peng, L. (2008), ‘Rank-based Variable Selection’, Journal of Nonparametric Statistics, 20(3):241–252; Wang, L., and Li, R. (2009), ‘Weighted Wilcoxon-type Smoothly Clipped Absolute Deviation Method’, Biometrics, 65(2):564–571] for grouped variables models. This estimation procedure also outperforms the adaptive hlasso [Zhou, N., and Zhu, J. (2010), ‘Group Variable Selection Via a Hierarchical Lasso and its Oracle Property’, Interface, 3(4):557–574] in the presence of local contamination in the design space or for heavy-tailed error distribution.  相似文献   

8.
In this paper we propose test statistics for a general hypothesis concerning the adequacy of multivariate random-effects covariance structures in a multivariate growth curve model with differing numbers of random effects (Lange, N., N.M. Laird, J. Amer. Statist. Assoc. 84 (1989) 241–247). Since the exact likelihood ratio (LR) statistic for the hypothesis is complicated, it is suggested to use a modified LR statistic. An asymptotic expansion of the null distribution of the statistic is obtained. The exact LR statistic is also discussed.  相似文献   

9.
The aim of this article is to discuss homogeneity testing of the exponential distribution. We introduce the exact likelihood ratio test of homogeneity in the subpopulation model, ELR, and the exact likelihood ratio test of homogeneity against the two-components subpopulation alternative, ELR2. The ELR test is asymptotically optimal in the Bahadur sense when the alternative consists of sampling from a fixed number of components. Thus, in some setups the ELR is superior to frequently used tests for exponential homogeneity which are based on the EM algorithm (like the MLRT, ADDS, and D-tests). One important example of superiority of ELR and ELR2 tests is the case of lower contamination. We demonstrate this fact by both theoretical comparisons and simulations.  相似文献   

10.
Repeating measurements of efficacy variables in clinical trials may be desirable when the measurement may be affected by ambient conditions. When such measurements are repeated at baseline and at the end of therapy, statistical questions relate to: (1) the best summary measurement to use for a subject when there is a possibility that some observations are contaminated and have increased variances; and (2) the effect of screening procedures which exclude outliers based on within- and between-subject contamination tests. We study these issues in two stages, each using a different set of models. The first stage deals only with the choice of the summary measure. The simulation results show that in some cases of contamination, the power achieved by the tests based on the median exceeds that achieved by the tests based on the mean of the replicates. However, even when we use the median, there are cases when contamination leads to a considerable loss in power. The combined issue of the best summary measurement and the effect of screening is studied in the second stage. The tests use either the observed data or the data after screening for outliers. The simulation results demonstrate that the power depends on the screening procedure as well as on the test statistic used in the study. We found that for the extent and magnitude of contamination considered, within-subject screening has a minimal effect on the power of the tests when there are at least three replicates; as a result, we found no advantage in the use of screening procedures for within-subject contamination. On the other hand, the use of a between-subject screening for outliers increases the power of the test procedures. However, even with the use of screening procedures, heterogeneity of variances can greatly reduce the power of the study.  相似文献   

11.
Compositional time series are multivariate time series which at each time point are proportions that sum to a constant. Accurate inference for such series which occur in several disciplines such as geology, economics and ecology is important in practice. Usual multivariate statistical procedures ignore the inherent constrained nature of these observations as parts of a whole and may lead to inaccurate estimation and prediction. In this article, a regression model with vector autoregressive moving average (VARMA) errors is fit to the compositional time series after an additive log ratio (ALR) transformation. Inference is carried out in a hierarchical Bayesian framework using Markov chain Monte Carlo techniques. The approach is illustrated on compositional time series of mortality events in Los Angeles in order to investigate dependence of different categories of mortality on air quality.  相似文献   

12.
We derive a non-parametric test for testing the presence of V(Xii) in the non-parametric first-order autoregressive model Xi+1=T(Xi)+V(Xii)+U(Xii+1, where the function T(x) is assumed known. The test is constructed as a functional of a basic process for which we establish a weak invariance principle, under the null hypothesis and under stationarity and mixing assumptions. Bounds for the local and non-local powers are provided under a condition which ensures that the power tends to one as the sample size tends to infinity.The testing procedure can be applied, e.g. to bilinear models, ARCH models, EXPAR models and to some other uncommon models. Our results confirm the robustness of the test constructed in Ngatchou Wandji (1995) and in Diebolt & Ngatchou Wandji (1995).  相似文献   

13.
While most of epidemiology is observational, rather than experimental, the culture of epidemiology is still derived from agricultural experiments, rather than other observational fields, such as astronomy or economics. The mismatch is made greater as focus has turned to continue risk factors, multifactorial outcomes, and outcomes with large variation unexplainable by available risk factors. The analysis of such data is often viewed as hypothesis testing with statistical control replacing randomization. However, such approaches often test restricted forms of the hypothesis being investigated, such as the hypothesis of a linear association, when there is no prior empirical or theoretical reason to believe that if an association exists, it is linear. In combination with the large nonstochastic sources of error in such observational studies, this suggests the more flexible alternative of exploring the association. Conclusions on the possible causal nature of any discovered association will rest on the coherence and consistency of multiple studies. Nonparametric smoothing in general, and generalized additive models in particular, represent an attractive approach to such problems. This is illustrated using data examining the relationship between particulate air pollution and daily mortality in Birmingham, Alabama; between particulate air pollution, ozone, and SO2 and daily hospital admissions for respiratory illness in Philadelphia; and between ozone and particulate air pollution and coughing episodes in children in six eastern U.S. cities. The results indicate that airborne particles and ozone are associated with adverse health outcomes at very low concentrations, and that there are likely no thresholds for these relationships.  相似文献   

14.
This paper develops a space‐time statistical model for local forecasting of surface‐level wind fields in a coastal region with complex topography. The statistical model makes use of output from deterministic numerical weather prediction models which are able to produce forecasts of surface wind fields on a spatial grid. When predicting surface winds at observing stations , errors can arise due to sub‐grid scale processes not adequately captured by the numerical weather prediction model , and the statistical model attempts to correct for these influences. In particular , it uses information from observing stations within the study region as well as topographic information to account for local bias. Bayesian methods for inference are used in the model , with computations carried out using Markov chain Monte Carlo algorithms. Empirical performance of the model is described , illustrating that a structured Bayesian approach to complicated space‐time models of the type considered in this paper can be readily implemented and can lead to improvements in forecasting over traditional methods.  相似文献   

15.
A spatial lattice model for binary data is constructed from two spatial scales linked through conditional probabilities. A coarse grid of lattice locations is specified, and all remaining locations (which we call the background) capture fine-scale spatial dependence. Binary data on the coarse grid are modelled with an autologistic distribution, conditional on the binary process on the background. The background behaviour is captured through a hidden Gaussian process after a logit transformation on its Bernoulli success probabilities. The likelihood is then the product of the (conditional) autologistic probability distribution and the hidden Gaussian–Bernoulli process. The parameters of the new model come from both spatial scales. A series of simulations illustrates the spatial-dependence properties of the model and likelihood-based methods are used to estimate its parameters. Presence–absence data of corn borers in the roots of corn plants are used to illustrate how the model is fitted.  相似文献   

16.
Early phase 2 tuberculosis (TB) trials are conducted to characterize the early bactericidal activity (EBA) of anti‐TB drugs. The EBA of anti‐TB drugs has conventionally been calculated as the rate of decline in colony forming unit (CFU) count during the first 14 days of treatment. The measurement of CFU count, however, is expensive and prone to contamination. Alternatively to CFU count, time to positivity (TTP), which is a potential biomarker for long‐term efficacy of anti‐TB drugs, can be used to characterize EBA. The current Bayesian nonlinear mixed‐effects (NLME) regression model for TTP data, however, lacks robustness to gross outliers that often are present in the data. The conventional way of handling such outliers involves their identification by visual inspection and subsequent exclusion from the analysis. However, this process can be questioned because of its subjective nature. For this reason, we fitted robust versions of the Bayesian nonlinear mixed‐effects regression model to a wide range of TTP datasets. The performance of the explored models was assessed through model comparison statistics and a simulation study. We conclude that fitting a robust model to TTP data obviates the need for explicit identification and subsequent “deletion” of outliers but ensures that gross outliers exert no undue influence on model fits. We recommend that the current practice of fitting conventional normal theory models be abandoned in favor of fitting robust models to TTP data.  相似文献   

17.
Data is rapidly increasing in volume and velocity and the Internet of Things (IoT) is one important source of this data. The IoT is a collection of connected devices (things) which are constantly recording data from their surroundings using on-board sensors. These devices can record and stream data to the cloud at a very high rate, leading to high storage and analysis costs. In order to ameliorate these costs, the data is modelled as a stream and analysed online to learn about the underlying process, perform interpolation and smoothing and make forecasts and predictions. Conventional state space modelling tools assume the observations occur on a fixed regular time grid. However, many sensors change their sampling frequency, sometimes adaptively, or get interrupted and re-started out of sync with the previous sampling grid, or just generate event data at irregular times. It is therefore desirable to model the system as a partially and irregularly observed Markov process which evolves in continuous time. Both the process and the observation model are potentially non-linear. Particle filters therefore represent the simplest approach to online analysis. A functional Scala library of composable continuous time Markov process models has been developed in order to model the wide variety of data captured in the IoT.  相似文献   

18.
In many situations in which a variable is measured at locations in time or space the observed data can be regarded as incomplete, the missing data sites perhaps completing a regular pattern such as a rectangular grid. In this paper general methods not dependent on the sequential nature of time are considered for estimating the parameters of Gaussian processes. An example is given.  相似文献   

19.
Fisher's exact test for two-by-two contingency tables has repeatedly been criticized as being too conservative. These criticisms arise most frequently in the context of a planned experiment for which the numbers of successes in each of two experimental groups are assumed to be binomially distributed. It is argued here that the binomial model is often unrealistic, and that the departures from the binomial assumptions reduce the conservatism in Fisher's exact test. Further discussion supports a recent claim of Barnard (1989) that the residual conservatism is attributable, not to any additional information used by the competing method, but to the discrete nature of the test, and can be drastically reduced through the use of Lancaster's mid-p-value. The binomial model is not recommended in that it depends on extra, questionable assumptions.  相似文献   

20.
Classical techniques for modeling numerical data associated to a regular grid have been widely developed in the literature. When a trigonometric model for the data is considered, it is possible to use the corresponding least squares (classical) estimators, but when the data are not observed on a regular grid, these estimators do not show appropriate properties. In this article we propose a novel way to model data that is not observed on a regular grid, and we establish a practical criterion, based on the mean squared error (MSE), to objectively decide which estimator should be used in each case: the inappropriate classical or the new unbiased estimator, which has greater variance. Jackknife and cross-validation techniques are used to follow a similar criterion in practice, when the MSE is not known. Finally, we present an application of the methodology to univariate and bivariate data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号