期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Efficient estimation of nonparametric spatial models with general correlation structures

Hongxia Wang Yuehua Wu Elton Chan 《Australian & New Zealand Journal of Statistics》2017,59(2):215-233

Spatially correlated data appear in many environmental studies, and consequently there is an increasing demand for estimation methods that take account of spatial correlation and thereby improve the accuracy of estimation. In this paper we propose an iterative nonparametric procedure for modelling spatial data with general correlation structures. The asymptotic normality of the proposed estimators is established under mild conditions. We demonstrate, using both simulation and case studies, that the proposed estimators are more efficient than the traditional locally linear methods which fail to account for spatial correlation. 相似文献

2.

Inverse Circular–Linear/Linear–Circular Regression

Sungsu Kim Ashis SenGupta 《统计学通讯:理论与方法》2013,42(22):4772-4782

We propose two distance-based methods and two likelihood-based methods of inversely regressing a linear predictor on a circular variable, and of inversely regressing a circular predictor on a linear variable. An asymptotic result on least circular distance estimators is provided in the Appendix. We present likelihood-based methods for symmetrical and asymmetrical errors in each situation. The utility of our methodology in each situation is illustrated by applying it to real data sets in engineering and environmental science. We then compare their performances using a cross validation method. 相似文献

3.

Modeling proportions and marginal counts simultaneously for clustered multinomial data with random cluster sizes

Guohua Yan Renjun Ma 《Journal of applied statistics》2016,43(6):1074-1087

Clustered multinomial data with random cluster sizes commonly appear in health, environmental and ecological studies. Traditional approaches for analyzing clustered multinomial data contemplate two assumptions. One of these assumptions is that cluster sizes are fixed, whereas the other demands cluster sizes to be positive. Randomness of the cluster sizes may be the determinant of the within-cluster correlation and between-cluster variation. We propose a baseline-category mixed model for clustered multinomial data with random cluster sizes based on Poisson mixed models. Our orthodox best linear unbiased predictor approach to this model depends only on the moment structure of unobserved distribution-free random effects. Our approach also consolidates the marginal and conditional modeling interpretations. Unlike the traditional methods, our approach can accommodate both random and zero cluster sizes. Two real-life multinomial data examples, crime data and food contamination data, are used to manifest our proposed methodology. 相似文献

4.

Modelling Dependence within Joint Tail Regions

Anthony W. Ledford & Jonathan A. Tawn 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1997,59(2):475-499

Standard approaches for modelling dependence within joint tail regions are based on extreme value methods which assume max-stability, a particular form of joint tail dependence. We develop joint tail models based on a broader class of dependence structure which provides a natural link between max-stable models and weaker forms of dependence including independence and negative association. This approach overcomes many of the problems that are encountered with standard methods and is the basis for a Poisson process representation that generalizes existing bivariate results. We apply the new techniques to simulated and environmental data, and demonstrate the marked advantage that the new approach offers for joint tail extrapolation. 相似文献

5.

On construction of asymptotically correct confidence intervals

Shifeng Xiong Weiyan Mu 《Journal of statistical planning and inference》2009

In this paper we discuss constructing confidence intervals based on asymptotic generalized pivotal quantities (AGPQs). An AGPQ associates a distribution with the corresponding parameter, and then an asymptotically correct confidence interval can be derived directly from this distribution like Bayesian or fiducial interval estimates. We provide two general procedures for constructing AGPQs. We also present several examples to show that AGPQs can yield new confidence intervals with better finite-sample behaviors than traditional methods. 相似文献

6.

Multiple imputation for combining confidential data owned by two agencies

Christine N. Kohnen Jerome P. Reiter 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2009,172(2):511-528

Summary. Statistical agencies that own different databases on overlapping subjects can benefit greatly from combining their data. These benefits are passed on to secondary data analysts when the combined data are disseminated to the public. Sometimes combining data across agencies or sharing these data with the public is not possible: one or both of these actions may break promises of confidentiality that have been given to data subjects. We describe an approach that is based on two stages of multiple imputation that facilitates data sharing and dissemination under restrictions of confidentiality. We present new inferential methods that properly account for the uncertainty that is caused by the two stages of imputation. We illustrate the approach by using artificial and genuine data. 相似文献

7.

Modelling method effects as individual causal effects

Steffi Pohl Rolf Steyer Katrin Kraus 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2008,171(1):41-63

Summary. Method effects often occur when different methods are used for measuring the same construct. We present a new approach for modelling this kind of phenomenon, consisting of a definition of method effects and a first model, the method effect model , that can be used for data analysis. This model may be applied to multitrait–multimethod data or to longitudinal data where the same construct is measured with at least two methods at all occasions. In this new approach, the definition of the method effects is based on the theory of individual causal effects by Neyman and Rubin. Method effects are accordingly conceptualized as the individual effects of applying measurement method j instead of k . They are modelled as latent difference scores in structural equation models. A reference method needs to be chosen against which all other methods are compared. The model fit is invariant to the choice of the reference method. The model allows the estimation of the average of the individual method effects, their variance, their correlation with the traits (and other latent variables) and the correlation of different method effects among each other. Furthermore, since the definition of the method effects is in line with the theory of causality, the method effects may (under certain conditions) be interpreted as causal effects of the method. The method effect model is compared with traditional multitrait–multimethod models. An example illustrates the application of the model to longitudinal data analysing the effect of negatively (such as 'feel bad') as compared with positively formulated items (such as 'feel good') measuring mood states. 相似文献

8.

Maximum Likelihood Estimations and EM Algorithms with Length-biased Data 总被引：2，自引：0，他引：2

Qin J Ning J Liu H Shen Y 《Journal of the American Statistical Association》2011,106(496):1434-1449

Length-biased sampling has been well recognized in economics, industrial reliability, etiology applications, epidemiological, genetic and cancer screening studies. Length-biased right-censored data have a unique data structure different from traditional survival data. The nonparametric and semiparametric estimations and inference methods for traditional survival data are not directly applicable for length-biased right-censored data. We propose new expectation-maximization algorithms for estimations based on full likelihoods involving infinite dimensional parameters under three settings for length-biased data: estimating nonparametric distribution function, estimating nonparametric hazard function under an increasing failure rate constraint, and jointly estimating baseline hazards function and the covariate coefficients under the Cox proportional hazards model. Extensive empirical simulation studies show that the maximum likelihood estimators perform well with moderate sample sizes and lead to more efficient estimators compared to the estimating equation approaches. The proposed estimates are also more robust to various right-censoring mechanisms. We prove the strong consistency properties of the estimators, and establish the asymptotic normality of the semi-parametric maximum likelihood estimators under the Cox model using modern empirical processes theory. We apply the proposed methods to a prevalent cohort medical study. Supplemental materials are available online. 相似文献

9.

An example of a two-part latent growth curve model for semicontinuous outcomes in the health sciences

Sterling McPherson Celestina Barbosa-Leiker 《Journal of applied statistics》2012,39(10):2113-2128

A new method of modeling coronary artery calcium (CAC) is needed in order to properly understand the probability of onset and growth of CAC. CAC remains a controversial indicator of cardiovascular disease (CVD) risk, but this may be due to ill-equipped methods of specifying CAC during the analysis phase of studies reporting an analysis where CAC is the primary outcome. The modern method of two-part latent growth modeling may represent a strong alternative to the myriad of existing methods for modeling CAC. We provide a brief overview of existing methods of analysis used for CAC before introducing the general latent growth curve model, how it extends into a two-part (semicontinuous) growth model, and how the ubiquitous problem of missing data can be effectively handled. We then present an example of how to model CAC using this framework. We demonstrate that utilizing this type of modeling strategy can result in traditional predictors of CAC (e.g. age, gender, and high-density lipoprotein cholesterol), exerting a different impact on the two different, yet simultaneous, operationalizations of CAC. This method of analyzing CAC could inform future analyses of CAC and inform subsequent discussions about the nature of its potential to inform long-term CVD risk and heart events. 相似文献

10.

Modeling data with structural and temporal correlation using lower level and higher level multilevel models

James G Zhou Y Miller S 《Pharmaceutical statistics》2011,10(5):395-406

Novel imaging techniques are playing an increasingly important role in drug development, providing insight into the mechanism of action of new chemical entities. The data sets obtained by these methods can be large with complex inter-relationships, but the most appropriate statistical analysis for handling this data is often uncertain--precisely because of the exploratory nature of the way the data are collected. We present an example from a clinical trial using magnetic resonance imaging to assess changes in atherosclerotic plaques following treatment with a tool compound with established clinical benefit. We compared two specific approaches to handle the correlations due to physical location and repeated measurements: two-level and four-level multilevel models. The two methods identified similar structural variables, but higher level multilevel models had the advantage of explaining a greater proportion of variation, and the modeling assumptions appeared to be better satisfied. 相似文献

11.

Function-on-function regression for two-dimensional functional data

Andrada E. Ivanescu 《统计学通讯:模拟与计算》2013,42(9):2656-2669

ABSTRACT

We present methods for modeling and estimation of a concurrent functional regression when the predictors and responses are two-dimensional functional datasets. The implementations use spline basis functions and model fitting is based on smoothing penalties and mixed model estimation. The proposed methods are implemented in available statistical software, allow the construction of confidence intervals for the bivariate model parameters, and can be applied to completely or sparsely sampled responses. Methods are tested to data in simulations and they show favorable results in practice. The usefulness of the methods is illustrated in an application to environmental data. 相似文献

12.

A new class of INAR(1) model for count time series

M. Shirozhan 《Journal of Statistical Computation and Simulation》2018,88(7):1348-1368

The present work proposes a new integer valued autoregressive model with Poisson marginal distribution based on the mixing Pegram and dependent Bernoulli thinning operators. Properties of the model are discussed. We consider several methods for estimating the unknown parameters of the model. Also, the classical and Bayesian approaches are used for forecasting. Simulations are performed for the performance of these estimators and forecasting methods. Finally, the analysis of two real data has been presented for illustrative purposes. 相似文献

13.

Diagnostics for multivariate imputations

Kobi Abayomi rew Gelman Marc Levy 《Journal of the Royal Statistical Society. Series C, Applied statistics》2008,57(3):273-291

Summary. We consider three sorts of diagnostics for random imputations: displays of the completed data, which are intended to reveal unusual patterns that might suggest problems with the imputations, comparisons of the distributions of observed and imputed data values and checks of the fit of observed data to the model that is used to create the imputations. We formulate these methods in terms of sequential regression multivariate imputation, which is an iterative procedure in which the missing values of each variable are randomly imputed conditionally on all the other variables in the completed data matrix. We also consider a recalibration procedure for sequential regression imputations. We apply these methods to the 2002 environmental sustainability index, which is a linear aggregation of 64 environmental variables on 142 countries. 相似文献

14.

Scanner Data and the Treatment of Quality Change in Nonrevisable Price Indexes

Jan de Haan Frances Krsinich 《商业与经济统计学杂志》2014,32(3):341-358

The recently developed rolling year GEKS procedure makes maximum use of all matches in the data to construct nonrevisable price indexes that are approximately free from chain drift. A potential weakness is that unmatched items are ignored. In this article we use imputation Törnqvist price indexes as inputs into the rolling year GEKS procedure. These indexes account for quality changes by imputing the “missing prices” associated with new and disappearing items. Three imputation methods are discussed. The first method makes explicit imputations using a hedonic regression model which is estimated for each time period. The other two methods make implicit imputations; they are based on time dummy hedonic and time-product dummy regression models and are estimated on bilateral pooled data. We present empirical evidence for New Zealand from scanner data on eight consumer electronics products and find that accounting for quality change can make a substantial difference. 相似文献

15.

Threshold selection for regional peaks-over-threshold data 总被引：1，自引：0，他引：1

M. Roth G. Jongbloed T.A. Buishand 《Journal of applied statistics》2016,43(7):1291-1309

A hurdle in the peaks-over-threshold approach for analyzing extreme values is the selection of the threshold. A method is developed to reduce this obstacle in the presence of multiple, similar data samples. This is for instance the case in many environmental applications. The idea is to combine threshold selection methods into a regional method. Regionalized versions of the threshold stability and the mean excess plot are presented as graphical tools for threshold selection. Moreover, quantitative approaches based on the bootstrap distribution of the spatially averaged Kolmogorov–Smirnov and Anderson–Darling test statistics are introduced. It is demonstrated that the proposed regional method leads to an increased sensitivity for too low thresholds, compared to methods that do not take into account the regional information. The approach can be used for a wide range of univariate threshold selection methods. We test the methods using simulated data and present an application to rainfall data from the Dutch water board Vallei en Veluwe. 相似文献

16.

A NEW FAMILY OF NON-NEGATIVE DISTRIBUTIONS 总被引：1，自引：0，他引：1

Robin K. S. Hankin Alan Lee 《Australian & New Zealand Journal of Statistics》2006,48(1):67-78

We introduce a new, flexible family of distributions for non‐negative data, defined by means of a quantile function. We describe some properties of this family, and discuss several methods for estimating the parameters. The distribution is applied to an example from environmental engineering. 相似文献

17.

Sparse additive models

Pradeep Ravikumar John Lafferty Han Liu Larry Wasserman 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2009,71(5):1009-1030

Summary. We present a new class of methods for high dimensional non-parametric regression and classification called sparse additive models. Our methods combine ideas from sparse linear modelling and additive non-parametric regression. We derive an algorithm for fitting the models that is practical and effective even when the number of covariates is larger than the sample size. Sparse additive models are essentially a functional version of the grouped lasso of Yuan and Lin. They are also closely related to the COSSO model of Lin and Zhang but decouple smoothing and sparsity, enabling the use of arbitrary non-parametric smoothers. We give an analysis of the theoretical properties of sparse additive models and present empirical results on synthetic and real data, showing that they can be effective in fitting sparse non-parametric models in high dimensional data. 相似文献

18.

Slashed generalized Rayleigh distribution

Yuri A. Iriarte Héctor Varela Héctor W. Gómez 《统计学通讯:理论与方法》2017,46(10):4686-4699

We introduce a new family of distributions suitable for fitting positive data sets with high kurtosis which is called the Slashed Generalized Rayleigh Distribution. This distribution arises as the quotient of two independent random variables, one being a generalized Rayleigh distribution in the numerator and the other a power of the uniform distribution in the denominator. We present properties and carry out estimation of the model parameters by moment and maximum likelihood (ML) methods. Finally, we conduct a small simulation study to evaluate the performance of ML estimators and analyze real data sets to illustrate the usefulness of the new model. 相似文献

19.

Assessing the Precision of Turning Point Estimates in Polynomial Regression Functions

Florenz Plassmann Neha Khanna 《Econometric Reviews》2013,32(5):503-528

Researchers often report point estimates of turning point(s) obtained in polynomial regression models but rarely assess the precision of these estimates. We discuss three methods to assess the precision of such turning point estimates. The first is the delta method that leads to a normal approximation of the distribution of the turning point estimator. The second method uses the exact distribution of the turning point estimator of quadratic regression functions. The third method relies on Markov chain Monte Carlo methods to provide a finite sample approximation of the exact distribution of the turning point estimator. We argue that the delta method may lead to misleading inference and that the other two methods are more reliable. We compare the three methods using two data sets from the environmental Kuznets curve literature, where the presence and location of a turning point in the income-pollution relationship is the focus of much empirical work. 相似文献

20.

Testing the Homogeneity of Two Survival Functions Against a Mixture Alternative Based on Censored Data

Liting Zhu Xianming Tan 《统计学通讯:模拟与计算》2013,42(4):767-776

When the survival distribution in a treatment group is a mixture of two distributions of the same family, traditional parametric methods that ignore the existence of mixture components or the nonparametric methods may not be very powerful. We develop a modified likelihood ratio test (MLRT) for testing homogeneity in a two sample problem with censored data and compare the actual type I error and power of the MLRT with that nonparametric log-rank test and parametric test through Monte-Carlo simulations. The proposed test is also applied to analyze data from a clinical trial on early breast cancer. 相似文献