首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
Shipping and shipping services are a key industry of great importance to the economy of Cyprus and the wider European Union. Assessment, management and future steering of the industry, and its associated economy, is carried out by a range of organisations and is of direct interest to a number of stakeholders. This article presents an analysis of shipping credit flow data: an important and archetypal series whose analysis is hampered by rapid changes of variance. Our analysis uses the recently developed data-driven Haar–Fisz transformation that enables accurate trend estimation and successful prediction in these kinds of situation. Our trend estimation is augmented by bootstrap confidence bands, new in this context. The good performance of the data-driven Haar–Fisz transform contrasts with the poor performance exhibited by popular and established variance stabilisation alternatives: the Box–Cox, logarithm and square root transformations.  相似文献   

2.
The statistical problems associated with estimating the mean responding cell density in the limiting dilution assay (LDA) have largely been ignored. We evaluate techniques for analyzing LDA data from multiple biological samples, assumed to follow either a normal or gamma distribution. Simulated data is used to evaluate the performance of an unweighted mean, a log transform, and a weighted mean procedure described by Taswell (1987). In general, an unweighted mean with jackknife estimates will produce satisfactory results. In some cases, a log transform is more appropriate. Taswell's weighted mean algorithm is unable to estimate an accurate variance. We also show that methods which pool samples, or LDA data, are invalid. In addition, we show that optimization of the variance in multiple sample LDA's is dependent on the estimator, the between-organism variance, the replicate well size, and the numberof biological samples. However, this optimization is generally achieved by maximizing biological samples at the expense of well replicates.  相似文献   

3.
We consider detection of multiple changes in the distribution of periodic and autocorrelated data with known period. To account for periodicity we transform the sequence of vector observations by arranging them in matrices and thereby producing a sequence of independently and identically distributed matrix observations. We propose methods of testing the equality of matrix distributions and present methods that can be applied to matrix observations using the E-divisive algorithm. We show that periodicity and autocorrelation degrade existing change detection methods because they blur the changes that these procedures aim to discover. Methods that ignore the periodicity have low power to detect changes in the mean and the variance of periodic time series when the periodic effects overwhelm the true changes, while the proposed methods detect such changes with high power. We illustrate the proposed methods by detecting changes in the water quality of Lake Kasumigaura in Japan. The Canadian Journal of Statistics 48: 518–534; 2020 © 2020 Statistical Society of Canada  相似文献   

4.
In most practical applications, the quality of count data is often compromised due to errors-in-variables (EIVs). In this paper, we apply Bayesian approach to reduce bias in estimating the parameters of count data regression models that have mismeasured independent variables. Furthermore, the exposure model is misspecified with a flexible distribution, hence our approach remains robust against any departures from normality in its true underlying exposure distribution. The proposed method is also useful in realistic situations as the variance of EIVs is estimated instead of assumed as known, in contrast with other methods of correcting bias especially in count data EIVs regression models. We conduct simulation studies on synthetic data sets using Markov chain Monte Carlo simulation techniques to investigate the performance of our approach. Our findings show that the flexible Bayesian approach is able to estimate the values of the true regression parameters consistently and accurately.  相似文献   

5.
Abstract.  A flexible semi-parametric regression model is proposed for modelling the relationship between a response and multivariate predictor variables. The proposed multiple-index model includes smooth unknown link and variance functions that are estimated non-parametrically. Data-adaptive methods for automatic smoothing parameter selection and for the choice of the number of indices M are considered. This model adapts to complex data structures and provides efficient adaptive estimation through the variance function component in the sense that the asymptotic distribution is the same as if the non-parametric components are known. We develop iterative estimation schemes, which include a constrained projection method for the case where the regression parameter vectors are mutually orthogonal. The proposed methods are illustrated with the analysis of data from a growth bioassay and a reproduction experiment with medflies. Asymptotic properties of the estimated model components are also obtained.  相似文献   

6.
Longitudinal or clustered response data arise in many applications such as biostatistics, epidemiology and environmental studies. The repeated responses cannot in general be assumed to be independent. One method of analysing such data is by using the generalized estimating equations (GEE) approach. The current GEE method for estimating regression effects in longitudinal data focuses on the modelling of the working correlation matrix assuming a known variance function. However, correct choice of the correlation structure may not necessarily improve estimation efficiency for the regression parameters if the variance function is misspecified [Wang YG, Lin X. Effects of variance-function misspecification in analysis of longitudinal data. Biometrics. 2005;61:413–421]. In this connection two problems arise: finding a correct variance function and estimating the parameters of the chosen variance function. In this paper, we study the problem of estimating the parameters of the variance function assuming that the form of the variance function is known and then the effect of a misspecified variance function on the estimates of the regression parameters. We propose a GEE approach to estimate the parameters of the variance function. This estimation approach borrows the idea of Davidian and Carroll [Variance function estimation. J Amer Statist Assoc. 1987;82:1079–1091] by solving a nonlinear regression problem where residuals are regarded as the responses and the variance function is regarded as the regression function. A limited simulation study shows that the proposed method performs at least as well as the modified pseudo-likelihood approach developed by Wang and Zhao [A modified pseudolikelihood approach for analysis of longitudinal data. Biometrics. 2007;63:681–689]. Both these methods perform better than the GEE approach.  相似文献   

7.
A new two-parameter distribution over the unit interval, called the Unit-Inverse Gaussian distribution, is introduced and studied in detail. The proposed distribution shares many properties with other known distributions on the unit interval, such as Beta, Johnson SB, Unit-Gamma, and Kumaraswamy distributions. Estimation of the parameters of the proposed distribution are obtained by transforming the data to the inverse Gaussian distribution. Unlike most distributions on the unit interval, the maximum likelihood or method of moments estimators of the parameters of the proposed distribution are expressed in simple closed forms which do not need iterative methods to compute. Application of the proposed distribution to a real data set shows better fit than many known two-parameter distributions on the unit interval.  相似文献   

8.
Multi-stage time evolving models are common statistical models for biological systems, especially insect populations. In stage-duration distribution models, parameter estimation for the models use the Laplace transform method. This method involves assumptions such as known constant shapes, known constant rates or the same overall hazard rate for all stages. These assumptions are strong and restrictive. The main aim of this paper is to weaken these assumptions by using a Bayesian approach. In particular, a Metropolis-Hastings algorithm based on deterministic transformations is used to estimate parameters. We will use two models, one which has no hazard rates, and the other has stage-wise constant hazard rates. These methods are validated in simulation studies followed by a case study of cattle parasites. The results show that the proposed methods are able to estimate the parameters comparably well, as opposed to using the Laplace transform methods.  相似文献   

9.
Projection techniques for nonlinear principal component analysis   总被引:4,自引:0,他引:4  
Principal Components Analysis (PCA) is traditionally a linear technique for projecting multidimensional data onto lower dimensional subspaces with minimal loss of variance. However, there are several applications where the data lie in a lower dimensional subspace that is not linear; in these cases linear PCA is not the optimal method to recover this subspace and thus account for the largest proportion of variance in the data.Nonlinear PCA addresses the nonlinearity problem by relaxing the linear restrictions on standard PCA. We investigate both linear and nonlinear approaches to PCA both exclusively and in combination. In particular we introduce a combination of projection pursuit and nonlinear regression for nonlinear PCA. We compare the success of PCA techniques in variance recovery by applying linear, nonlinear and hybrid methods to some simulated and real data sets.We show that the best linear projection that captures the structure in the data (in the sense that the original data can be reconstructed from the projection) is not necessarily a (linear) principal component. We also show that the ability of certain nonlinear projections to capture data structure is affected by the choice of constraint in the eigendecomposition of a nonlinear transform of the data. Similar success in recovering data structure was observed for both linear and nonlinear projections.  相似文献   

10.
Abstract

In this paper, using estimating function approach, a new optimal volatility estimator is introduced and based on the recursive form of the estimator a data-driven generalized EWMA model for value at risk (VaR) forecast is proposed. An appropriate data-driven model for volatility is identified by the relationship between absolute deviation and standard deviation for symmetric distributions with finite variance. It is shown that the asymptotic variance of the proposed volatility estimator is smaller than that of conventional estimators and is more appropriate for financial data with larger kurtosis. For IBM, Microsoft, Apple stocks and SP 500 index the proposed method is used to identify the model, estimate the volatility, and obtain minimum mean square error(MMSE) forecasts of VaR.  相似文献   

11.
For estimation of time-varying coefficient longitudinal models, the widely used local least-squares (LS) or covariance-weighted local LS smoothing uses information from the local sample average. Motivated by the fact that a combination of multiple quantiles provides a more complete picture of the distribution, we investigate quantile regression-based methods to improve efficiency by optimally combining information across quantiles. Under the working independence scenario, the asymptotic variance of the proposed estimator approaches the Cramér–Rao lower bound. In the presence of dependence among within-subject measurements, we adopt a prewhitening technique to transform regression errors into independent innovations and show that the prewhitened optimally weighted quantile average estimator asymptotically achieves the Cramér–Rao bound for the independent innovations. Fully data-driven bandwidth selection and optimal weights estimation are implemented through a two-step procedure. Monte Carlo studies show that the proposed method delivers more robust and superior overall performance than that of the existing methods.  相似文献   

12.
Although regression estimates are quite robust to slight departure from normality, symmetric prediction intervals assuming normality can be highly unsatisfactory and problematic if the residuals have a skewed distribution. For data with distributions outside the class covered by the Generalized Linear Model, a common way to handle non-normality is to transform the response variable. Unfortunately, transforming the response variable often destroys the theoretical or empirical functional relationship connecting the mean of the response variable to the explanatory variables established on the original scale. Further complication arises if a single transformation cannot both stabilize variance and attain normality. Furthermore, practitioners also find the interpretation of highly transformed data not obvious and often prefer an analysis on the original scale. The present paper presents an alternative approach for handling simultaneously heteroscedasticity and non-normality without resorting to data transformation. Unlike classical approaches, the proposed modeling allows practitioners to formulate the mean and variance relationships directly on the original scale, making data interpretation considerably easier. The modeled variance relationship and form of non-normality in the proposed approach can be easily examined through a certain function of the standardized residuals. The proposed method is seen to remain consistent for estimating the regression parameters even if the variance function is misspecified. The method along with some model checking techniques is illustrated with a real example.  相似文献   

13.
Estimating the proportion of true null hypotheses, π0, has attracted much attention in the recent statistical literature. Besides its apparent relevance for a set of specific scientific hypotheses, an accurate estimate of this parameter is key for many multiple testing procedures. Most existing methods for estimating π0 in the literature are motivated from the independence assumption of test statistics, which is often not true in reality. Simulations indicate that most existing estimators in the presence of the dependence among test statistics can be poor, mainly due to the increase of variation in these estimators. In this paper, we propose several data-driven methods for estimating π0 by incorporating the distribution pattern of the observed p-values as a practical approach to address potential dependence among test statistics. Specifically, we use a linear fit to give a data-driven estimate for the proportion of true-null p-values in (λ, 1] over the whole range [0, 1] instead of using the expected proportion at 1?λ. We find that the proposed estimators may substantially decrease the variance of the estimated true null proportion and thus improve the overall performance.  相似文献   

14.
Summary.  Wavelet shrinkage is an effective nonparametric regression technique, especially when the underlying curve has irregular features such as spikes or discontinuities. The basic idea is simple: take the discrete wavelet transform of data consisting of a signal corrupted by noise; shrink or remove the wavelet coefficients to remove the noise; then invert the discrete wavelet transform to form an estimate of the true underlying curve. Various researchers have proposed increasingly sophisticated methods of doing this by using real-valued wavelets. Complex-valued wavelets exist but are rarely used. We propose two new complex-valued wavelet shrinkage techniques: one based on multiwavelet style shrinkage and the other using Bayesian methods. Extensive simulations show that our methods almost always give significantly more accurate estimates than methods based on real-valued wavelets. Further, our multiwavelet style shrinkage method is both simpler and dramatically faster than its competitors. To understand the excellent performance of this method we present a new risk bound on its hard thresholded coefficients.  相似文献   

15.
Spectral analysis at frequencies other than zero plays an increasingly important role in econometrics. A number of alternative automated data-driven procedures for nonparametric spectral density estimation have been suggested in the literature, but little is known about their finite-sample accuracy. We compare five such procedures in terms of their mean-squared percentage error across frequencies. Our data generating processes (DGP) include autoregressive-moving average (ARMA) models, fractionally integrated ARMA models and nonparametric models based on 16 commonly used macroeconomic time series. We find that for both quarterly and monthly data the autoregressive sieve estimator is the most reliable method overall.  相似文献   

16.
We derive the exact finite sample distribution of the L1 -version of the Fisz–Cramér–von Mises test statistic (FCvM 1). We first characterize the set of all distinct sample p-p plots for two balanced samples of size n absent ties. Next, we order this set according to the corresponding value of FCvM 1. Finally, we link these values to the probabilities that the underlying p-p plots emerge. Comparing the finite sample distribution with the (known) limiting distribution shows that the latter can always be used for hypothesis testing: although for finite samples the critical percentiles of the limiting distribution differ from the exact values, this will not lead to differences in the rejection of the underlying hypothesis.  相似文献   

17.
Our article presents a general treatment of the linear regression model, in which the error distribution is modelled nonparametrically and the error variances may be heteroscedastic, thus eliminating the need to transform the dependent variable in many data sets. The mean and variance components of the model may be either parametric or nonparametric, with parsimony achieved through variable selection and model averaging. A Bayesian approach is used for inference with priors that are data-based so that estimation can be carried out automatically with minimal input by the user. A Dirichlet process mixture prior is used to model the error distribution nonparametrically; when there are no regressors in the model, the method reduces to Bayesian density estimation, and we show that in this case the estimator compares favourably with a well-regarded plug-in density estimator. We also consider a method for checking the fit of the full model. The methodology is applied to a number of simulated and real examples and is shown to work well.  相似文献   

18.
The depths, which have been used to detect outliers or to extract a representative subset, can be applied to classification. We propose a resampling-based classification method based on the fact that resampling techniques yield a consistent estimator of the distribution of a statistic. The performance of this method was evaluated with eight contaminated models in terms of Correct Classification Rates (CCRs), and the results were compared with other known methods. The proposed method consistently showed higher average CCRs and 4% higher CCR at the maximum compared to other methods. In addition, this method was applied to Berkeley data. The average CCRs were between 0.79 and 0.85.  相似文献   

19.
ABSTRACT

Various methods have been proposed to estimate intra-cluster correlation coefficients (ICCs) for correlated binary data, and many are very sensitive to the type of design and underlying distributional assumptions. We proposed a new method to estimate ICC and its 95% confidence intervals based on resampling principles and U-statistics, where we resampled with replacement pairs of individuals from within and between clusters. We concluded from our simulation study that the resampling-based estimates approximate the population ICC more precisely than the analysis of variance and method of moments techniques for different event rates, varying number of clusters, and cluster sizes.  相似文献   

20.
In some applications, the quality of the process or product is characterized and summarized by a functional relationship between a response variable and one or more explanatory variables. Profile monitoring is a technique for checking the stability of the relationship over time. Existing linear profile monitoring methods usually assumed the error distribution to be normal. However, this assumption may not always be true in practice. To address this situation, we propose a method for profile monitoring under the framework of generalized linear models when the relationship between the mean and variance of the response variable is known. Two multivariate exponentially weighted moving average control schemes are proposed based on the estimated profile parameters obtained using a quasi-likelihood approach. The performance of the proposed methods is evaluated by simulation studies. Furthermore, the proposed method is applied to a real data set, and the R code for profile monitoring is made available to users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号