首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We present two new statistics for estimating the number of factors underlying in a multivariate system. One of the two new methods, the original NUMFACT, has been used in high profile environmental studies. The two new methods are first explained from a geometrical viewpoint. We then present an algebraic development and asymptotic cutoff points. Next we present a simulation study that shows that for skewed data the new methods are typically superior to traditional methods and for normally distributed data the new methods are competitive to the best of the traditional methods. We finally show how the methods compare by using two environmental data sets.  相似文献   

2.
Highly skewed and non-negative data can often be modeled by the delta-lognormal distribution in fisheries research. However, the coverage probabilities of extant interval estimation procedures are less satisfactory in small sample sizes and highly skewed data. We propose a heuristic method of estimating confidence intervals for the mean of the delta-lognormal distribution. This heuristic method is an estimation based on asymptotic generalized pivotal quantity to construct generalized confidence interval for the mean of the delta-lognormal distribution. Simulation results show that the proposed interval estimation procedure yields satisfactory coverage probabilities, expected interval lengths and reasonable relative biases. Finally, the proposed method is employed in red cod densities data for a demonstration.  相似文献   

3.
In this paper, we study a new Bayesian approach for the analysis of linearly mixed structures. In particular, we consider the case of hyperspectral images, which have to be decomposed into a collection of distinct spectra, called endmembers, and a set of associated proportions for every pixel in the scene. This problem, often referred to as spectral unmixing, is usually considered on the basis of the linear mixing model (LMM). In unsupervised approaches, the endmember signatures have to be calculated by an endmember extraction algorithm, which generally relies on the supposition that there are pure (unmixed) pixels contained in the image. In practice, this assumption may not hold for highly mixed data and consequently the extracted endmember spectra differ from the true ones. A way out of this dilemma is to consider the problem under the normal compositional model (NCM). Contrary to the LMM, the NCM treats the endmembers as random Gaussian vectors and not as deterministic quantities. Existing Bayesian approaches for estimating the proportions under the NCM are restricted to the case that the covariance matrix of the Gaussian endmembers is a multiple of the identity matrix. The self-evident conclusion is that this model is not suitable when the variance differs from one spectral channel to the other, which is a common phenomenon in practice. In this paper, we first propose a Bayesian strategy for the estimation of the mixing proportions under the assumption of varying variances in the spectral bands. Then we generalize this model to handle the case of a completely unknown covariance structure. For both algorithms, we present Gibbs sampling strategies and compare their performance with other, state of the art, unmixing routines on synthetic as well as on real hyperspectral fluorescence spectroscopy data.  相似文献   

4.
The traditional method for estimating or predicting linear combinations of the fixed effects and realized values of the random effects in mixed linear models is first to estimate the variance components and then to proceed as if the estimated values of the variance components were the true values. This two-stage procedure gives unbiased estimators or predictors of the linear combinations provided the data vector is symmetrically distributed about its expected value and provided the variance component estimators are translation-invariant and are even functions of the data vector. The standard procedures for estimating the variance components yield even, translation-invariant estimators.  相似文献   

5.
It is widely believed that the median is “usually” between the mean and the mode for skewed unimodal distributions. However, this inequality is not always true, especially with grouped data. Unavailability of complete raw data further necessitates the importance of evaluating this characteristic in grouped data. There is a gap in the current statistical literature on assessing mean–median–mode inequality for grouped data. The study aims to evaluate the relationship between the mean, median, and mode with unimodal grouped data; derive conditions for their inequalities; and present their application.  相似文献   

6.
In most practical applications, the quality of count data is often compromised due to errors-in-variables (EIVs). In this paper, we apply Bayesian approach to reduce bias in estimating the parameters of count data regression models that have mismeasured independent variables. Furthermore, the exposure model is misspecified with a flexible distribution, hence our approach remains robust against any departures from normality in its true underlying exposure distribution. The proposed method is also useful in realistic situations as the variance of EIVs is estimated instead of assumed as known, in contrast with other methods of correcting bias especially in count data EIVs regression models. We conduct simulation studies on synthetic data sets using Markov chain Monte Carlo simulation techniques to investigate the performance of our approach. Our findings show that the flexible Bayesian approach is able to estimate the values of the true regression parameters consistently and accurately.  相似文献   

7.
The quantile residual lifetime function provides comprehensive quantitative measures for residual life, especially when the distribution of the latter is skewed or heavy‐tailed and/or when the data contain outliers. In this paper, we propose a general class of semiparametric quantile residual life models for length‐biased right‐censored data. We use the inverse probability weighted method to correct the bias due to length‐biased sampling and informative censoring. Two estimating equations corresponding to the quantile regressions are constructed in two separate steps to obtain an efficient estimator. Consistency and asymptotic normality of the estimator are established. The main difficulty in implementing our proposed method is that the estimating equations associated with the quantiles are nondifferentiable, and we apply the majorize–minimize algorithm and estimate the asymptotic covariance using an efficient resampling method. We use simulation studies to evaluate the proposed method and illustrate its application by a real‐data example.  相似文献   

8.
The income or expenditure-related data sets are often nonlinear, heteroscedastic, skewed even after the transformation, and contain numerous outliers. We propose a class of robust nonlinear models that treat outlying observations effectively without removing them. For this purpose, case-specific parameters and a related penalty are employed to detect and modify the outliers systematically. We show how the existing nonlinear models such as smoothing splines and generalized additive models can be robustified by the case-specific parameters. Next, we extend the proposed methods to the heterogeneous models by incorporating unequal weights. The details of estimating the weights are provided. Two real data sets and simulated data sets show the potential of the proposed methods when the nature of the data is nonlinear with outlying observations.  相似文献   

9.
This paper contrasts two approaches to estimating quantile regression models: traditional semi-parametric methods and partially adaptive estimators using flexible probability density functions (pdfs). While more general pdfs could have been used, the skewed Laplace was selected for pedagogical purposes. Monte Carlo simulations are used to compare the behavior of the semi-parametric and partially adaptive quantile estimators in the presence of possibly skewed and heteroskedastic data. Both approaches accommodate skewness and heteroskedasticity which are consistent with linear quantiles; however, the partially adaptive estimator considered allows for non linear quantiles and also provides simple tests for symmetry and heteroskedasticity. The methods are applied to the problem of estimating conditional quantile functions for wages corresponding to different levels of education.  相似文献   

10.
Empirical distribution function (EDF) is a commonly used estimator of population cumulative distribution function. Survival function is estimated as the complement of EDF. However, clinical diagnosis of an event is often subjected to misclassification, by which the outcome is given with some uncertainty. In the presence of such errors, the true distribution of the time to first event is unknown. We develop a method to estimate the true survival distribution by incorporating negative predictive values and positive predictive values of the prediction process into a product-limit style construction. This will allow us to quantify the bias of the EDF estimates due to the presence of misclassified events in the observed data. We present an unbiased estimator of the true survival rates and its variance. Asymptotic properties of the proposed estimators are provided and these properties are examined through simulations. We evaluate our methods using data from the VIRAHEP-C study.  相似文献   

11.
This article proposes estimating the mean of positively skewed distributions by a one-sided trimmed mean estimator, called the upper-mean, and uses it to assess soil contact exposures at sites with toxic contamination. An optimal upper-mean is found by maximising the probability of the estimator falling in a target range. Monte Carlo studies are conducted for several positively skewed distributions and for a distribution obtained from real data. The simulation results show that the upper-mean is a better estimator than the upper 95% confidence limit estimator currently used, because it is more probable that the estimator is covered by the target range.  相似文献   

12.
In this paper, we introduce the shared gamma frailty models with two different baseline distributions namely, the generalized log-logistic and the generalized Weibull. We introduce the Bayesian estimation procedure to estimate the parameters involved in these models. We present a simulation study to compare the true values of the parameters with the estimated values. We apply these models to a real-life bivariate survival data set of McGilchrist and Aisbett related to the kidney infection data and a better model is suggested for the data.  相似文献   

13.
A problem of estimating regression coefficients is considered when the distribution of error terms is unknown but symmetric. We propose the use of reference distributions having various kurtosis values. It is assumed that the true error distribution is one of the reference distributions, but the indicator variable for the true distribution is missing. The generalized expectation–maximization algorithm combined with a line search is developed for estimating regression coefficients. Simulation experiments are carried out to compare the performance of the proposed approach with some existing robust regression methods including least absolute deviation, Lp, Huber M regression and an approximation using normal mixtures under various error distributions. As the error distribution is far from a normal distribution, the proposed method is observed to show better performance than other methods.  相似文献   

14.
A simulation study was conducted to assess how well the necessary sample size to achieve a stipulated margin of error can be estimated prior to sampling. Our concern was particularly focused on performance when sampling from a very skewed distribution, which is a common feature of many biological, economic, and other populations. We examined two approaches for estimating sample size—one being the commonly used strategy aimed at regulating the average magnitude of the stipulated margin of error and the second being a previously proposed strategy to control the tolerance probability with which the stipulated margin of error is exceeded. Results of the simulation revealed that (1) skewness does not much affect the average estimated sample size but can greatly extend the range of estimated sample sizes; and (2) skewness does reduce the effectiveness of Kupper and Hafner's sample size estimator, yet its effectiveness is negatively impacted less by skewness directly, and to a much greater degree by the common practice of estimating the population variance via a pilot sampling from the skewed population. Nonetheless, the simulations suggest that estimating sample size to control the probability with which the desired margin of error is achieved is a worthwhile alternative to the usual sample size formula that controls the average width of the confidence interval only.  相似文献   

15.
Estimating equations which are not necessarily likelihood-based score equations are becoming increasingly popular for estimating regression model parameters. This paper is concerned with estimation based on general estimating equations when true covariate data are missing for all the study subjects, but surrogate or mismeasured covariates are available instead. The method is motivated by the covariate measurement error problem in marginal or partly conditional regression of longitudinal data. We propose to base estimation on the expectation of the complete data estimating equation conditioned on available data. The regression parameters and other nuisance parameters are estimated simultaneously by solving the resulting estimating equations. The expected estimating equation (EEE) estimator is equal to the maximum likelihood estimator if the complete data scores are likelihood scores and conditioning is with respect to all the available data. A pseudo-EEE estimator, which requires less computation, is also investigated. Asymptotic distribution theory is derived. Small sample simulations are conducted when the error process is an order 1 autoregressive model. Regression calibration is extended to this setting and compared with the EEE approach. We demonstrate the methods on data from a longitudinal study of the relationship between childhood growth and adult obesity.  相似文献   

16.
In this paper we present a simulation study for comparing differents methods for estimating the prediction error rate in a discrimination problem. We consider the Cross-validation, Bootstrap and Bayesian Bootstrap methods for such as problem, while also elaborating on both simple and Bayesian Bootstrap methods by smoothing techniques. We observe as the smoothing procedure lead to improvements in the estimation of the true error rate of the discrimination rule, specially in the case of the smooth Bayesian Bootstrap estimator, whose reduction in M.S.E. resulted from the high positive correlation between the true error rate and its estimations based in this method.  相似文献   

17.
Illumina BeadArrays are becoming an increasingly popular Microarray platform due to its high data quality and relatively low cost. One distinct feature of Illumina BeadArrays is that each array has thousands of negative control bead types containing oligonucleotide sequences that are not specific to any target genes in the genome. This design provides a way of directly estimating the distribution of the background noise. In the literature of background correction for BeadArray data, the information from negative control beads is either ignored, used in a naive way that can lead to a loss in efficiency, or the noise is assumed to be normally distributed. However, we show with real data that the noise can be skewed. In this study, we propose an exponential-gamma convolution model for background correction of Illumina BeadArray data. Using both simulated and real data examples, we show that the proposed method can improve the signal estimation and detection of differentially expressed genes when the signal to noise ratio is large and the noise has a skewed distribution.  相似文献   

18.
We extend the Bayesian Model Averaging (BMA) framework to dynamic panel data models with endogenous regressors using a Limited Information Bayesian Model Averaging (LIBMA) methodology. Monte Carlo simulations confirm the asymptotic performance of our methodology both in BMA and selection, with high posterior inclusion probabilities for all relevant regressors, and parameter estimates very close to their true values. In addition, we illustrate the use of LIBMA by estimating a dynamic gravity model for bilateral trade. Once model uncertainty, dynamics, and endogeneity are accounted for, we find several factors that are robustly correlated with bilateral trade. We also find that applying methodologies that do not account for either dynamics or endogeneity (or both) results in different sets of robust determinants.  相似文献   

19.
The product limit or Kaplan‐Meier (KM) estimator is commonly used to estimate the survival function in the presence of incomplete time to event. Application of this method assumes inherently that the occurrence of an event is known with certainty. However, the clinical diagnosis of an event is often subject to misclassification due to assay error or adjudication error, by which the event is assessed with some uncertainty. In the presence of such errors, the true distribution of the time to first event would not be estimated accurately using the KM method. We develop a method to estimate the true survival distribution by incorporating negative predictive values and positive predictive values, into a KM‐like method of estimation. This allows us to quantify the bias in the KM survival estimates due to the presence of misclassified events in the observed data. We present an unbiased estimator of the true survival function and its variance. Asymptotic properties of the proposed estimators are provided, and these properties are examined through simulations. We demonstrate our methods using data from the Viral Resistance to Antiviral Therapy of Hepatitis C study.  相似文献   

20.
We consider asymmetric kernel estimates based on grouped data. We propose an iterated scheme for constructing such an estimator and apply an iterated smoothed bootstrap approach for bandwidth selection. We compare our approach with competing methods in estimating actuarial loss models using both simulations and data studies. The simulation results show that with this new method, the estimated density from grouped data matches the true density more closely than with competing approaches.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号