首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) models are recommended for handling excessive zeros in count data. For various reasons, researchers may not address zero inflation. This paper helps educate researchers on (1) the importance of accounting for zero inflation and (2) the consequences of misspecifying the statistical model. Using simulations, we found that when the zero inflation in the data was ignored, estimation was poor and statistically significant findings were missed. When overdispersion within the zero-inflated data was ignored, poor estimation and inflated Type I errors resulted. Recommendations on when to use the ZINB and ZIP models are provided. In an illustration using a two-step model selection procedure (likelihood ratio test and the Vuong test), the ZIP model was correctly identified only when the distributions had moderate means and sample sizes and did not correctly identify the ZINB model or the zero inflation in the ZIP and ZINB distributions.  相似文献   

2.
Recursive partitioning algorithms separate a feature space into a set of disjoint rectangles. Then, usually, a constant in every partition is fitted. While this is a simple and intuitive approach, it may still lack interpretability as to how a specific relationship between dependent and independent variables may look. Or it may be that a certain model is assumed or of interest and there is a number of candidate variables that may non-linearly give rise to different model parameter values. We present an approach that combines generalized linear models (GLM) with recursive partitioning that offers enhanced interpretability of classical trees as well as providing an explorative way to assess a candidate variable's influence on a parametric model. This method conducts recursive partitioning of a GLM by (1) fitting the model to the data set, (2) testing for parameter instability over a set of partitioning variables, (3) splitting the data set with respect to the variable associated with the highest instability. The outcome is a tree where each terminal node is associated with a GLM. We will show the method's versatility and suitability to gain additional insight into the relationship of dependent and independent variables by two examples, modelling voting behaviour and a failure model for debt amortization, and compare it to alternative approaches.  相似文献   

3.
4.
In this research, we employ Bayesian inference and stochastic dynamic programming approaches to select the binomial population with the largest probability of success from n independent Bernoulli populations based upon the sample information. To do this, we first define a probability measure called belief for the event of selecting the best population. Second, we explain the way to model the selection problem using Bayesian inference. Third, we clarify the model by which we improve the beliefs and prove that it converges to select the best population. In this iterative approach, we update the beliefs by taking new observations on the populations under study. This is performed using Bayesian rule and prior beliefs. Fourth, we model the problem of making the decision in a predetermined number of decision stages using the stochastic dynamic programming approach. Finally, in order to understand and to evaluate the proposed methodology, we provide two numerical examples and a comparison study by simulation. The results of the comparison study show that the proposed method performs better than that of Levin and Robbins (1981 Levin , B. , Robbins , H. ( 1981 ). Selecting the highest probability in Binomial or multinomial trials . Proc. Nat. Acad. Sci. USA 78 : 46634666 .[Crossref], [PubMed], [Web of Science ®] [Google Scholar]) for some values of estimated probability of making a correct selection.  相似文献   

5.
We provide an estimation procedure of the two-parameter Gamma distribution based on the Algorithmic Inference approach. As a key feature of this approach, we compute the joint probability distribution of these parameters without assuming any prior. To this end, we propose a numerical algorithm which is often beneficial of a highly efficient speed up based on an approximate analytical expression of the probability distribution. We contrast our interval and point estimates with those recently obtained in Son and Oh (2006 Son , Y. S. , Oh , M. ( 2006 ). Bayesian estimation of the two-parameter gamma distribution . Communications in Statistics – Simulation and Computation 35 : 285293 .[Taylor & Francis Online], [Web of Science ®] [Google Scholar]) for the same problem. From this benchmark we realize that our estimates are both unbiased and more accurate, albeit more dispersed, in some cases, than the competitors' methods, where the dispersion drawback is notably mitigated w.r.t. Bayesian methods by a greater estimate decorrelation. We also briefly discuss the theoretical novelty of the adopted inference paradigm which actually represents a brush up on a Fisher perspective dating to almost a century, made feasible today by the available computational tools.  相似文献   

6.
In this paper, we propose a methodology to analyze longitudinal data through distances between pairs of observations (or individuals) with regard to the explanatory variables used to fit continuous response variables. Restricted maximum-likelihood and generalized least squares are used to estimate the parameters in the model. We applied this new approach to study the effect of gender and exposure on the deviant behavior variable with respect to tolerance for a group of youths studied over a period of 5 years. Were performed simulations where we compared our distance-based method with classic longitudinal analysis with both AR(1) and compound symmetry correlation structures. We compared them under Akaike and Bayesian information criterions, and the relative efficiency of the generalized variance of the errors of each model. We found small gains in the proposed model fit with regard to the classical methodology, particularly in small samples, regardless of variance, correlation, autocorrelation structure and number of time measurements.  相似文献   

7.
A mixed survival function is derived through proportional hazards model assuming Gamma or Inverse Gaussian frailty under Weibull baseline via Bayesian approach, and the maximum likelihood estimation (MLE) is proposed to estimate the parameters in the population. Through intensive simulations designed to generate right-censored data from the proposed survival function, we compare the MLE in this article with Cox estimates (in many different versions) for the covariate coefficient. Finally, the proposed survival model is applied to fit Leukemia data from Cox and Oakes (1992 Cox , D. R. , Oakes , D. ( 1992 ). Analysis of Survival Data . London : Chapman & Hall . [Google Scholar]).  相似文献   

8.
In this article, we discuss the estimation of linear functions of two Poisson means, on which an order restriction is given. We give a necessary and sufficient condition on the coefficients of the linear function for the maximum likelihood estimator (MLE) which satisfies the order restriction to dominate the unbiased estimator under squared error loss. Furthermore, simultaneous estimation of two ordered Poisson means is considered and we suggest the Clevenson–Zidek type modification of MLE which dominates the MLE under normalized squared error loss. We also improve the estimator proposed by Clevenson and Zidek (1975 Clevenson , M. , Zidek , J. ( 1975 ). Simultaneous estimation of the means of independent Poisson laws . J. Amer. Statist. Assoc. 70 : 698705 . [CSA] [CROSSREF] [Taylor & Francis Online], [Web of Science ®] [Google Scholar]) which ignores the order restriction.  相似文献   

9.
The problem of density estimation arises naturally in many contexts. In this paper, we consider the approach using a piecewise constant function to approximate the underlying density. We present a new density estimation method via the random forest method based on the Bayesian Sequential Partition (BSP) (Lu, Jiang, and Wong 2013 Lu, L., H. Jiang, and W. H. Wong, 2013. Multivariate density estimation by Bayesian Sequential Partitioning. Journal of the American Statistical Association 108(504):140210.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]). Extensive simulations are carried out with comparison to the kernel density estimation method, BSP method, and four local kernel density estimation methods. The experiment results show that the new method is capable of providing accurate and reliable density estimation, even at the boundary, especially for i.i.d. data. In addition, the likelihood of the out-of-bag density estimation, which is a byproduct of the training process, is an effective hyperparameter selection criterion.  相似文献   

10.
The Self-Healing Umbrella Sampling (SHUS) algorithm is an adaptive biasing algorithm which has been proposed in Marsili et al. (J Phys Chem B 110(29):14011–14013, 2006) in order to efficiently sample a multimodal probability measure. We show that this method can be seen as a variant of the well-known Wang–Landau algorithm Wang and Landau (Phys Rev E 64:056101, 2001a; Phys Rev Lett 86(10):2050–2053, 2001b). Adapting results on the convergence of the Wang-Landau algorithm obtained in Fort et al. (Math Comput 84(295):2297–2327, 2014a), we prove the convergence of the SHUS algorithm. We also compare the two methods in terms of efficiency. We finally propose a modification of the SHUS algorithm in order to increase its efficiency, and exhibit some similarities of SHUS with the well-tempered metadynamics method Barducci et al. (Phys Rev Lett 100:020,603, 2008).  相似文献   

11.
It is common to have experiments in which it is not possible to observe the exact lifetimes but only the interval where they occur. This sort of data presents a high number of ties and it is called grouped or interval-censored survival data. Regression methods for grouped data are available in the statistical literature. The regression structure considers modeling the probability of a subject's survival past a visit time conditional on his survival at the previous visit. Two approaches are presented: assuming that lifetimes come from (1) a continuous proportional hazards model and (2) a logistic model. However, there may be situations in which none of the models are adequate for a particular data set. This article proposes the generalized log-normal model as an alternative model for discrete survival data. This model was introduced by Chen (1995 Chen , G. ( 1995 ). Generalized Log-normal distributions with reliability application . Comput. Stat. Data Anal. 19 : 300319 . [Google Scholar]) and it is extended in this article for grouped survival data. A real example related to a Chagas disease illustrates the proposed model.  相似文献   

12.
Müller et al. (Stat Methods Appl, 2017) provide an excellent review of several classes of Bayesian nonparametric models which have found widespread application in a variety of contexts, successfully highlighting their flexibility in comparison with parametric families. Particular attention in the paper is dedicated to modelling spatial dependence. Here we contribute by concisely discussing general computational challenges which arise with posterior inference with Bayesian nonparametric models and certain aspects of modelling temporal dependence.  相似文献   

13.
This article presents some applications of time-series procedures to solve two typical problems that arise when analyzing demographic information in developing countries: (1) unavailability of annual time series of population growth rates (PGRs) and their corresponding population time series and (2) inappropriately defined population growth goals in official population programs. These problems are considered as situations that require combining information of population time series. Firstly, we suggest the use of temporal disaggregation techniques to combine census data with vital statistics information in order to estimate annual PGRs. Secondly, we apply multiple restricted forecasting to combine the official targets on future PGRs with the disaggregated series. Then, we propose a mechanism to evaluate the compatibility of the demographic goals with the annual data. We apply the aforementioned procedures to data of the Mexico City Metropolitan Zone divided by concentric rings and conclude that the targets established in the official program are not feasible. Hence, we derive future PGRs that are both in line with the official targets and with the historical demographic behavior. We conclude that growth population programs should be based on this kind of analysis to be supported empirically. So, through specialized multivariate time-series techniques, we propose to obtain first an optimal estimate of a disaggregate vector of population time series and then, produce restricted forecasts in agreement with some data-based population policies here derived.  相似文献   

14.
《Econometric Reviews》2013,32(4):425-443
The integer-valued AR1 model is generalized to encompass some of the more likely features of economic time series of count data. The generalizations come at the price of loosing exact distributional properties. For most specifications the first and second order both conditional and unconditional moments can be obtained. Hence estimation, testing and forecasting are feasible and can be based on least squares or GMM techniques. An illustration based on the number of plants within an industrial sector is considered.  相似文献   

15.
In this article, we propose a robust statistical approach to select an appropriate error distribution, in a classical multiplicative heteroscedastic model. In a first step, unlike to the traditional approach, we do not use any GARCH-type estimation of the conditional variance. Instead, we propose to use a recently developed nonparametric procedure [31 D. Mercurio and V. Spokoiny, Statistical inference for time-inhomogeneous volatility models, Ann. Stat. 32 (2004), pp. 577602.[Crossref], [Web of Science ®] [Google Scholar]]: the local adaptive volatility estimation. The motivation for using this method is to avoid a possible model misspecification for the conditional variance. In a second step, we suggest a set of estimation and model selection procedures (Berk–Jones tests, kernel density-based selection, censored likelihood score, and coverage probability) based on the so-obtained residuals. These methods enable to assess the global fit of a set of distributions as well as to focus on their behaviour in the tails, giving us the capacity to map the strengths and weaknesses of the candidate distributions. A bootstrap procedure is provided to compute the rejection regions in this semiparametric context. Finally, we illustrate our methodology throughout a small simulation study and an application on three time series of daily returns (UBS stock returns, BOVESPA returns and EUR/USD exchange rates).  相似文献   

16.
Kadilar and Cingi [Ratio estimators in simple random sampling, Appl. Math. Comput. 151 (3) (2004), pp. 893–902] introduced some ratio-type estimators of finite population mean under simple random sampling. Recently, Kadilar and Cingi [New ratio estimators using correlation coefficient, Interstat 4 (2006), pp. 1–11] have suggested another form of ratio-type estimators by modifying the estimator developed by Singh and Tailor [Use of known correlation coefficient in estimating the finite population mean, Stat. Transit. 6 (2003), pp. 655–560]. Kadilar and Cingi [Improvement in estimating the population mean in simple random sampling, Appl. Math. Lett. 19 (1) (2006), pp. 75–79] have suggested yet another class of ratio-type estimators by taking a weighted average of the two known classes of estimators referenced above. In this article, we propose an alternative form of ratio-type estimators which are better than the competing ratio, regression, and other ratio-type estimators considered here. The results are also supported by the analysis of three real data sets that were considered by Kadilar and Cingi.  相似文献   

17.
Abstract

This article presents a general method of inference of the parameters of a continuous distribution with two unknown parameters. Except in a few distributions such as the normal distribution, the classical approach fails in this context to provide accurate inferences with small samples.Therefore, by taking the generalized approach to inference (cf. Weerahandi, 1995 Weerahandi, S. (1995). Exact Statistical Methods for Data Analysis. New York: Springer Verlag. [Google Scholar]), in this article we present a general method of inference to tackle practically useful two-parameter distributions such as the gamma distribution as well as distributions of theoretical interest such as the two-parameter uniform distribution. The proposed methods are exact in the sense that they are based on exact probability statements and exact expected values. The advantage of taking the generalized approach over the classical approximate inferences is shown via simulation studies.

This article has the potential to motivate much needed further research in non normal regressions, multiparameter problems, and multivariate problems for which basically there are only large sample inferences available. The approach that we take should pave the way for researchers to solve a variety of non normal problems, including ANOVA and MANOVA problems, where even the Bayesian approach fails. In the context of testing of hypotheses, the proposed method provides a superior alternative to the classical generalized likelihood ratio method.  相似文献   

18.
We investigate the issue of bandwidth estimation in a functional nonparametric regression model with function-valued, continuous real-valued and discrete-valued regressors under the framework of unknown error density. Extending from the recent work of Shang (2013 Shang, H.L. (2013), ‘Bayesian Bandwidth Estimation for a Nonparametric Functional Regression Model with Unknown Error Density’, Computational Statistics &; Data Analysis, 67, 185198. doi: 10.1016/j.csda.2013.05.006[Crossref], [Web of Science ®] [Google Scholar]) [‘Bayesian Bandwidth Estimation for a Nonparametric Functional Regression Model with Unknown Error Density’, Computational Statistics &; Data Analysis, 67, 185–198], we approximate the unknown error density by a kernel density estimator of residuals, where the regression function is estimated by the functional Nadaraya–Watson estimator that admits mixed types of regressors. We derive a likelihood and posterior density for the bandwidth parameters under the kernel-form error density, and put forward a Bayesian bandwidth estimation approach that can simultaneously estimate the bandwidths. Simulation studies demonstrated the estimation accuracy of the regression function and error density for the proposed Bayesian approach. Illustrated by a spectroscopy data set in the food quality control, we applied the proposed Bayesian approach to select the optimal bandwidths in a functional nonparametric regression model with mixed types of regressors.  相似文献   

19.
Many situations, especially in Bayesian statistical inference, call for the use of a Markov chain Monte Carlo (MCMC) method as a way to draw approximate samples from an intractable probability distribution. With the use of any MCMC algorithm comes the question of how long the algorithm must run before it can be used to draw an approximate sample from the target distribution. A common method of answering this question involves verifying that the Markov chain satisfies a drift condition and an associated minorization condition (Rosenthal, J Am Stat Assoc 90:558–566, 1995; Jones and Hobert, Stat Sci 16:312–334, 2001). This is often difficult to do analytically, so as an alternative, it is typical to rely on output-based methods of assessing convergence. The work presented here gives a computational method of approximately verifying a drift condition and a minorization condition specifically for the symmetric random-scan Metropolis algorithm. Two examples of the use of the method described in this article are provided, and output-based methods of convergence assessment are presented in each example for comparison with the upper bound on the convergence rate obtained via the simulation-based approach.  相似文献   

20.
《Econometric Reviews》2013,32(3):269-287
Abstract

In many applications, a researcher must select an instrument vector from a candidate set of instruments. If the ultimate objective is to perform inference about the unknown parameters using conventional asymptotic theory, then we argue that it is desirable for the chosen instrument vector to satisfy four conditions which we refer to as orthogonality, identification, efficiency, and non‐redundancy. It is impossible to verify a priori which elements of the candidate set satisfy these conditions; this can only be done using the data. However, once the data are used in this fashion it is important that the selection process does not contaminate the limiting distribution of the parameter estimator. We refer to this requirement as the inference condition. In a recent paper, Andrews [[Andrews, D. W. K. (1999)] Andrews, D. W.K. 1999. Consistent moment selection procedures for generalized method of moments estimation. Econometrica, 67: 543564. [Crossref], [Web of Science ®] [Google Scholar]. Consistent moment selection procedures for generalized method of moments estimation. Econometrica67:543–564] has proposed a method of moment selection based on an information criterion involving the overidentifying restrictions test. This method can be shown to select an instrument vector which satisfies the orthogonality condition with probability one in the limit. In this paper, we consider the problem of instrument selection based on a combination of the efficiency and non‐redundancy conditions which we refer to as the relevance condition. It is shown that, within a particular class of models, certain canonical correlations form the natural metric for relevancy, and this leads us to propose a canonical correlations information criterion (CCIC) for instrument selection. We establish conditions under which our method satisfies the inference condition. We also consider the properties of an instrument selection method based on the sequential application of [Andrews, D. W. K. (1999)] Andrews, D. W.K. 1999. Consistent moment selection procedures for generalized method of moments estimation. Econometrica, 67: 543564. [Crossref], [Web of Science ®] [Google Scholar]. Consistent moment selection procedures for generalized method of moments estimation. Econometrica67:543–564 method and CCIC.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号