首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 562 毫秒
1.
The generalized negative binomial (GNB) distribution was defined by Jain and Consul (SIAM J. Appl. Math., 21 (1971)) and was obtained as a particular family of Lagrangian distributions by Consul and Shenton (SIAM J. Appl. Math., 23 (1973)). Consul and Shenton also gave the probability generating function (p.g.f.) and proved many properties of the GNBD. Consul and Gupta (SIAM J. Appl. Math., 39 (1980)) proved that the parameter β must be either zero or 1≤ β ≤ θ-1 for the GNBD to be a true probability distribution and proved some other properties. Numerous applications and properties of this model have been studied by various researchers. Considering two independent GNB variates X and Y, with parameters (m,β,θ) and (n,β,θ) respectively, the probability distribuition of D = Y-X and its p.g.f. and cumulant generating function have been obtained. A recurrence relation between the cumulants has been established and the first four cumulants, β1 and β2 have been derived. Also some moments of the absolute difference |Y-X| have been obtained.  相似文献   

2.
In this article we deal with simultaneous two-sided tolerance intervals for a univariate linear regression model with independent normally distributed errors. We present a method for determining the intervals derived by the general confidence-set approach (GCSA), i.e. the intervals are constructed based on a specified confidence set for unknown parameters of the model. The confidence set used in the new method is formed based on a suggested hypothesis test about all parameters of the model. The simultaneous two-sided tolerance intervals determined by the presented method are found to be efficient and fast to compute based on a preliminary numerical comparison of all the existing methods based on GCSA.  相似文献   

3.
This article investigates the effect of estimation of unknown degrees of freedom on efficient estimation of remaining parameters in Spanos’ conditional t heteroskedastic model. We compare by simulation three maximum likelihood estimators (MLEs) of the remaining parameters in the model: the MLE of the remaining parameters when all the parameters are estimated by the MLE, when the degrees of freedom is estimated by a method of moments estimator, and when the degrees of freedom is erroneously specified. The latter two methods are found to perform poorly compared to the former method for the inference of variance parameters in the model. Thus, efficient estimation of degrees of freedom by the MLE is important to estimate efficiently the remaining variance parameters.  相似文献   

4.
The estimation of the parameter of a mixed model analysis of variance by maximum likelihood methods is discussed. The functional iteration method is studied and found to have good comptuational properties. The estimates are studied via Monte Carlo techniques and their small sample properties are observed; it is found that the MLE's may be biased but that they have good Mean Square Error properties.  相似文献   

5.
Nowadays, Bayesian methods are routinely used for estimating parameters of item response theory (IRT) models. However, the marginal likelihoods are still rarely used for comparing IRT models due to their complexity and a relatively high dimension of the model parameters. In this paper, we review Monte Carlo (MC) methods developed in the literature in recent years and provide a detailed development of how these methods are applied to the IRT models. In particular, we focus on the “best possible” implementation of these MC methods for the IRT models. These MC methods are used to compute the marginal likelihoods under the one-parameter IRT model with the logistic link (1PL model) and the two-parameter logistic IRT model (2PL model) for a real English Examination dataset. We further use the widely applicable information criterion (WAIC) and deviance information criterion (DIC) to compare the 1PL model and the 2PL model. The 2PL model is favored by all of these three Bayesian model comparison criteria for the English Examination data.  相似文献   

6.
Bounds for the maximum deviation between parameters of a finite population and their corresponding sample estimates are found in the multiple regression model. The parameters considered are the vector of regression coefficients and the value ofthe regression function for given values of the independent variable (or variables). Applications are considered to several widely employed sampling methods.  相似文献   

7.
Various methods for clustering mixed-mode data are compared. It is found that a method based on a finite mixture model in which the observed categorical variables are generated from underlying continuous variables out-performs more conventional methods when applied to artificially generated data. This method also performs best when applied to Fisher's iris data in which two of the variables are categorized by applying thresholds.  相似文献   

8.
The analysis of infectious disease data presents challenges arising from the dependence in the data and the fact that only part of the transmission process is observable. These difficulties are usually overcome by making simplifying assumptions. The paper explores the use of Markov chain Monte Carlo (MCMC) methods for the analysis of infectious disease data, with the hope that they will permit analyses to be made under more realistic assumptions. Two important kinds of data sets are considered, containing temporal and non-temporal information, from outbreaks of measles and influenza. Stochastic epidemic models are used to describe the processes that generate the data. MCMC methods are then employed to perform inference in a Bayesian context for the model parameters. The MCMC methods used include standard algorithms, such as the Metropolis–Hastings algorithm and the Gibbs sampler, as well as a new method that involves likelihood approximation. It is found that standard algorithms perform well in some situations but can exhibit serious convergence difficulties in others. The inferences that we obtain are in broad agreement with estimates obtained by other methods where they are available. However, we can also provide inferences for parameters which have not been reported in previous analyses.  相似文献   

9.

Ordinal data are often modeled using a continuous latent response distribution, which is partially observed through windows of adjacent intervals defined by cutpoints. In this paper we propose the beta distribution as a model for the latent response. The beta distribution has several advantages over the other common distributions used, e.g. , normal and logistic. In particular, it enables separate modeling of location and dispersion effects which is essential in the Taguchi method of robust design. First, we study the problem of estimating the location and dispersion parameters of a single beta distribution (representing a single treatment) from ordinal data assuming known equispaced cutpoints. Two methods of estimation are compared: the maximum likelihood method and the method of moments. Two methods of treating the data are considered: in raw discrete form and in smoothed continuousized form. A large scale simulation study is carried out to compare the different methods. The mean square errors of the estimates are obtained under a variety of parameter configurations. Comparisons are made based on the ratios of the mean square errors (called the relative efficiencies). No method is universally the best, but the maximum likelihood method using continuousized data is found to perform generally well, especially for estimating the dispersion parameter. This method is also computationally much faster than the other methods and does not experience convergence difficulties in case of sparse or empty cells. Next, the problem of estimating unknown cutpoints is addressed. Here the multiple treatments setup is considered since in an actual application, cutpoints are common to all treatments, and must be estimated from all the data. A two-step iterative algorithm is proposed for estimating the location and dispersion parameters of the treatments, and the cutpoints. The proposed beta model and McCullagh's (1980) proportional odds model are compared by fitting them to two real data sets.  相似文献   

10.
Summary. We propose a Bayesian model for physiologically based pharmacokinetics of 1,3-butadiene (BD). BD is classified as a suspected human carcinogen and exposure to it is common, especially through cigarette smoke as well as in urban settings. The main aim of the methodology and analysis that are presented here is to quantify variability in the rates of BD metabolism by human subjects. A three-compartmental model is described, together with informative prior distributions for the population parameters, all of which represent real physiological variables. The model is described in detail along with the meanings and interpretations of the associated parameters. A four-compartment model is also given for comparison. Markov chain Monte Carlo methods are described for fitting the model proposed. The model is fitted to toxicokinetic data obtained from 133 healthy subjects (males and females) from the four major racial groups in the USA, with ages ranging from 19 to 62 years. Subjects were exposed to 2 parts per million of BD for 20 min through a face mask by using a computer-controlled exposure and respiratory monitoring system. Stratification by ethnic group results in major changes in the physiological parameters. Sex and age were also tested but not found to have a significant effect.  相似文献   

11.
This paper describes a new program, CORRECT, which takes words rejected by the Unix® SPELL program, proposes a list of candidate corrections, and sorts them by probability score. The probability scores are the novel contribution of this work. They are based on a noisy channel model. It is assumed that the typist knows what words he or she wants to type but some noise is added on the way to the keyboard (in the form of typos and spelling errors). Using a classic Bayesian argument of the kind that is popular in recognition applications, especially speech recognition (Jelinek, 1985), one can often recover the intended correction,c, from a typo,t, by finding the correctionc that maximizesPr(c) Pr(t/c). The first factor,Pr(c), is a prior model of word probabilities; the second factor,Pr(t/c), is a model of the noisy channel that accounts for spelling transformations on letter sequences (insertions, deletions, substitutions and reversals). Both sets of probabilities were estimated using data collected from the Associated Press (AP) newswire over 1988 and 1989 as a training set. The AP generates about 1 million words and 500 typos per week.In evaluating the program, we found that human judges were extremely reluctant to cast a vote given only the information available to the program, and that they were much more comfortable when they could see a concordance line or two. The second half of this paper discusses some very simple methods of modeling the context usingn-gram statistics. Althoughn-gram methods are much too simple (compared with much more sophisticated methods used in artificial intelligence and natural language processing), we have found that even these very simple methods illustrate some very interesting estimation problems that will almost certainly come up when we consider more sophisticated models of contexts. The problem is how to estimate the probability of a context that we have not seen. We compare several estimation techniques and find that some are useless. Fortunately, we have found that the Good-Turing method provides an estimate of contextual probabilities that produces a significant improvement in program performance. Context is helpful in this application, but only if it is estimated very carefully.At this point, we have a number of different knowledge sources—the prior, the channel and the context—and there will certainly be more in the future. In general, performance will be improved as more and more knowledge sources are added to the system, as long as each additional knowledge source provides some new (independent) information. As we shall see, it is important to think more carefully about combination rules, especially when there are a large number of different knowledge sources.  相似文献   

12.
Permutation Tests for Linear Models   总被引:4,自引:1,他引:3  
Several approximate permutation tests have been proposed for tests of partial regression coefficients in a linear model based on sample partial correlations. This paper begins with an explanation and notation for an exact test. It then compares the distributions of the test statistics under the various permutation methods proposed, and shows that the partial correlations under permutation are asymptotically jointly normal with means 0 and variances 1. The method of Freedman & Lane (1983) is found to have asymptotic correlation 1 with the exact test, and the other methods are found to have smaller correlations with this test. Under local alternatives the critical values of all the approximate permutation tests converge to the same constant, so they all have the same asymptotic power. Simulations demonstrate these theoretical results.  相似文献   

13.
对偶法核算全要素生产率   总被引:2,自引:0,他引:2       下载免费PDF全文
项歌德  朱平芳 《统计研究》2010,27(11):47-52
 在Griliches、Jorgenson(1967)和Hsieh(1999,2002)的基础上,结合卢卡斯(1988)的人力资本模型发展了一种基于人力资本的对偶法测度全要素生产率。以改革开放以来的上海为研究对象,同时运用对偶法和传统测度方法测度上海市全要素生产率,验证了经济增长率的变动更多来自于全要素生产率而不是投入要素的变动,同时发现两种方法测度结果差异显著。经过进一步分析,发现这种差异来源于直接资本价格与间接资本价格之间存在着高度的不一致性。通过对两种方法的适用性进行讨论,可以认为对偶法是传统测度方法一种有益的补充。  相似文献   

14.
Variable selection problem is one of the most important tasks in regression analysis, especially in a high-dimensional setting. In this paper, we study this problem in the context of scalar response functional regression model, which is a linear model with scalar response and functional regressors. The functional model can be represented by certain multiple linear regression model via basis expansions of functional variables. Based on this model and random subspace method of Mielniczuk and Teisseyre (Comput Stat Data Anal 71:725–742, 2014), two simple variable selection procedures for scalar response functional regression model are proposed. The final functional model is selected by using generalized information criteria. Monte Carlo simulation studies conducted and a real data example show very satisfactory performance of new variable selection methods under finite samples. Moreover, they suggest that considered procedures outperform solutions found in the literature in terms of correctly selected model, false discovery rate control and prediction error.  相似文献   

15.
Abstract. We investigate simulation methodology for Bayesian inference in Lévy‐driven stochastic volatility (SV) models. Typically, Bayesian inference from such models is performed using Markov chain Monte Carlo (MCMC); this is often a challenging task. Sequential Monte Carlo (SMC) samplers are methods that can improve over MCMC; however, there are many user‐set parameters to specify. We develop a fully automated SMC algorithm, which substantially improves over the standard MCMC methods in the literature. To illustrate our methodology, we look at a model comprised of a Heston model with an independent, additive, variance gamma process in the returns equation. The driving gamma process can capture the stylized behaviour of many financial time series and a discretized version, fit in a Bayesian manner, has been found to be very useful for modelling equity data. We demonstrate that it is possible to draw exact inference, in the sense of no time‐discretization error, from the Bayesian SV model.  相似文献   

16.
The use of parametric linear mixed models and generalized linear mixed models to analyze longitudinal data collected during randomized control trials (RCT) is conventional. The application of these methods, however, is restricted due to various assumptions required by these models. When the number of observations per subject is sufficiently large, and individual trajectories are noisy, functional data analysis (FDA) methods serve as an alternative to parametric longitudinal data analysis techniques. However, the use of FDA in RCTs is rare. In this paper, the effectiveness of FDA and linear mixed models (LMMs) was compared by analyzing data from rural persons living with HIV and comorbid depression enrolled in a depression treatment randomized clinical trial. Interactive voice response systems were used for weekly administrations of the 10-item Self-Administered Depression Scale (SADS) over 41 weeks. Functional principal component analysis and functional regression analysis methods detected a statistically significant difference in SADS between telphone-administered interpersonal psychotherapy (tele-IPT) and controls but linear mixed effects model results did not. Additional simulation studies were conducted to compare FDA and LMMs under a different nonlinear trajectory assumption. In this clinical trial with sufficient per subject measured outcomes and individual trajectories that are noisy and nonlinear, we found FDA methods to be a better alternative to LMMs.  相似文献   

17.
In a class action litigation, actual damages are not known exactly and must be estimated. Various estimators are proposed and assessed by using a model that identifies possible sources of error. Estimators that have been used in practice are shown to be seriously biased. An empirical Bayes estimator and an empirical minimal mean squared error estimator are found to be more satisfactory methods for estimating damages.  相似文献   

18.
When a vector of sample proportions is not obtained through a simple random sampling, the covariance matrix for the sample vector can differ substantially from the one corresponding to the multinomial model (Wilson 1989). For example, clustering effects of subject effects in repeated-measure experiments can cause the variance of the observed proportions to be much larger than variances under the multinomial model. The phenomenon is generally referred to as overdispersion. Tallis (1962) proposed a model for identically distributed multinomials with a common measure of correlation and referred to it as the generalized multinomial model. This generalized multinomial model is extended in this article to account for overdispersion by allowing the vectors of proportions to vary according to a Dirichlet distribution. The generalized Dirichlet-multinomial model (as it is referred to here) allows for a second order of pairwise correlation among units, a type of assumption found reasonable in some biological data (Kupper and Haseman 1978) and introduced here to business data. An alternative derivation allowing for two kinds of variation is also considered. Asymptotic normal properties of parameter estimators are used to construct Wald statistics for testing hypotheses. The methods are illustrated with applications to performance evaluation monthly data and an integrated circuit yield analysis.  相似文献   

19.
Structural inference as a method of statistical analysis seems to have escaped the attention of many statisticians. This paper focuses on Fraser’s necessary analysis of structural models as a tool to derive classical distribution results.

A structural model analyzed by Zacks (1971) by means of conventional statistical methods and fiducial theory is re-examined by the structural method. It is shown that results obtained by the former methods come as easy consequences of the latter analysis of the structural model. In the process we also simplify Zacks1 methods of obtaining a minimum risk equivariant estimator of a parameter of the model.

A theorem of Basu (1955), often used to prove independence of a complete sufficient statistic and an ancillary statistic, is also reexamined in the light of structural method. It is found that for structural models more can be achieved by necessary analysis without the use of Basu’s theorem. Bain’s (1972) application of Basu’s theorem of constructing confidence intervals for Weibull reliability is given as an example.  相似文献   

20.
Image correlation spectroscopy (ICS) is an experimental technique used in the study of cell tissue, specifically to obtain information about protein aggregates. This paper discusses a filtered Poisson process model of ICS, and uses this to obtain the sampling distribution of an estimator of interest, using spectral methods. Integrals of third- and fourth-order spectra are involved, and methods proposed in the literature are studied numerically. They are found to be infeasible methods, requiring months of computing time. An alternative method is proposed and is illustrated on some actual ICS experiments. A novel feature in this study is the use of a SIMD-class parallel computer, a MasPar MP-2 machine, to obtain some statistical properties of the estimation method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号