首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A Bayesian network (BN) is a probabilistic graphical model that represents a set of variables and their probabilistic dependencies. Formally, BNs are directed acyclic graphs whose nodes represent variables, and whose arcs encode the conditional dependencies among the variables. Nodes can represent any kind of variable, be it a measured parameter, a latent variable, or a hypothesis. They are not restricted to represent random variables, which form the “Bayesian” aspect of a BN. Efficient algorithms exist that perform inference and learning in BNs. BNs that model sequences of variables are called dynamic BNs. In this context, [A. Harel, R. Kenett, and F. Ruggeri, Modeling web usability diagnostics on the basis of usage statistics, in Statistical Methods in eCommerce Research, W. Jank and G. Shmueli, eds., Wiley, 2008] provide a comparison between Markov Chains and BNs in the analysis of web usability from e-commerce data. A comparison of regression models, structural equation models, and BNs is presented in Anderson et al. [R.D. Anderson, R.D. Mackoy, V.B. Thompson, and G. Harrell, A bayesian network estimation of the service–profit Chain for transport service satisfaction, Decision Sciences 35(4), (2004), pp. 665–689]. In this article we apply BNs to the analysis of customer satisfaction surveys and demonstrate the potential of the approach. In particular, BNs offer advantages in implementing models of cause and effect over other statistical techniques designed primarily for testing hypotheses. Other advantages include the ability to conduct probabilistic inference for prediction and diagnostic purposes with an output that can be intuitively understood by managers.  相似文献   

2.
Structure learning for Bayesian networks has been made in a heuristic mode in search of an optimal model to avoid an explosive computational burden. In the learning process, a structural error which occurred at a point of learning may deteriorate its subsequent learning. We proposed a remedial approach to this error-for-error process by using marginal model structures. The remedy is made by fixing local errors in structure in reference to the marginal structures. In this sense, we call the remedy a marginally corrective procedure. We devised a new score function for the procedure which consists of two components, the likelihood function of a model and a discrepancy measure in marginal structures. The proposed method compares favourably with a couple of the most popular algorithms as shown in experiments with benchmark data sets.  相似文献   

3.
Most statistical and data-mining algorithms assume that data come from a stationary distribution. However, in many real-world classification tasks, data arrive over time and the target concept to be learned from the data stream may change accordingly. Many algorithms have been proposed for learning drifting concepts. To deal with the problem of learning when the distribution generating the data changes over time, dynamic weighted majority was proposed as an ensemble method for concept drift. Unfortunately, this technique considers neither the age of the classifiers in the ensemble nor their past correct classification. In this paper, we propose a method that takes into account expert's age as well as its contribution to the global algorithm's accuracy. We evaluate the effectiveness of our proposed method by using m classifiers and training a collection of n-fold partitioning of the data. Experimental results on a benchmark data set show that our method outperforms existing ones.  相似文献   

4.
This paper presents a robust mixture modeling framework using the multivariate skew t distributions, an extension of the multivariate Student’s t family with additional shape parameters to regulate skewness. The proposed model results in a very complicated likelihood. Two variants of Monte Carlo EM algorithms are developed to carry out maximum likelihood estimation of mixture parameters. In addition, we offer a general information-based method for obtaining the asymptotic covariance matrix of maximum likelihood estimates. Some practical issues including the selection of starting values as well as the stopping criterion are also discussed. The proposed methodology is applied to a subset of the Australian Institute of Sport data for illustration.  相似文献   

5.
Often in practice one is interested in the situation where the lifetime data are censored. Censorship is a common phenomenon frequently encountered when analyzing lifetime data due to time constraints. In this paper, the flexible Weibull distribution proposed in Bebbington et al. [A flexible Weibull extension, Reliab. Eng. Syst. Safety 92 (2007), pp. 719–726] is studied using maximum likelihood technics based on three different algorithms: Newton Raphson, Levenberg Marquardt and Trust Region reflective. The proposed parameter estimation method is introduced and proved to work from theoretical and practical point of view. On one hand, we apply a maximum likelihood estimation method using complete simulated and real data. On the other hand, we study for the first time the model using simulated and real data for type I censored samples. The estimation results are approved by a statistical test.  相似文献   

6.
There is an increasing amount of literature focused on Bayesian computational methods to address problems with intractable likelihood. One approach is a set of algorithms known as Approximate Bayesian Computational (ABC) methods. One of the problems with these algorithms is that their performance depends on the appropriate choice of summary statistics, distance measure and tolerance level. To circumvent this problem, an alternative method based on the empirical likelihood has been introduced. This method can be easily implemented when a set of constraints, related to the moments of the distribution, is specified. However, the choice of the constraints is sometimes challenging. To overcome this difficulty, we propose an alternative method based on a bootstrap likelihood approach. The method is easy to implement and in some cases is actually faster than the other approaches considered. We illustrate the performance of our algorithm with examples from population genetics, time series and stochastic differential equations. We also test the method on a real dataset.  相似文献   

7.
Lin  Tsung I.  Lee  Jack C.  Ni  Huey F. 《Statistics and Computing》2004,14(2):119-130
A finite mixture model using the multivariate t distribution has been shown as a robust extension of normal mixtures. In this paper, we present a Bayesian approach for inference about parameters of t-mixture models. The specifications of prior distributions are weakly informative to avoid causing nonintegrable posterior distributions. We present two efficient EM-type algorithms for computing the joint posterior mode with the observed data and an incomplete future vector as the sample. Markov chain Monte Carlo sampling schemes are also developed to obtain the target posterior distribution of parameters. The advantages of Bayesian approach over the maximum likelihood method are demonstrated via a set of real data.  相似文献   

8.
In this paper, a small-sample asymptotic method is proposed for higher order inference in the stress–strength reliability model, R=P(Y<X), where X and Y are distributed independently as Burr-type X distributions. In a departure from the current literature, we allow the scale parameters of the two distributions to differ, and the likelihood-based third-order inference procedure is applied to obtain inference for R. The difficulty of the implementation of the method is in obtaining the the constrained maximum likelihood estimates (MLE). A penalized likelihood method is proposed to handle the numerical complications of maximizing the constrained likelihood model. The proposed procedures are illustrated using a sample of carbon fibre strength data. Our results from simulation studies comparing the coverage probabilities of the proposed small-sample asymptotic method with some existing large-sample asymptotic methods show that the proposed method is very accurate even when the sample sizes are small.  相似文献   

9.
The use of Mathematica in deriving mean likelihood estimators is discussed. Comparisons are made between the mean likelihood estimator, the maximum likelihood estimator, and the Bayes estimator based on a Jeffrey's noninformative prior. These estimators are compared using the mean-square error criterion and Pitman measure of closeness. In some cases it is possible, using Mathematica, to derive exact results for these criteria. Using Mathematica, simulation comparisons among the criteria can be made for any model for which we can readily obtain estimators.In the binomial and exponential distribution cases, these criteria are evaluated exactly. In the first-order moving-average model, analytical comparisons are possible only for n = 2. In general, we find that for the binomial distribution and the first-order moving-average time series model the mean likelihood estimator outperforms the maximum likelihood estimator and the Bayes estimator with a Jeffrey's noninformative prior. Mathematica was used for symbolic and numeric computations as well as for the graphical display of results. A Mathematica notebook which provides the Mathematica code used in this article is available: http://www.stats.uwo.ca/mcleod/epubs/mele. Our article concludes with our opinions and criticisms of the relative merits of some of the popular computing environments for statistics researchers.  相似文献   

10.
In this paper we present decomposable priors, a family of priors over structure and parameters of tree belief nets for which Bayesian learning with complete observations is tractable, in the sense that the posterior is also decomposable and can be completely determined analytically in polynomial time. Our result is the first where computing the normalization constant and averaging over a super-exponential number of graph structures can be performed in polynomial time. This follows from two main results: First, we show that factored distributions over spanning trees in a graph can be integrated in closed form. Second, we examine priors over tree parameters and show that a set of assumptions similar to Heckerman, Geiger and Chickering (1995) constrain the tree parameter priors to be a compactly parametrized product of Dirichlet distributions. Besides allowing for exact Bayesian learning, these results permit us to formulate a new class of tractable latent variable models in which the likelihood of a data point is computed through an ensemble average over tree structures.  相似文献   

11.
Although devised in 1936 by Fisher, discriminant analysis is still rapidly evolving, as the complexity of contemporary data sets grows exponentially. Our classification rules explore these complexities by modeling various correlations in higher-order data. Moreover, our classification rules are suitable to data sets where the number of response variables is comparable or larger than the number of observations. We assume that the higher-order observations have a separable variance-covariance matrix and two different Kronecker product structures on the mean vector. In this article, we develop quadratic classification rules among g different populations where each individual has κth order (κ ≥2) measurements. We also provide the computational algorithms to compute the maximum likelihood estimates for the model parameters and eventually the sample classification rules.  相似文献   

12.
Fitting stochastic kinetic models represented by Markov jump processes within the Bayesian paradigm is complicated by the intractability of the observed-data likelihood. There has therefore been considerable attention given to the design of pseudo-marginal Markov chain Monte Carlo algorithms for such models. However, these methods are typically computationally intensive, often require careful tuning and must be restarted from scratch upon receipt of new observations. Sequential Monte Carlo (SMC) methods on the other hand aim to efficiently reuse posterior samples at each time point. Despite their appeal, applying SMC schemes in scenarios with both dynamic states and static parameters is made difficult by the problem of particle degeneracy. A principled approach for overcoming this problem is to move each parameter particle through a Metropolis-Hastings kernel that leaves the target invariant. This rejuvenation step is key to a recently proposed \(\hbox {SMC}^2\) algorithm, which can be seen as the pseudo-marginal analogue of an idealised scheme known as iterated batch importance sampling. Computing the parameter weights in \(\hbox {SMC}^2\) requires running a particle filter over dynamic states to unbiasedly estimate the intractable observed-data likelihood up to the current time point. In this paper, we propose to use an auxiliary particle filter inside the \(\hbox {SMC}^2\) scheme. Our method uses two recently proposed constructs for sampling conditioned jump processes, and we find that the resulting inference schemes typically require fewer state particles than when using a simple bootstrap filter. Using two applications, we compare the performance of the proposed approach with various competing methods, including two global MCMC schemes.  相似文献   

13.
In this paper, we consider the family of skew generalized t (SGT) distributions originally introduced by Theodossiou [P. Theodossiou, Financial data and the skewed generalized t distribution, Manage. Sci. Part 1 44 (12) ( 1998), pp. 1650–1661] as a skew extension of the generalized t (GT) distribution. The SGT distribution family warrants special attention, because it encompasses distributions having both heavy tails and skewness, and many of the widely used distributions such as Student's t, normal, Hansen's skew t, exponential power, and skew exponential power (SEP) distributions are included as limiting or special cases in the SGT family. We show that the SGT distribution can be obtained as the scale mixture of the SEP and generalized gamma distributions. We investigate several properties of the SGT distribution and consider the maximum likelihood estimation of the location, scale, and skewness parameters under the assumption that the shape parameters are known. We show that if the shape parameters are estimated along with the location, scale, and skewness parameters, the influence function for the maximum likelihood estimators becomes unbounded. We obtain the necessary conditions to ensure the uniqueness of the maximum likelihood estimators for the location, scale, and skewness parameters, with known shape parameters. We provide a simple iterative re-weighting algorithm to compute the maximum likelihood estimates for the location, scale, and skewness parameters and show that this simple algorithm can be identified as an EM-type algorithm. We finally present two applications of the SGT distributions in robust estimation.  相似文献   

14.
A finite mixture model using the Student's t distribution has been recognized as a robust extension of normal mixtures. Recently, a mixture of skew normal distributions has been found to be effective in the treatment of heterogeneous data involving asymmetric behaviors across subclasses. In this article, we propose a robust mixture framework based on the skew t distribution to efficiently deal with heavy-tailedness, extra skewness and multimodality in a wide range of settings. Statistical mixture modeling based on normal, Student's t and skew normal distributions can be viewed as special cases of the skew t mixture model. We present analytically simple EM-type algorithms for iteratively computing maximum likelihood estimates. The proposed methodology is illustrated by analyzing a real data example.  相似文献   

15.
In this article, we show that the log empirical likelihood ratio statistic for the population mean converges in distribution to χ2(1) as n → ∞ when the population is in the domain of attraction of normal law but has infinite variance. The simulation results show that the empirical likelihood ratio method is applicable under the infinite second moment condition.  相似文献   

16.
The paper is focussing on some recent developments in nonparametric mixture distributions. It discusses nonparametric maximum likelihood estimation of the mixing distribution and will emphasize gradient type results, especially in terms of global results and global convergence of algorithms such as vertex direction or vertex exchange method. However, the NPMLE (or the algorithms constructing it) provides also an estimate of the number of components of the mixing distribution which might be not desirable for theoretical reasons or might be not allowed from the physical interpretation of the mixture model. When the number of components is fixed in advance, the before mentioned algorithms can not be used and globally convergent algorithms do not exist up to now. Instead, the EM algorithm is often used to find maximum likelihood estimates. However, in this case multiple maxima are often occuring. An example from a meta-analyis of vitamin A and childhood mortality is used to illustrate the considerable, inferential importance of identifying the correct global likelihood. To improve the behavior of the EM algorithm we suggest a combination of gradient function steps and EM steps to achieve global convergence leading to the EM algorithm with gradient function update (EMGFU). This algorithms retains the number of components to be exactly k and typically converges to the global maximum. The behavior of the algorithm is highlighted at hand of several examples.  相似文献   

17.
In this article, we extended the classic Box–Cox transformation to spatial linear models. For a comparative study, the proposed models were applied to a real data set of Chinese population growth and economic development with three different structures: no spatial correction, conditional autoregressive and simultaneous autoregressive. Maximal likelihood method was used to estimate the Box–Cox parameter λ and other parameters in the models. The residuals of the models were analyzed through Moran’s I and Geary’s c.  相似文献   

18.
In this article, we consider the efficient estimation of the semiparametric transformation model with doubly truncated data. We propose a two-step approach for obtaining the pseudo maximum likelihood estimators (PMLE) of regression parameters. In the first step, the truncation time distribution is estimated by the nonparametric maximum likelihood estimator (Shen, 2010a) when the distribution function K of the truncation time is unspecified or by the conditional maximum likelihood estimator (Bilker and Wang, 1996) when K is parameterized. In the second step, using the pseudo complete-data likelihood function with the estimated distribution of truncation time, we propose expectation–maximization algorithms for obtaining the PMLE. We establish the consistency of the PMLE. The simulation study indicates that the PMLE performs well in finite samples. The proposed method is illustrated using an AIDS data set.  相似文献   

19.
In this paper we present two methods of estimating a linear regression equation with Cauchy disturbances. The first method uses the maximum likelihood principle and therefore the estimators obtained are consistent. The asymptotic covariance is derived which provides with the necessary statistics for the purpose of making inference in large samples. The second method is the method of least lines which minimizes the sum of absolute errors (MSAE) from the fitted regression. Then these two methods are compared through a Monte Carlo study. The maximum likelihood method emerges superior over the MSAE method. However, the MSAE procedure which does not depend on the distribution of the error term appears to be a close competitor to the maximum likelihood estimator.  相似文献   

20.
The glmnet package by Friedman et al. [Regularization paths for generalized linear models via coordinate descent, J. Statist. Softw. 33 (2010), pp. 1–22] is an extremely fast implementation of the standard coordinate descent algorithm for solving ?1 penalized learning problems. In this paper, we consider a family of coordinate majorization descent algorithms for solving the ?1 penalized learning problems by replacing each coordinate descent step with a coordinate-wise majorization descent operation. Numerical experiments show that this simple modification can lead to substantial improvement in speed when the predictors have moderate or high correlations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号