首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
ABSTRACT

This article discusses two asymmetrization methods, Azzalini's representation and beta generation, to generate asymmetric bimodal models including two novel beta-generated models. The practical utility of these models is assessed with nine data sets from different fields of applied sciences. Besides this tutorial assessment, some methodological contributions are made: a random number generator for the asymmetric Rathie–Swamee model is developed (generators for the other models are already known and briefly described) and a new likelihood ratio test of unimodality is compared via simulations with other available tests. Several tools have been used to quantify and test for bimodality and assess goodness of fit including Bayesian information criterion, measures of agreement with the empirical distribution and the Kolmogorov–Smirnoff test. In the nine case studies, the results favoured models derived from Azzalini's asymmetrization, but no single model provided a best fit across the applications considered. In only two cases the normal mixture was selected as best model. Parameter estimation has been done by likelihood maximization. Numerical optimization must be performed with care since local optima are often present. We concluded that the models considered are flexible enough to fit different bimodal shapes and that the tools studied should be used with care and attention to detail.  相似文献   

2.
Variable selection in cluster analysis is important yet challenging. It can be achieved by regularization methods, which realize a trade-off between the clustering accuracy and the number of selected variables by using a lasso-type penalty. However, the calibration of the penalty term can suffer from criticisms. Model selection methods are an efficient alternative, yet they require a difficult optimization of an information criterion which involves combinatorial problems. First, most of these optimization algorithms are based on a suboptimal procedure (e.g. stepwise method). Second, the algorithms are often computationally expensive because they need multiple calls of EM algorithms. Here we propose to use a new information criterion based on the integrated complete-data likelihood. It does not require the maximum likelihood estimate and its maximization appears to be simple and computationally efficient. The original contribution of our approach is to perform the model selection without requiring any parameter estimation. Then, parameter inference is needed only for the unique selected model. This approach is used for the variable selection of a Gaussian mixture model with conditional independence assumed. The numerical experiments on simulated and benchmark datasets show that the proposed method often outperforms two classical approaches for variable selection. The proposed approach is implemented in the R package VarSelLCM available on CRAN.  相似文献   

3.
This article proposes a mixture double autoregressive model by introducing the flexibility of mixture models to the double autoregressive model, a novel conditional heteroscedastic model recently proposed in the literature. To make it more flexible, the mixing proportions are further assumed to be time varying, and probabilistic properties including strict stationarity and higher order moments are derived. Inference tools including the maximum likelihood estimation, an expectation–maximization (EM) algorithm for searching the estimator and an information criterion for model selection are carefully studied for the logistic mixture double autoregressive model, which has two components and is encountered more frequently in practice. Monte Carlo experiments give further support to the new models, and the analysis of an empirical example is also reported.  相似文献   

4.
For testing the effectiveness of a treatment on a binary outcome, a bewildering range of methods have been proposed. How similar are all these tests? What are their theoretical strengths and weaknesses? Which are to be recommended and what is a coherent basis for deciding? In this paper, we take seven standard but imperfect tests and apply three different methods of adjustment to ensure size control: maximization (M), restricted maximization (B) and bootstrap/estimation (E). Across a wide conditions, we compute exact size and power of the 7 basic and 21 adjusted tests. We devise two new measures of size bias and intrinsic power, and employ novel graphical tools to summarise a huge set of results. Amongst the 7 basic tests, Liebermeister’s test best controls size but can still be conservative. Amongst the adjusted tests, E-tests clearly have the best power and results are very stable across different conditions.  相似文献   

5.
This article proposes an exact estimation of demand functions under block-rate pricing by focusing on increasing block-rate pricing. This is the first study that explicitly considers the separability condition which has been ignored in previous literature. Under this pricing structure, the price changes when consumption exceeds a certain threshold and the consumer faces a utility maximization problem subject to a piecewise-linear budget constraint. Solving this maximization problem leads to a statistical model in which model parameters are strongly restricted by the separability condition. In this article, by taking a hierarchical Bayesian approach, we implement a Markov chain Monte Carlo simulation to properly estimate the demand function. We find, however, that the convergence of the distribution of simulated samples to the posterior distribution is slow, requiring an additional scale transformation step for parameters to the Gibbs sampler. These proposed methods are then applied to estimate the Japanese residential water demand function.  相似文献   

6.
Mixture modeling in general and expectation–maximization in particular are too cumbersome and confusing for applied health researchers. Consequently, the full potential of mixture modeling is not realized. To remedy the deficiency, this tutorial article is prepared. This article addresses important applied problems in survival analysis and handles them in deeper generality than the existing work, especially from the point of view of taking covariates into account. In specific, the article demonstrates the concepts, tools, and inferencial procedure of mixture modeling using head-and-neck cancer data and survival time after heart transplant surgery data.  相似文献   

7.
This paper develops an algorithm for uniform random generation over a constrained simplex, which is the intersection of a standard simplex and a given set. Uniform sampling from constrained simplexes has numerous applications in different fields, such as portfolio optimization, stochastic multi-criteria decision analysis, experimental design with mixtures and decision problems involving discrete joint distributions with imprecise probabilities. The proposed algorithm is developed by combining the acceptance–rejection and conditional methods along with the use of optimization tools. The acceptance rate of the algorithm is analytically compared to that of a crude acceptance–rejection algorithm, which generates points over the simplex and then rejects any points falling outside the intersecting set. Finally, using convex optimization, the setup phase of the algorithm is detailed for the special cases where the intersecting set is a general convex set, a convex set defined by a finite number of convex constraints or a polyhedron.  相似文献   

8.
The increasing amount of data stored in the form of dynamic interactions between actors necessitates the use of methodologies to automatically extract relevant information. The interactions can be represented by dynamic networks in which most existing methods look for clusters of vertices to summarize the data. In this paper, a new framework is proposed in order to cluster the vertices while detecting change points in the intensities of the interactions. These change points are key in the understanding of the temporal interactions. The model used involves non-homogeneous Poisson point processes with cluster-dependent piecewise constant intensity functions and common discontinuity points. A variational expectation maximization algorithm is derived for inference. We show that the pruned exact linear time method, originally developed for change points detection in univariate time series, can be considered for the maximization step. This allows the detection of both the number of change points and their location. Experiments on artificial and real datasets are carried out, and the proposed approach is compared with related methods.  相似文献   

9.
Abstract

Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine, machine learning and credit scoring. The receiver operating characteristic (ROC) curve and surface are useful tools to assess the ability of diagnostic tests to discriminate between ordered classes or groups. To define these diagnostic tests, selecting the optimal thresholds that maximize the accuracy of these tests is required. One procedure that is commonly used to find the optimal thresholds is by maximizing what is known as Youden’s index. This article presents nonparametric predictive inference (NPI) for selecting the optimal thresholds of a diagnostic test. NPI is a frequentist statistical method that is explicitly aimed at using few modeling assumptions, enabled through the use of lower and upper probabilities to quantify uncertainty. Based on multiple future observations, the NPI approach is presented for selecting the optimal thresholds for two-group and three-group scenarios. In addition, a pairwise approach has also been presented for the three-group scenario. The article ends with an example to illustrate the proposed methods and a simulation study of the predictive performance of the proposed methods along with some classical methods such as Youden index. The NPI-based methods show some interesting results that overcome some of the issues concerning the predictive performance of Youden’s index.  相似文献   

10.
Heng Lian 《Statistics》2013,47(6):777-785
Improving efficiency of the importance sampler is at the centre of research on Monte Carlo methods. While the adaptive approach is usually not so straightforward within the Markov chain Monte Carlo framework, the counterpart in importance sampling can be justified and validated easily. We propose an iterative adaptation method for learning the proposal distribution of an importance sampler based on stochastic approximation. The stochastic approximation method can recruit general iterative optimization techniques like the minorization–maximization algorithm. The effectiveness of the approach in optimizing the Kullback divergence between the proposal distribution and the target is demonstrated using several examples.  相似文献   

11.
The analysis of human perceptions is often carried out by resorting to surveys and questionnaires, where respondents are asked to express ratings about the objects being evaluated. A class of mixture models, called CUB (Combination of Uniform and shifted Binomial), has been recently proposed in this context. This article focuses on a model of this class, the Nonlinear CUB, and investigates some computational issues concerning parameter estimation, which is performed by Maximum Likelihood. More specifically, we consider two main approaches to optimize the log-likelihood: the classical numerical methods of optimization and the EM algorithm. The classical numerical methods comprise the widely used algorithms Nelder–Mead, Newton–Raphson, Broyden–Fletcher–Goldfarb–Shanno (BFGS), Berndt–Hall–Hall–Hausman (BHHH), Simulated Annealing, Conjugate Gradients and usually have the advantage of a fast convergence. On the other hand, the EM algorithm deserves consideration for some optimality properties in the case of mixture models, but it is slower. This article has a twofold aim: first we show how to obtain explicit formulas for the implementation of the EM algorithm in nonlinear CUB models and we formally derive the asymptotic variance–covariance matrix of the Maximum Likelihood estimator; second, we discuss and compare the performance of the two above mentioned approaches to the log-likelihood maximization.  相似文献   

12.
Immuno‐oncology has emerged as an exciting new approach to cancer treatment. Common immunotherapy approaches include cancer vaccine, effector cell therapy, and T‐cell–stimulating antibody. Checkpoint inhibitors such as cytotoxic T lymphocyte–associated antigen 4 and programmed death‐1/L1 antagonists have shown promising results in multiple indications in solid tumors and hematology. However, the mechanisms of action of these novel drugs pose unique statistical challenges in the accurate evaluation of clinical safety and efficacy, including late‐onset toxicity, dose optimization, evaluation of combination agents, pseudoprogression, and delayed and lasting clinical activity. Traditional statistical methods may not be the most accurate or efficient. It is highly desirable to develop the most suitable statistical methodologies and tools to efficiently investigate cancer immunotherapies. In this paper, we summarize these issues and discuss alternative methods to meet the challenges in the clinical development of these novel agents. For safety evaluation and dose‐finding trials, we recommend the use of a time‐to‐event model‐based design to handle late toxicities, a simple 3‐step procedure for dose optimization, and flexible rule‐based or model‐based designs for combination agents. For efficacy evaluation, we discuss alternative endpoints/designs/tests including the time‐specific probability endpoint, the restricted mean survival time, the generalized pairwise comparison method, the immune‐related response criteria, and the weighted log‐rank or weighted Kaplan‐Meier test. The benefits and limitations of these methods are discussed, and some recommendations are provided for applied researchers to implement these methods in clinical practice.  相似文献   

13.
This paper examines the asymptotic properties of a binary response model estimator based on maximization of the Area Under receiver operating characteristic Curve (AUC). Given certain assumptions, AUC maximization is a consistent method of binary response model estimation up to normalizations. As AUC is equivalent to Mann-Whitney U statistics and Wilcoxon test of ranks, maximization of area under ROC curve is equivalent to the maximization of corresponding statistics. Compared to parametric methods, such as logit and probit, AUC maximization relaxes assumptions about error distribution, but imposes some restrictions on the distribution of explanatory variables, which can be easily checked, since this information is observable.  相似文献   

14.
Abstract. In this article we consider a problem from bone marrow transplant (BMT) studies where there is interest on assessing the effect of haplotype match for donor and patient on the overall survival. The BMT study we consider is based on donors and patients that are genotype matched, and this therefore leads to a missing data problem. We show how Aalen's additive risk model can be applied in this setting with the benefit that the time‐varying haplomatch effect can be easily studied. This problem has not been considered before, and the standard approach where one would use the expected‐maximization (EM) algorithm cannot be applied for this model because the likelihood is hard to evaluate without additional assumptions. We suggest an approach based on multivariate estimating equations that are solved using a recursive structure. This approach leads to an estimator where the large sample properties can be developed using product‐integration theory. Small sample properties are investigated using simulations in a setting that mimics the motivating haplomatch problem.  相似文献   

15.
The traditional mixture model assumes that a dataset is composed of several populations of Gaussian distributions. In real life, however, data often do not fit the restrictions of normality very well. It is likely that data from a single population exhibiting either asymmetrical or heavy-tail behavior could be erroneously modeled as two populations, resulting in suboptimal decisions. To avoid these pitfalls, we generalize the mixture model using adaptive kernel density estimators. Because kernel density estimators enforce no functional form, we can adapt to non-normal asymmetric, kurtotic, and tail characteristics in each population independently. This, in effect, robustifies mixture modeling. We adapt two computational algorithms, genetic algorithm with regularized Mahalanobis distance and genetic expectation maximization algorithm, to optimize the kernel mixture model (KMM) and use results from robust estimation theory in order to data-adaptively regularize both. Finally, we likewise extend the information criterion ICOMP to score the KMM. We use these tools to simultaneously select the best mixture model and classify all observations without making any subjective decisions. The performance of the KMM is demonstrated on two medical datasets; in both cases, we recover the clinically determined group structure and substantially improve patient classification rates over the Gaussian mixture model.  相似文献   

16.
While at least some standard graphical tools do exist for cardinal time series analysis, little research effort has been given directed towards the visualization of categorical time series. The repertoire of such visual methods is nearly exclusively restricted to few isolated proposals from computer science and biology. This article aims at presenting a toolbox of known and newly developed approaches for analysing given categorical time series data visually. Among these tools, especially the rate evolution graph, the circle transformation, pattern histograms and control charts are promising.  相似文献   

17.
This paper presents an investigation of a method for minimizing functions of several parameters where the function need not be computed precisely. Motivated by problems requiring the optimization of negative log-likelihoods, we also want to estimate the (inverse) Hessian at the point of minimum. The imprecision of the function values impedes the application of conventional optimization methods, and the goal of Hessian estimation adds a lot to the difficulty of developing an algorithm. The present class of methods is based on statistical approximation of the functional surface by a quadratic model, so is similar in motivation to many conventional techniques. The present work attempts to classify both problems and algorithmic tools in an effort to prescribe suitable techniques in a variety of situations. The codes are available from the authors' web site http://macnash.admin.uottawa.ca/~rsmin/.  相似文献   

18.
Abstract. We consider a bidimensional Ornstein–Uhlenbeck process to describe the tissue microvascularization in anti‐cancer therapy. Data are discrete, partial and noisy observations of this stochastic differential equation (SDE). Our aim is to estimate the SDE parameters. We use the main advantage of a one‐dimensional observation to obtain an easy way to compute the exact likelihood using the Kalman filter recursion, which allows to implement an easy numerical maximization of the likelihood. Furthermore, we establish the link between the observations and an ARMA process and we deduce the asymptotic properties of the maximum likelihood estimator. We show that this ARMA property can be generalized to a higher dimensional underlying Ornstein–Uhlenbeck diffusion. We compare this estimator with the one obtained by the well‐known expectation maximization algorithm on simulated data. Our estimation methods can be directly applied to other biological contexts such as drug pharmacokinetics or hormone secretions.  相似文献   

19.
This article is concerned with the effect of the methods for handling missing values in multivariate control charts. We discuss the complete case, mean substitution, regression, stochastic regression, and the expectation–maximization algorithm methods for handling missing values. Estimates of mean vector and variance–covariance matrix from the treated data set are used to build the multivariate exponentially weighted moving average (MEWMA) control chart. Based on a Monte Carlo simulation study, the performance of each of the five methods is investigated in terms of its ability to obtain the nominal in-control and out-of-control average run length (ARL). We consider three sample sizes, five levels of the percentage of missing values, and three types of variable numbers. Our simulation results show that imputation methods produce better performance than case deletion methods. The regression-based imputation methods have the best overall performance among all the competing methods.  相似文献   

20.
After reading a few articles in the nonlinear econonetric literature one begins to notice that each discussion follows roughly the same lines as the classical treatment of maximum likelihood estimation. There are some technical problems having to do with simultaneously conditioning on the exogenous variables and subjecting the true parameter to a Pittman drift which prevent the use of the classical methods of proof but the basic impression of similarity is correct . An estimator – be it nonlinear least squares, three – stage nonlinear least squares, or whatever – is the solution of an optimization problem. And the objective function of the optimization problem can be treated as if it were the likelihood to derive the Wald test statistic, the likelihood ratio test statistic , and Rao's efficient score statistic. Their asymptotic null and non – null distributions can be found using arguments fairly similar to the classical maximum likelihood arguments. In this article we exploit these observations and unify much of the nonlinear econometric literature. That which escapes this unificationis that which has an objective function which is not twice continuously differentiable with respect to the parameters – minimum absolute deviations regression for example.

The model which generates the data need not bethe same as the model which was presumed to define the optimization problem. Thus, these results can be used to obtain the asymptotic behavior of inference procedures under specification error We think that this will prove to be the nost useful feature of the paper. For example, it i s not necessary toresortto Monte Carlo simulat ionto determine i f a Translog estimate of an elasticity of sub stitution obtained by nonlinear three-stage least squares is robust against a CES truestate of nature. The asymptotic approximations we give here w ill provide an analytic answer to the question, sufficiently accurate for most purposes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号