期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A non-inferiority test for diagnostic accuracy in the absence of the golden standard test based on the paired partial areas under receiver operating characteristic curves

Shu-Man Shih Hsin-Neng Hsieh 《Journal of applied statistics》2016,43(3):550-562

Non-inferiority tests are often measured for the diagnostic accuracy in medical research. The area under the receiver operating characteristic (ROC) curve is a familiar diagnostic measure for the overall diagnostic accuracy. Nevertheless, since it may not differentiate the diverse shapes of the ROC curves with different diagnostic significance, the partial area under the ROC (PAUROC) curve, another summary measure emerges for such diagnostic processes that require the false-positive rate to be in the clinically interested range. Traditionally, to estimate the PAUROC, the golden standard (GS) test on the true disease status is required. Nevertheless, the GS test may sometimes be infeasible. Besides, in a lot of research fields such as the epidemiology field, the true disease status of the patients may not be known or available. Under the normality assumption on diagnostic test results, based on the expectation-maximization algorithm in combination with the bootstrap method, we propose the heuristic method to construct a non-inferiority test for the difference in the paired PAUROCs without the GS test. Through the simulation study, although the proposed method might provide a liberal test, as a whole, the empirical size of the proposed method sufficiently controls the size at the significance level, and the empirical power of the proposed method in the absence of the GS is as good as that of the non-inferiority in the presence of the GS. The proposed method is illustrated with the published data. 相似文献

2.

A lack-of-fit test for parametric zero-inflated Poisson models

《Journal of Statistical Computation and Simulation》2012,82(9):1081-1098

Count data often contain many zeros. In parametric regression analysis of zero-inflated count data, the effect of a covariate of interest is typically modelled via a linear predictor. This approach imposes a restrictive, and potentially questionable, functional form on the relation between the independent and dependent variables. To address the noted restrictions, a flexible parametric procedure is employed to model the covariate effect as a linear combination of fixed-knot cubic basis splines or B-splines. The semiparametric zero-inflated Poisson regression model is fitted by maximizing the likelihood function through an expectation–maximization algorithm. The smooth estimate of the functional form of the covariate effect can enhance modelling flexibility. Within this modelling framework, a log-likelihood ratio test is used to assess the adequacy of the covariate function. Simulation results show that the proposed test has excellent power in detecting the lack of fit of a linear predictor. A real-life data set is used to illustrate the practicality of the methodology. 相似文献

3.

A new three-parameter lifetime distribution

Sadegh Rezaei Nahid Tahghighnia 《Statistics》2013,47(4):835-860

Many if not most lifetime distributions are motivated only by mathematical interest. Here, a new three-parameter distribution motivated mainly by lifetime issues is introduced. Some properties of the new distribution including estimation procedures are derived. Three real-data applications are described to show superior performance versus at least five of the known lifetime models. 相似文献

4.

A bivariate Sarmanov regression model for count data with generalised Poisson marginals

Vera Hofer Johannes Leitner 《Journal of applied statistics》2012,39(12):2599-2617

We present a bivariate regression model for count data that allows for positive as well as negative correlation of the response variables. The covariance structure is based on the Sarmanov distribution and consists of a product of generalised Poisson marginals and a factor that depends on particular functions of the response variables. The closed form of the probability function is derived by means of the moment-generating function. The model is applied to a large real dataset on health care demand. Its performance is compared with alternative models presented in the literature. We find that our model is significantly better than or at least equivalent to the benchmark models. It gives insights into influences on the variance of the response variables. 相似文献

5.

Bivariate Kumaraswamy distribution: properties and a new method to generate bivariate classes

Wagner Barreto-Souza Artur J. Lemonte 《Statistics》2013,47(6):1321-1342

In this paper, we introduce a bivariate Kumaraswamy (BVK) distribution whose marginals are Kumaraswamy distributions. The cumulative distribution function of this bivariate model has absolutely continuous and singular parts. Representations for the cumulative and density functions are presented and properties such as marginal and conditional distributions, product moments and conditional moments are obtained. We show that the BVK model can be obtained from the Marshall and Olkin survival copula and obtain a tail dependence measure. The estimation of the parameters by maximum likelihood is discussed and the Fisher information matrix is determined. We propose an EM algorithm to estimate the parameters. Some simulations are presented to verify the performance of the direct maximum-likelihood estimation and the proposed EM algorithm. We also present a method to generate bivariate distributions from our proposed BVK distribution. Furthermore, we introduce a BVK distribution which has only an absolutely continuous part and discuss some of its properties. Finally, a real data set is analysed for illustrative purposes. 相似文献

6.

Interval estimation for misclassification rate parameters in a complementary Poisson model

《Journal of Statistical Computation and Simulation》2012,82(9):1145-1156

We investigate three interval estimators for binomial misclassification rates in a complementary Poisson model where the data are possibly misclassified: a Wald-based interval, a score-based interval, and an interval based on the profile log-likelihood statistic. We investigate the coverage and average width properties of these intervals via a simulation study. For small Poisson counts and small misclassification rates, the intervals can perform poorly in terms of coverage. The profile log-likelihood confidence interval (CI) is often proved to outperform the other intervals with good coverage and width properties. Lastly, we apply the CIs to a real data set involving traffic accident data that contain misclassified counts. 相似文献

7.

A mixture model with Poisson and zero-truncated Poisson components to analyze road traffic accidents in Turkey

Hande Konuk Ünlü Derek S. Young Ayten Yiiter L. Hilal zcebe 《Journal of applied statistics》2022,49(4):1003

The analysis of traffic accident data is crucial to address numerous concerns, such as understanding contributing factors in an accident''s chain-of-events, identifying hotspots, and informing policy decisions about road safety management. The majority of statistical models employed for analyzing traffic accident data are logically count regression models (commonly Poisson regression) since a count – like the number of accidents – is used as the response. However, features of the observed data frequently do not make the Poisson distribution a tenable assumption. For example, observed data rarely demonstrate an equal mean and variance and often times possess excess zeros. Sometimes, data may have heterogeneous structure consisting of a mixture of populations, rather than a single population. In such data analyses, mixtures-of-Poisson-regression models can be used. In this study, the number of injuries resulting from casualties of traffic accidents registered by the General Directorate of Security (Turkey, 2005–2014) are modeled using a novel mixture distribution with two components: a Poisson and zero-truncated-Poisson distribution. Such a model differs from existing mixture models in literature where the components are either all Poisson distributions or all zero-truncated Poisson distributions. The proposed model is compared with the Poisson regression model via simulation and in the analysis of the traffic data. 相似文献

8.

A semi-analytical solution to the maximum-likelihood fit of Poisson data to a linear model using the Cash statistic

Massimiliano Bonamente David Spence 《Journal of applied statistics》2022,49(3):522

The Cash statistic, also known as the

C

statistic, is commonly used for the analysis of low-count Poisson data, including data with null counts for certain values of the independent variable. The use of this statistic is especially attractive for low-count data that cannot be combined, or re-binned, without loss of resolution. This paper presents a new maximum-likelihood solution for the best-fit parameters of a linear model using the Poisson-based Cash statistic. The solution presented in this paper provides a new and simple method to measure the best-fit parameters of a linear model for any Poisson-based data, including data with null counts. In particular, the method enforces the requirement that the best-fit linear model be non-negative throughout the support of the independent variable. The method is summarized in a simple algorithm to fit Poisson counting data of any size and counting rate with a linear model, by-passing entirely the use of the traditional

χ^{2}

statistic. 相似文献

9.

Bayesian predictive distribution for a Poisson model with a parametric restriction

Yasuyuki Hamura Tatsuya Kubokawa 《统计学通讯:理论与方法》2020,49(13):3257-3266

Abstract

Predictive probability estimation for a Poisson distribution is addressed when the parameter space is restricted. The Bayesian predictive probability against the prior on the restricted space is compared with the non-restricted Bayes predictive probability. It is shown that the former predictive probability dominates the latter under some conditions when the predictive probabilities are evaluated by the risk function relative to the Kullback-Leibler divergence. This result is proved by first showing the corresponding dominance result for estimating the restricted parameter and then translating it into the framework of predictive probability estimation. 相似文献

10.

A comparative study of tests for paired lifetime data

Wang Z Ng HK 《Lifetime data analysis》2006,12(4):505-522

In this paper, we investigate different procedures for testing the equality of two mean survival times in paired lifetime studies. We consider Owen’s M-test and Q-test, a likelihood ratio test, the paired t-test, the Wilcoxon signed rank test and a permutation test based on log-transformed survival times in the comparative study. We also consider the paired t-test, the Wilcoxon signed rank test and a permutation test based on original survival times for the sake of comparison. The size and power characteristics of these tests are studied by means of Monte Carlo simulations under a frailty Weibull model. For less skewed marginal distributions, the Wilcoxon signed rank test based on original survival times is found to be desirable. Otherwise, the M-test and the likelihood ratio test are the best choices in terms of power. In general, one can choose a test procedure based on information about the correlation between the two survival times and the skewness of the marginal survival distributions. 相似文献

11.

Likelihood Ratio Testing for Hidden Markov Models Under Non-standard Conditions

JÖRN DANNEMANN HAJO HOLZMANN 《Scandinavian Journal of Statistics》2008,35(2):309-321

Abstract. In practical applications, when testing parametric restrictions for hidden Markov models (HMMs), one frequently encounters non-standard situations such as testing for zero entries in the transition matrix, one-sided tests for the parameters of the transition matrix or for the components of the stationary distribution of the underlying Markov chain, or testing boundary restrictions on the parameters of the state-dependent distributions. In this paper, we briefly discuss how the relevant asymptotic distribution theory for the likelihood ratio test (LRT) when the true parameter is on the boundary extends from the independent and identically distributed situation to HMMs. Then we concentrate on discussing a number of relevant examples. The finite-sample performance of the LRT in such situations is investigated in a simulation study. An application to series of epileptic seizure counts concludes the paper. 相似文献

12.

Sparse inverse covariance estimation for high-throughput microRNA sequencing data in the Poisson log-normal graphical model

David Sinclair 《Journal of Statistical Computation and Simulation》2019,89(16):3105-3117

We introduce a one-step EM algorithm to estimate the graphical structure in a Poisson-Log-Normal graphical model. This procedure is equivalent to a normality transformation that makes the problem of identifying relationships in high-throughput microRNA (miRNA) sequence data feasible. The Poisson-log-normal model moreover allows us to directly account for known overdispersion relationships present in this data set. We show that our EM algorithm provides a provable increase in performance in determining the network structure. The model is shown to provide an increase in performance in simulation settings over a range of network structures. The model is applied to high-throughput miRNA sequencing data from patients with breast cancer from The Cancer Genome Atlas (TCGA). By selecting the most highly connected miRNA molecules in the fitted network we find that nearly all of them are known to be involved in the regulation of breast cancer. 相似文献

13.

Maximum-likelihood estimation of the random-clumped multinomial model as a prototype problem for large-scale statistical computing

Andrew M. Raim Matthias K. Gobbert Nagaraj K. Neerchal Jorge G. Morel 《Journal of Statistical Computation and Simulation》2013,83(12):2178-2194

Numerical methods are needed to obtain maximum-likelihood estimates (MLEs) in many problems. Computation time can be an issue for some likelihoods even with modern computing power. We consider one such problem where the assumed model is a random-clumped multinomial distribution. We compute MLEs for this model in parallel using the Toolkit for Advanced Optimization software library. The computations are performed on a distributed-memory cluster with low latency interconnect. We demonstrate that for larger problems, scaling the number of processes improves wall clock time significantly. An illustrative example shows how parallel MLE computation can be useful in a large data analysis. Our experience with a direct numerical approach indicates that more substantial gains may be obtained by making use of the specific structure of the random-clumped model. 相似文献

14.

Inference procedures for the variance gamma model and applications

K. Fragiadakis D. Karlis 《Journal of Statistical Computation and Simulation》2013,83(3):555-567

Goodness-of-fit tests for the family of the four-parameter normal–variance gamma distribution are constructed. The tests are based on a weighted integral incorporating the empirical characteristic function of suitably standardized data. Non-standard algorithms are employed for the computation of the maximum-likelihood estimators of the parameters involved in the test statistic, while Monte Carlo results are used in order to compare the new test with some classical goodness-of-fit methods. A real-data application is also included. 相似文献

15.

A general location model with zero-inflated counts and skew normal outcomes

Sayed Jamal Mirkamali 《Journal of applied statistics》2017,44(15):2716-2728

This paper proposes an extension of the general location model using a joint model for analyzing inflated counting outcomes and skew continuous outcomes. A zero-inflated binomial with batches of binomials or a zero-inflated Poisson with batches of Poissons is proposed for counting outcome and a skew normal distribution is assumed for continuous outcome. The EM algorithm is developed for estimation of parameters. The accuracy of estimations is evaluated using a simulation study. An application of our models for joint analysis of the number of cigarettes smoked per day and the weights of respondents for the American's Changing Lives study is enclosed. 相似文献

16.

Learning-based EM algorithm for normal-inverse Gaussian mixture model with application to extrasolar planets

Wen-Liang Hung Shou-Jen Chang-Chien 《Journal of applied statistics》2017,44(6):978-999

Karlis and Santourian [14 D. Karlis and A. Santourian, Model-based clustering with non-elliptically contoured distribution, Stat. Comput. 19 (2009), pp. 73–83. doi: 10.1007/s11222-008-9072-0[Crossref], [Web of Science ®] , [Google Scholar]] proposed a model-based clustering algorithm, the expectation–maximization (EM) algorithm, to fit the mixture of multivariate normal-inverse Gaussian (NIG) distribution. However, the EM algorithm for the mixture of multivariate NIG requires a set of initial values to begin the iterative process, and the number of components has to be given a priori. In this paper, we present a learning-based EM algorithm: its aim is to overcome the aforementioned weaknesses of Karlis and Santourian's EM algorithm [14 D. Karlis and A. Santourian, Model-based clustering with non-elliptically contoured distribution, Stat. Comput. 19 (2009), pp. 73–83. doi: 10.1007/s11222-008-9072-0[Crossref], [Web of Science ®] , [Google Scholar]]. The proposed learning-based EM algorithm was first inspired by Yang et al. [24 M.-S. Yang, C.-Y. Lai, and C.-Y. Lin, A robust EM clustering algorithm for Gaussian mixture models, Pattern Recognit. 45 (2012), pp. 3950–3961. doi: 10.1016/j.patcog.2012.04.031[Crossref], [Web of Science ®] , [Google Scholar]]: the process of how they perform self-clustering was then simulated. Numerical experiments showed promising results compared to Karlis and Santourian's EM algorithm. Moreover, the methodology is applicable to the analysis of extrasolar planets. Our analysis provides an understanding of the clustering results in the ln?P?ln?M and ln?P?e spaces, where M is the planetary mass, P is the orbital period and e is orbital eccentricity. Our identified groups interpret two phenomena: (1) the characteristics of two clusters in ln?P?ln?M space might be related to the tidal and disc interactions (see [9 I.G. Jiang, W.H. Ip, and L.C. Yeh, On the fate of close-in extrasolar planets, Astrophys. J. 582 (2003), pp. 449–454. doi: 10.1086/344590[Crossref], [Web of Science ®] , [Google Scholar]]); and (2) there are two clusters in ln?P?e space. 相似文献

17.

A flexible cure rate model based on the polylogarithm distribution

Diego I. Gallardo Yolanda M. Gómez Mário de Castro 《Journal of Statistical Computation and Simulation》2018,88(11):2137-2149

Models for dealing with survival data in the presence of a cured fraction of individuals have attracted the attention of many researchers and practitioners in recent years. In this paper, we propose a cure rate model under the competing risks scenario. For the number of causes that can lead to the event of interest, we assume the polylogarithm distribution. The model is flexible in the sense it encompasses some well-known models, which can be tested using large sample test statistics applied to nested models. Maximum-likelihood estimation based on the EM algorithm and hypothesis testing are investigated. Results of simulation studies designed to gauge the performance of the estimation method and of two test statistics are reported. The methodology is applied in the analysis of a data set. 相似文献

18.

A simplified estimation procedure based on the EM algorithm for the power series cure rate model

Diego I. Gallardo Jose S. Romeo Renate Meyer 《统计学通讯:模拟与计算》2017,46(8):6342-6359

The family of power series cure rate models provides a flexible modeling framework for survival data of populations with a cure fraction. In this work, we present a simplified estimation procedure for the maximum likelihood (ML) approach. ML estimates are obtained via the expectation-maximization (EM) algorithm where the expectation step involves computation of the expected number of concurrent causes for each individual. It has the big advantage that the maximization step can be decomposed into separate maximizations of two lower-dimensional functions of the regression and survival distribution parameters, respectively. Two simulation studies are performed: the first to investigate the accuracy of the estimation procedure for different numbers of covariates and the second to compare our proposal with the direct maximization of the observed log-likelihood function. Finally, we illustrate the technique for parameter estimation on a dataset of survival times for patients with malignant melanoma. 相似文献

19.

Inference for mixed generalized exponential distribution under progressively type-II censored samples

Yuzhu Tian Qianqian Zhu 《Journal of applied statistics》2014,41(3):660-676

In industrial life tests, reliability analysis and clinical trials, the type-II progressive censoring methodology, which allows for random removals of the remaining survival units at each failure time, has become quite popular for analyzing lifetime data. Parameter estimation under progressively type-II censored samples for many common lifetime distributions has been investigated extensively. However, how to estimate unknown parameters of the mixed distribution models under progressive type-II censoring schemes is still a challenging and interesting problem. Based on progressively type-II censored samples, this paper addresses the estimation problem of mixed generalized exponential distributions. In addition, it is observed that the maximum-likelihood estimates (MLEs) cannot be easily obtained in closed form due to the complexity of the likelihood function. Thus, we make good use of the expectation-maximization algorithm to obtain the MLEs. Finally, some simulations are implemented in order to show the performance of the proposed method under finite samples and a case analysis is illustrated. 相似文献

20.

Stochastic EM algorithm of a finite mixture model from hurdle Poisson distribution with missing responses

Ying-zi Fu 《统计学通讯:理论与方法》2013,42(20):5918-5932

ABSTRACT

In this article, a finite mixture model of hurdle Poisson distribution with missing outcomes is proposed, and a stochastic EM algorithm is developed for obtaining the maximum likelihood estimates of model parameters and mixing proportions. Specifically, missing data is assumed to be missing not at random (MNAR)/non ignorable missing (NINR) and the corresponding missingness mechanism is modeled through probit regression. To improve the algorithm efficiency, a stochastic step is incorporated into the E-step based on data augmentation, whereas the M-step is solved by the method of conditional maximization. A variation on Bayesian information criterion (BIC) is also proposed to compare models with different number of components with missing values. The considered model is a general model framework and it captures the important characteristics of count data analysis such as zero inflation/deflation, heterogeneity as well as missingness, providing us with more insight into the data feature and allowing for dispersion to be investigated more fully and correctly. Since the stochastic step only involves simulating samples from some standard distributions, the computational burden is alleviated. Once missing responses and latent variables are imputed to replace the conditional expectation, our approach works as part of a multiple imputation procedure. A simulation study and a real example illustrate the usefulness and effectiveness of our methodology. 相似文献