期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Nonparametric predictive inference for diagnostic test thresholds

《统计学通讯:理论与方法》2012,41(3):697-725

Abstract

Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine, machine learning and credit scoring. The receiver operating characteristic (ROC) curve and surface are useful tools to assess the ability of diagnostic tests to discriminate between ordered classes or groups. To define these diagnostic tests, selecting the optimal thresholds that maximize the accuracy of these tests is required. One procedure that is commonly used to find the optimal thresholds is by maximizing what is known as Youden’s index. This article presents nonparametric predictive inference (NPI) for selecting the optimal thresholds of a diagnostic test. NPI is a frequentist statistical method that is explicitly aimed at using few modeling assumptions, enabled through the use of lower and upper probabilities to quantify uncertainty. Based on multiple future observations, the NPI approach is presented for selecting the optimal thresholds for two-group and three-group scenarios. In addition, a pairwise approach has also been presented for the three-group scenario. The article ends with an example to illustrate the proposed methods and a simulation study of the predictive performance of the proposed methods along with some classical methods such as Youden index. The NPI-based methods show some interesting results that overcome some of the issues concerning the predictive performance of Youden’s index. 相似文献

2.

Testing the equality of quantiles for several normal populations

Kamel Abdollahnezhad Ali Akbar Jafari 《统计学通讯:模拟与计算》2018,47(7):1890-1898

Many procedures exist for testing equality of means or medians to compare several independent distributions. However, the mean or median do not determine the entire distribution. In this article, we propose a new small-sample modification of the likelihood ratio test for testing the equality of the quantiles of several normal distributions. The merits of the proposed test are numerically compared with the existing tests—a generalized p-value method and likelihood ratio test—with respect to their sizes and powers. The simulation results demonstrate that proposed method is satisfactory; its actual size is very close to the nominal level. We illustrate these approaches using two real examples. 相似文献

3.

New improvements in the use of dependence measures for sensitivity analysis and screening

《Journal of Statistical Computation and Simulation》2012,82(15):3038-3058

ABSTRACT

Physical phenomena are commonly modelled by time consuming numerical simulators, function of many uncertain parameters whose influences can be measured via a global sensitivity analysis. The usual variance-based indices require too many simulations, especially as the inputs are numerous. To address this limitation, we consider recent advances in dependence measures, focusing on the distance correlation and the Hilbert–Schmidt independence criterion. We study and use these indices for a screening purpose. Numerical tests reveal differences between variance-based indices and dependence measures. Then, two approaches are proposed to use the latter for a screening purpose. The first approach uses independence tests, with existing asymptotic versions and spectral extensions; bootstrap versions are also proposed. The second considers a linear model with dependence measures, coupled to a bootstrap selection method or a Lasso penalization. Numerical experiments show their potential in the presence of many non-influential inputs and give successful results for a nuclear reliability application. 相似文献

4.

The modified permutation entropy-based independence test of time series

Emad Ashtari Nezhad G. R. Mohtashami Borzadaran H. R. Nilli Sani Hadi Alizadeh Noughabi 《统计学通讯:模拟与计算》2013,42(10):2877-2897

Abstract

In time series, it is essential to check the independence of data by means of a proper method or an appropriate statistical test before any further analysis. Therefore, among different independence tests, a powerful and productive test has been introduced by Matilla-García and Marín via m-dimensional vectorial process, in which the value of the process at time t includes m-histories of the primary process. However, this method causes a dependency for the vectors even when the independence assumption of random variables is considered. Considering this dependency, a modified test is obtained in this article through presenting a new asymptotic distribution based on weighted chi-square random variables. Also, some other alterations to the test have been made via bootstrap method and by controlling the overlap. Compared with the primary test, it is obtained that not only the modified test is more accurate but also, it possesses higher power. 相似文献

5.

Goodness-of-fit-based outlier detection for Phase I monitoring

Xiaona Yang Xuemin Zi 《统计学通讯:模拟与计算》2013,42(10):2979-2991

ABSTRACT

Nonparametric charts are useful in statistical process control when there is a lack of or limited knowledge about the underlying process distribution. Most existing approaches in the literature of Phase I monitoring assume that outliers have the same distributions as the in-control sample but only differ in location or scale parameters, they may not be effective with distributional changes. This article develops a new procedure based on the integration of the classical Anderson–Darling goodness-of-fit test and the stepwise isolation method. Our proposed procedure is efficient in detecting potential shifts in location, scale, or shape, and thus it offers robust protection against variation in various underlying distributions. The finite sample performance of our method is evaluated through simulations and is compared with that of available outlier detection methods for Phase I monitoring. 相似文献

6.

Application of the Bootstrap Approach to the Choice of Dimension and the α Parameter in the SIRα Method

Benoît Liquet 《统计学通讯:模拟与计算》2013,42(6):1198-1218

To reduce the dimensionality of regression problems, sliced inverse regression approaches make it possible to determine linear combinations of a set of explanatory variables X related to the response variable Y in general semiparametric regression context. From a practical point of view, the determination of a suitable dimension (number of the linear combination of X) is important. In the literature, statistical tests based on the nullity of some eigenvalues have been proposed. Another approach is to consider the quality of the estimation of the effective dimension reduction (EDR) space. The square trace correlation between the true EDR space and its estimate can be used as goodness of estimation. In this article, we focus on the SIR_α method and propose a naïve bootstrap estimation of the square trace correlation criterion. Moreover, this criterion could also select the α parameter in the SIR_α method. We indicate how it can be used in practice. A simulation study is performed to illustrate the behavior of this approach. 相似文献

7.

New Simple Tests for Panel Cointegration

Joakim Westerlund 《Econometric Reviews》2013,32(3):297-316

ABSTRACT

In this paper, two new simple residual-based panel data tests are proposed for the null of no cointegration. The tests are simple because they do not require any correction for the temporal dependencies of the data. Yet they are able to accommodate individual specific short-run dynamics, individual specific intercept and trend terms, and individual specific slope parameters. The limiting distributions of the tests are derived and are shown to be free of nuisance parameters. The Monte Carlo results in this paper suggest that the asymptotic results are borne out well even in very small samples. 相似文献

8.

Computing Critical Values of Exact Tests by Incorporating Monte Carlo Simulations Combined with Statistical Tables 总被引：1，自引：0，他引：1

Albert Vexler Young Min Kim Jihnhee Yu Nicole A. Lazar Alan D. Hutson 《Scandinavian Journal of Statistics》2014,41(4):1013-1030

Various exact tests for statistical inference are available for powerful and accurate decision rules provided that corresponding critical values are tabulated or evaluated via Monte Carlo methods. This article introduces a novel hybrid method for computing p‐values of exact tests by combining Monte Carlo simulations and statistical tables generated a priori. To use the data from Monte Carlo generations and tabulated critical values jointly, we employ kernel density estimation within Bayesian‐type procedures. The p‐values are linked to the posterior means of quantiles. In this framework, we present relevant information from the Monte Carlo experiments via likelihood‐type functions, whereas tabulated critical values are used to reflect prior distributions. The local maximum likelihood technique is employed to compute functional forms of prior distributions from statistical tables. Empirical likelihood functions are proposed to replace parametric likelihood functions within the structure of the posterior mean calculations to provide a Bayesian‐type procedure with a distribution‐free set of assumptions. We derive the asymptotic properties of the proposed nonparametric posterior means of quantiles process. Using the theoretical propositions, we calculate the minimum number of needed Monte Carlo resamples for desired level of accuracy on the basis of distances between actual data characteristics (e.g. sample sizes) and characteristics of data used to present corresponding critical values in a table. The proposed approach makes practical applications of exact tests simple and rapid. Implementations of the proposed technique are easily carried out via the recently developed STATA and R statistical packages. 相似文献

9.

Model Averaging for Prediction With Fragmentary Data

Fang Fang Wei Lan Jingjing Tong Jun Shao 《商业与经济统计学杂志》2013,31(3):517-527

ABSTRACT

One main challenge for statistical prediction with data from multiple sources is that not all the associated covariate data are available for many sampled subjects. Consequently, we need new statistical methodology to handle this type of “fragmentary data” that has become more and more popular in recent years. In this article, we propose a novel method based on the frequentist model averaging that fits some candidate models using all available covariate data. The weights in model averaging are selected by delete-one cross-validation based on the data from complete cases. The optimality of the selected weights is rigorously proved under some conditions. The finite sample performance of the proposed method is confirmed by simulation studies. An example for personal income prediction based on real data from a leading e-community of wealth management in China is also presented for illustration. 相似文献

10.

Using logistic regression for semiparametric comparison of population means and variances

Shuwen Wan Binrong Xu Biao Zhang 《统计学通讯:理论与方法》2013,42(9):2485-2503

Abstract

We propose to compare population means and variances under a semiparametric density ratio model. The proposed method is easy to implement by employing logistic regression procedures in many statistical software, and it often works very well when data are not normal. In this paper, we construct semiparametric estimators of the differences of two population means and variances, and derive their asymptotic distributions. We prove that the proposed semiparametric estimators are asymptotically more efficient than the corresponding non parametric ones. In addition, a simulation study and the analysis of two real data sets are presented. Finally, a short discussion is provided. 相似文献

11.

Simple and flexible Bayesian inferences for standardized regression coefficients

Yonggang Lu Peter Westfall 《Journal of applied statistics》2019,46(12):2254-2288

ABSTRACT

In statistical practice, inferences on standardized regression coefficients are often required, but complicated by the fact that they are nonlinear functions of the parameters, and thus standard textbook results are simply wrong. Within the frequentist domain, asymptotic delta methods can be used to construct confidence intervals of the standardized coefficients with proper coverage probabilities. Alternatively, Bayesian methods solve similar and other inferential problems by simulating data from the posterior distribution of the coefficients. In this paper, we present Bayesian procedures that provide comprehensive solutions for inferences on the standardized coefficients. Simple computing algorithms are developed to generate posterior samples with no autocorrelation and based on both noninformative improper and informative proper prior distributions. Simulation studies show that Bayesian credible intervals constructed by our approaches have comparable and even better statistical properties than their frequentist counterparts, particularly in the presence of collinearity. In addition, our approaches solve some meaningful inferential problems that are difficult if not impossible from the frequentist standpoint, including identifying joint rankings of multiple standardized coefficients and making optimal decisions concerning their sizes and comparisons. We illustrate applications of our approaches through examples and make sample R functions available for implementing our proposed methods. 相似文献

12.

Parallel tempering for dynamic generalized linear models

Guangbao Guo Wei Shao Lu Lin Xuehu Zhu 《统计学通讯:理论与方法》2013,42(21):6299-6310

ABSTRACT

Markov chain Monte Carlo (MCMC) methods can be used for statistical inference. The methods are time-consuming due to time-vary. To resolve these problems, parallel tempering (PT), as a parallel MCMC method, is tried, for dynamic generalized linear models (DGLMs), as well as the several optimal properties of our proposed method. In PT, two or more samples are drawn at the same time, and samples can exchange information with each other. We also present some simulations of the DGLMs in the case and provide two applications of Poisson-type DGLMs in financial research. 相似文献

13.

Testing the Martingale Difference Hypothesis

《Econometric Reviews》2013,32(4):351-377

Abstract

In this paper we consider testing that an economic time series follows a martingale difference process. The martingale difference hypothesis has typically been tested using information contained in the second moments of a process, that is, using test statistics based on the sample autocovariances or periodograms. Tests based on these statistics are inconsistent since they cannot detect nonlinear alternatives. In this paper we consider tests that detect linear and nonlinear alternatives. Given that the asymptotic distributions of the considered tests statistics depend on the data generating process, we propose to implement the tests using a modified wild bootstrap procedure. The paper theoretically justifies the proposed tests and examines their finite sample behavior by means of Monte Carlo experiments. 相似文献

14.

Nonparametric particle filtering approaches for identification and inference in nonlinear state-space dynamic systems

Jean-Pierre Gauchi Jean-Pierre Vila 《Statistics and Computing》2013,23(4):523-533

Most system identification approaches and statistical inference methods rely on the availability of the analytic knowledge of the probability distribution function of the system output variables. In the case of dynamic systems modelled by hidden Markov chains or stochastic nonlinear state-space models, these distributions as well as that of the state variables themselves, can be unknown or untractable. In that situation, the usual particle Monte Carlo filters for system identification or likelihood-based inference and model selection methods have to rely, whenever possible, on some hazardous approximations and are often at risk. This review shows how a recent nonparametric particle filtering approach can be efficiently used in that context, not only for consistent filtering of these systems but also to restore these statistical inference methods, allowing, for example, consistent particle estimation of Bayes factors or the generalisation of model parameter change detection sequential tests. Real-life applications of these particle approaches to a microbiological growth model are proposed as illustrations. 相似文献

15.

Testing the difference between two independent regression models

Mohammad Reza Mahmoudi Marziyeh Mahmoudi Elaheh Nahavandi 《统计学通讯:理论与方法》2013,42(21):6284-6289

ABSTRACT

In some situations, for example, in biology or psychology studies, we wish to determine whether the linear relationship between response variable and predictor variables differs in two populations. The analysis of the covariance (ANCOVA) or, equivalently, the partial F-test approaches are the commonly used methods. In this study, the asymptotic distribution for the difference between two independent regression coefficients was established. The proposed method was used to derive the asymptotic confidence set for the difference between coefficients and hypothesis testing for the equality of the two regression models. Then a simulation study was conducted to compare the proposed method with the partial F method. The performance of the new method was comparable with that of the partial F method. 相似文献

16.

Clustering and classification problems in genetics through U-statistics

Gabriela B. Cybis Marcio Valk Sílvia R. C. Lopes 《Journal of Statistical Computation and Simulation》2018,88(10):1882-1902

ABSTRACT

Genetic data are frequently categorical and have complex dependence structures that are not always well understood. For this reason, clustering and classification based on genetic data, while highly relevant, are challenging statistical problems. Here we consider a versatile U-statistics-based approach for non-parametric clustering that allows for an unconventional way of solving these problems. In this paper we propose a statistical test to assess group homogeneity taking into account multiple testing issues and a clustering algorithm based on dissimilarities within and between groups that highly speeds up the homogeneity test. We also propose a test to verify classification significance of a sample in one of two groups. We present Monte Carlo simulations that evaluate size and power of the proposed tests under different scenarios. Finally, the methodology is applied to three different genetic data sets: global human genetic diversity, breast tumour gene expression and Dengue virus serotypes. These applications showcase this statistical framework's ability to answer diverse biological questions in the high dimension low sample size scenario while adapting to the specificities of the different datatypes. 相似文献

17.

Testing Chaos Based on Empirical Distribution Function: A Simulation Study

《Journal of Statistical Computation and Simulation》2012,82(1):77-85

It is well known that many classical statistical tests of randomness generally fail to distinguish chaos generated by some lower-dimensional deterministic dynamical systems from independent and identically distributed (i.i.d.) random series. In this paper, we suggest a powerful statistical testing method based on empirical distribution function that can well detect chaos and i.i.d. random series. 相似文献

18.

Limitations of P-Values and R-squared for Stepwise Regression Building: A Fairness Demonstration in Health Policy Risk Adjustment

Sherri Rose Thomas G. McGuire 《The American statistician》2019,73(1):152-156

ABSTRACT

Stepwise regression building procedures are commonly used applied statistical tools, despite their well-known drawbacks. While many of their limitations have been widely discussed in the literature, other aspects of the use of individual statistical fit measures, especially in high-dimensional stepwise regression settings, have not. Giving primacy to individual fit, as is done with p-values and R², when group fit may be the larger concern, can lead to misguided decision making. One of the most consequential uses of stepwise regression is in health care, where these tools allocate hundreds of billions of dollars to health plans enrolling individuals with different predicted health care costs. The main goal of this “risk adjustment” system is to convey incentives to health plans such that they provide health care services fairly, a component of which is not to discriminate in access or care for persons or groups likely to be expensive. We address some specific limitations of p-values and R² for high-dimensional stepwise regression in this policy problem through an illustrated example by additionally considering a group-level fairness metric. 相似文献

19.

A Conway Maxwell Poisson type generalization of the negative hypergeometric distribution

Sudip Roy Ram C. Tripathi Narayanaswamy Balakrishnan 《统计学通讯:理论与方法》2020,49(10):2410-2428

Abstract

Negative hypergeometric distribution arises as a waiting time distribution when we sample without replacement from a finite population. It has applications in many areas such as inspection sampling and estimation of wildlife populations. However, as is well known, the negative hypergeometric distribution is over-dispersed in the sense that its variance is greater than the mean. To make it more flexible and versatile, we propose a modified version of negative hypergeometric distribution called COM-Negative Hypergeometric distribution (COM-NH) by introducing a shape parameter as in the COM-Poisson and COMP-Binomial distributions. It is shown that under some limiting conditions, COM-NH approaches to a distribution that we call the COM-Negative binomial (COMP-NB), which in turn, approaches to the COM Poisson distribution. For the proposed model, we investigate the dispersion characteristics and shape of the probability mass function for different combinations of parameters. We also develop statistical inference for this model including parameter estimation and hypothesis tests. In particular, we investigate some properties such as bias, MSE, and coverage probabilities of the maximum likelihood estimators for its parameters by Monte Carlo simulation and likelihood ratio test to assess shape parameter of the underlying model. We present illustrative data to provide discussion. 相似文献

20.

Detecting change in a hazard regression model with right-censoring

Jean-François Dupuy 《Journal of statistical planning and inference》2009

The hazard function plays an important role in survival analysis and reliability, since it quantifies the instantaneous failure rate of an individual at a given time point t, given that this individual has not failed before t. In some applications, abrupt changes in the hazard function are observed, and it is of interest to detect the location of such a change. In this paper, we consider testing of existence of a change in the parameters of an exponential regression model, based on a sample of right-censored survival times and the corresponding covariates. Likelihood ratio type tests are proposed and non-asymptotic bounds for the type II error probability are obtained. When the tests lead to acceptance of a change, estimators for the location of the change are proposed. Non-asymptotic upper bounds of the underestimation and overestimation probabilities are obtained. A short simulation study illustrates these results. 相似文献