首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 312 毫秒
1.
A contaminated beta model $(1-\gamma) B(1,1) + \gamma B(\alpha,\beta)$ is often used to describe the distribution of $P$ ‐values arising from a microarray experiment. The authors propose and examine a different approach: namely, using a contaminated normal model $(1-\gamma) N(0,\sigma^2) + \gamma N(\mu,\sigma^2)$ to describe the distribution of $Z$ statistics or suitably transformed $T$ statistics. The authors then address whether a researcher who has $Z$ statistics should analyze them using the contaminated normal model or whether the $Z$ statistics should be converted to $P$ ‐values to be analyzed using the contaminated beta model. The authors also provide a decision‐theoretic perspective on the analysis of $Z$ statistics. The Canadian Journal of Statistics 38: 315–332; 2010 © 2010 Statistical Society of Canada  相似文献   

2.
This paper deals with a bias correction of Akaike's information criterion (AIC) for selecting variables in multivariate normal linear regression models when the true distribution of observation is an unknown non‐normal distribution. It is well known that the bias of AIC is $O(1)$ , and there are a number of the first‐order bias‐corrected AICs which improve the bias to $O(n^{-1})$ , where $n$ is the sample size. A new information criterion is proposed by slightly adjusting the first‐order bias‐corrected AIC. Although the adjustment is achieved by merely using constant coefficients, the bias of the new criterion is reduced to $O(n^{-2})$ . Then, a variance of the new criterion is also improved. Through numerical experiments, we verify that our criterion is superior to others. The Canadian Journal of Statistics 39: 126–146; 2011 © 2011 Statistical Society of Canada  相似文献   

3.
This paper considers estimators of survivor functions subject to a stochastic ordering constraint based on right censored data. We present the constrained nonparametric maximum likelihood estimator (C‐NPMLE) of the survivor functions in one‐and two‐sample settings where the survivor distributions could be discrete or continuous and discuss the non‐uniqueness of the estimators. We also present a computationally efficient algorithm to obtain the C‐NPMLE. To address the possibility of non‐uniqueness of the C‐NPMLE of $S_1(t)$ when $S_1(t)\le S_2(t)$ , we consider the maximum C‐NPMLE (MC‐NPMLE) of $S_1(t)$ . In the one‐sample case with arbitrary upper bound survivor function $S_2(t)$ , we present a novel and efficient algorithm for finding the MC‐NPMLE of $S_1(t)$ . Dykstra ( 1982 ) also considered constrained nonparametric maximum likelihood estimation for such problems, however, as we show, Dykstra's method has an error and does not always give the C‐NPMLE. We corrected this error and simulation shows improvement in efficiency compared to Dykstra's estimator. Confidence intervals based on bootstrap methods are proposed and consistency of the estimators is proved. Data from a study on larynx cancer are analysed to illustrate the method. The Canadian Journal of Statistics 40: 22–39; 2012 © 2012 Statistical Society of Canada  相似文献   

4.
We study estimation and feature selection problems in mixture‐of‐experts models. An $l_2$ ‐penalized maximum likelihood estimator is proposed as an alternative to the ordinary maximum likelihood estimator. The estimator is particularly advantageous when fitting a mixture‐of‐experts model to data with many correlated features. It is shown that the proposed estimator is root‐$n$ consistent, and simulations show its superior finite sample behaviour compared to that of the maximum likelihood estimator. For feature selection, two extra penalty functions are applied to the $l_2$ ‐penalized log‐likelihood function. The proposed feature selection method is computationally much more efficient than the popular all‐subset selection methods. Theoretically it is shown that the method is consistent in feature selection, and simulations support our theoretical results. A real‐data example is presented to demonstrate the method. The Canadian Journal of Statistics 38: 519–539; 2010 © 2010 Statistical Society of Canada  相似文献   

5.
We are interested in estimating prediction error for a classification model built on high dimensional genomic data when the number of genes (p) greatly exceeds the number of subjects (n). We examine a distance argument supporting the conventional 0.632+ bootstrap proposed for the $n > p$ scenario, modify it for the $n < p$ situation and develop learning curves to describe how the true prediction error varies with the number of subjects in the training set. The curves are then applied to define adjusted resampling estimates for the prediction error in order to achieve a balance in terms of bias and variability. The adjusted resampling methods are proposed as counterparts of the 0.632+ bootstrap when $n < p$ , and are found to improve on the 0.632+ bootstrap and other existing methods in the microarray study scenario when the sample size is small and there is some level of differential expression. The Canadian Journal of Statistics 41: 133–150; 2013 © 2012 Statistical Society of Canada  相似文献   

6.
A new test is proposed for the hypothesis of uniformity on bi‐dimensional supports. The procedure is an adaptation of the “distance to boundary test” (DB test) proposed in Berrendero, Cuevas, & Vázquez‐Grande (2006). This new version of the DB test, called DBU test, allows us (as a novel, interesting feature) to deal with the case where the support S of the underlying distribution is unknown. This means that S is not specified in the null hypothesis so that, in fact, we test the null hypothesis that the underlying distribution is uniform on some support S belonging to a given class ${\cal C}$ . We pay special attention to the case that ${\cal C}$ is either the class of compact convex supports or the (broader) class of compact λ‐convex supports (also called r‐convex or α‐convex in the literature). The basic idea is to apply the DB test in a sort of plug‐in version, where the support S is approximated by using methods of set estimation. The DBU method is analysed from both the theoretical and practical point of view, via some asymptotic results and a simulation study, respectively. The Canadian Journal of Statistics 40: 378–395; 2012 © 2012 Statistical Society of Canada  相似文献   

7.
The class $G^{\rho,\lambda }$ of weighted log‐rank tests proposed by Fleming & Harrington [Fleming & Harrington (1991) Counting Processes and Survival Analysis, Wiley, New York] has been widely used in survival analysis and is nowadays, unquestionably, the established method to compare, nonparametrically, k different survival functions based on right‐censored survival data. This paper extends the $G^{\rho,\lambda }$ class to interval‐censored data. First we introduce a new general class of rank based tests, then we show the analogy to the above proposal of Fleming & Harrington. The asymptotic behaviour of the proposed tests is derived using an observed Fisher information approach and a permutation approach. Aiming to make this family of tests interpretable and useful for practitioners, we explain how to interpret different choices of weights and we apply it to data from a cohort of intravenous drug users at risk for HIV infection. The Canadian Journal of Statistics 40: 501–516; 2012 © 2012 Statistical Society of Canada  相似文献   

8.
Statistical procedures for the detection of a change in the dependence structure of a series of multivariate observations are studied in this work. The test statistics that are proposed are $L_1$ , $L_2$ , and $L_{\infty }$ distances computed from vectors of differences of Kendall's tau; two multivariate extensions of Kendall's measure of association are used. Since the distributions of these statistics under the null hypothesis of no change depend on the unknown underlying copula of the vectors, a procedure based on the multiplier central limit theorem is used for the computation of p‐values; the method is shown to be valid both asymptotically and for moderate sample sizes. Alternative versions of the tests that take into account possible breakpoints in the marginal distributions are also investigated. Monte Carlo simulations show that the tests are powerful under many scenarios of change‐point. In addition, two estimators of the time of change are proposed and their efficiency is carefully studied. The methodologies are illustrated on simulated series from the Canadian Regional Climate Model. The Canadian Journal of Statistics 41: 65–82; 2013 © 2012 Statistical Society of Canada  相似文献   

9.
In a missing data setting, we have a sample in which a vector of explanatory variables ${\bf x}_i$ is observed for every subject i, while scalar responses $y_i$ are missing by happenstance on some individuals. In this work we propose robust estimators of the distribution of the responses assuming missing at random (MAR) data, under a semiparametric regression model. Our approach allows the consistent estimation of any weakly continuous functional of the response's distribution. In particular, strongly consistent estimators of any continuous location functional, such as the median, L‐functionals and M‐functionals, are proposed. A robust fit for the regression model combined with the robust properties of the location functional gives rise to a robust recipe for estimating the location parameter. Robustness is quantified through the breakdown point of the proposed procedure. The asymptotic distribution of the location estimators is also derived. The proofs of the theorems are presented in Supplementary Material available online. The Canadian Journal of Statistics 41: 111–132; 2013 © 2012 Statistical Society of Canada  相似文献   

10.
In this paper, we extend the general minimum lower‐order confounding (GMC) criterion to the case of three‐level designs. First, we review the relationship between GMC and other criteria. Then we introduce an aliased component‐number pattern (ACNP) and a three‐level GMC criterion via the consideration of component effects, and obtain some results on the new criterion. All the 27‐run GMC designs, 81‐run GMC designs with factor numbers $n=5,\ldots,20$ and 243‐run GMC designs with resolution $IV$ or higher are tabulated. The Canadian Journal of Statistics 41: 192–210; 2013 © 2012 Statistical Society of Canada  相似文献   

11.
Consider a linear regression model with n‐dimensional response vector, regression parameter and independent and identically distributed errors. Suppose that the parameter of interest is where a is a specified vector. Define the parameter where c and t are specified. Also suppose that we have uncertain prior information that . Part of our evaluation of a frequentist confidence interval for is the ratio (expected length of this confidence interval)/(expected length of standard confidence interval), which we call the scaled expected length of this interval. We say that a confidence interval for utilizes this uncertain prior information if: (i) the scaled expected length of this interval is substantially less than 1 when ; (ii) the maximum value of the scaled expected length is not too much larger than 1; and (iii) this confidence interval reverts to the standard confidence interval when the data happen to strongly contradict the prior information. Kabaila and Giri (2009) present a new method for finding such a confidence interval. Let denote the least squares estimator of . Also let and . Using computations and new theoretical results, we show that the performance of this confidence interval improves as increases and decreases.  相似文献   

12.
Abstract. The strong Rayleigh property is a new and robust negative dependence property that implies negative association; in fact it implies conditional negative association closed under external fields (CNA+). Suppose that and are two families of 0‐1 random variables that satisfy the strong Rayleigh property and let . We show that {Zi} conditioned on is also strongly Rayleigh; this turns out to be an easy consequence of the results on preservation of stability of polynomials of Borcea & Brändén (Invent. Math., 177, 2009, 521–569). This entails that a number of important π ps sampling algorithms, including Sampford sampling and Pareto sampling, are CNA+. As a consequence, statistics based on such samples automatically satisfy a version of the Central Limit Theorem for triangular arrays.  相似文献   

13.
A reduced ‐statistic is a ‐statistic with its summands drawn from a restricted but balanced set of pairs. In this article, central limit theorems are derived for reduced ‐statistics under ‐mixing, which significantly extends the work of Brown & Kildea in various aspects. It will be shown and illustrated that reduced ‐statistics are quite useful in deriving test statistics in various nonparametric testing problems.  相似文献   

14.
We extend the empirical likelihood beyond its domain by expanding its contours nested inside the domain with a similarity transformation. The extended empirical likelihood achieves two objectives at the same time: escaping the “convex hull constraint” on the empirical likelihood and improving the coverage accuracy of the empirical likelihood ratio confidence region to $O(n^{-2})$ . The latter is accomplished through a special transformation which matches the extended empirical likelihood with the Bartlett corrected empirical likelihood. The extended empirical likelihood ratio confidence region retains the shape of the original empirical likelihood ratio confidence region. It also accommodates adjustments for dimension and small sample size, giving it good coverage accuracy in large and small sample situations. The Canadian Journal of Statistics 41: 257–274; 2013 © 2013 Statistical Society of Canada  相似文献   

15.
Canada's $41{\rm st}$ national general election saw the Conservative Party increase its seat count from 143 to 166, thus giving it a majority of the national parliament's 308 seats. By contrast, nearly all of the pre‐election seat count forecasts predicted a Conservative minority only. We examine the extent to which simple statistical models could or could not have predicted the Conservative majority prior to the election. We conclude that, by using data from the previous (2008) election appropriately, the Conservative majority should have been anticipated as the most likely outcome. The Canadian Journal of Statistics 39: 721–733; 2011. © 2011 Statistical Society of Canada  相似文献   

16.
The Dantzig selector (Candès & Tao, 2007) is a popular $\ell^{1}$ ‐regularization method for variable selection and estimation in linear regression. We present a very weak geometric condition on the observed predictors which is related to parallelism and, when satisfied, ensures the uniqueness of Dantzig selector estimators. The condition holds with probability 1, if the predictors are drawn from a continuous distribution. We discuss the necessity of this condition for uniqueness and also provide a closely related condition which ensures the uniqueness of lasso estimators (Tibshirani, 1996). Large sample asymptotics for the Dantzig selector, that is, almost sure convergence and the asymptotic distribution, follow directly from our uniqueness results and a continuity argument. The limiting distribution of the Dantzig selector is generally non‐normal. Though our asymptotic results require that the number of predictors is fixed (similar to Knight & Fu, 2000), our uniqueness results are valid for an arbitrary number of predictors and observations. The Canadian Journal of Statistics 41: 23–35; 2013 © 2012 Statistical Society of Canada  相似文献   

17.
The problem of estimating the effects in a balanced two-way classification with interaction \documentclass{article}\pagestyle{empty}\begin{document}$i = 1, \ldots ,I;j = 1, \ldots ,J;k = 1, \ldots ,K$\end{document} using a random effect model is considered from a Bayesian view point. Posterior distributions of ri, cj and tij are obtained under the assumptions that ri, cj, tij and eijk are all independently drawn from normal distributions with zero meansand variances \documentclass{article}\pagestyle{empty}\begin{document}$\sigma _r^2 ,\sigma _c^2 ,\sigma _t^2 ,\sigma _e^2$\end{document} respectively. A non informative reference prior is adopted for \documentclass{article}\pagestyle{empty}\begin{document}$\mu ,\sigma _r^2 ,\sigma _c^2 ,\sigma _t^2 ,\sigma _e^2$\end{document}. Various features of thisposterior distribution are obtained. The same features of the psoterior distribution for a fixed effect model are also obtained. A numerical example is given.  相似文献   

18.
Abstract. In this article, we define and investigate a novel class of non‐parametric prior distributions, termed the class . Such class of priors is dense with respect to the homogeneous normalized random measures with independent increments and it is characterized by a richer predictive structure than those arising from other widely used priors. Our interest in the class is mainly motivated by Bayesian non‐parametric analysis of some species sampling problems concerning the evaluation of the species relative abundances in a population. We study both the probability distribution of the number of species present in a sample and the probability of discovering a new species conditionally on an observed sample. Finally, by using the coupling from the past method, we provide an exact sampling scheme for the system of predictive distributions characterizing the class .  相似文献   

19.
Abstract. Let M be an isotonic real‐valued function on a compact subset of and let be an unconstrained estimator of M. A feasible monotonizing technique is to take the largest (smallest) monotone function that lies below (above) the estimator or any convex combination of these two envelope estimators. When the process is asymptotically equicontinuous for some sequence rn→∞, we show that these projection‐type estimators are rn‐equivalent in probability to the original unrestricted estimator. Our first motivating application involves a monotone estimator of the conditional distribution function that has the distributional properties of the local linear regression estimator. Applications also include the estimation of econometric (probability‐weighted moment, quantile) and biometric (mean remaining lifetime) functions.  相似文献   

20.
Testing goodness‐of‐fit of commonly used genetic models is of critical importance in many applications including association studies and testing for departure from Hardy–Weinberg equilibrium. Case–control design has become widely used in population genetics and genetic epidemiology, thus it is of interest to develop powerful goodness‐of‐fit tests for genetic models using case–control data. This paper develops a likelihood ratio test (LRT) for testing recessive and dominant models for case–control studies. The LRT statistic has a closed‐form formula with a simple $\chi^{2}(1)$ null asymptotic distribution, thus its implementation is easy even for genome‐wide association studies. Moreover, it has the same power and optimality as when the disease prevalence is known in the population. The Canadian Journal of Statistics 41: 341–352; 2013 © 2013 Statistical Society of Canada  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号