期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Mixture model on the variance for the differential analysis of gene expression data 总被引：1，自引：0，他引：1

Paul Delmar Stéphane Robin Diana Tronik-Le Roux Jean Jacques Daudin 《Journal of the Royal Statistical Society. Series C, Applied statistics》2005,54(1):31-50

Summary. In microarray experiments, accurate estimation of the gene variance is a key step in the identification of differentially expressed genes. Variance models go from the too stringent homoscedastic assumption to the overparameterized model assuming a specific variance for each gene. Between these two extremes there is some room for intermediate models. We propose a method that identifies clusters of genes with equal variance. We use a mixture model on the gene variance distribution. A test statistic for ranking and detecting differentially expressed genes is proposed. The method is illustrated with publicly available complementary deoxyribonucleic acid microarray experiments, an unpublished data set and further simulation studies. 相似文献

2.

A nonparametric test for interaction in two‐way layouts

Xin Gao Mayer Alvo 《Revue canadienne de statistique》2005,33(4):529-543

The authors present a new nonparametric approach to test for interaction in two‐way layouts. Based on the concept of composite linear rank statistics, they combine the correlated row and column ranking information to construct the test statistic. They determine the limiting distributions of the proposed test statistic under the null hypothesis and Pitman alternatives. They also propose consistent estimators for the limiting covariance matrices associated with the test. They illustrate the application of their test in practical settings using a microarray data set. 相似文献

3.

A Jonckheere-Terpstra-type test for perfect ranking in balanced ranked set sampling

Michael Vock N. Balakrishnan 《Journal of statistical planning and inference》2011,141(2):624-630

Many methods based on ranked set sampling (RSS) assume perfect ranking of the samples. Here, by using the data measured by a balanced RSS scheme, we propose a nonparametric test for the assumption of perfect ranking. The test statistic that we use formally corresponds to the Jonckheere-Terpstra-type test statistic. We show formal relations of the proposed test for perfect ranking to other methods proposed recently in the literature. Through an empirical power study, we demonstrate that the proposed method performs favorably compared to many of its competitors. 相似文献

4.

Hadamard matrix methods in identifying differentially expressed genes from microarray experiments

Yu Ding Damaraju Raghavarao 《Journal of statistical planning and inference》2008

Identifying differentially expressed genes is a basic objective in microarray experiments. Many statistical methods for detecting differentially expressed genes in multiple-slide experiments have been proposed. However, sometimes with limited experimental resources, only a single cDNA array or two Oligonuleotide arrays could be made or only insufficient replicated arrays could be conducted. Many current statistical models cannot be used because of the non-availability of replicated data. Simply using fold changes is also unreliable and inefficient [Chen et al. 1997. Ratio-based decisions and the quantitative analysis of cDNA microarray images. J. Biomed. Optics 2, 364–374; Newton et al. 2001. On differential variability of expression ratios: improving statistical inference about gene expression changes from microarray data. J. Comput. Biol. 8, 37–52; Pan et al. 2002. How many replicates of arrays are required to detect gene expression changes in microarray experiments? a mixture model approach. Genome Biol. 3, research0022.1-0022.10]. We propose a new method. If the log-transformed ratios for the expressed genes as well as unexpressed genes have equal variance, we use a Hadamard matrix to construct a t-test from a single array data. Basically, we test whether each doubtful gene has significantly differential expression compared to the unexpressed genes. We form some new random variables corresponding to the rows of a Hadamard matrix using the algebraic sum of gene expressions. A one-sample t-test is constructed and the p-value is calculated for each doubtful gene based on these random variables. By using any method for multiple testing, adjusted p-values could be obtained from original p-values and significance of doubtful genes can be determined. When the variance of expressed genes differs from the variance of unexpressed genes, we construct a z-statistic based on the result from application of Hadamard matrix and find the confidence interval to retain the null hypothesis. Using the interval, we determine differentially expressed genes. This method is also useful for multiple microarrays, especially when sufficient replicated data are not available for a traditional t-test. We apply our methodology to ApoAI data. The results appear to be promising. They not only confirm the early known differentially expressed genes, but also indicate more genes to be differentially expressed. 相似文献

5.

Robust and nonparametric subset selection procedures

Jason C.. Hsu 《统计学通讯:理论与方法》2013,42(14):1439-1459

Subset selection procedures based on ranks have been investigated by a number of authors previously. Their methods are based on ranking the samples from all the populations jointly. However, as was pointed out by Rizvi and Woodworth (1970), the procedures they proposed cannot control the probability of a correct selection over the entire parameter space. In this paper, we propose a subset selection procedure based on pairwise rather than joint ranking of the samples. It is shown that this procedure controls the probability of a correct selection over the entire parameter space. It is also shown that the Pitman efficiency of this nonparametric procedure relative to the multivariate t procedure of Gupta (1956, 1965) is the same as the Pitman efficiency of the Mann-Whitney-Wilcoxon test relative to the t-test. 相似文献

6.

A Bayesian mixture model for differential gene expression 总被引：3，自引：0，他引：3

Kim-Anh Do Peter Müller Feng Tang 《Journal of the Royal Statistical Society. Series C, Applied statistics》2005,54(3):627-644

Summary. We propose model-based inference for differential gene expression, using a nonparametric Bayesian probability model for the distribution of gene intensities under various conditions. The probability model is a mixture of normal distributions. The resulting inference is similar to a popular empirical Bayes approach that is used for the same inference problem. The use of fully model-based inference mitigates some of the necessary limitations of the empirical Bayes method. We argue that inference is no more difficult than posterior simulation in traditional nonparametric mixture-of-normal models. The approach proposed is motivated by a microarray experiment that was carried out to identify genes that are differentially expressed between normal tissue and colon cancer tissue samples. Additionally, we carried out a small simulation study to verify the methods proposed. In the motivating case-studies we show how the nonparametric Bayes approach facilitates the evaluation of posterior expected false discovery rates. We also show how inference can proceed even in the absence of a null sample of known non-differentially expressed scores. This highlights the difference from alternative empirical Bayes approaches that are based on plug-in estimates. 相似文献

7.

Empirical Likelihood Inferences for Semiparametric Varying-Coefficient Partially Linear Models with Longitudinal Data

Peixin Zhao Liugen Xue 《统计学通讯:理论与方法》2013,42(11):1898-1914

In this article, empirical likelihood inferences for semiparametric varying-coefficient partially linear models with longitudinal data are investigated. We propose a groupwise empirical likelihood procedure to handle the inter-series dependence of the longitudinal data. By using residual-adjustment, an empirical likelihood ratio function for the nonparametric component is constructed, and a nonparametric version Wilks' phenomenons is proved. Compared with methods based on normal approximations, the empirical likelihood does not require consistent estimators for the asymptotic variance and bias. A simulation study is undertaken to assess the finite sample performance of the proposed confidence regions. 相似文献

8.

NONPARAMETRIC ESTIMATION OF GENEWISE VARIANCE FOR MICROARRAY DATA

Fan J Feng Y Niu YS 《Annals of statistics》2010,38(5):2723-2750

Estimation of genewise variance arises from two important applications in microarray data analysis: selecting significantly differentially expressed genes and validation tests for normalization of microarray data. We approach the problem by introducing a two-way nonparametric model, which is an extension of the famous Neyman-Scott model and is applicable beyond microarray data. The problem itself poses interesting challenges because the number of nuisance parameters is proportional to the sample size and it is not obvious how the variance function can be estimated when measurements are correlated. In such a high-dimensional nonparametric problem, we proposed two novel nonparametric estimators for genewise variance function and semiparametric estimators for measurement correlation, via solving a system of nonlinear equations. Their asymptotic normality is established. The finite sample property is demonstrated by simulation studies. The estimators also improve the power of the tests for detecting statistically differentially expressed genes. The methodology is illustrated by the data from MicroArray Quality Control (MAQC) project. 相似文献

9.

Dimension reduction for the conditional mean and variance functions in time series

Jin-Hong Park S. Yaser Samadi 《Scandinavian Journal of Statistics》2020,47(1):134-155

This paper deals with the nonparametric estimation of the mean and variance functions of univariate time series data. We propose a nonparametric dimension reduction technique for both mean and variance functions of time series. This method does not require any model specification and instead we seek directions in both the mean and variance functions such that the conditional distribution of the current observation given the vector of past observations is the same as that of the current observation given a few linear combinations of the past observations without loss of inferential information. The directions of the mean and variance functions are estimated by maximizing the Kullback–Leibler distance function. The consistency of the proposed estimators is established. A computational procedure is introduced to detect lags of the conditional mean and variance functions in practice. Numerical examples and simulation studies are performed to illustrate and evaluate the performance of the proposed estimators. 相似文献

10.

Empirical null distribution-based modeling of multi-class differential gene expression detection

Xiting Cao Marshall I. Hertz 《Journal of applied statistics》2013,40(2):347-357

In this paper, we study the multi-class differential gene expression detection for microarray data. We propose a likelihood-based approach to estimating an empirical null distribution to incorporate gene interactions and provide a more accurate false-positive control than the commonly used permutation or theoretical null distribution-based approach. We propose to rank important genes by p-values or local false discovery rate based on the estimated empirical null distribution. Through simulations and application to lung transplant microarray data, we illustrate the competitive performance of the proposed method. 相似文献

11.

Nonparametric rank-based test procedures for non-additive models in the two-way layout I. no replications

Douglas A. Wolfe Angela M. Dean Bradley A. Hartlaub 《统计学通讯:理论与方法》2013,42(11):4355-4382

One of the major unresolved problems in the area of nonparametric statistics is the need for satisfactory rank-based test procedures for non-additive models in the two-way layout, especially when there is only one observation on each combination of the levels of the experimental factors. In this paper we consider an arbitrary non-additive model for the two-way layout with n levels of each factor. We utilize both alignment and ranking of the data together with basic properties of Latin squares to develop rank tests for interaction (non-additivity). Our technique involves first aligning within one of the main effects, ranking within the other main effects (columns and rows) and then adding the resulting ranks within “interaction bands” corresponding to orthogonal partitions of the interaction for the model, as denoted by the letters of an n × n Latin square. A Friedman-type statistic is then computed on the resulting sums. This is repeated for each of (n?1) mutually orthogonal Latin squares (thus accounting for all the interaction degrees of freedom). The resulting (n?1) Friedman-type statistics are finally combined to obtain an overall test statistic. The necessary null distribution tables for applying the proposed test for non-additivity are presented and we discuss the results of a Monte Carlo simulation study of the relative powers of this new procedure and other (parametric and nonparametric) procedures designed to detect interaction in a two-way layout with one observation per cell. 相似文献

12.

Semiparametric statistical inferences for longitudinal data with nonparametric covariance modelling

Qunfang Xu 《Statistics》2017,51(6):1280-1303

In this paper, semiparametric modelling for longitudinal data with an unstructured error process is considered. We propose a partially linear additive regression model for longitudinal data in which within-subject variances and covariances of the error process are described by unknown univariate and bivariate functions, respectively. We provide an estimating approach in which polynomial splines are used to approximate the additive nonparametric components and the within-subject variance and covariance functions are estimated nonparametrically. Both the asymptotic normality of the resulting parametric component estimators and optimal convergence rate of the resulting nonparametric component estimators are established. In addition, we develop a variable selection procedure to identify significant parametric and nonparametric components simultaneously. We show that the proposed SCAD penalty-based estimators of non-zero components have an oracle property. Some simulation studies are conducted to examine the finite-sample performance of the proposed estimation and variable selection procedures. A real data set is also analysed to demonstrate the usefulness of the proposed method. 相似文献

13.

An adaptive empirical Bayesian thresholding procedure for analysing microarray experiments with replication

Rebecca E. Walls Stuart Barber John T. Kent Mark S. Gilthorpe 《Journal of the Royal Statistical Society. Series C, Applied statistics》2007,56(3):271-291

Summary. A typical microarray experiment attempts to ascertain which genes display differential expression in different samples. We model the data by using a two-component mixture model and develop an empirical Bayesian thresholding procedure, which was originally introduced for thresholding wavelet coefficients, as an alternative to the existing methods for determining differential expression across thousands of genes. The method is built on sound theoretical properties and has easy computer implementation in the R statistical package. Furthermore, we consider improvements to the standard empirical Bayesian procedure when replication is present, to increase the robustness and reliability of the method. We provide an introduction to microarrays for those who are unfamilar with the field and the proposed procedure is demonstrated with applications to two-channel complementary DNA microarray experiments. 相似文献

14.

Evaluation of false discovery rate and power via sample size in microarray studies

Jie Song Herman W. Raadsma Peter C. Thomson 《Journal of applied statistics》2012,39(3):489-500

Microarray studies are now common for human, agricultural plant and animal studies. False discovery rate (FDR) is widely used in the analysis of large-scale microarray data to account for problems associated with multiple testing. A well-designed microarray study should have adequate statistical power to detect the differentially expressed (DE) genes, while keeping the FDR acceptably low. In this paper, we used a mixture model of expression responses involving DE genes and non-DE genes to analyse theoretical FDR and power for simple scenarios where it is assumed that each gene has equal error variance and the gene effects are independent. A simulation study was used to evaluate the empirical FDR and power for more complex scenarios with unequal error variance and gene dependence. Based on this approach, we present a general guide for sample size requirement at the experimental design stage for prospective microarray studies. This paper presented an approach to explicitly connect the sample size with FDR and power. While the methods have been developed in the context of one-sample microarray studies, they are readily applicable to two-sample, and could be adapted to multiple-sample studies. 相似文献

15.

Wavelet Analysis of Change Points in Nonparametric Hazard Rate Models Under Random Censorship

Jingle Wang Ming Zheng Wen Yu 《统计学通讯:理论与方法》2014,43(9):1956-1978

In this article, we consider detection and estimation of change points in nonparametric hazard rate models. Wavelet methods are utilized to develop a testing procedure for change points detection. The asymptotic properties of the test statistic are explored. When there exist change points in hazard function, we also propose estimators for the number, the locations, and the jump sizes of the change points. The asymptotic properties of these estimators are systematically derived. Some simulation examples are conducted to assess the finite sample performance of the proposed approach and to make comparisons with some existing methods. A real data analysis is provided to illustrate the new approach. 相似文献

16.

Quantile-adaptive variable screening in ultra-high dimensional varying coefficient models

Junying Zhang Zhiping Lu 《Journal of applied statistics》2016,43(4):643-654

The varying-coefficient model is an important nonparametric statistical model since it allows appreciable flexibility on the structure of fitted model. For ultra-high dimensional heterogeneous data it is very necessary to examine how the effects of covariates vary with exposure variables at different quantile level of interest. In this paper, we extended the marginal screening methods to examine and select variables by ranking a measure of nonparametric marginal contributions of each covariate given the exposure variable. Spline approximations are employed to model marginal effects and select the set of active variables in quantile-adaptive framework. This ensures the sure screening property in quantile-adaptive varying-coefficient model. Numerical studies demonstrate that the proposed procedure works well for heteroscedastic data. 相似文献

17.

Nonparametric estimation of the dynamic range of music signals

下载免费PDF全文

Pietro Coretto Francesco Giordano 《Australian & New Zealand Journal of Statistics》2017,59(4):389-412

相似文献

18.

NEW EFFICIENT ESTIMATION AND VARIABLE SELECTION METHODS FOR SEMIPARAMETRIC VARYING-COEFFICIENT PARTIALLY LINEAR MODELS 总被引：1，自引：0，他引：1

Kai B Li R Zou H 《Annals of statistics》2011,39(1):305-332

The complexity of semiparametric models poses new challenges to statistical inference and model selection that frequently arise from real applications. In this work, we propose new estimation and variable selection procedures for the semiparametric varying-coefficient partially linear model. We first study quantile regression estimates for the nonparametric varying-coefficient functions and the parametric regression coefficients. To achieve nice efficiency properties, we further develop a semiparametric composite quantile regression procedure. We establish the asymptotic normality of proposed estimators for both the parametric and nonparametric parts and show that the estimators achieve the best convergence rate. Moreover, we show that the proposed method is much more efficient than the least-squares-based method for many non-normal errors and that it only loses a small amount of efficiency for normal errors. In addition, it is shown that the loss in efficiency is at most 11.1% for estimating varying coefficient functions and is no greater than 13.6% for estimating parametric components. To achieve sparsity with high-dimensional covariates, we propose adaptive penalization methods for variable selection in the semiparametric varying-coefficient partially linear model and prove that the methods possess the oracle property. Extensive Monte Carlo simulation studies are conducted to examine the finite-sample performance of the proposed procedures. Finally, we apply the new methods to analyze the plasma beta-carotene level data. 相似文献

19.

A Discussion of Permutation Tests Conditional to Observed Responses in Unreplicated 2 M Full Factorial Designs

D. Basso 《统计学通讯:理论与方法》2013,42(1):83-97

ABSTRACT

In this article we present a new solution to test for effects in unreplicated two-level factorial designs. The proposed test statistic, in case the error components are normally distributed, follows an F random variable, though our attention is on its nonparametric permutation version. The proposed procedure does not require any transformation of data such as residualization and it is exact for each effect and distribution-free. Our main aim is to discuss a permutation solution conditional to the original vector of responses. We give two versions of the same nonparametric testing procedure in order to control both the individual error rate and the experiment-wise error rate. A power comparison with Loughin and Noble's test is provided in the case of a unreplicated 2⁴ full factorial design. 相似文献

20.

A nonparametric test for diagnosis of the proportionality assumption

Jong-Hoo Choi Hyo-Il Park 《Statistical Papers》2007,48(3):467-477

We propose a nonparametric test for diagnosis of the proportionality assumption between hazard functions based on a functional equation. Because of involvement of censoring distribution, we consider the test procedure in an asymptotic manner and obtain the asymptotic normality for the proposed test statistic. Then we discuss the rationale of use of the functional equation for the initial effect model. Finally we compare our test with others using an example. 相似文献