首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Summary. Smoothing spline analysis of variance decomposes a multivariate function into additive components. This decomposition not only provides an efficient way to model a multivariate function but also leads to meaningful inference by testing whether a certain component equals 0. No formal procedure is yet available to test such a hypothesis. We propose an asymptotic method based on the likelihood ratio to test whether a functional component is 0. This test allows us to choose an optimal model and to compare groups of curves. We first develop the general theory by exploiting the connection between mixed effects models and smoothing splines. We then apply this to compare two groups of curves and to select an optimal model in a two-dimensional problem. A small simulation is used to assess the finite sample performance of the likelihood ratio test.  相似文献   

2.
Given the very large amount of data obtained everyday through population surveys, much of the new research again could use this information instead of collecting new samples. Unfortunately, relevant data are often disseminated into different files obtained through different sampling designs. Data fusion is a set of methods used to combine information from different sources into a single dataset. In this article, we are interested in a specific problem: the fusion of two data files, one of which being quite small. We propose a model-based procedure combining a logistic regression with an Expectation-Maximization algorithm. Results show that despite the lack of data, this procedure can perform better than standard matching procedures.  相似文献   

3.
Growth curve analysis is beneficial in longitudinal studies, where the pattern of response variables measured repeatedly over time is of interest, yet unknown. In this article, we propose generalized growth curve models under a polynomial regression framework and offer a complete process that identifies the parsimonious growth curves for different groups of interest, as well as compares the curves. A higher order of a polynomial degree generally provides more flexible regression, yet it may suffer from the complicated and overfitted model in practice. Therefore, we employ the model selection procedure that chooses the optimal degree of a polynomial consistently. Consideration of a quadratic inference function (Qu et al., 2000) for estimation on regression parameters is addressed and estimation efficiency is improved by incorporating the within-subject correlation commonly existing in longitudinal data. In biomedical studies, it is of particular interest to compare multiple treatments and provide an effective one. We further conduct the hypothesis test that assesses the equality of the growth curves through an asymptotic chi-square test statistic. The proposed methodology is employed on a randomized controlled longitudinal dataset on depression. The effectiveness of our procedure is also confirmed with simulation studies.  相似文献   

4.
Given two independent samples of size n and m drawn from univariate distributions with unknown densities f and g, respectively, we are interested in identifying subintervals where the two empirical densities deviate significantly from each other. The solution is built by turning the nonparametric density comparison problem into a comparison of two regression curves. Each regression curve is created by binning the original observations into many small size bins, followed by a suitable form of root transformation to the binned data counts. Turned as a regression comparison problem, several nonparametric regression procedures for detection of sparse signals can be applied. Both multiple testing and model selection methods are explored. Furthermore, an approach for estimating larger connected regions where the two empirical densities are significantly different is also derived, based on a scale-space representation. The proposed methods are applied on simulated examples as well as real-life data from biology.  相似文献   

5.
Artur J. Lemonte 《Statistics》2013,47(6):1249-1265
The class of generalized linear models with dispersion covariates, which allows us to jointly model the mean and dispersion parameters, is a natural extension to the classical generalized linear models. In this paper, we derive the asymptotic expansions under a sequence of Pitman alternatives (up to order n ?1/2) for the nonnull distribution functions of the likelihood ratio, Wald, Rao score and gradient statistics in this class of models. The asymptotic distributions of these statistics are obtained for testing a subset of regression parameters and for testing a subset of dispersion parameters. Based on these nonnull asymptotic expansions, the power of all four tests, which are equivalent to first order, are compared. Furthermore, we consider Monte Carlo simulations in order to compare the finite-sample performance of these tests in this class of models. We present two empirical applications to two real data sets for illustrative purposes.  相似文献   

6.
Summary.  We establish asymptotic theory for both the maximum likelihood and the maximum modified likelihood estimators in mixture regression models. Moreover, under specific and reasonable conditions, we show that the optimal convergence rate of n −1/4 for estimating the mixing distribution is achievable for both the maximum likelihood and the maximum modified likelihood estimators. We also derive the asymptotic distributions of two log-likelihood ratio test statistics for testing homogeneity and we propose a resampling procedure for approximating the p -value. Simulation studies are conducted to investigate the empirical performance of the two test statistics. Finally, two real data sets are analysed to illustrate the application of our theoretical results.  相似文献   

7.
We briefly review and discuss design issues for population growth and decline models. We then use a flexible growth and decline model as an illustrative example and apply optimal design theory to find optimal sampling times for estimating model parameters, specific parameters and interesting functions of the model parameters for the model with two real applications. Robustness properties of the optimal designs are investigated when nominal values or the model is mis-specified, and also under a different optimality criterion. To facilitate use of optimal design ideas in practice, we also introduce a website for generating a variety of optimal designs for popular models from different disciplines.  相似文献   

8.
The study of count data time series has been active in the past decade, mainly in theory and model construction. There are different ways to construct time series models with a geometric autocorrelation function, and a given univariate margin such as negative binomial. In this paper, we investigate negative binomial time series models based on the binomial thinning and two other expectation thinning operators, and show how they differ in conditional variance or heteroscedasticity. Since the model construction is in terms of probability generating functions, typically, the relevant conditional probability mass functions do not have explicit forms. In order to do simulations, likelihood inference, graphical diagnostics and prediction, we use a numerical method for inversion of characteristic functions. We illustrate the numerical methods and compare the various negative binomial time series models for a real data example.  相似文献   

9.
The aim of this paper is to describe a simulation procedure to compare parametric regression against a non-parametric regression method, for different functions and sets of information. The proposed methodology improves lack of fit at the edges of the regression curves, and an acceptable result is obtained for the no-parametric estimation in all studied cases. Larger differences appear at the edges of the estimation. The results are applied to the study of dasometric variables, which do not fulfil the normality hypothesis needed for parametric estimation. The kernel regression shows the relationship between the studied variables, which would not be detected with more rigid parametric models.  相似文献   

10.
We propose a method of comparing two functional linear models in which explanatory variables are functions (curves) and responses can be either scalars or functions. In such models, the role of parameter vectors (or matrices) is played by integral operators acting on a function space. We test the null hypothesis that these operators are the same in two independent samples. The complexity of the test statistics increases as we move from scalar to functional responses and relax assumptions on the covariance structure of the regressors. They all, however, have an asymptotic chi‐squared distribution with the number of degrees of freedom which depends on a specific setting. The test statistics are readily computable using the R package fda , and have good finite sample properties. The test is applied to egg‐laying curves of Mediterranean flies and to data from terrestrial magnetic observatories. The Canadian Journal of Statistics © 2009 Statistical Society of Canada  相似文献   

11.
Summary.  Cellular signalling pathways, mediating receptor activity to nuclear gene activation, are generally regarded as feed forward cascades. We analyse measured data of a partially observed signalling pathway and address the question of possible feed-back cycling of involved biochemical components between the nucleus and cytoplasm. First we address the question of cycling in general, starting from basic assumptions about the system. We reformulate the problem as a statistical test leading to likelihood ratio tests under non-standard conditions. We find that the modelling approach without cycling is rejected. Afterwards, to differentiate two different transport mechanisms within the nucleus, we derive the appropriate dynamical models which lead to two systems of ordinary differential equations. To compare both models we apply a statistical testing procedure that is based on bootstrap distributions. We find that one of both transport mechanisms leads to a dynamical model which is rejected whereas the other model is satisfactory.  相似文献   

12.
We consider varying coefficient models, which are an extension of the classical linear regression models in the sense that the regression coefficients are replaced by functions in certain variables (for example, time), the covariates are also allowed to depend on other variables. Varying coefficient models are popular in longitudinal data and panel data studies, and have been applied in fields such as finance and health sciences. We consider longitudinal data and estimate the coefficient functions by the flexible B-spline technique. An important question in a varying coefficient model is whether an estimated coefficient function is statistically different from a constant (or zero). We develop testing procedures based on the estimated B-spline coefficients by making use of nice properties of a B-spline basis. Our method allows longitudinal data where repeated measurements for an individual can be correlated. We obtain the asymptotic null distribution of the test statistic. The power of the proposed testing procedures are illustrated on simulated data where we highlight the importance of including the correlation structure of the response variable and on real data.  相似文献   

13.
Regression analysis is one of the most commonly used techniques in statistics. When the dimension of independent variables is high, it is difficult to conduct efficient non-parametric analysis straightforwardly from the data. As an important alternative to the additive and other non-parametric models, varying-coefficient models can reduce the modelling bias and avoid the "curse of dimensionality" significantly. In addition, the coefficient functions can easily be estimated via a simple local regression. Based on local polynomial techniques, we provide the asymptotic distribution for the maximum of the normalized deviations of the estimated coefficient functions away from the true coefficient functions. Using this result and the pre-asymptotic substitution idea for estimating biases and variances, simultaneous confidence bands for the underlying coefficient functions are constructed. An important question in the varying coefficient models is whether an estimated coefficient function is statistically significantly different from zero or a constant. Based on newly derived asymptotic theory, a formal procedure is proposed for testing whether a particular parametric form fits a given data set. Simulated and real-data examples are used to illustrate our techniques.  相似文献   

14.
Goodness-of-fit evaluation of a parametric regression model is often done through hypothesis testing, where the fit of the model of interest is compared statistically to that obtained under a broader class of models. Nonparametric regression models are frequently used as the latter type of model, because of their flexibility and wide applicability. To date, this type of tests has generally been performed globally, by comparing the parametric and nonparametric fits over the whole range of the data. However, in some instances it might be of interest to test for deviations from the parametric model that are localized to a subset of the data. In this case, a global test will have low power and hence can miss important local deviations. Alternatively, a naive testing approach that discards all observations outside the local interval will suffer from reduced sample size and potential overfitting. We therefore propose a new local goodness-of-fit test for parametric regression models that can be applied to a subset of the data but relies on global model fits, and propose a bootstrap-based approach for obtaining the distribution of the test statistic. We compare the new approach with the global and the naive tests, both theoretically and through simulations, and illustrate its practical behavior in an application. We find that the local test has a better ability to detect local deviations than the other two tests.  相似文献   

15.
In this paper, we study the properties of a special class of frailty models when the frailty is common to several failure times. The models are closely linked to Archimedean copula models. We establish a useful formula for cumulative baseline hazard functions and develop a new estimator for cumulative baseline hazard functions in bivariate frailty regression models. Based on our proposed estimator, we present a graphical model checking procedure. We fit a leukemia data set using our model and end our paper with some discussions.  相似文献   

16.
In the present study we have evaluated two competing methods for estimation of the impulse response weights used in the identification of transfer function models:a time domain method involving biased regression techniques and a frequency domain method utilizing a discrete Fourier transform of the cross-covariance system of the transfer function model. The algorithms were implemented on a VAX-8800 computer at the Computing Center at Åbo Akademi. The evaluation of the competing methods was carried out by simulations of different transfer function noise model structures. The models are essentially the same as those of Edlund, but we have used a far greater number of replications in the cases tested. Furthermore, we have used actually identified and estimated autoregressive integrated moving-average models of the residuals in the identification procedure of impulse response weights, in contrast with Edlund who only used theoretical noise models in filtering the input and output series. After a shot discussion of the underlying theory, we present the procedures and results of the empirical testing.  相似文献   

17.
In this article, we develop a formal goodness-of-fit testing procedure for one-shot device testing data, in which each observation in the sample is either left censored or right censored. Such data are also called current status data. We provide an algorithm for calculating the nonparametric maximum likelihood estimate (NPMLE) of the unknown lifetime distribution based on such data. Then, we consider four different test statistics that can be used for testing the goodness-of-fit of accelerated failure time (AFT) model by the use of samples of residuals: a chi-square-type statistic based on the difference between the empirical and expected numbers of failures at each inspection time; two other statistics based on the difference between the NPMLE of the lifetime distribution obtained from one-shot device testing data and the distribution specified under the null hypothesis; as a final statistic, we use White's idea of comparing two estimators of the Fisher Information (FI) to propose a test statistic. We then compare these tests in terms of power, and draw some conclusions. Finally, we present an example to illustrate the proposed tests.  相似文献   

18.
Often, the response variables on sampling units are observed repeatedly over time. The sampling units may come from different populations, such as treatment groups. This setting is routinely modeled by a random coefficients growth curve model, and the techniques of general linear mixed models are applied to address the primary research aim. An alternative approach is to reduce each subject’s data to summary measures, such as within-subject averages or regression coefficients. One may then test for equality of means of the summary measures (or functions of them) among treatment groups. Here, we compare by simulation the performance characteristics of three approximate tests based on summary measures and one based on the full data, focusing mainly on accuracy of p-values. We find that performances of these procedures can be quite different for small samples in several different configurations of parameter values. The summary-measures approach performed at least as well as the full-data mixed models approach.  相似文献   

19.
In some situations, for example in agriculture, biology, hydrology, and psychology, researchers wish to determine whether the relationship between response variable and predictor variables differs in two populations. In other words, we are interested in comparing two regression models for two independent datasets. In this work, we will use the parametric and nonparametric methods to establish hypothesis testing for the equality of two independent regression models. Then the simulation study is provided to investigate the performance of the proposed method.  相似文献   

20.
This article addresses the problem of testing whether the vectors of regression coefficients are equal for two independent normal regression models when the error variances are unknown. This problem poses severe difficulties both to the frequentist and Bayesian approaches to statistical inference. In the former approach, normal hypothesis testing theory does not apply because of the unrelated variances. In the latter, the prior distributions typically used for the parameters are improper and hence the Bayes factor-based solution cannot be used.We propose a Bayesian solution to this problem in which no subjective input is considered. We first generate “objective” proper prior distributions (intrinsic priors) for which the Bayes factor and model posterior probabilities are well defined. The posterior probability of each model is used as a model selection tool. This consistent procedure of testing hypotheses is compared with some of the frequentist approximate tests proposed in the literature.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号