More flexible semiparametric linear‐index regression models are proposed to describe the conditional distribution. Such a model formulation captures varying effects of covariates over the support of a response variable in distribution, offers an alternative perspective on dimension reduction and covers a lot of widely used parametric and semiparameteric regression models. A feasible pseudo likelihood approach, accompanied with a simple and easily implemented algorithm, is further developed for the mixed case with both varying and invariant coefficients. By justifying some theoretical properties on Banach spaces, the uniform consistency and asymptotic Gaussian process of the proposed estimator are also established in this article. In addition, under the monotonicity of distribution in linear‐index, we develop an alternative approach based on maximizing a varying accuracy measure. By virtue of the asymptotic recursion relation for the estimators, some of the achievements in this direction include showing the convergence of the iterative computation procedure and establishing the large sample properties of the resulting estimator. It is noticeable that our theoretical framework is very helpful in constructing confidence bands for the parameters of interest and tests for the hypotheses of various qualitative structures in distribution. Generally, the developed estimation and inference procedures perform quite satisfactorily in the conducted simulations and are demonstrated to be useful in reanalysing data from the Boston house price study and the World Values Survey.  相似文献   

We extend four tests common in classical regression – Wald, score, likelihood ratio and F tests – to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.  相似文献   

We examine the asymptotic and small sample properties of model-based and robust tests of the null hypothesis of no randomized treatment effect based on the partial likelihood arising from an arbitrarily misspecified Cox proportional hazards model. When the distribution of the censoring variable is either conditionally independent of the treatment group given covariates or conditionally independent of covariates given the treatment group, the numerators of the partial likelihood treatment score and Wald tests have asymptotic mean equal to 0 under the null hypothesis, regardless of whether or how the Cox model is misspecified. We show that the model-based variance estimators used in the calculation of the model-based tests are not, in general, consistent under model misspecification, yet using analytic considerations and simulations we show that their true sizes can be as close to the nominal value as tests calculated with robust variance estimators. As a special case, we show that the model-based log-rank test is asymptotically valid. When the Cox model is misspecified and the distribution of censoring depends on both treatment group and covariates, the asymptotic distributions of the resulting partial likelihood treatment score statistic and maximum partial likelihood estimator do not, in general, have a zero mean under the null hypothesis. Here neither the fully model-based tests, including the log-rank test, nor the robust tests will be asymptotically valid, and we show through simulations that the distortion to test size can be substantial.  相似文献   

In this paper, we consider a model checking problem for general linear models with randomly missing covariates. Two types of score type tests with inverse probability weight, which is estimated by parameter and nonparameter methods respectively, are proposed to this goodness of fit problem. The asymptotic properties of the test statistics are developed under the null and local alternative hypothesis. Simulation study is carried out to present the performance of the sizes and powers of the tests. We illustrate the proposed method with a data set on monozygotic twins.  相似文献   

Exact ksample permutation tests for binary data for three commonly encountered hypotheses tests are presented,, The tests are derived both under the population and randomization models . The generating function for the number of cases in the null distribution is obtained, The asymptotic distributions of the test statistics are derived . Actual significance levels are computed for the asymptotic test versions , Random sampling of the null distribution is suggested as a superior alternative to the asymptotics and an efficient computer technique for implementing the random sampling is described., finally, some numerical examples are presented and sample size guidelines given for computer implementation of the exact tests.  相似文献   

Varying-coefficient models have been widely used to investigate the possible time-dependent effects of covariates when the response variable comes from normal distribution. Much progress has been made for inference and variable selection in the framework of such models. However, the identification of model structure, that is how to identify which covariates have time-varying effects and which have fixed effects, remains a challenging and unsolved problem especially when the dimension of covariates is much larger than the sample size. In this article, we consider the structural identification and variable selection problems in varying-coefficient models for high-dimensional data. Using a modified basis expansion approach and group variable selection methods, we propose a unified procedure to simultaneously identify the model structure, select important variables and estimate the coefficient curves. The unique feature of the proposed approach is that we do not have to specify the model structure in advance, therefore, it is more realistic and appropriate for real data analysis. Asymptotic properties of the proposed estimators have been derived under regular conditions. Furthermore, we evaluate the finite sample performance of the proposed methods with Monte Carlo simulation studies and a real data analysis.  相似文献   

Abstract.  Several testing procedures are proposed that can detect change-points in the error distribution of non-parametric regression models. Different settings are considered where the change-point either occurs at some time point or at some value of the covariate. Fixed as well as random covariates are considered. Weak convergence of the suggested difference of sequential empirical processes based on non-parametrically estimated residuals to a Gaussian process is proved under the null hypothesis of no change-point. In the case of testing for a change in the error distribution that occurs with increasing time in a model with random covariates the test statistic is asymptotically distribution free and the asymptotic quantiles can be used for the test. This special test statistic can also detect a change in the regression function. In all other cases the asymptotic distribution depends on unknown features of the data-generating process and a bootstrap procedure is proposed in these cases. The small sample performances of the proposed tests are investigated by means of a simulation study and the tests are applied to a data example.  相似文献   

A class of test statistics is introduced which is sensitive against the alternative of stochastic ordering in the two-sample censored data problem. The test statistics for evaluating a cumulative weighted difference in survival distributions are developed while taking into account the imbalances in base-line covariates between two groups. This procedure can be used to test the null hypothesis of no treatment effect, especially when base-line hazards cross and prognostic covariates need to be adjusted. The statistics are semiparametric, not rank based, and can be written as integrated weighted differences in estimated survival functions, where these survival estimates are adjusted for covariate imbalances. The asymptotic distribution theory of the tests is developed, yielding test procedures that are shown to be consistent under a fixed alternative. The choice of weight function is discussed and relies on stability and interpretability considerations. An example taken from a clinical trial for acquired immune deficiency syndrome is presented.  相似文献   

Herein, we propose a data-driven test that assesses the lack of fit of nonlinear regression models. The comparison of local linear kernel and parametric fits is the basis of this test, and specific boundary-corrected kernels are not needed at the boundary when local linear fitting is used. Under the parametric null model, the asymptotically optimal bandwidth can be used for bandwidth selection. This selection method leads to the data-driven test that has a limiting normal distribution under the null hypothesis and is consistent against any fixed alternative. The finite-sample property of the proposed data-driven test is illustrated, and the power of the test is compared with that of some existing tests via simulation studies. We illustrate the practicality of the proposed test by using two data sets.  相似文献   

We consider statistical inference for partial linear additive models (PLAMs) when the linear covariates are measured with errors and distorted by unknown functions of commonly observable confounding variables. A semiparametric profile least squares estimation procedure is proposed to estimate unknown parameter under unrestricted and restricted conditions. Asymptotic properties for the estimators are established. To test a hypothesis on the parametric components, a test statistic based on the difference between the residual sums of squares under the null and alternative hypotheses is proposed, and we further show that its limiting distribution is a weighted sum of independent standard chi-squared distributions. A bootstrap procedure is further proposed to calculate critical values. Simulation studies are conducted to demonstrate the performance of the proposed procedure and a real example is analyzed for an illustration.  相似文献   

We investigate here small sample properties of approximate F-tests about fixed effects parameters in nonlinear mixed models. For estimation of population fixed effects parameters as well as variance components, we apply the two-stage approach. This method is useful and popular when the number of observations per sampling unit is large enough. The approximate F-test is constructed based on large-sample approximation to the distribution of nonlinear least-squares estimates of subject-specific parameters. We recommend a modified test statistic that takes into consideration approximation to the large-sample Fisher information matrix (See [Volaufova J, Burton JH. Note on hypothesis testing in mixed models. Oral presentation at: LINSTAT 2012/21st IWMS; 2012; Bedlewo, Poland]). Our main focus is on comparing finite sample properties of broadly used approximate tests (Wald test and likelihood ratio test) and the modified F-test under the null hypothesis, especially accuracy of p-values (See [Volaufova J, LaMotte L. Comparison of approximate tests of fixed effects in linear repeated measures design models with covariates. Tatra Mountains. 2008;39:17–25]). For that purpose two extensive simulation studies are conducted based on pharmacokinetic models (See [Hartford A, Davidian M. Consequences of misspecifying assumptions in nonlinear mixed effects models. Comput Stat and Data Anal. 2000;34:139–164; Pinheiro J, Bates D. Approximations to the log-likelihood function in the non-linear mixed-effects model. J Comput Graph Stat. 1995;4(1):12–35]).  相似文献   

There is an increasing number of goodness-of-fit tests whose test statistics measure deviations between the empirical characteristic function and an estimated characteristic function of the distribution in the null hypothesis. With the aim of overcoming certain computational difficulties with the calculation of some of these test statistics, a transformation of the data is considered. To apply such a transformation, the data are assumed to be continuous with arbitrary dimension, but we also provide a modification for discrete random vectors. Practical considerations leading to analytic formulas for the test statistics are studied, as well as theoretical properties such as the asymptotic null distribution, validity of the corresponding bootstrap approximation, and consistency of the test against fixed alternatives. Five applications are provided in order to illustrate the theory. These applications also include numerical comparison with other existing techniques for testing goodness-of-fit.  相似文献   

In this article, we consider nonparametric test procedures based on a group of quantile test statistics. We consider the quadratic form for the two-sided test and the maximal and summing types of statistics for the one-sided alternatives. Then we derive the null limiting distributions of the proposed test statistics using the large sample approximation theory. Also, we consider applying the permutation principle to obtain the null distribution. In this vein, we may consider the supremum type, which should use the permutation principle for obtaining the null distribution. Then we illustrate our procedure with an example and compare the proposed tests with other existing tests including the individual quantile tests by obtaining empirical powers through simulation study. Also, we comment on the related discussions to this testing procedure as concluding remarks. Finally we prove the lemmas and theorems in the appendices.  相似文献   

It is generally assumed that the likelihood ratio statistic for testing the null hypothesis that data arise from a homoscedastic normal mixture distribution versus the alternative hypothesis that data arise from a heteroscedastic normal mixture distribution has an asymptotic χ 2 reference distribution with degrees of freedom equal to the difference in the number of parameters being estimated under the alternative and null models under some regularity conditions. Simulations show that the χ 2 reference distribution will give a reasonable approximation for the likelihood ratio test only when the sample size is 2000 or more and the mixture components are well separated when the restrictions suggested by Hathaway (Ann. Stat. 13:795–800, 1985) are imposed on the component variances to ensure that the likelihood is bounded under the alternative distribution. For small and medium sample sizes, parametric bootstrap tests appear to work well for determining whether data arise from a normal mixture with equal variances or a normal mixture with unequal variances.  相似文献   

Using simulation techniques, the null distribution properties of seven hypothesis testing procedures and a comparison of their powers are investigated for incomplete-data small-sample growth curve situations. The testing procedures are a combination of two growth curve models (the Potthoff and Roy model for complete data and Kleinbaum's extention to incomplete data) and three estimation techniques (two involving means of existing observations and the other using the EM algorithm) plus an analysis of a subset of complete data. All of the seven tests use the Kleinbaum Wald statistic, but different tests use different information. The hypotheses of identical and parallel growth curves are tested under the assumptions of multivariate normality and a linear polynomial mean growth curve for each of two groups. Good approximate null distributions are found for all procedures and one procedure is identified as empirically most powerful for the situations investigated.  相似文献   

Two new implementations of the EM algorithm are proposed for maximum likelihood fitting of generalized linear mixed models. Both methods use random (independent and identically distributed) sampling to construct Monte Carlo approximations at the E-step. One approach involves generating random samples from the exact conditional distribution of the random effects (given the data) by rejection sampling, using the marginal distribution as a candidate. The second method uses a multivariate t importance sampling approximation. In many applications the two methods are complementary. Rejection sampling is more efficient when sample sizes are small, whereas importance sampling is better with larger sample sizes. Monte Carlo approximation using random samples allows the Monte Carlo error at each iteration to be assessed by using standard central limit theory combined with Taylor series methods. Specifically, we construct a sandwich variance estimate for the maximizer at each approximate E-step. This suggests a rule for automatically increasing the Monte Carlo sample size after iterations in which the true EM step is swamped by Monte Carlo error. In contrast, techniques for assessing Monte Carlo error have not been developed for use with alternative implementations of Monte Carlo EM algorithms utilizing Markov chain Monte Carlo E-step approximations. Three different data sets, including the infamous salamander data of McCullagh and Nelder, are used to illustrate the techniques and to compare them with the alternatives. The results show that the methods proposed can be considerably more efficient than those based on Markov chain Monte Carlo algorithms. However, the methods proposed may break down when the intractable integrals in the likelihood function are of high dimension.  相似文献   

In this paper, a moving monitoring procedure is proposed to detect potential variance change of the location model with dependent errors. The procedure is motivated by the problem that the existing square CUSUM test is insensitive to a late variance change of the location model. The asymptotic distribution of the statistics under the null hypothesis and the consistency under the alternative hypothesis are derived. Simulations show that our monitoring procedure compared to the square CUSUM test offers better power and can more quickly detect change. Moreover, the effectiveness of our procedure is illustrated by applying it to two data sets.  相似文献   

For a multivariate linear model, Wilk's likelihood ratio test (LRT) constitutes one of the cornerstone tools. However, the computation of its quantiles under the null or the alternative hypothesis requires complex analytic approximations, and more importantly, these distributional approximations are feasible only for moderate dimension of the dependent variable, say p≤20. On the other hand, assuming that the data dimension p as well as the number q of regression variables are fixed while the sample size n grows, several asymptotic approximations are proposed in the literature for Wilk's Λ including the widely used chi-square approximation. In this paper, we consider necessary modifications to Wilk's test in a high-dimensional context, specifically assuming a high data dimension p and a large sample size n. Based on recent random matrix theory, the correction we propose to Wilk's test is asymptotically Gaussian under the null hypothesis and simulations demonstrate that the corrected LRT has very satisfactory size and power, surely in the large p and large n context, but also for moderately large data dimensions such as p=30 or p=50. As a byproduct, we give a reason explaining why the standard chi-square approximation fails for high-dimensional data. We also introduce a new procedure for the classical multiple sample significance test in multivariate analysis of variance which is valid for high-dimensional data.  相似文献   

In a longitudinal study, an individual is followed up over a period of time. Repeated measurements on the response and some time-dependent covariates are taken at a series of sampling times. The sampling times are often irregular and depend on covariates. In this paper, we propose a sampling adjusted procedure for the estimation of the proportional mean model without having to specify a sampling model. Unlike existing procedures, the proposed method is robust to model misspecification of the sampling times. Large sample properties are investigated for the estimators of both regression coefficients and the baseline function. We show that the proposed estimation procedure is more efficient than the existing procedures. Large sample confidence intervals for the baseline function are also constructed by perturbing the estimation equations. A simulation study is conducted to examine the finite sample properties of the proposed estimators and to compare with some of the existing procedures. The method is illustrated with a data set from a recurrent bladder cancer study.  相似文献   

Quantile-based reliability analysis has received much attention recently. We propose new quantile-based tests for exponentiality against decreasing mean residual quantile function (DMRQ) and new better than used in expectation (NBUE) classes of alternatives. The exact null distribution of the test statistic is derived when the alternative class is DMRQ. The asymptotic properties of both the test statistics are studied. The performance of the proposed tests with other existing tests in the literature is evaluated through simulation study. Finally, we illustrate our test procedure using real data sets.  相似文献   

