首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 515 毫秒
1.
Under an assumption that missing values occur randomly in a matrix, formulae are developed for the expected value and variance of six statistics that summarize the number and location of the missing values. For a seventh statistic, a regression model based on simulated data yields an estimate of the expected value. The results can be used in the development of methods to control the Type I error and approximate power and sample size for multilevel and longitudinal studies with missing data.  相似文献   

2.
We consider statistical inference of unknown parameters in estimating equations (EEs) when some covariates have nonignorably missing values, which is quite common in practice but has rarely been discussed in the literature. When an instrument, a fully observed covariate vector that helps identifying parameters under nonignorable missingness, is available, the conditional distribution of the missing covariates given other covariates can be estimated by the pseudolikelihood method of Zhao and Shao [(2015), ‘Semiparametric pseudo likelihoods in generalised linear models with nonignorable missing data’, Journal of the American Statistical Association, 110, 1577–1590)] and be used to construct unbiased EEs. These modified EEs then constitute a basis for valid inference by empirical likelihood. Our method is applicable to a wide range of EEs used in practice. It is semiparametric since no parametric model for the propensity of missing covariate data is assumed. Asymptotic properties of the proposed estimator and the empirical likelihood ratio test statistic are derived. Some simulation results and a real data analysis are presented for illustration.  相似文献   

3.
A problem of testing of hypotheses on the mean vector of a multivariate normal distribution with unknown and positive definite covariance matrix is considered when a sample with a special, though not unusual, pattern of missing observations from that population is available. The approximate percentage points of the test statistic are obtained and their accuracy has been checked by comparing them with some exact percentage points which are calculated for complete samples and some special incomplete samples. The approximate percentage points are in good agreement with exact percentage points. The above work is extended to the problem of testing the hypothesis of equality of two mean vectors of two multivariate normal distributions with the same, unknown covariance matrix  相似文献   

4.
The assumption that all random errors in the linear regression model share the same variance (homoskedasticity) is often violated in practice. The ordinary least squares estimator of the vector of regression parameters remains unbiased, consistent and asymptotically normal under unequal error variances. Many practitioners then choose to base their inferences on such an estimator. The usual practice is to couple it with an asymptotically valid estimation of its covariance matrix, and then carry out hypothesis tests that are valid under heteroskedasticity of unknown form. We use numerical integration methods to compute the exact null distributions of some quasi-t test statistics, and propose a new covariance matrix estimator. The numerical results favor testing inference based on the estimator we propose.  相似文献   

5.
ABSTRACT. This paper considers a general class of random coefficient regression (RCR) models to represent pooled cross-sectional and time series data. A new method is given to estimate the covariance matrix of the error component in these RCR models. Also, the asymptotic and small sample properties of the estimated generalized least squares estimator of the regression coefficient vector are established. Procedures for testing a linear restriction on the mean vector of the random coefficients are derived. Finally, a test for non-randomness in the RCR model is devised, and the asymptotic distribution of the test statistic is obtained.  相似文献   

6.
Linear increments (LI) are used to analyse repeated outcome data with missing values. Previously, two LI methods have been proposed, one allowing non‐monotone missingness but not independent measurement error and one allowing independent measurement error but only monotone missingness. In both, it was suggested that the expected increment could depend on current outcome. We show that LI can allow non‐monotone missingness and either independent measurement error of unknown variance or dependence of expected increment on current outcome but not both. A popular alternative to LI is a multivariate normal model ignoring the missingness pattern. This gives consistent estimation when data are normally distributed and missing at random (MAR). We clarify the relation between MAR and the assumptions of LI and show that for continuous outcomes multivariate normal estimators are also consistent under (non‐MAR and non‐normal) assumptions not much stronger than those of LI. Moreover, when missingness is non‐monotone, they are typically more efficient.  相似文献   

7.
This paper considers the problem of testing the randomness of Gaussian and non–Gaussian time series. A general class of parametric portmanteau statistics, which include the Box–Pierce and the Ljung–Box statistics, is introduced. Using the exact first and second moments of the sample autocorrelations when the observations are i.i.d. normal with unknown mean, the exact expected value of any portmanteau statistics is obtained for this case. Two new portmanteau statistics, which exploit the exact moments of the sample autocorrelations, are studied. For the nonparametric case, a rank portmanteau statistic is introduced. The latter has the same distribution for any series of exchangeable random variables and uses the exact moments of the rank autocorrelations. We show that its asymptotic distribution is chi–squate. Simulation results indicate that the new portmanteau statistics are better approximated by the chi–square asymptotic distribution than the Ljung–Box statistics. Several analytical results presented in the paper were derived by usig a symbolic manipulation program.  相似文献   

8.

There has been increasing interest in using semi-supervised learning to form a classifier. As is well known, the (Fisher) information in an unclassified feature with unknown class label is less (considerably less for weakly separated classes) than that of a classified feature which has known class label. Hence in the case where the absence of class labels does not depend on the data, the expected error rate of a classifier formed from the classified and unclassified features in a partially classified sample is greater than that if the sample were completely classified. We propose to treat the labels of the unclassified features as missing data and to introduce a framework for their missingness as in the pioneering work of Rubin (Biometrika 63:581–592, 1976) for missingness in incomplete data analysis. An examination of several partially classified data sets in the literature suggests that the unclassified features are not occurring at random in the feature space, but rather tend to be concentrated in regions of relatively high entropy. It suggests that the missingness of the labels of the features can be modelled by representing the conditional probability of a missing label for a feature via the logistic model with covariate depending on the entropy of the feature or an appropriate proxy for it. We consider here the case of two normal classes with a common covariance matrix where for computational convenience the square of the discriminant function is used as the covariate in the logistic model in place of the negative log entropy. Rather paradoxically, we show that the classifier so formed from the partially classified sample may have smaller expected error rate than that if the sample were completely classified.

  相似文献   

9.
Most multivariate statistical techniques rely on the assumption of multivariate normality. The effects of nonnormality on multivariate tests are assumed to be negligible when variance–covariance matrices and sample sizes are equal. Therefore, in practice, investigators usually do not attempt to assess multivariate normality. In this simulation study, the effects of skewed and leptokurtic multivariate data on the Type I error and power of Hotelling's T 2 were examined by manipulating distribution, sample size, and variance–covariance matrix. The empirical Type I error rate and power of Hotelling's T 2 were calculated before and after the application of generalized Box–Cox transformation. The findings demonstrated that even when variance–covariance matrices and sample sizes are equal, small to moderate changes in power still can be observed.  相似文献   

10.
Three procedures for testing the adequacy of a proposed linear multiresponse regression model against unspecified general alternatives are considered. The model has an error structure with a matrix normal distribution which allows the vector of responses for a particular run to have an unknown covariance matrix while the responses for different runs are uncorrelated. Furthermore, each response variable may be modeled by a separate design matrix. Multivariate statistics corresponding to the classical univariate lack of fit and pure error sums of squares are defined and used to determine the multivariate lack of fit tests. A simulation study was performed to compare the power functions of the test procedures in the case of replication. Generalizations of the tests for the case in which there are no independent replicates on all responses are also presented.  相似文献   

11.
This work provides a set of macros performed with SAS (Statistical Analysis System) for Windows, which can be used to fit conditional models under intermittent missingness in longitudinal data. A formalized transition model, including random effects for individuals and measurement error, is presented. Model fitting is based on the missing completely at random or missing at random assumptions, and the separability condition. The problem translates to maximization of the marginal observed data density only, which for Gaussian data is again Gaussian, meaning that the likelihood can be expressed in terms of the mean and covariance matrix of the observed data vector. A simulation study is presented and misspecification issues are considered. A practical application is also given, where conditional models are fitted to the data from a clinical trial that assessed the effect of a Cuban medicine on a disease of the respiratory system.  相似文献   

12.
This article considers the problem of testing the null hypothesis of stochastic stationarity in time series characterized by variance shifts at some (known or unknown) point in the sample. It is shown that existing stationarity tests can be severely biased in the presence of such shifts, either oversized or undersized, with associated spurious power gains or losses, depending on the values of the breakpoint parameter and on the ratio of the prebreak to postbreak variance. Under the assumption of a serially independent Gaussian error term with known break date and known variance ratio, a locally best invariant (LBI) test of the null hypothesis of stationarity in the presence of variance shifts is then derived. Both the test statistic and its asymptotic null distribution depend on the breakpoint parameter and also, in general, on the variance ratio. Modifications of the LBI test statistic are proposed for which the limiting distribution is independent of such nuisance parameters and belongs to the family of Cramér–von Mises distributions. One such modification is particularly appealing in that it is simultaneously exact invariant to variance shifts and to structural breaks in the slope and/or level of the series. Monte Carlo simulations demonstrate that the power loss from using our modified statistics in place of the LBI statistic is not large, even in the neighborhood of the null hypothesis, and particularly for series with shifts in the slope and/or level. The tests are extended to cover the cases of weakly dependent error processes and unknown breakpoints. The implementation of the tests are illustrated using output, inflation, and exchange rate data series.  相似文献   

13.
In this study we discuss the group sequential procedures for comparing two treatments based on multivariate observations in clinical trials. Also we suppose that a response vector on each of two treatments has a multivariate normal distribution with unknown covariance matrix. Then we propose a group sequential x2 statistic in order to carry out repeated significance test for hypothesis of no difference between two population mean vectors. In order to realize the group sequential test where average sample number is reduced, we propose another modified group sequential x2 statistic by extension of Jennison and Turnbull ( 1991 ). After construction of repeated confidence boundaries for making the repeated significance test, we compare two group sequential procedures based on two statistics regarding the average sample number and the power of the test in the simulations.  相似文献   

14.
The authors present a new nonparametric approach to test for interaction in two‐way layouts. Based on the concept of composite linear rank statistics, they combine the correlated row and column ranking information to construct the test statistic. They determine the limiting distributions of the proposed test statistic under the null hypothesis and Pitman alternatives. They also propose consistent estimators for the limiting covariance matrices associated with the test. They illustrate the application of their test in practical settings using a microarray data set.  相似文献   

15.
In this paper, we consider the problem of estimating the parameters of a matrix normal dynamic linear model when the variance and covariance matrices of its error terms are unknown and can be changing over time. Given that the analysis is not conjugate, we use simulation methods based on Monte Carlo Markov chains to estimate the parameters of the model. This analysis allows us to carry out a dynamic principal components analysis in a set of multivariate time series. Furthermore, it permits the treatment of series with different lengths and with missing data. The methodology is illustrated with two empirical examples: the value added distribution of the firms operating in the manufacturing sector of the countries participating in the BACH project, and the joint evolution of a set of international stock-market indices.  相似文献   

16.
Consider a vector valued response variable related to a vector valued explanatory variable through a normal multivariate linear model. The multivariate calibration problem deals with statistical inference on unknown values of the explanatory variable. The problem addressed is the construction of joint confidence regions for several unknown values of the explanatory variable. The problem is investigated when the variance covariance matrix is a scalar multiple of the identity matrix and also when it is a completely unknown positive definite matrix. The problem is solved in only two cases: (i) the response and explanatory variables have the same dimensions, and (ii) the explanatory variable is a scalar. In the former case, exact joint confidence regions are derived based on a natural pivot statistic. In the latter case, the joint confidence regions are only conservative. Computational aspects and the practical implementation of the confidence regions are discussed and illustrated using an example.  相似文献   

17.
We propose a Bayesian computation and inference method for the Pearson-type chi-squared goodness-of-fit test with right-censored survival data. Our test statistic is derived from the classical Pearson chi-squared test using the differences between the observed and expected counts in the partitioned bins. In the Bayesian paradigm, we generate posterior samples of the model parameter using the Markov chain Monte Carlo procedure. By replacing the maximum likelihood estimator in the quadratic form with a random observation from the posterior distribution of the model parameter, we can easily construct a chi-squared test statistic. The degrees of freedom of the test equal the number of bins and thus is independent of the dimensionality of the underlying parameter vector. The test statistic recovers the conventional Pearson-type chi-squared structure. Moreover, the proposed algorithm circumvents the burden of evaluating the Fisher information matrix, its inverse and the rank of the variance–covariance matrix. We examine the proposed model diagnostic method using simulation studies and illustrate it with a real data set from a prostate cancer study.  相似文献   

18.
A class of Kolmogorov-Smirnov and Cramér-von Mises type statistics for testing symmetry about an unknown value is described. These statistics are not distribution-free, however, and, indeed, are not readily amenable to calculation. A linear rank statistic analog of the first component of the Cramér-von Mises type statistic is investigated. Asymptotic non-null properties of these procedures in the normal case are studied, and an efficiency comparison of the Cramér-vonMises statistic, the linear rank statistic analog, the modified Wil-coxon statistic, and the likelihood ratio test is reported.  相似文献   

19.
When prediction intervals are constructed using unobserved component models (UCM), problems can arise due to the possible existence of components that may or may not be conditionally heteroscedastic. Accurate coverage depends on correctly identifying the source of the heteroscedasticity. Different proposals for testing heteroscedasticity have been applied to UCM; however, in most cases, these procedures are unable to identify the heteroscedastic component correctly. The main issue is that test statistics are affected by the presence of serial correlation, causing the distribution of the statistic under conditional homoscedasticity to remain unknown. We propose a nonparametric statistic for testing heteroscedasticity based on the well-known Wilcoxon''s rank statistic. We study the asymptotic validation of the statistic and examine bootstrap procedures for approximating its finite sample distribution. Simulation results show an improvement in the size of the homoscedasticity tests and a power that is clearly comparable with the best alternative in the literature. We also apply the test on real inflation data. Looking for the presence of a conditionally heteroscedastic effect on the error terms, we arrive at conclusions that almost all cases are different than those given by the alternative test statistics presented in the literature.  相似文献   

20.
In this study, we consider stochastic one-way analysis of covariance model when the distribution of the error terms is long-tailed symmetric. Estimators of the unknown model parameters are obtained by using the maximum likelihood (ML) methodology. Iteratively reweighting algorithm is used to compute the ML estimates of the parameters. We also propose new test statistic based on ML estimators for testing the linear contrasts of the treatment effects. In the simulation study, we compare the efficiencies of the traditional least-squares (LS) estimators of the model parameters with the corresponding ML estimators. We also compare the power of the test statistics based on LS and ML estimators, respectively. A real-life example is given at the end of the study.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号