首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The plug–in Anderson's covariate classification statistic is constructed on the basis of an initially unclassified training sample by means of posty–stratification. The asymptotic efficiency relative to the discriminant based on an initially classified training sample is evaluated for the case where a covariate is present. Effect of post–stratification is examined.  相似文献   

2.
The performance of Anderson's classification statistic based on a post-stratified random sample is examined. It is assumed that the training sample is a random sample from a stratified population consisting of two strata with unknown stratum weights. The sample is first segregated into the two strata by post-stratification. The unknown parameters for each of the two populations are then estimated and used in the construction of the plug-in discriminant. Under this procedure, it is shown that additional estimation of the stratum weight will not seriously affect the performance of Anderson's classification statistic. Furthermore, our discriminant enjoys a much higher efficiency than the procedure based on an unclassified sample from a mixture of normals investigated by Ganesalingam and McLachlan (1978).  相似文献   

3.
Linear maps of a single unclassified observation are used to estimate the mixing proportion in a mixture of two populations with homogeneous variances in the presence of covariates. with complete knowledge of the parameters of the individual populations, the linear map for which the estimator is unbiased and has minimum variance amongst all similar estimators can be determined. Plug-in estimator based on independent training samples from the component populations can be constructed and is asymptotically equivalent to Cochran's classification statistic V* for covariate classification; see Memon and Okamoto (1970). Under normality assumptions, asymptotic expansion of the distribution of the plug-in estimator is available. In the absence of covariates, our estimator reduces to that suggested by Walker (1980) who has investigated the problem based on information on large unclassified samples from a mixture of two populations with heterogeneous variances. In contrast, distribution of Walker's estimator seems intractable in moderate sample sizes even with normality assumption.  相似文献   

4.
A technique for deriving asymptotic expansions for the variances of the errors of misclassification of the linear discriminant function (Anderson's classification statistic) is developed. These expansions are shown to be in reasonable agreement with the sample values of the variances of the errors obtained from some sampling experiments.  相似文献   

5.
The Bartlett's test (1937) for equality of variances is based on the χ2 distribution approximation. This approximation deteriorates either when the sample size is small (particularly < 4) or when the population number is large. According to a simulation investigation, we find a similar varying trend for the mean differences between empirical distributions of Bartlett's statistics and their χ2 approximations. By using the mean differences to represent the distribution departures, a simple adjustment approach on the Bartlett's statistic is proposed on the basis of equal mean principle. The performance before and after adjustment is extensively investigated under equal and unequal sample sizes, with number of populations varying from 3 to 100. Compared with the traditional Bartlett's statistic, the adjusted statistic is distributed more closely to χ2 distribution, for homogeneity samples from normal populations. The type I error is well controlled and the power is a little higher after adjustment. In conclusion, the adjustment has good control on the type I error and higher power, and thus is recommended for small samples and large population number when underlying distribution is normal.  相似文献   

6.
The current method of determining sample size for confidence intervals does not accommodate multiple covariate adjustment. Under the normality assumption, the effect of multiple covariate adjustment on the standard error of the mean comparison is related to a Hotelling T 2 statistic. Sample size can be calculated to obtain a desired probability of achieving a predetermined width in the confidence interval of the mean comparison with multiple covariate adjustment, given that the confidence interval includes the population parameter.  相似文献   

7.
In the classical discriminant analysis, when two multivariate normal distributions with equal variance–covariance matrices are assumed for two groups, the classical linear discriminant function is optimal with respect to maximizing the standardized difference between the means of two groups. However, for a typical case‐control study, the distributional assumption for the case group often needs to be relaxed in practice. Komori et al. (Generalized t ‐statistic for two‐group classification. Biometrics 2015, 71: 404–416) proposed the generalized t ‐statistic to obtain a linear discriminant function, which allows for heterogeneity of case group. Their procedure has an optimality property in the class of consideration. We perform a further study of the problem and show that additional improvement is achievable. The approach we propose does not require a parametric distributional assumption on the case group. We further show that the new estimator is efficient, in that no further improvement is possible to construct the linear discriminant function more efficiently. We conduct simulation studies and real data examples to illustrate the finite sample performance and the gain that it produces in comparison with existing methods.  相似文献   

8.
Borrowing data from external control has been an appealing strategy for evidence synthesis when conducting randomized controlled trials (RCTs). Often named hybrid control trials, they leverage existing control data from clinical trials or potentially real-world data (RWD), enable trial designs to allocate more patients to the novel intervention arm, and improve the efficiency or lower the cost of the primary RCT. Several methods have been established and developed to borrow external control data, among which the propensity score methods and Bayesian dynamic borrowing framework play essential roles. Noticing the unique strengths of propensity score methods and Bayesian hierarchical models, we utilize both methods in a complementary manner to analyze hybrid control studies. In this article, we review methods including covariate adjustments, propensity score matching and weighting in combination with dynamic borrowing and compare the performance of these methods through comprehensive simulations. Different degrees of covariate imbalance and confounding are examined. Our findings suggested that the conventional covariate adjustment in combination with the Bayesian commensurate prior model provides the highest power with good type I error control under the investigated settings. It has desired performance especially under scenarios of different degrees of confounding. To estimate efficacy signals in the exploratory setting, the covariate adjustment method in combination with the Bayesian commensurate prior is recommended.  相似文献   

9.
Several mathematical programming approaches to the classification problem in discriminant analysis have recently been introduced. This paper empirically compares these newly introduced classification techniques with Fisher's linear discriminant analysis (FLDA), quadratic discriminant analysis (QDA), logit analysis, and several rank-based procedures for a variety of symmetric and skewed distributions. The percent of correctly classified observations by each procedure in a holdout sample indicate that while under some experimental conditions the linear programming approaches compete well with the classical procedures, overall, however, their performance lags behind that of the classical procedures.  相似文献   

10.
Two procedures for testing equality of two proportions are compared in terms of asymptotic efficiency. The comparison favors use of a statistic equivalent to Goodman's Y 2 over the usual X 2 statistic in some cases including that of equal sample sizes. Numerical comparisons indicate that the asymptotic results have some relevance for moderate sample sizes.  相似文献   

11.
Covariate adjustment for the estimation of treatment effect for randomized controlled trials (RCT) is a simple approach with a long history, hence, its pros and cons have been well‐investigated and published in the literature. It is worthwhile to revisit this topic since recently there has been significant investigation and development on model assumptions, robustness to model mis‐specification, in particular, regarding the Neyman‐Rubin model and the average treatment effect estimand. This paper discusses key results of the investigation and development and their practical implication on pharmaceutical statistics. Accordingly, we recommend that appropriate covariate adjustment should be more widely used for RCTs for both hypothesis testing and estimation.  相似文献   

12.
We consider the problem of testing the equality of two population means when the population variances are not necessarily equal. We propose a Welch-type statistic, say T* c, based on Tiku!s ‘1967, 1980’ modified maximum likelihood estimators, and show that this statistic is robust to symmetric and moderately skew distributions. We investigate the power properties of the statistic T* c; T* c clearly seems to be more powerful than Yuen's ‘1974’ Welch-type robust statistic based on the trimmed sample means and the matching sample variances. We show that the analogous statistics based on the ‘adaptive’ robust estimators give misleading Type I errors. We generalize the results to testing linear contrasts among k population means  相似文献   

13.
In this paper we consider the problem of testing the means of k multivariate normal populations with additional data from an unknown subset of the k populations. The purpose of this research is to offer test procedures utilizing all the available data for the multivariate analysis of variance problem because the additional data may contain valuable information about the parameters of the k populations. The standard procedure uses only the data from identified populations. We provide a test using all available data based upon Hotelling' s generalized T2statistic. The power of this test is computed using Betz's approximation of Hotelling' s generalized T2statistic by an F-distribution. A comparison of the power of the test and the standard test procedure is also given.  相似文献   

14.
Various methods have been suggested in the literature to handle a missing covariate in the presence of surrogate covariates. These methods belong to one of two paradigms. In the imputation paradigm, Pepe and Fleming (1991) and Reilly and Pepe (1995) suggested filling in missing covariates using the empirical distribution of the covariate obtained from the observed data. We can proceed one step further by imputing the missing covariate using nonparametric maximum likelihood estimates (NPMLE) of the density of the covariate. Recently Murphy and Van der Vaart (1998a) showed that such an approach yields a consistent, asymptotically normal, and semiparametric efficient estimate for the logistic regression coefficient. In the weighting paradigm, Zhao and Lipsitz (1992) suggested an estimating function using completely observed records after weighting inversely by the probability of observation. An extension of this weighting approach designed to achieve semiparametric efficient bound is considered by Robins, Hsieh and Newey (RHN) (1995). The two ends of each paradigm (NPMLE and RHN) attain the efficiency bound and are asymptotically equivalent. However, both require a substantial amount of computation. A question arises whether and when, in practical situations, this extensive computation is worthwhile. In this paper we investigate the performance of single and multiple imputation estimates, weighting estimates, semiparametric efficient estimates, and two new imputation estimates. Simulation studies suggest that the sample size should be substantially large (e.g. n=2000) for NPMLE and RHN to be more efficient than simpler imputation estimates. When the sample size is moderately large (n≤ 1500), simpler imputation estimates have as small a variance as semiparametric efficient estimates.  相似文献   

15.
A consistent test for difference in locations between two bivariate populations is proposed, The test is similar as the Mann-Whitney test and depends on the exceedances of slopes of the two samples where slope for each sample observation is computed by taking the ratios of the observed values. In terms of the slopes, it reduces to a univariate problem, The power of the test has been compared with those of various existing tests by simulation. The proposed test statistic is compared with Mardia's(1967) test statistics, Peters-Randies(1991) test statistic, Wilcoxon's rank sum test. statistic and Hotelling' T2 test statistic using Monte Carlo technique. It performs better than other statistics compared for small differences in locations between two populations when underlying population is population 7(light tailed population) and sample size 15 and 18 respectively. When underlying population is population 6(heavy tailed population) and sample sizes are 15 and 18 it performas better than other statistic compared except Wilcoxon's rank sum test statistics for small differences in location between two populations. It performs better than Mardia's(1967) test statistic for large differences in location between two population when underlying population is bivariate normal mixture with probability p=0.5, population 6, Pearson type II population and Pearson type VII population for sample size 15 and 18 .Under bivariate normal population it performs as good as Mardia' (1967) test statistic for small differences in locations between two populations and sample sizes 15 and 18. For sample sizes 25 and 28 respectively it performs better than Mardia's (1967) test statistic when underlying population is population 6, Pearson type II population and Pearson type VII population  相似文献   

16.
Sample size calculation is a critical issue in clinical trials because a small sample size leads to a biased inference and a large sample size increases the cost. With the development of advanced medical technology, some patients can be cured of certain chronic diseases, and the proportional hazards mixture cure model has been developed to handle survival data with potential cure information. Given the needs of survival trials with potential cure proportions, a corresponding sample size formula based on the log-rank test statistic for binary covariates has been proposed by Wang et al. [25]. However, a sample size formula based on continuous variables has not been developed. Herein, we presented sample size and power calculations for the mixture cure model with continuous variables based on the log-rank method and further modified it by Ewell's method. The proposed approaches were evaluated using simulation studies for synthetic data from exponential and Weibull distributions. A program for calculating necessary sample size for continuous covariates in a mixture cure model was implemented in R.  相似文献   

17.
In this paper, asymptotic relative efficiency (ARE) of Wald tests for the Tweedie class of models with log-linear mean, is considered when the aux¬iliary variable is measured with error. Wald test statistics based on the naive maximum likelihood estimator and on a consistent estimator which is obtained by using Nakarnura's (1990) corrected score function approach are defined. As shown analytically, the Wald statistics based on the naive and corrected score function estimators are asymptotically equivalents in terms of ARE. On the other hand, the asymptotic relative efficiency of the naive and corrected Wald statistic with respect to the Wald statistic based on the true covariate equals to the square of the correlation between the unobserved and the observed co-variate. A small scale numerical Monte Carlo study and an example illustrate the small sample size situation.  相似文献   

18.
The study of a linear regression model with an interval-censored covariate, which was motivated by an acquired immunodeficiency syndrome (AIDS) clinical trial, was first proposed by Gómez et al. They developed a likelihood approach, together with a two-step conditional algorithm, to estimate the regression coefficients in the model. However, their method is inapplicable when the interval-censored covariate is continuous. In this article, we propose a novel and fast method to treat the continuous interval-censored covariate. By using logspline density estimation, we impute the interval-censored covariate with a conditional expectation. Then, the ordinary least-squares method is applied to the linear regression model with the imputed covariate. To assess the performance of the proposed method, we compare our imputation with the midpoint imputation and the semiparametric hierarchical method via simulations. Furthermore, an application to the AIDS clinical trial is presented.  相似文献   

19.
In the medical literature, there has been an increased interest in evaluating association between exposure and outcomes using nonrandomized observational studies. However, because assignments to exposure are not random in observational studies, comparisons of outcomes between exposed and nonexposed subjects must account for the effect of confounders. Propensity score methods have been widely used to control for confounding, when estimating exposure effect. Previous studies have shown that conditioning on the propensity score results in biased estimation of conditional odds ratio and hazard ratio. However, research is lacking on the performance of propensity score methods for covariate adjustment when estimating the area under the ROC curve (AUC). In this paper, AUC is proposed as measure of effect when outcomes are continuous. The AUC is interpreted as the probability that a randomly selected nonexposed subject has a better response than a randomly selected exposed subject. A series of simulations has been conducted to examine the performance of propensity score methods when association between exposure and outcomes is quantified by AUC; this includes determining the optimal choice of variables for the propensity score models. Additionally, the propensity score approach is compared with that of the conventional regression approach to adjust for covariates with the AUC. The choice of the best estimator depends on bias, relative bias, and root mean squared error. Finally, an example looking at the relationship of depression/anxiety and pain intensity in people with sickle cell disease is used to illustrate the estimation of the adjusted AUC using the proposed approaches.  相似文献   

20.
In this paper, testing procedures based on double-sampling are proposed that yield gains in terms of power for the tests of General Linear Hypotheses. The distribution of a test statistic, involving both the measurements of the outcome on the smaller sample and of the covariates on the wider sample, is first derived. Then, approximations are provided in order to allow for a formal comparison between the powers of double-sampling and single-sampling strategies. Furthermore, it is shown how to allocate the measurements of the outcome and the covariates in order to maximize the power of the tests for a given experimental cost.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号