首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We discuss the standard linear regression model with nonspherical disturbances, where some regressors are annihilated by considering only the residuals from an auxiliary regression, and where, analogous to the Frisch-Waugh procedure, the original GLS procedure is applied to the transformed data. We call this procedure pseudo-GLS and give conditions for pseudo-GL5 to be equal to genuine GLS. We also show via examples that these conditions are often violated in empirical applications, and that the Frisch-Waugh Theorem still “works” with nonspherical disturbances if efficient estimation is applied to both the original and the transformed data.  相似文献   

2.
The analysis of high-dimensional data often begins with the identification of lower dimensional subspaces. Principal component analysis is a dimension reduction technique that identifies linear combinations of variables along which most variation occurs or which best “reconstruct” the original variables. For example, many temperature readings may be taken in a production process when in fact there are just a few underlying variables driving the process. A problem with principal components is that the linear combinations can seem quite arbitrary. To make them more interpretable, we introduce two classes of constraints. In the first, coefficients are constrained to equal a small number of values (homogeneity constraint). The second constraint attempts to set as many coefficients to zero as possible (sparsity constraint). The resultant interpretable directions are either calculated to be close to the original principal component directions, or calculated in a stepwise manner that may make the components more orthogonal. A small dataset on characteristics of cars is used to introduce the techniques. A more substantial data mining application is also given, illustrating the ability of the procedure to scale to a very large number of variables.  相似文献   

3.
In this paper, we propose the application of group screening methods for analyzing data using E(fNOD)-optimal mixed-level supersaturated designs possessing the equal occurrence property. Supersaturated designs are a large class of factorial designs which can be used for screening out the important factors from a large set of potentially active variables. The huge advantage of these designs is that they reduce the experimental cost drastically, but their critical disadvantage is the high degree of confounding among factorial effects. Based on the idea of the group screening methods, the f factors are sub-divided into g “group-factors”. The “group-factors” are then studied using the penalized likelihood statistical analysis methods at a factorial design with orthogonal or near-orthogonal columns. All factors in groups found to have a large effect are then studied in a second stage of experiments. A comparison of the Type I and Type II error rates of various estimation methods via simulation experiments is performed. The results are presented in tables and discussion follows.  相似文献   

4.
When comparing the central values of two independent groups, should a t-test be performed, or should the observations be transformed into their ranks and a Wilcoxon-Mann-Whitney test performed? This paper argues that neither should automatically be chosen. Instead, provided that software for conducting randomisation tests is available, the chief concern should be with obtaining data values that are a good reflection of scientific reality and appropriate to the objective of the research; if necessary, the data values should be transformed so that this is so. The subsequent use of a randomisation (permutation) test will mean that failure of the transformed data values to satisfy assumptions such as normality and equality of variances will not be of concern.  相似文献   

5.
Long-run relations and common trends are discussed in terms of the multivariate cointegration model given in the autoregressive and the moving average form. The basic results needed for the analysis of I(1) and 1(2)processes are reviewed and the results applied to Danish monetary data. The test procedures reveal that nominal money stock is essentially I(2). Long-run price homogeneity is supported by the data and imposed on the system. It is found that the bond rate is weakly exogenous for the long-run parameters and therefore act as a driving trend. Using the nonstationarity property of the data, “excess money” is estimated and its effect on the other determinants of the system is investigated. In particular, it is found that “excess money” has no effect on price inflation.  相似文献   

6.
Comparison of treatment effects in an experiment is usually done through analysis of variance under the assumption that the errors are normally and independently distributed with zero mean and constant variance. The traditional approach in dealing with non-constant variance is to apply a variance stabilizing transformation and then run the analysis on the transformed data. In this approach, the conclusions of analysis of variance apply only to the transformed population. In this paper, the asymptotic quasi-likelihood method is introduced to the analysis of experimental designs. The weak assumptions of the asymptotic quasi-likelihood method make it possible to draw conclusions on heterogeneous populations without transforming them. This paper demonstrates how to apply the asymptotic quasi-likelihood technique to three commonly used models. This gives a possible way to analyse data given a complex experimental design.  相似文献   

7.
Clinical studies in overactive bladder have traditionally used analysis of covariance or nonparametric methods to analyse the number of incontinence episodes and other count data. It is known that if the underlying distributional assumptions of a particular parametric method do not hold, an alternative parametric method may be more efficient than a nonparametric one, which makes no assumptions regarding the underlying distribution of the data. Therefore, there are advantages in using methods based on the Poisson distribution or extensions of that method, which incorporate specific features that provide a modelling framework for count data. One challenge with count data is overdispersion, but methods are available that can account for this through the introduction of random effect terms in the modelling, and it is this modelling framework that leads to the negative binomial distribution. These models can also provide clinicians with a clearer and more appropriate interpretation of treatment effects in terms of rate ratios. In this paper, the previously used parametric and non‐parametric approaches are contrasted with those based on Poisson regression and various extensions in trials evaluating solifenacin and mirabegron in patients with overactive bladder. In these applications, negative binomial models are seen to fit the data well. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

8.
In practice, survival data are often collected over geographical regions. Shared spatial frailty models have been used to model spatial variation in survival times, which are often implemented using the Bayesian Markov chain Monte Carlo method. However, this method comes at the price of slow mixing rates and heavy computational cost, which may render it impractical for data-intensive application. Alternatively, a frailty model assuming an independent and identically distributed (iid) random effect can be easily and efficiently implemented. Therefore, we used simulations to assess the bias and efficiency loss in the estimated parameters, if residual spatial correlation is present but using an iid random effect. Our simulations indicate that a shared frailty model with an iid random effect can estimate the regression coefficients reasonably well, even with residual spatial correlation present, when the percentage of censoring is not too high and the number of clusters and cluster size are not too low. Therefore, if the primary goal is to assess the covariate effects, one may choose the frailty model with an iid random effect; whereas if the goal is to predict the hazard, additional care needs to be given due to the efficiency loss in the parameter(s) for the baseline hazard.  相似文献   

9.
This paper considers computation of fitted values and marginal effects in the Box-Cox regression model. Two methods, 1 the “smearing” technique suggested by Duan (see Ref. [10]) and 2 direct numerical integration, are examined and compared with the “naive” method often used in econometrics.  相似文献   

10.
We examine the risk of a pre-test estimator for regression coefficients after a pre-test for homoskedasticity under the Balanced Loss Function (BLF). We show analytically that the two stage Aitken estimator is dominated by the pre-test estimator with the critical value of unity, even if the BLF is used. We also show numerically that both the two stage Aitken estimator and the pre-test estimator can be dominated by the ordinary least squares estimator when “goodness of fit” is regarded as more important than precision of estimation.  相似文献   

11.
A generalized self-consistency approach to maximum likelihood estimation (MLE) and model building was developed in Tsodikov [2003. Semiparametric models: a generalized self-consistency approach. J. Roy. Statist. Soc. Ser. B Statist. Methodology 65(3), 759–774] and applied to a survival analysis problem. We extend the framework to obtain second-order results such as information matrix and properties of the variance. Multinomial model motivates the paper and is used throughout as an example. Computational challenges with the multinomial likelihood motivated Baker [1994. The Multinomial–Poisson transformation. The Statist. 43, 495–504] to develop the Multinomial–Poisson (MP) transformation for a large variety of regression models with multinomial likelihood kernel. Multinomial regression is transformed into a Poisson regression at the cost of augmenting model parameters and restricting the problem to discrete covariates. Imposing normalization restrictions by means of Lagrange multipliers [Lang, J., 1996. On the comparison of multinomial and Poisson log-linear models. J. Roy. Statist. Soc. Ser. B Statist. Methodology 58, 253–266] justifies the approach. Using the self-consistency framework we develop an alternative solution to multinomial model fitting that does not require augmenting parameters while allowing for a Poisson likelihood and arbitrary covariate structures. Normalization restrictions are imposed by averaging over artificial “missing data” (fake mixture). Lack of probabilistic interpretation at the “complete-data” level makes the use of the generalized self-consistency machinery essential.  相似文献   

12.
In clinical studies, the researchers measure the patients' response longitudinally. In recent studies, Mixed models are used to determine effects in the individual level. In the other hand, Henderson et al. [3,4] developed a joint likelihood function which combines likelihood functions of longitudinal biomarkers and survival times. They put random effects in the longitudinal component to determine if a longitudinal biomarker is associated with time to an event. In this paper, we deal with a longitudinal biomarker as a growth curve and extend Henderson's method to determine if a longitudinal biomarker is associated with time to an event for the multivariate survival data.  相似文献   

13.
In July 2004, Cindy Hepfer asked friends and colleagues: “What question would you like to ask Clifford Lynch if you had the chance?” As a result, Clifford Lynch discusses a wide variety of topics and issues impacting the serials community from Open Access, institutional repositories, what we can learn from Google and Amazon, and Shibboleth to where his favorite places are to travel and how he prepares for presentations.  相似文献   

14.
Several statistics based on the empirical characteristic function have been proposed for testing the simple goodness-of-fit hypothesis that the data come from a population with a completely specified characteristic function which cannot be inverted in a closed form, the typical example being the class of stable characteristic functions. As an alternative approach, it is pointed out here that the inversion formula of Gil-Pelaez and Rosén, as applied to the data and the hypothetical characteristic function via numerical integration, is the natural replacement of the probability integral transformation in the given situation. The transformed sample is from the uniform (0, l) distribution if and only if the null hypothesis is true, and for testing uniformity on (0,1) the whole arsenal of methods statistics so far produced can be used.  相似文献   

15.
Book Reviews     
The Levene test is a widely used test for detecting differences in dispersion. The modified Levene transformation using sample medians is considered in this article. After Levene's transformation the data are not normally distributed, hence, nonparametric tests may be useful. As the Wilcoxon rank sum test applied to the transformed data cannot control the type I error rate for asymmetric distributions, a permutation test based on reallocations of the original observations rather than the absolute deviations was investigated. Levene's transformation is then only an intermediate step to compute the test statistic. Such a Levene test, however, cannot control the type I error rate when the Wilcoxon statistic is used; with the Fisher–Pitman permutation test it can be extremely conservative. The Fisher–Pitman test based on reallocations of the transformed data seems to be the only acceptable nonparametric test. Simulation results indicate that this test is on average more powerful than applying the t test after Levene's transformation, even when the t test is improved by the deletion of structural zeros.  相似文献   

16.
The maximum likelihood estimator (MLE) in nonlinear panel data models with fixed effects is widely understood (with a few exceptions) to be biased and inconsistent when T, the length of the panel, is small and fixed. However, there is surprisingly little theoretical or empirical evidence on the behavior of the estimator on which to base this conclusion. The received studies have focused almost exclusively on coefficient estimation in two binary choice models, the probit and logit models. In this note, we use Monte Carlo methods to examine the behavior of the MLE of the fixed effects tobit model. We find that the estimator's behavior is quite unlike that of the estimators of the binary choice models. Among our findings are that the location coefficients in the tobit model, unlike those in the probit and logit models, are unaffected by the “incidental parameters problem.” But, a surprising result related to the disturbance variance emerges instead - the finite sample bias appears here rather than in the slopes. This has implications for estimation of marginal effects and asymptotic standard errors, which are also examined in this paper. The effects are also examined for the probit and truncated regression models, extending the range of received results in the first of these beyond the widely cited biases in the coefficient estimators.  相似文献   

17.
Potency bioassays are used to measure biological activity. Consequently, potency is considered a critical quality attribute in manufacturing. Relative potency is measured by comparing the concentration‐response curves of a manufactured test batch with that of a reference standard. If the curve shapes are deemed similar, the test batch is said to exhibit constant relative potency with the reference standard, a critical requirement for calibrating the potency of the final drug product. Outliers in bioassay potency data may result in the false acceptance/rejection of a bad/good sample and, if accepted, may yield a biased relative potency estimate. To avoid these issues, the USP<1032> recommends the screening of bioassay data for outliers prior to performing a relative potency analysis. In a recently published work, the effects of one or more outliers, outlier size, and outlier type on similarity testing and estimation of relative potency were thoroughly examined, confirming the USP<1032> outlier guidance. As a follow‐up, several outlier detection methods, including those proposed by the USP<1010>, are evaluated and compared in this work through computer simulation. Two novel outlier detection methods are also proposed. The effects of outlier removal on similarity testing and estimation of relative potency were evaluated, resulting in recommendations for best practice.  相似文献   

18.
For attribute data with (very) small failure rates often control charts are used which decide whether to stop or to continue each time r failures have occurred, for some r?1. Because of the small probabilities involved, such charts are very sensitive to estimation effects. This is true in particular if the underlying failure rate varies and hence the distributions involved are not geometric. Such a situation calls for a nonparametric approach, but this may require far more Phase I observations than are typically available in practice. In the present paper it is shown how this obstacle can be effectively overcome by looking not at the sum but rather at the maximum of each group of size r.  相似文献   

19.
We discuss the standard linear regression model with nonspherical disturbances, where some regressors are annihilated by considering only the residuals from an auxiliary regression, and where, analogous to the Frisch-Waugh procedure, the original GLS procedure is applied to the transformed data. We call this procedure pseudo-GLS and give conditions for pseudo-GL5 to be equal to genuine GLS. We also show via examples that these conditions are often violated in empirical applications, and that the Frisch-Waugh Theorem still “works” with nonspherical disturbances if efficient estimation is applied to both the original and the transformed data.  相似文献   

20.
Mixed effects models or random effects models are popular for the analysis of longitudinal data. In practice, longitudinal data are often complex since there may be outliers in both the response and the covariates and there may be measurement errors. The likelihood method is a common approach for these problems but it can be computationally very intensive and sometimes may even be computationally infeasible. In this article, we consider approximate robust methods for nonlinear mixed effects models to simultaneously address outliers and measurement errors. The approximate methods are computationally very efficient. We show the consistency and asymptotic normality of the approximate estimates. The methods can also be extended to missing data problems. An example is used to illustrate the methods and a simulation is conducted to evaluate the methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号