共查询到20条相似文献,搜索用时 15 毫秒
1.
Least squares estimation of regression parameters in mixed effects models with unmeasured covariates
We consider mixed effects models for longitudinal, repeated measures or clustered data. Unmeasured or omitted covariates in such models may be correlated with the included covanates, and create model violations when not taken into account. Previous research and experience with longitudinal data sets suggest a general form of model which should be considered when omitted covariates are likely, such as in observational studies. We derive the marginal model between the response variable and included covariates, and consider model fitting using the ordinary and weighted least squares methods, which require simple non-iterative computation and no assumptions on the distribution of random covariates or error terms, Asymptotic properties of the least squares estimators are also discussed. The results shed light on the structure of least squares estimators in mixed effects models, and provide large sample procedures for statistical inference and prediction based on the marginal model. We present an example of the relationship between fluid intake and output in very low birth weight infants, where the model is found to have the assumed structure. 相似文献
2.
Mark Von Tress 《统计学通讯:理论与方法》2013,42(12):3523-3536
Recent work by Miller and Landis (1991) discusses generalized variance component models for polytomous responses. This work is adapted to longitudinal models for repeated measures of individuals having polytomous responses. In this setting, individuals are considered to be “clusters”. The resulting simplifications are discussed. First, each response has a multinomial distribution with N=l. Second, observed cluster proportions in the variance component estimates must be replaced by their expectations. This technique accommodates patients with missing data in a sequence of repeated observations. 相似文献
3.
Neil C. Schwertman Todd Stewart Katherine L. Schenk Kathy S. Stone 《统计学通讯:理论与方法》2013,42(10):3539-3555
Repeated measures data collected at random observation times are quite common in clinical studies and are often difficult to analyze. A Monte Carlo comparison of four analysis procedures with respect to significance level and power is presented. The basic procedures compared are successive difference analyses and three procedures using the data as summarized in the estimated quadratic polynomial regression coefficients for each subject. These three procedures are (1) Hotelling's T-square, (2) Multivariate Multisample Rank Sum Test (MMRST) and (3) Multivariate Multisample Median Test (MMMT). For the variety of dispersion structures, sample sizes and treatement groups simulated the MMRST and successive difference analysis were the most satisfactory. 相似文献
4.
Ante-dependence models can be used to model the covariance structure in problems involving repeated measures through time. They are conditional regression models which generalize Gabriel’s constant-order ante-dependence model. Likelihood-based procedures are presented, together with simple expressions for likelihood ratio test statistics in terms of sum of squares from appropriate analysis of covariance. The estimation of the orders is approached as a model selection problem, and penalized likelihood criteria are suggested. Extensions of all procedures discussed here to situations with a monotone pattern of missing data are presented. 相似文献
5.
Bias from the use of generalized estimating equations to analyze incomplete longitudinal binary data
Patient dropout is a common problem in studies that collect repeated binary measurements. Generalized estimating equations (GEE) are often used to analyze such data. The dropout mechanism may be plausibly missing at random (MAR), i.e. unrelated to future measurements given covariates and past measurements. In this case, various authors have recommended weighted GEE with weights based on an assumed dropout model, or an imputation approach, or a doubly robust approach based on weighting and imputation. These approaches provide asymptotically unbiased inference, provided the dropout or imputation model (as appropriate) is correctly specified. Other authors have suggested that, provided the working correlation structure is correctly specified, GEE using an improved estimator of the correlation parameters (‘modified GEE’) show minimal bias. These modified GEE have not been thoroughly examined. In this paper, we study the asymptotic bias under MAR dropout of these modified GEE, the standard GEE, and also GEE using the true correlation. We demonstrate that all three methods are biased in general. The modified GEE may be preferred to the standard GEE and are subject to only minimal bias in many MAR scenarios but in others are substantially biased. Hence, we recommend the modified GEE be used with caution. 相似文献
6.
《Journal of Statistical Computation and Simulation》2012,82(6):807-824
This paper develops a test for comparing treatment effects when observations are missing at random for repeated measures data on independent subjects. It is assumed that missingness at any occasion follows a Bernoulli distribution. It is shown that the distribution of the vector of linear rank statistics depends on the unknown parameters of the probability law that governs missingness, which is absent in the existing conditional methods employing rank statistics. This dependence is through the variance–covariance matrix of the vector of linear ranks. The test statistic is a quadratic form in the linear rank statistics when the variance–covariance matrix is estimated. The limiting distribution of the test statistic is derived under the null hypothesis. Several methods of estimating the unknown components of the variance–covariance matrix are considered. The estimate that produces stable empirical Type I error rate while maintaining the highest power among the competing tests is recommended for implementation in practice. Simulation studies are also presented to show the advantage of the proposed test over other rank-based tests that do not account for the randomness in the missing data pattern. Our method is shown to have the highest power while also maintaining near-nominal Type I error rates. Our results clearly illustrate that even for an ignorable missingness mechanism, the randomness in the pattern of missingness cannot be ignored. A real data example is presented to highlight the effectiveness of the proposed method. 相似文献
7.
STEFAN FREMDT JOSEF G. STEINEBACH LAJOS HORVÁTH PIOTR KOKOSZKA 《Scandinavian Journal of Statistics》2013,40(1):138-152
Abstract. We propose a non‐parametric test for the equality of the covariance structures in two functional samples. The test statistic has a chi‐square asymptotic distribution with a known number of degrees of freedom, which depends on the level of dimension reduction needed to represent the data. Detailed analysis of the asymptotic properties is developed. Finite sample perfo‐rmance is examined by a simulation study and an application to egg‐laying curves of fruit flies. 相似文献
8.
We explore the performance accuracy of the linear and quadratic classifiers for high-dimensional higher-order data, assuming that the class conditional distributions are multivariate normal with locally doubly exchangeable covariance structure. We derive a two-stage procedure for estimating the covariance matrix: at the first stage, the Lasso-based structure learning is applied to sparsifying the block components within the covariance matrix. At the second stage, the maximum-likelihood estimators of all block-wise parameters are derived assuming the doubly exchangeable within block covariance structure and a Kronecker product structured mean vector. We also study the effect of the block size on the classification performance in the high-dimensional setting and derive a class of asymptotically equivalent block structure approximations, in a sense that the choice of the block size is asymptotically negligible. 相似文献
9.
《Journal of Statistical Computation and Simulation》2012,82(2):344-359
Clustered or correlated samples of categorical response data arise frequently in many fields of application. The method of generalized estimating equations (GEEs) introduced in Liang and Zeger [Longitudinal data analysis using generalized linear models, Biometrika 73 (1986), pp. 13–22] is often used to analyse this type of data. GEEs give consistent estimates of the regression parameters and their variance based upon the Pearson residuals. Park et al. [Alternative GEE estimation procedures for discrete longitudinal data, Comput. Stat. Data Anal. 28 (1998), pp. 243–256] considered a modification of the GEE approach using the Anscombe residual and the deviance residual. In this work, we propose to extend this idea to a family of generalized residuals. A wide simulation study is conducted for binary and Poisson correlated outcomes and also two numerical illustrations are presented. 相似文献
10.
Fangyao Li Christopher M. Triggs Ciprian Doru Giurcăneanu 《Australian & New Zealand Journal of Statistics》2023,65(2):77-100
We discuss the use of the following greedy algorithms in the prediction of multivariate time series: Matching Pursuit Algorithm (MPA), Orthogonal Matching Pursuit (OMP), Relaxed Matching Pursuit (RMP), Frank–Wolfe Algorithm (FWA) and Constrained Matching Pursuit (CMP). The last two are known to be solvers for the lasso problem. Some of the algorithms are well-known (e.g. OMP), while others are less popular (e.g. RMP). We provide a unified presentation of all the algorithms, and evaluate their computational complexity for the high-dimensional case and for the big data case. We show how 12 information theoretic (IT) criteria can be used jointly with the greedy algorithms. As part of this effort, we derive new theoretical results that allow modification of the IT criteria such that to be compatible with RMP. The prediction capabilities are tested in experiments with two data sets. The first one involves air pollution data measured in Auckland (New Zealand) and the second one concerns the House Price Index in England (the United Kingdom). 相似文献
11.
We consider a generalized leverage matrix useful for the identification of influential units and observations in linear mixed models and show how a decomposition of this matrix may be employed to identify high leverage points for both the marginal fitted values and the random effect component of the conditional fitted values. We illustrate the different uses of the two components of the decomposition with a simulated example as well as with a real data set. 相似文献
12.
Joo Victor B. de Freitas Juvêncio S. Nobre Marcelo Bourguignon Manoel Santos-Neto 《Journal of applied statistics》2022,49(15):3784
In many situations, it is common to have more than one observation per experimental unit, thus generating the experiments with repeated measures. In the modeling of such experiments, it is necessary to consider and model the intra-unit dependency structure. In the literature, there are several proposals to model positive continuous data with repeated measures. In this paper, we propose one more with the generalization of the beta prime regression model. We consider the possibility of dependence between observations of the same unit. Residuals and diagnostic tools also are discussed. To evaluate the finite-sample performance of the estimators, using different correlation matrices and distributions, we conducted a Monte Carlo simulation study. The methodology proposed is illustrated with an analysis of a real data set. Finally, we create an package for easy access to publicly available the methodology described in this paper. 相似文献
13.
M. Richter 《Statistics》2013,47(2):177-194
Zusammenfassung: Für die Bestimmung der Verteilungsfunktion F(x) von quadrastischen Formen normalverteilter Zufallsgröen werden Verfahren angegeben.Diese Verfahren werden mit den aus der Literatur bekannten Ergebnissen verglichen.Im Staz 2 wird eine quadratische Faltungsgleichung hergeleitet, aus der F(x) berechenbar ist. Weiterhin werden Abschätzungen nach oben und unten für F(x) hergeleitet.Die für endliche quadratischer Formen übertragen.Anwendungsbeispiele aus der Statistik zufälliger Prozesse (Identifikation und Konfidenzintervalle für die Mittelwerfunktion eines GAUSSPROZESSES) und aus der parameterfreien Statistik werden angegeben.Da zur Berechnung von F(x) eine quadratische Faltungsgleichung zu lösen ist, werden im letzten Teil der Arbeit Lösungs möglichkeiten fü diese Gleichung diskutiert und numerische Ergebnisse angegeben. 相似文献
14.
R. Pincus 《Statistics》2013,47(2):251-255
A procedure for finding exact tests for some hypotheses on variance components in unbalanced models is proposed. It is based on F-distributed statistics got by an orthogonal decomposition of the sample space. 相似文献
15.
Among the diverse frameworks that have been proposed for regression analysis of angular data, the projected multivariate linear model provides a particularly appealing and tractable methodology. In this model, the observed directional responses are assumed to correspond to the angles formed by latent bivariate normal random vectors that are assumed to depend upon covariates through a linear model. This implies an angular normal distribution for the observed angles, and incorporates a regression structure through a familiar and convenient relationship. In this paper we extend this methodology to accommodate clustered data (e.g., longitudinal or repeated measures data) by formulating a marginal version of the model and basing estimation on an EM‐like algorithm in which correlation among within‐cluster responses is taken into account by incorporating a working correlation matrix into the M step. A sandwich estimator is used for the parameter estimates’ covariance matrix. The methodology is motivated and illustrated using an example involving clustered measurements of microbril angle on loblolly pine (Pinus taeda L.) Simulation studies are presented that evaluate the finite sample properties of the proposed fitting method. In addition, the relationship between within‐cluster correlation on the latent Euclidean vectors and the corresponding correlation structure for the observed angles is explored. 相似文献
16.
Antonella Plaia 《Journal of applied statistics》2015,42(12):2639-2653
In a long-term experiment usually the experimenter needs to know whether the effect of a treatment varies over time. But time usually has both a fixed and a random effects over the output and the difficulty in the analysis depends on the particular design considered and the availability of covariates. Actually, as shown in the paper, the presence of covariates can be very useful to model the random effect of time. In this paper a model to analyze data from a long-term strip plot design with covariates is proposed. Its effectiveness will be tested using both simulated and real data from a crop rotation experiment. 相似文献
17.
Bruno Delafont Kevin Carroll Claire Vilain Emmanuel Pham 《Pharmaceutical statistics》2018,17(5):515-526
The longitudinal data from 2 published clinical trials in adult subjects with upper limb spasticity (a randomized placebo‐controlled study [NCT01313299] and its long‐term open‐label extension [NCT01313312]) were combined. Their study designs involved repeat intramuscular injections of abobotulinumtoxinA (Dysport®), and efficacy endpoints were collected accordingly. With the objective of characterizing the pattern of response across cycles, Mixed Model Repeated Measures analyses and Non‐Linear Random Coefficient (NLRC) analyses were performed and their results compared. The Mixed Model Repeated Measures analyses, commonly used in the context of repeated measures with missing dependent data, did not involve any parametric shape for the curve of changes over time. Based on clinical expectations, the NLRC included a negative exponential function of the number of treatment cycles, with its asymptote and rate included as random coefficients in the model. Our analysis focused on 2 specific efficacy parameters reflecting complementary aspects of efficacy in the study population. A simulation study based on a similar study design was also performed to further assess the performance of each method under different patterns of response over time. This highlighted a gain of precision with the NLRC model, and most importantly the need for its assumptions to be verified to avoid potentially biased estimates. These analyses describe a typical situation and the conditions under which non‐linear mixed modeling can provide additional insights on the behavior of efficacy parameters over time. Indeed, the resulting estimates from the negative exponential NLRC can help determine the expected maximal effect and the treatment duration required to reach it. 相似文献
18.
Kai Xu 《Journal of Statistical Computation and Simulation》2017,87(16):3208-3224
Under non-normality, this article is concerned with testing diagonality of high-dimensional covariance matrix, which is more practical than testing sphericity and identity in high-dimensional setting. The existing testing procedure for diagonality is not robust against either the data dimension or the data distribution, producing tests with distorted type I error rates much larger than nominal levels. This is mainly due to bias from estimating some functions of high-dimensional covariance matrix under non-normality. Compared to the sphericity and identity hypotheses, the asymptotic property of the diagonality hypothesis would be more involved and we should be more careful to deal with bias. We develop a correction that makes the existing test statistic robust against both the data dimension and the data distribution. We show that the proposed test statistic is asymptotically normal without the normality assumption and without specifying an explicit relationship between the dimension p and the sample size n. Simulations show that it has good size and power for a wide range of settings. 相似文献
19.
Stuart Lee Dianne Cook Natalia da Silva Ursula Laa Nicholas Spyrison Earo Wang H. Sherry Zhang 《Wiley Interdisciplinary Reviews: Computational Statistics》2022,14(4):e1573
This article discusses a high-dimensional visualization technique called the tour, which can be used to view data in more than three dimensions. We review the theory and history behind the technique, as well as modern software developments and applications of the tour that are being found across the sciences and machine learning. This article is categorized under:
- Statistical and Graphical Methods of Data Analysis > Analysis of High Dimensional Data
- Statistical and Graphical Methods of Data Analysis > Statistical Graphics and Visualization
- Statistical Learning and Exploratory Methods of the Data Sciences > Exploratory Data Analysis
20.