首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Linear mixed models based on the normality assumption are widely used in health related studies. Although the normality assumption leads to simple, mathematically tractable, and powerful tests, violation of the assumption may easily invalidate the statistical inference. Transformation of variables is sometimes used to make normality approximately true. In this paper we consider another approach by replacing the normal distributions in linear mixed models by skew-t distributions, which account for skewness and heavy tails for both the random effects and the errors. The full likelihood-based estimator is often difficult to use, but a 3-step estimation procedure is proposed, followed by an application to the analysis of deglutition apnea duration in normal swallows. The example shows that skew-t models often entail more reliable inference than Gaussian models for the skewed data.  相似文献   

2.
In this paper, we consider a multivariate linear model with complete/incomplete data, where the regression coefficients are subject to a set of linear inequality restrictions. We first develop an expectation/conditional maximization (ECM) algorithm for calculating restricted maximum likelihood estimates of parameters of interest. We then establish the corresponding convergence properties for the proposed ECM algorithm. Applications to growth curve models and linear mixed models are presented. Confidence interval construction via the double-bootstrap method is provided. Some simulation studies are performed and a real example is used to illustrate the proposed methods.  相似文献   

3.
For a general mixed model with two variance components θ1 and θ2, a criterion for a function q1θ1+q2θ2 to admit an unbiased nonnegative definite quadratic estimator is established in a form that allows answering the question of existence of such an estimator more explicitly than with the use of the criteria known hitherto. An application of this result to the case of a random one-way model shows that for many unbalanced models the estimability criterion is expressible directly by the largest of the numbers of observations within levels, thus extending the criterion established by LaMotte (1973) for balanced models.  相似文献   

4.
We introduce scaled density models for binary response data which can be much more reasonable than the traditional binary response models for particular types of binary response data. We show the maximum-likelihood estimates for the new models and it seems that the model works well with some sets of data. We also considered optimum designs for parameter estimation for the models and found that the D- and Ds-optimum designs are independent of parameters corresponding to the linear function of dose level, but the optimum designs are simple functions of a scale parameter only.  相似文献   

5.
Growth curve models are used to analyze repeated measures data (longitudinal data), which are functions of time. In this paper, some necessary and sufficient conditions for linear function B1YB2 to be the best linear unbiased estimator (BLUE) of estimable functions X1ΘX2 (or K1ΘK2) under the general growth curve model were established. In addition, the representations of BLUE(K1ΘK2) (or BLUE(X1ΘX2)) were derived when the conditions are satisfied. Two special notions of linear sufficiency with respect to the general growth curve model are given in the end. The findings of this paper enrich some known results in the literature.  相似文献   

6.
Abstract. We propose an ?1‐penalized estimation procedure for high‐dimensional linear mixed‐effects models. The models are useful whenever there is a grouping structure among high‐dimensional observations, that is, for clustered data. We prove a consistency and an oracle optimality result and we develop an algorithm with provable numerical convergence. Furthermore, we demonstrate the performance of the method on simulated and a real high‐dimensional data set.  相似文献   

7.
The conceptual predictive statistic, Cp, is a widely used criterion for model selection in linear regression. Cp serves as an estimator of a discrepancy, a measure that reflects the disparity between the generating model and a fitted candidate model. This discrepancy, based on scaled squared error loss, is asymmetric: an alternate measure is obtained by reversing the roles of the two models in the definition of the measure. We propose a variant of the Cp statistic based on estimating a symmetrized version of the discrepancy targeted by Cp. We claim that the resulting criterion provides better protection against overfitting than Cp, since the symmetric discrepancy is more sensitive towards detecting overspecification than its asymmetric counterpart. We illustrate our claim by presenting simulation results. Finally, we demonstrate the practical utility of the new criterion by discussing a modeling application based on data collected in a cardiac rehabilitation program at University of Iowa Hospitals and Clinics.  相似文献   

8.
This paper discusses the contribution of Cerioli et al. (Stat Methods Appl, 2018), where robust monitoring based on high breakdown point estimators is proposed for multivariate data. The results follow years of development in robust diagnostic techniques. We discuss the issues of extending data monitoring to other models with complex structure, e.g. factor analysis, mixed linear models for which S and MM-estimators exist or deviating data cells. We emphasise the importance of robust testing that is often overlooked despite robust tests being readily available once S and MM-estimators have been defined. We mention open questions like out-of-sample inference or big data issues that would benefit from monitoring.  相似文献   

9.
Clustering gene expression time course data is an important problem in bioinformatics because understanding which genes behave similarly can lead to the discovery of important biological information. Statistically, the problem of clustering time course data is a special case of the more general problem of clustering longitudinal data. In this paper, a very general and flexible model-based technique is used to cluster longitudinal data. Mixtures of multivariate t-distributions are utilized, with a linear model for the mean and a modified Cholesky-decomposed covariance structure. Constraints are placed upon the covariance structure, leading to a novel family of mixture models, including parsimonious models. In addition to model-based clustering, these models are also used for model-based classification, i.e., semi-supervised clustering. Parameters, including the component degrees of freedom, are estimated using an expectation-maximization algorithm and two different approaches to model selection are considered. The models are applied to simulated data to illustrate their efficacy; this includes a comparison with their Gaussian analogues—the use of these Gaussian analogues with a linear model for the mean is novel in itself. Our family of multivariate t mixture models is then applied to two real gene expression time course data sets and the results are discussed. We conclude with a summary, suggestions for future work, and a discussion about constraining the degrees of freedom parameter.  相似文献   

10.
11.
Previous work has been carried out on the use of double-sampling schemes for inference from categorical data subject to misclassification. The double-sampling schemes utilize a sample of n units classified by both a fallible and true device and another sample of n2 units classified only by a fallible device. In actual applications, one often hasavailable a third sample of n1 units, which is classified only by the true device. In this article we develop techniques of fitting log-linear models under various misclassification structures for a general triple-sampling scheme. The estimation is by maximum likelihood and the fitted models are hierarchical. The methodology is illustrated by applying it to data in traffic safety research from a study on the effectiveness of belts in reducing injuries.  相似文献   

12.
Although prediction in mixed effects models usually concerns the random effects, in this paper we deal with the problem of prediction of a future, or yet unobserved, response random variable, belonging to a given cluster. In particular, the aim is to define computationally tractable prediction intervals, with conditional and unconditional coverage probability close to the target nominal value. This solution involves the conditional density of the future response random variable given the observed data, or a suitable high-order approximation based on the Laplace method. We prove that, unless the amount of data is very limited, the estimative or naive predictive procedure gives a relatively simple, feasible solution for response prediction. An application to generalized linear mixed models is presented.  相似文献   

13.
The linear chirp process is an important class of time series for which the instantaneous frequency changes linearly in time. Linear chirps have been used extensively to model a variety of physical signals such as radar, sonar, and whale clicks (see 1, 5 and 6). We introduce the stochastic linear chirp model and then define the generalized linear chirp (GLC) process as a special case of the G-stationary process studied by Jiang et al. (2006) to model data with time-varying frequencies. We then define GLC(p,q) processes and show that the relationship between stochastic linear chirp processes and GLC(p,q) processes is analogous to that between harmonic and ARMA models. The new methods are then applied to both simulated and actual data sets.  相似文献   

14.
In multiple linear regression analysis each lower-dimensional subspace L of a known linear subspace M of ? n corresponds to a non empty subset of the columns of the regressor matrix. For a fixed subspace L, the C p statistic is an unbiased estimator of the mean square error if the projection of the response vector onto L is used to estimate the expected response. In this article, we consider two truncated versions of the C p statistic that can also be used to estimate this mean square error. The C p statistic and its truncated versions are compared in two example data sets, illustrating that use of the truncated versions may result in models different from those selected by standard C p .  相似文献   

15.
In this paper, a multivariate form of truncated generalized Cauchy distribution (TGCD), which is denoted by (MVTGCD), is introduced. The joint density function, conditional density function, moment generating function and mixed moments of order ${b=\sum_{i=1}^{k}b_{i}}$ are obtained. Making use of the mixed moments formula, skewness and kurtosis in case of the bivariate case are obtained. Also, all parameters of the distribution are estimated using the maximum likelihood and Bayes methods. A real data set is introduced and analyzed using three models. The first model is the bivariate Cauchy distribution, the second is the truncated bivariate Cauchy distribution and the third is the bivariate truncated generalized Cauchy distribution. A comparison is carried out between the mentioned models based on the corresponding Kolmogorov–Smirnov (K–S) test statistic to emphasize that the bivariate truncated generalized Cauchy model fits the data better than the other models.  相似文献   

16.
Linear mixed models were developed to handle clustered data and have been a topic of increasing interest in statistics for the past 50 years. Generally, the normality (or symmetry) of the random effects is a common assumption in linear mixed models but it may, sometimes, be unrealistic, obscuring important features of among-subjects variation. In this article, we utilize skew-normal/independent distributions as a tool for robust modeling of linear mixed models under a Bayesian paradigm. The skew-normal/independent distributions is an attractive class of asymmetric heavy-tailed distributions that includes the skew-normal distribution, skew-t, skew-slash and the skew-contaminated normal distributions as special cases, providing an appealing robust alternative to the routine use of symmetric distributions in this type of models. The methods developed are illustrated using a real data set from Framingham cholesterol study.  相似文献   

17.
We consider the Gauss-Markoff model (Y,X0β,σ2V) and provide solutions to the following problem: What is the class of all models (Y,Xβ,σ2V) such that a specific linear representation/some linear representation/every linear representation of the BLUE of every estimable parametric functional p'β under (Y,X0β,σ2V) is (a) an unbiased estimator, (b) a BLUE, (c) a linear minimum bias estimator and (d) best linear minimum bias estimator of p'β under (Y,Xβ,σ2V)? We also analyse the above problems, when attention is restricted to a subclass of estimable parametric functionals.  相似文献   

18.
Qingguo Tang 《Statistics》2013,47(5):389-404
The varying coefficient model is a useful extension of linear models and has many advantages in practical use. To estimate the unknown functions in the model, the kernel type with local linear least-squares (L 2) estimation methods has been proposed by several authors. When the data contain outliers or come from population with heavy-tailed distributions, L 1-estimation should yield better estimators. In this article, we present the local linear L 1-estimation method and derive the asymptotic distributions of the L 1-estimators. The simulation results for two examples, with outliers and heavy-tailed distribution, respectively, show that the L 1-estimators outperform the L 2-estimators.  相似文献   

19.
The use of parametric linear mixed models and generalized linear mixed models to analyze longitudinal data collected during randomized control trials (RCT) is conventional. The application of these methods, however, is restricted due to various assumptions required by these models. When the number of observations per subject is sufficiently large, and individual trajectories are noisy, functional data analysis (FDA) methods serve as an alternative to parametric longitudinal data analysis techniques. However, the use of FDA in RCTs is rare. In this paper, the effectiveness of FDA and linear mixed models (LMMs) was compared by analyzing data from rural persons living with HIV and comorbid depression enrolled in a depression treatment randomized clinical trial. Interactive voice response systems were used for weekly administrations of the 10-item Self-Administered Depression Scale (SADS) over 41 weeks. Functional principal component analysis and functional regression analysis methods detected a statistically significant difference in SADS between telphone-administered interpersonal psychotherapy (tele-IPT) and controls but linear mixed effects model results did not. Additional simulation studies were conducted to compare FDA and LMMs under a different nonlinear trajectory assumption. In this clinical trial with sufficient per subject measured outcomes and individual trajectories that are noisy and nonlinear, we found FDA methods to be a better alternative to LMMs.  相似文献   

20.
The Cash statistic, also known as the C statistic, is commonly used for the analysis of low-count Poisson data, including data with null counts for certain values of the independent variable. The use of this statistic is especially attractive for low-count data that cannot be combined, or re-binned, without loss of resolution. This paper presents a new maximum-likelihood solution for the best-fit parameters of a linear model using the Poisson-based Cash statistic. The solution presented in this paper provides a new and simple method to measure the best-fit parameters of a linear model for any Poisson-based data, including data with null counts. In particular, the method enforces the requirement that the best-fit linear model be non-negative throughout the support of the independent variable. The method is summarized in a simple algorithm to fit Poisson counting data of any size and counting rate with a linear model, by-passing entirely the use of the traditional χ2 statistic.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号