期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Three-step estimation in linear mixed models with skew-t distributions

Tianyue Zhou Xuming He 《Journal of statistical planning and inference》2008

Linear mixed models based on the normality assumption are widely used in health related studies. Although the normality assumption leads to simple, mathematically tractable, and powerful tests, violation of the assumption may easily invalidate the statistical inference. Transformation of variables is sometimes used to make normality approximately true. In this paper we consider another approach by replacing the normal distributions in linear mixed models by skew-t distributions, which account for skewness and heavy tails for both the random effects and the errors. The full likelihood-based estimator is often difficult to use, but a 3-step estimation procedure is proposed, followed by an application to the analysis of deglutition apnea duration in normal swallows. The example shows that skew-t models often entail more reliable inference than Gaussian models for the skewed data. 相似文献

2.

Likelihood-based approaches for multivariate linear models under inequality constraints for incomplete data

Shurong Zheng Jianhua Guo Ning-Zhong Shi Guo-Liang Tian 《Journal of statistical planning and inference》2012

In this paper, we consider a multivariate linear model with complete/incomplete data, where the regression coefficients are subject to a set of linear inequality restrictions. We first develop an expectation/conditional maximization (ECM) algorithm for calculating restricted maximum likelihood estimates of parameters of interest. We then establish the corresponding convergence properties for the proposed ECM algorithm. Applications to growth curve models and linear mixed models are presented. Confidence interval construction via the double-bootstrap method is provided. Some simulation studies are performed and a real example is used to illustrate the proposed methods. 相似文献

3.

Nonnegative unbiased estimability of linear combinations of two variance components

Jerzy K. Baksalary Anna Molińska 《Journal of statistical planning and inference》1984,10(1):1-8

For a general mixed model with two variance components θ₁ and θ₂, a criterion for a function q₁θ₁+q₂θ₂ to admit an unbiased nonnegative definite quadratic estimator is established in a form that allows answering the question of existence of such an estimator more explicitly than with the use of the criteria known hitherto. An application of this result to the case of a random one-way model shows that for many unbalanced models the estimability criterion is expressible directly by the largest of the numbers of observations within levels, thus extending the criterion established by LaMotte (1973) for balanced models. 相似文献

4.

Scaled density models for binary response data and their D- and Ds-optimal designs

《Journal of statistical planning and inference》2005,128(2):649-660

We introduce scaled density models for binary response data which can be much more reasonable than the traditional binary response models for particular types of binary response data. We show the maximum-likelihood estimates for the new models and it seems that the model works well with some sets of data. We also considered optimum designs for parameter estimation for the models and found that the D- and D_s-optimum designs are independent of parameters corresponding to the linear function of dose level, but the optimum designs are simple functions of a scale parameter only. 相似文献

5.

On the best linear unbiased estimator and the linear sufficiency of a general growth curve model

Guang-Jing Song 《Journal of statistical planning and inference》2011,141(8):2700-2710

Growth curve models are used to analyze repeated measures data (longitudinal data), which are functions of time. In this paper, some necessary and sufficient conditions for linear function B₁YB₂ to be the best linear unbiased estimator (BLUE) of estimable functions X₁ΘX₂ (or K₁ΘK₂) under the general growth curve model were established. In addition, the representations of BLUE(K₁ΘK₂) (or BLUE(X₁ΘX₂)) were derived when the conditions are satisfied. Two special notions of linear sufficiency with respect to the general growth curve model are given in the end. The findings of this paper enrich some known results in the literature. 相似文献

6.

Estimation for High‐Dimensional Linear Mixed‐Effects Models Using ℓ1‐Penalization

JÜRG SCHELLDORFER PETER BÜHLMANN SARA VAN DE GEER 《Scandinavian Journal of Statistics》2011,38(2):197-214

Abstract. We propose an ?₁‐penalized estimation procedure for high‐dimensional linear mixed‐effects models. The models are useful whenever there is a grouping structure among high‐dimensional observations, that is, for clustered data. We prove a consistency and an oracle optimality result and we develop an algorithm with provable numerical convergence. Furthermore, we demonstrate the performance of the method on simulated and a real high‐dimensional data set. 相似文献

7.

An alternate version of the conceptual predictive statistic based on a symmetrized discrepancy measure

Joseph E. Cavanaugh Andrew A. Neath Simon L. Davies 《Journal of statistical planning and inference》2010

The conceptual predictive statistic, C_p, is a widely used criterion for model selection in linear regression. C_p serves as an estimator of a discrepancy, a measure that reflects the disparity between the generating model and a fitted candidate model. This discrepancy, based on scaled squared error loss, is asymmetric: an alternate measure is obtained by reversing the roles of the two models in the definition of the measure. We propose a variant of the C_p statistic based on estimating a symmetrized version of the discrepancy targeted by C_p. We claim that the resulting criterion provides better protection against overfitting than C_p, since the symmetric discrepancy is more sensitive towards detecting overspecification than its asymmetric counterpart. We illustrate our claim by presenting simulation results. Finally, we demonstrate the practical utility of the new criterion by discussing a modeling application based on data collected in a cardiac rehabilitation program at University of Iowa Hospitals and Clinics. 相似文献

8.

Discussion of “The power of monitoring: how to make the most of a contaminated multivariate sample” by Andrea Cerioli,Marco Riani,Anthony C. Atkinson and Aldo Corbellini

Valentin Todorov 《Statistical Methods and Applications》2018,27(4):595-602

This paper discusses the contribution of Cerioli et al. (Stat Methods Appl, 2018), where robust monitoring based on high breakdown point estimators is proposed for multivariate data. The results follow years of development in robust diagnostic techniques. We discuss the issues of extending data monitoring to other models with complex structure, e.g. factor analysis, mixed linear models for which S and MM-estimators exist or deviating data cells. We emphasise the importance of robust testing that is often overlooked despite robust tests being readily available once S and MM-estimators have been defined. We mention open questions like out-of-sample inference or big data issues that would benefit from monitoring. 相似文献

9.

Clustering gene expression time course data using mixtures of multivariate t-distributions

Paul D. McNicholas Sanjeena Subedi 《Journal of statistical planning and inference》2012,142(5):1114-1127

Clustering gene expression time course data is an important problem in bioinformatics because understanding which genes behave similarly can lead to the discovery of important biological information. Statistically, the problem of clustering time course data is a special case of the more general problem of clustering longitudinal data. In this paper, a very general and flexible model-based technique is used to cluster longitudinal data. Mixtures of multivariate t-distributions are utilized, with a linear model for the mean and a modified Cholesky-decomposed covariance structure. Constraints are placed upon the covariance structure, leading to a novel family of mixture models, including parsimonious models. In addition to model-based clustering, these models are also used for model-based classification, i.e., semi-supervised clustering. Parameters, including the component degrees of freedom, are estimated using an expectation-maximization algorithm and two different approaches to model selection are considered. The models are applied to simulated data to illustrate their efficacy; this includes a comparison with their Gaussian analogues—the use of these Gaussian analogues with a linear model for the mean is novel in itself. Our family of multivariate t mixture models is then applied to two real gene expression time course data sets and the results are discussed. We conclude with a summary, suggestions for future work, and a discussion about constraining the degrees of freedom parameter. 相似文献

10.

Equality of BLUEs or BLUPs under two linear models using stochastic restrictions

Stephen J. Haslett Simo Puntanen 《Statistical Papers》2010,51(2):465-475

相似文献

11.

Analysis of multivariate categorical data with misclassification errors by triple sampling schemes

T. Timothy Chen Yosef Hochberg Aaron Tenenbein 《Journal of statistical planning and inference》1984,9(2):177-184

Previous work has been carried out on the use of double-sampling schemes for inference from categorical data subject to misclassification. The double-sampling schemes utilize a sample of n units classified by both a fallible and true device and another sample of n₂ units classified only by a fallible device. In actual applications, one often hasavailable a third sample of n₁ units, which is classified only by the true device. In this article we develop techniques of fitting log-linear models under various misclassification structures for a general triple-sampling scheme. The estimation is by maximum likelihood and the fitted models are hierarchical. The methodology is illustrated by applying it to data in traffic safety research from a study on the effectiveness of belts in reducing injuries. 相似文献

12.

Response prediction in mixed effects models

《Journal of statistical planning and inference》2006,136(11):3948-3966

Although prediction in mixed effects models usually concerns the random effects, in this paper we deal with the problem of prediction of a future, or yet unobserved, response random variable, belonging to a given cluster. In particular, the aim is to define computationally tractable prediction intervals, with conditional and unconditional coverage probability close to the target nominal value. This solution involves the conditional density of the future response random variable given the observed data, or a suitable high-order approximation based on the Laplace method. We prove that, unless the amount of data is very limited, the estimative or naive predictive procedure gives a relatively simple, feasible solution for response prediction. An application to generalized linear mixed models is presented. 相似文献

13.

The generalized linear chirp process

Stephen D. Robertson Henry L. GrayWayne A. Woodward 《Journal of statistical planning and inference》2010

The linear chirp process is an important class of time series for which the instantaneous frequency changes linearly in time. Linear chirps have been used extensively to model a variety of physical signals such as radar, sonar, and whale clicks (see 1, 5 and 6). We introduce the stochastic linear chirp model and then define the generalized linear chirp (GLC) process as a special case of the G-stationary process studied by Jiang et al. (2006) to model data with time-varying frequencies. We then define GLC(p,q) processes and show that the relationship between stochastic linear chirp processes and GLC(p,q) processes is analogous to that between harmonic and ARMA models. The new methods are then applied to both simulated and actual data sets. 相似文献

14.

Using a Truncated C p Statistic for Variable Selection in Multiple Linear Regression

D. W. Uys S. J. Steel 《统计学通讯:模拟与计算》2013,42(2):420-432

In multiple linear regression analysis each lower-dimensional subspace L of a known linear subspace M of ?ⁿ corresponds to a non empty subset of the columns of the regressor matrix. For a fixed subspace L, the C _p statistic is an unbiased estimator of the mean square error if the projection of the response vector onto L is used to estimate the expected response. In this article, we consider two truncated versions of the C _p statistic that can also be used to estimate this mean square error. The C _p statistic and its truncated versions are compared in two example data sets, illustrating that use of the truncated versions may result in models different from those selected by standard C _p. 相似文献

15.

On multivariate truncated generalized Cauchy distribution

Saieed F. Ateya Elham A. Madhagi 《Statistical Papers》2013,54(3):879-897

In this paper, a multivariate form of truncated generalized Cauchy distribution (TGCD), which is denoted by (MVTGCD), is introduced. The joint density function, conditional density function, moment generating function and mixed moments of order ${b=\sum_{i=1}^{k}b_{i}}$ are obtained. Making use of the mixed moments formula, skewness and kurtosis in case of the bivariate case are obtained. Also, all parameters of the distribution are estimated using the maximum likelihood and Bayes methods. A real data set is introduced and analyzed using three models. The first model is the bivariate Cauchy distribution, the second is the truncated bivariate Cauchy distribution and the third is the bivariate truncated generalized Cauchy distribution. A comparison is carried out between the mentioned models based on the corresponding Kolmogorov–Smirnov (K–S) test statistic to emphasize that the bivariate truncated generalized Cauchy model fits the data better than the other models. 相似文献

16.

Robust linear mixed models with skew-normal independent distributions from a Bayesian perspective

Victor H. Lachos Dipak K. Dey Vicente G. Cancho 《Journal of statistical planning and inference》2009,139(12):4098-4110

Linear mixed models were developed to handle clustered data and have been a topic of increasing interest in statistics for the past 50 years. Generally, the normality (or symmetry) of the random effects is a common assumption in linear mixed models but it may, sometimes, be unrealistic, obscuring important features of among-subjects variation. In this article, we utilize skew-normal/independent distributions as a tool for robust modeling of linear mixed models under a Bayesian paradigm. The skew-normal/independent distributions is an attractive class of asymmetric heavy-tailed distributions that includes the skew-normal distribution, skew-t, skew-slash and the skew-contaminated normal distributions as special cases, providing an appealing robust alternative to the routine use of symmetric distributions in this type of models. The methods developed are illustrated using a real data set from Framingham cholesterol study. 相似文献

17.

Optimality of blue's in a general linear model with incorrect design matrix

Thomas Mathew P. Bhimasankaram 《Journal of statistical planning and inference》1983,8(3):315-329

We consider the Gauss-Markoff model (Y,X₀β,σ²V) and provide solutions to the following problem: What is the class of all models (Y,Xβ,σ²V) such that a specific linear representation/some linear representation/every linear representation of the BLUE of every estimable parametric functional p'β under (Y,X₀β,σ²V) is (a) an unbiased estimator, (b) a BLUE, (c) a linear minimum bias estimator and (d) best linear minimum bias estimator of p'β under (Y,Xβ,σ²V)? We also analyse the above problems, when attention is restricted to a subclass of estimable parametric functionals. 相似文献

18.

L 1-estimation for varying coefficient models

Qingguo Tang 《Statistics》2013,47(5):389-404

The varying coefficient model is a useful extension of linear models and has many advantages in practical use. To estimate the unknown functions in the model, the kernel type with local linear least-squares (L ₂) estimation methods has been proposed by several authors. When the data contain outliers or come from population with heavy-tailed distributions, L ₁-estimation should yield better estimators. In this article, we present the local linear L ₁-estimation method and derive the asymptotic distributions of the L ₁-estimators. The simulation results for two examples, with outliers and heavy-tailed distribution, respectively, show that the L ₁-estimators outperform the L ₂-estimators. 相似文献

19.

Applying functional data analysis to assess tele-interpersonal psychotherapy's efficacy to reduce depression

Henok Woldu Timothy G. Heckman Andreas Handel 《Journal of applied statistics》2019,46(2):203-216

The use of parametric linear mixed models and generalized linear mixed models to analyze longitudinal data collected during randomized control trials (RCT) is conventional. The application of these methods, however, is restricted due to various assumptions required by these models. When the number of observations per subject is sufficiently large, and individual trajectories are noisy, functional data analysis (FDA) methods serve as an alternative to parametric longitudinal data analysis techniques. However, the use of FDA in RCTs is rare. In this paper, the effectiveness of FDA and linear mixed models (LMMs) was compared by analyzing data from rural persons living with HIV and comorbid depression enrolled in a depression treatment randomized clinical trial. Interactive voice response systems were used for weekly administrations of the 10-item Self-Administered Depression Scale (SADS) over 41 weeks. Functional principal component analysis and functional regression analysis methods detected a statistically significant difference in SADS between telphone-administered interpersonal psychotherapy (tele-IPT) and controls but linear mixed effects model results did not. Additional simulation studies were conducted to compare FDA and LMMs under a different nonlinear trajectory assumption. In this clinical trial with sufficient per subject measured outcomes and individual trajectories that are noisy and nonlinear, we found FDA methods to be a better alternative to LMMs. 相似文献

20.

A semi-analytical solution to the maximum-likelihood fit of Poisson data to a linear model using the Cash statistic

Massimiliano Bonamente David Spence 《Journal of applied statistics》2022,49(3):522

The Cash statistic, also known as the

C

statistic, is commonly used for the analysis of low-count Poisson data, including data with null counts for certain values of the independent variable. The use of this statistic is especially attractive for low-count data that cannot be combined, or re-binned, without loss of resolution. This paper presents a new maximum-likelihood solution for the best-fit parameters of a linear model using the Poisson-based Cash statistic. The solution presented in this paper provides a new and simple method to measure the best-fit parameters of a linear model for any Poisson-based data, including data with null counts. In particular, the method enforces the requirement that the best-fit linear model be non-negative throughout the support of the independent variable. The method is summarized in a simple algorithm to fit Poisson counting data of any size and counting rate with a linear model, by-passing entirely the use of the traditional

χ^{2}

statistic. 相似文献