期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A spatial structural equation model for health outcomes

Peter Congdon 《Journal of statistical planning and inference》2008

Population level risk factors in spatial epidemiology (e.g. socioeconomic deprivation) are often not directly available but indirectly measured through census or other sources. This paper considers multiple health outcomes (e.g. mortality, hospital admissions) in relation to unmeasured latent constructs of population morbidity, established as relevant to explaining spatial contrasts in such health outcomes. The constructs are derived using a factor analytic approach in which observed area social indicators are measures of a smaller set of latent constructs. The constructs are allowed to be spatially correlated as well as correlated with one another. The possibility of nonlinear construct effects is considered using a spline regression. A case study considers suicide mortality and self-harm contrasts in 32 London boroughs, in relation to two latent constructs: area deprivation and social fragmentation. 相似文献

2.

A spatial structural equation model with an application to area health needs

《Journal of Statistical Computation and Simulation》2012,82(4):401-412

Indices of population ‘health need’ are often used to distribute health resources or assess equity in service provision. This article describes a spatial structural equation model incorporating multiple indicators of need and multiple population health risks that affect need (analogous to multiple indicators–multiple causes models). More specifically, the multiple indicator component of the model involves health outcomes such as hospital admissions or mortality, whereas the multiple risk component models the impact on the need for area social and demographic indicators, which proxy population-level risk factors for different diseases. The latent need construct is allowed (under a Bayesian approach) to be spatially correlated, though the prior assumed for need allows a mix of spatially structured and unstructured influences. A case study considers variations in need for coronary heart disease (CHD) care over 625 small areas in London, using recent mortality and hospitalization data (the ‘indicators’) and measures of general ill-health, income and unemployment, which proxy variations in population risk for CHD. 相似文献

3.

Structural inference of the parameters of the heteroscedastic simultaneous equation model

M. Safiul Haq Shahjahan Khan 《统计学通讯:理论与方法》2013,42(12):4713-4729

The structural approach of inference for the parameters of a simultaneous equation model with heteroscedastic error variance is investigated in this paper. The joint and the marginal structural distributions for the coefficients of the exogenous variables and the scale parameters of the error variables, and the marginal likelihood function of the coefficients of the endogenous variables have been derived. The estimates are directly obtainable from the structural distribution and the marginal likelihood function of the parameters. The marginal distribution of a subset of coefficients of exogenous variables provides the basis for making inference for a particular subset of parameter of interest. 相似文献

4.

Structural identification and variable selection in high-dimensional varying-coefficient models

Yuping Chen Wingkam Fung 《Journal of nonparametric statistics》2017,29(2):258-279

Varying-coefficient models have been widely used to investigate the possible time-dependent effects of covariates when the response variable comes from normal distribution. Much progress has been made for inference and variable selection in the framework of such models. However, the identification of model structure, that is how to identify which covariates have time-varying effects and which have fixed effects, remains a challenging and unsolved problem especially when the dimension of covariates is much larger than the sample size. In this article, we consider the structural identification and variable selection problems in varying-coefficient models for high-dimensional data. Using a modified basis expansion approach and group variable selection methods, we propose a unified procedure to simultaneously identify the model structure, select important variables and estimate the coefficient curves. The unique feature of the proposed approach is that we do not have to specify the model structure in advance, therefore, it is more realistic and appropriate for real data analysis. Asymptotic properties of the proposed estimators have been derived under regular conditions. Furthermore, we evaluate the finite sample performance of the proposed methods with Monte Carlo simulation studies and a real data analysis. 相似文献

5.

An example of a two-part latent growth curve model for semicontinuous outcomes in the health sciences

Sterling McPherson Celestina Barbosa-Leiker 《Journal of applied statistics》2012,39(10):2113-2128

A new method of modeling coronary artery calcium (CAC) is needed in order to properly understand the probability of onset and growth of CAC. CAC remains a controversial indicator of cardiovascular disease (CVD) risk, but this may be due to ill-equipped methods of specifying CAC during the analysis phase of studies reporting an analysis where CAC is the primary outcome. The modern method of two-part latent growth modeling may represent a strong alternative to the myriad of existing methods for modeling CAC. We provide a brief overview of existing methods of analysis used for CAC before introducing the general latent growth curve model, how it extends into a two-part (semicontinuous) growth model, and how the ubiquitous problem of missing data can be effectively handled. We then present an example of how to model CAC using this framework. We demonstrate that utilizing this type of modeling strategy can result in traditional predictors of CAC (e.g. age, gender, and high-density lipoprotein cholesterol), exerting a different impact on the two different, yet simultaneous, operationalizations of CAC. This method of analyzing CAC could inform future analyses of CAC and inform subsequent discussions about the nature of its potential to inform long-term CVD risk and heart events. 相似文献

6.

Threshold knot selection for large-scale spatial models with applications to the Deepwater Horizon disaster

Casey M. Jelsema Richard K. Kwok Shyamal D. Peddada 《Journal of Statistical Computation and Simulation》2019,89(11):2121-2137

Large spatial datasets are typically modelled through a small set of knot locations; often these locations are specified by the investigator by arbitrary criteria. Existing methods of estimating the locations of knots assume their number is known a priori, or are otherwise computationally intensive. We develop a computationally efficient method of estimating both the location and number of knots for spatial mixed effects models. Our proposed algorithm, Threshold Knot Selection (TKS), estimates knot locations by identifying clusters of large residuals and placing a knot in the centroid of those clusters. We conduct a simulation study showing TKS in relation to several comparable methods of estimating knot locations. Our case study utilizes data of particulate matter concentrations collected during the course of the response and clean-up effort from the 2010 Deepwater Horizon oil spill in the Gulf of Mexico. 相似文献

7.

Spatial Mallows model averaging for geostatistical models

Jun Liao Guohua Zou Yan Gao 《Revue canadienne de statistique》2019,47(3):336-351

Important progress has been made with model averaging methods over the past decades. For spatial data, however, the idea of model averaging has not been applied well. This article studies model averaging methods for the spatial geostatistical linear model. A spatial Mallows criterion is developed to choose weights for the model averaging estimator. The resulting estimator can achieve asymptotic optimality in terms of L₂ loss. Simulation experiments reveal that our proposed estimator is superior to the model averaging estimator by the Mallows criterion developed for ordinary linear models [Hansen, 2007] and the model selection estimator using the corrected Akaike's information criterion, developed for geostatistical linear models [Hoeting et al., 2006]. The Canadian Journal of Statistics 47: 336–351; 2019 © 2019 Statistical Society of Canada 相似文献

8.

Spatial generalized linear mixed models in small area estimation

Mahmoud Torabi 《Revue canadienne de statistique》2019,47(3):426-437

In survey sampling, policy decisions regarding the allocation of resources to sub‐groups of a population depend on reliable predictors of their underlying parameters. However, in some sub‐groups, called small areas due to small sample sizes relative to the population, the information needed for reliable estimation is typically not available. Consequently, data on a coarser scale are used to predict the characteristics of small areas. Mixed models are the primary tools in small area estimation (SAE) and also borrow information from alternative sources (e.g., previous surveys and administrative and census data sets). In many circumstances, small area predictors are associated with location. For instance, in the case of chronic disease or cancer, it is important for policy makers to understand spatial patterns of disease in order to determine small areas with high risk of disease and establish prevention strategies. The literature considering SAE with spatial random effects is sparse and mostly in the context of spatial linear mixed models. In this article, small area models are proposed for the class of spatial generalized linear mixed models to obtain small area predictors and corresponding second‐order unbiased mean squared prediction errors via Taylor expansion and a parametric bootstrap approach. The performance of the proposed approach is evaluated through simulation studies and application of the models to a real esophageal cancer data set from Minnesota, U.S.A. The Canadian Journal of Statistics 47: 426–437; 2019 © 2019 Statistical Society of Canada 相似文献

9.

Model selection information criteria in latent class models with missing data and contingency question

《Journal of Statistical Computation and Simulation》2012,82(1):159-170

Latent class analysis (LCA) has been found to have important applications in social and behavioural sciences for modelling categorical response variables, and non-response is typical when collecting data. In this study, the non-response mainly included ‘contingency questions’ and real ‘missing data’. The primary objective of this study was to evaluate the effects of some potential factors on model selection indices in LCA with non-response data. We simulated missing data with contingency question and evaluated the accuracy rates of eight information criteria for selecting the correct models. The results showed that the main factors are latent class proportions, conditional probabilities, sample size, the number of items, the missing data rate and the contingency data rate. Interactions of the conditional probabilities with class proportions, sample size and the number of items are also significant. From our simulation results, the impact of missing data and contingency questions can be amended by increasing the sample size or the number of items. 相似文献

10.

Markov Chain Monte Carlo model selection for DAG models

Eva-Maria?Fronk Email author Paolo?Giudici 《Statistical Methods and Applications》2004,13(3):259-273

We present a methodology for Bayesian model choice and averaging in Gaussian directed acyclic graphs (dags). The dimension-changing move involves adding or dropping a (directed) edge from the graph. The methodology employs the results in Geiger and Heckerman and searches directly in the space of all dags. Model determination is carried out by implementing a reversible jump Markov Chain Monte Carlo sampler. To achieve this aim we rely on the concept of adjacency matrices, which provides a relatively inexpensive check for acyclicity. The performance of our procedure is illustrated by means of two simulated datasets, as well as one real dataset. 相似文献

11.

Modern variable selection for longitudinal semi-parametric models with missing data

J. Kowalski S. Hao T. Chen Y. Liang J. Liu L. Ge 《Journal of applied statistics》2018,45(14):2548-2562

Penalized methods for variable selection such as the Smoothly Clipped Absolute Deviation penalty have been increasingly applied to aid variable section in regression analysis. Much of the literature has focused on parametric models, while a few recent studies have shifted the focus and developed their applications for the popular semi-parametric, or distribution-free, generalized estimating equations (GEEs) and weighted GEE (WGEE). However, although the WGEE is composed of one main and one missing-data module, available methods only focus on the main module, with no variable selection for the missing-data module. In this paper, we develop a new approach to further extend the existing methods to enable variable selection for both modules. The approach is illustrated by both real and simulated study data. 相似文献

12.

Model selection and model averaging for semiparametric partially linear models with missing data

Jie Zeng Weihu Cheng Guozhi Hu Yaohua Rong 《统计学通讯:理论与方法》2019,48(2):381-395

We study model selection and model averaging in semiparametric partially linear models with missing responses. An imputation method is used to estimate the linear regression coefficients and the nonparametric function. We show that the corresponding estimators of the linear regression coefficients are asymptotically normal. Then a focused information criterion and frequentist model average estimators are proposed and their theoretical properties are established. Simulation studies are performed to demonstrate the superiority of the proposed methods over the existing strategies in terms of mean squared error and coverage probability. Finally, the approach is applied to a real data case. 相似文献

13.

Effectiveness of combinations of Gaussian graphical models for model building

M. Sofia Massa Monica Chiogna 《Journal of Statistical Computation and Simulation》2013,83(9):1602-1612

Combining statistical models is an useful approach in all the research area where a global picture of the problem needs to be constructed by binding together evidence from different sources [M.S. Massa and S.L. Lauritzen Combining Statistical Models, M. Viana and H. Wynn, eds., American Mathematical Society, Providence, RI, 2010, pp. 239–259]. In this paper, we investigate the effectiveness of combining a fixed number of Gaussian graphical models respecting some consistency assumptions in problems of model building. In particular, we use the meta-Markov combination of Gaussian graphical models as detailed in Massa and Lauritzen and compare model selection results obtained by combining selections over smaller sets of variables with selection results over all variables of interest. In order to do so, we carry out some simulation studies in which different criteria are considered for the selection procedures. We conclude that the combination performs, generally, better than global estimation, is computationally simpler by virtue of having fewer and simpler models to work on, and has an intuitive appeal to a wide variety of contexts. 相似文献

14.

Bain: a program for Bayesian testing of order constrained hypotheses in structural equation models

Xin Gu Herbert Hoijtink Joris Mulder Yves Rosseel 《Journal of Statistical Computation and Simulation》2019,89(8):1526-1553

This paper presents a new statistical method and accompanying software for the evaluation of order constrained hypotheses in structural equation models (SEM). The method is based on a large sample approximation of the Bayes factor using a prior with a data-based correlational structure. An efficient algorithm is written into an R package to ensure fast computation. The package, referred to as Bain, is easy to use for applied researchers. Two classical examples from the SEM literature are used to illustrate the methodology and software. 相似文献

15.

A new model selection procedure for finite mixture regression models

Conglian Yu 《统计学通讯:理论与方法》2020,49(18):4347-4366

Abstract

In this article, we propose a new penalized-likelihood method to conduct model selection for finite mixture of regression models. The penalties are imposed on mixing proportions and regression coefficients, and hence order selection of the mixture and the variable selection in each component can be simultaneously conducted. The consistency of order selection and the consistency of variable selection are investigated. A modified EM algorithm is proposed to maximize the penalized log-likelihood function. Numerical simulations are conducted to demonstrate the finite sample performance of the estimation procedure. The proposed methodology is further illustrated via real data analysis. 相似文献

16.

A graphical model selection tool for mixed models

M. Sciandra A. Plaia 《统计学通讯:模拟与计算》2013,42(9):2624-2638

ABSTRACT

Model selection can be defined as the task of estimating the performance of different models in order to choose the most parsimonious one, among a potentially very large set of candidate statistical models. We propose a graphical representation to be considered as an extension to the class of mixed models of the deviance plot proposed in the literature within the framework of classical and generalized linear models. This graphical representation allows, once a reduced number of models have been selected, to identify important covariates focusing only on the fixed effects component, assuming the random part properly specified. Nevertheless, we suggest also a standalone figure representing the residual random variance ratio: a cross-evaluation of the two graphical representations will allow to derive some conclusions on the random part specification of the model and a more accurate selection of the final model. 相似文献

17.

Exact dimensionality selection for Bayesian PCA

Charles Bouveyron Pierre Latouche Pierre-Alexandre Mattei 《Scandinavian Journal of Statistics》2020,47(1):196-211

We present a Bayesian model selection approach to estimate the intrinsic dimensionality of a high-dimensional dataset. To this end, we introduce a novel formulation of the probabilisitic principal component analysis model based on a normal-gamma prior distribution. In this context, we exhibit a closed-form expression of the marginal likelihood which allows to infer an optimal number of components. We also propose a heuristic based on the expected shape of the marginal likelihood curve in order to choose the hyperparameters. In nonasymptotic frameworks, we show on simulated data that this exact dimensionality selection approach is competitive with both Bayesian and frequentist state-of-the-art methods. 相似文献

18.

Variable selection and estimation for high-dimensional spatial autoregressive models

Liqian Cai Tapabrata Maiti 《Scandinavian Journal of Statistics》2020,47(2):587-607

Spatial regression models are important tools for many scientific disciplines including economics, business, and social science. In this article, we investigate postmodel selection estimators that apply least squares estimation to the model selected by penalized estimation in high-dimensional regression models with spatial autoregressive errors. We show that by separating the model selection and estimation process, the postmodel selection estimator performs at least as well as the simultaneous variable selection and estimation method in terms of the rate of convergence. Moreover, under perfect model selection, the ℓ₂ rate of convergence is the oracle rate of

\sqrt{s / n}

, compared with the convergence rate of

◂√▸ \sqrt{s \log p / n}

in the general case. Here, n is the sample size and p, s are the model dimension and number of significant covariates, respectively. We further provide the convergence rate of the estimation error in the form of

\sup

norm, and ideally the rate can reach as fast as

◂√▸ \sqrt{\log s / n}

. 相似文献

19.

Multiple predicting K-fold cross-validation for model selection 总被引：1，自引：0，他引：1

Yoonsuh Jung 《Journal of nonparametric statistics》2018,30(1):197-215

相似文献

20.

Dynamic latent trait models with mixed hidden Markov structure for mixed longitudinal outcomes

Yue Zhang Kiros Berhane 《Journal of applied statistics》2016,43(4):704-720

We propose a general Bayesian joint modeling approach to model mixed longitudinal outcomes from the exponential family for taking into account any differential misclassification that may exist among categorical outcomes. Under this framework, outcomes observed without measurement error are related to latent trait variables through generalized linear mixed effect models. The misclassified outcomes are related to the latent class variables, which represent unobserved real states, using mixed hidden Markov models (MHMMs). In addition to enabling the estimation of parameters in prevalence, transition and misclassification probabilities, MHMMs capture cluster level heterogeneity. A transition modeling structure allows the latent trait and latent class variables to depend on observed predictors at the same time period and also on latent trait and latent class variables at previous time periods for each individual. Simulation studies are conducted to make comparisons with traditional models in order to illustrate the gains from the proposed approach. The new approach is applied to data from the Southern California Children Health Study to jointly model questionnaire-based asthma state and multiple lung function measurements in order to gain better insight about the underlying biological mechanism that governs the inter-relationship between asthma state and lung function development. 相似文献