期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Joint modelling of longitudinal binary data and survival data

Yi-Ting Hwang Chia-Hui Huang Chun-Chao Wang Tzu-Yin Lin Yi-Kuan Tseng 《Journal of applied statistics》2019,46(13):2357-2371

The medical costs in an ageing society substantially increase when the incidences of chronic diseases, disabilities and inability to live independently are high. Healthy lifestyles not only affect elderly individuals but also influence the entire community. When assessing treatment efficacy, survival and quality of life should be considered simultaneously. This paper proposes the joint likelihood approach for modelling survival and longitudinal binary covariates simultaneously. Because some unobservable information is present in the model, the Monte Carlo EM algorithm and Metropolis-Hastings algorithm are used to find the estimators. Monte Carlo simulations are performed to evaluate the performance of the proposed model based on the accuracy and precision of the estimates. Real data are used to demonstrate the feasibility of the proposed model. 相似文献

2.

Multilevel modelling of complex survey data

Sophia Rabe-Hesketh Anders Skrondal 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2006,169(4):805-827

Summary. Multilevel modelling is sometimes used for data from complex surveys involving multistage sampling, unequal sampling probabilities and stratification. We consider generalized linear mixed models and particularly the case of dichotomous responses. A pseudolikelihood approach for accommodating inverse probability weights in multilevel models with an arbitrary number of levels is implemented by using adaptive quadrature. A sandwich estimator is used to obtain standard errors that account for stratification and clustering. When level 1 weights are used that vary between elementary units in clusters, the scaling of the weights becomes important. We point out that not only variance components but also regression coefficients can be severely biased when the response is dichotomous. The pseudolikelihood methodology is applied to complex survey data on reading proficiency from the American sample of the 'Program for international student assessment' 2000 study, using the Stata program gllamm which can estimate a wide range of multilevel and latent variable models. Performance of pseudo-maximum-likelihood with different methods for handling level 1 weights is investigated in a Monte Carlo experiment. Pseudo-maximum-likelihood estimators of (conditional) regression coefficients perform well for large cluster sizes but are biased for small cluster sizes. In contrast, estimators of marginal effects perform well in both situations. We conclude that caution must be exercised in pseudo-maximum-likelihood estimation for small cluster sizes when level 1 weights are used. 相似文献

3.

Bayesian modelling of spatial compositional data 总被引：1，自引：0，他引：1

H kon Tjelmeland Kjetill Vassmo Lund 《Journal of applied statistics》2003,30(1):87-100

Compositional data are vectors of proportions, specifying fractions of a whole. Aitchison (1986) defines logistic normal distributions for compositional data by applying a logistic transformation and assuming the transformed data to be multi- normal distributed. In this paper we generalize this idea to spatially varying logistic data and thereby define logistic Gaussian fields. We consider the model in a Bayesian framework and discuss appropriate prior distributions. We consider both complete observations and observations of subcompositions or individual proportions, and discuss the resulting posterior distributions. In general, the posterior cannot be analytically handled, but the Gaussian base of the model allows us to define efficient Markov chain Monte Carlo algorithms. We use the model to analyse a data set of sediments in an Arctic lake. These data have previously been considered, but then without taking the spatial aspect into account. 相似文献

4.

Time series modelling of neuroscience data

Anouar Ben Mabrouk 《Journal of applied statistics》2013,40(4):918-919

相似文献

5.

Comparison of data analysis procedures for real-time nanoparticle sampling data using classical regression and ARIMA models

Seunghon Ham Sunju Kim Naroo Lee Pilje Kim Igchun Eom Byoungcheun Lee 《Journal of applied statistics》2017,44(4):685-699

Real-time monitoring is necessary for nanoparticle exposure assessment to characterize the exposure profile, but the data produced are autocorrelated. This study was conducted to compare three statistical methods used to analyze data, which constitute autocorrelated time series, and to investigate the effect of averaging time on the reduction of the autocorrelation using field data. First-order autoregressive (AR(1)) and autoregressive-integrated moving average (ARIMA) models are alternative methods that remove autocorrelation. The classical regression method was compared with AR(1) and ARIMA. Three data sets were used. Scanning mobility particle sizer data were used. We compared the results of regression, AR(1), and ARIMA with averaging times of 1, 5, and 10?min. AR(1) and ARIMA models had similar capacities to adjust autocorrelation of real-time data. Because of the non-stationary of real-time monitoring data, the ARIMA was more appropriate. When using the AR(1), transformation into stationary data was necessary. There was no difference with a longer averaging time. This study suggests that the ARIMA model could be used to process real-time monitoring data especially for non-stationary data, and averaging time setting is flexible depending on the data interval required to capture the effects of processes for occupational and environmental nano measurements. 相似文献

6.

Incremental modelling for compositional data streams

Yuan Wei Huiwen Wang Gilbert Saporta 《统计学通讯:模拟与计算》2013,42(8):2229-2243

ABSTRACT

Incremental modelling of data streams is of great practical importance, as shown by its applications in advertising and financial data analysis. We propose two incremental covariance matrix decomposition methods for a compositional data type. The first method, exact incremental covariance decomposition of compositional data (C-EICD), gives an exact decomposition result. The second method, covariance-free incremental covariance decomposition of compositional data (C-CICD), is an approximate algorithm that can efficiently compute high-dimensional cases. Based on these two methods, many frequently used compositional statistical models can be incrementally calculated. We take multiple linear regression and principle component analysis as examples to illustrate the utility of the proposed methods via extensive simulation studies. 相似文献

7.

Multiple-bias modelling for analysis of observational data 总被引：3，自引：3，他引：0

Sander Greenland 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2005,168(2):267-306

Summary. Conventional analytic results do not reflect any source of uncertainty other than random error, and as a result readers must rely on informal judgments regarding the effect of possible biases. When standard errors are small these judgments often fail to capture sources of uncertainty and their interactions adequately. Multiple-bias models provide alternatives that allow one systematically to integrate major sources of uncertainty, and thus to provide better input to research planning and policy analysis. Typically, the bias parameters in the model are not identified by the analysis data and so the results depend completely on priors for those parameters. A Bayesian analysis is then natural, but several alternatives based on sensitivity analysis have appeared in the risk assessment and epidemiologic literature. Under some circumstances these methods approximate a Bayesian analysis and can be modified to do so even better. These points are illustrated with a pooled analysis of case–control studies of residential magnetic field exposure and childhood leukaemia, which highlights the diminishing value of conventional studies conducted after the early 1990s. It is argued that multiple-bias modelling should become part of the core training of anyone who will be entrusted with the analysis of observational data, and should become standard procedure when random error is not the only important source of uncertainty (as in meta-analysis and pooled analysis). 相似文献

8.

Frailty modelling approaches for semi-competing risks data

Ha Il Do Xiang Liming Peng Mengjiao Jeong Jong-Hyeon Lee Youngjo 《Lifetime data analysis》2020,26(1):109-133

Lifetime Data Analysis - In the semi-competing risks situation where only a terminal event censors a non-terminal event, observed event times can be correlated. Recently, frailty models with an... 相似文献

9.

A generalized framework for modelling ordinal data

Maria Iannario Domenico Piccolo 《Statistical Methods and Applications》2016,25(2):163-189

In several applied disciplines, as Economics, Marketing, Business, Sociology, Psychology, Political science, Environmental research and Medicine, it is common to collect data in the form of ordered categorical observations. In this paper, we introduce a class of models based on mixtures of discrete random variables in order to specify a general framework for the statistical analysis of this kind of data. The structure of these models allows the interpretation of the final response as related to feeling, uncertainty and a possible shelter option and the expression of the relationship among these components and subjects’ covariates. Such a model may be effectively estimated by maximum likelihood methods leading to asymptotically efficient inference. We present a simulation experiment and discuss a real case study to check the consistency and the usefulness of the approach. Some final considerations conclude the paper. 相似文献

10.

Log-linear modelling of data from matched case-control studies

G. Lovison 《Journal of applied statistics》1994,21(3):125-141

A log-linear modelling approach is proposed for dealing with polytomous, unordered exposure variables in case-control epidemiological studies with matched pairs. Hypotheses concerning epidemiological parameters are shown to be expressable in terms of log-linear models for the expected frequencies of the case-by-control square concordance table representation of the matched data; relevant maximum likelihood estimates and goodness-of-fit statistics are presented. Possible extensions to account for ordered categorical risk factors and multiple controls are illustrated, and comparisons with previous work are discussed. Finally, the possibility of implementing the proposed method with GLIM is illustrated within the context of a data set already analyzed by other authors. 相似文献

11.

Two ways of modelling overdispersion in non-normal data 总被引：2，自引：0，他引：2

Y. Lee & J. A. Nelder 《Journal of the Royal Statistical Society. Series C, Applied statistics》2000,49(4):591-598

For non-normal data assumed to have distributions, such as the Poisson distribution, which have an a priori dispersion parameter, there are two ways of modelling overdispersion: by a quasi-likelihood approach or with a random-effect model. The two approaches yield different variance functions for the response, which may be distinguishable if adequate data are available. The epilepsy data of Thall and Vail and the fabric data of Bissell are used to exemplify the ideas. 相似文献

12.

Extended Poisson process modelling of dilution series data

M. J. Faddy D. M. Smith 《Journal of the Royal Statistical Society. Series C, Applied statistics》2008,57(4):461-471

Summary. Data comprising colony counts, or a binary variable representing fertile (or sterile) samples, as a dilution series of the containing medium are analysed by using extended Poisson process modelling. These models form a class of flexible probability distributions that are widely applicable to count and grouped binary data. Standard distributions such as Poisson and binomial, and those representing overdispersion and underdispersion relative to these distributions can be expressed within this class. For all the models in the class, likelihoods can be obtained. These models have not been widely used because of the perceived difficulty of performing the calculations and the lack of associated software. Exact calculation of the probabilities that are involved can be time consuming although accurate approximations that use considerably less computational time are available. Although dilution series data are the focus here, the models are applicable to any count or binary data. A benefit of the approach is the ability to draw likelihood-based inferences from the data. 相似文献

13.

Disaggregated spatial modelling for areal unit categorical data

Tassone EC Miranda ML Gelfand AE 《Journal of the Royal Statistical Society. Series C, Applied statistics》2010,59(1):175-190

Summary. We consider joint spatial modelling of areal multivariate categorical data assuming a multiway contingency table for the variables, modelled by using a log-linear model, and connected across units by using spatial random effects. With no distinction regarding whether variables are response or explanatory, we do not limit inference to conditional probabilities, as in customary spatial logistic regression. With joint probabilities we can calculate arbitrary marginal and conditional probabilities without having to refit models to investigate different hypotheses. Flexible aggregation allows us to investigate subgroups of interest; flexible conditioning enables not only the study of outcomes given risk factors but also retrospective study of risk factors given outcomes. A benefit of joint spatial modelling is the opportunity to reveal disparities in health in a richer fashion, e.g. across space for any particular group of cells, across groups of cells at a particular location, and, hence, potential space–group interaction. We illustrate with an analysis of birth records for the state of North Carolina and compare with spatial logistic regression. 相似文献

14.

Application of Markov chain Monte Carlo methods to modelling birth prevalence of Down syndrome

I. Bray & D. E. Wright 《Journal of the Royal Statistical Society. Series C, Applied statistics》1998,47(4):589-602

Data collected before the routine application of prenatal screening are of unique value in estimating the natural live-birth prevalence of Down syndrome. However, much of these data are from births from over 20 years ago and they are of uncertain quality. In particular, they are subject to varying degrees of underascertainment. Published approaches have used ad hoc corrections to deal with this problem or have been restricted to data sets in which ascertainment is assumed to be complete. In this paper we adopt a Bayesian approach to modelling ascertainment and live-birth prevalence. We consider three prior specifications concerning ascertainment and compare predicted maternal-age-specific prevalence under these three different prior specifications. The computations are carried out by using Markov chain Monte Carlo methods in which model parameters and missing data are sampled. 相似文献

15.

Truncation effect in closed and open birth interval data

Sheps MC Menken JA Ridley JC Lingner JW 《Journal of the American Statistical Association》1970,65(330):678-693

A numerical investigation using a flexible simulation model to establish interval analysis as an index for changing natality patterns. Such an index should reflect parity distribution, the age at which women start reproduction, and the spacing of their births. The simulated statistical results illustrate the truncation effect that reflects a negative correlation between parity and the length of closed and open intervals in a birth or marriage cohort. Truncation is related to the duration of marriage at survey, but this duration interacts with other assumptions. Holding duration constant does not ensure that the data on intervals will reflect postulated changes in the distributions. For complete birth orders, this analysis does reflect patterns of child spacing. However, it ignores changes in the parity distribution, whether produced by deliberate limitation of family size or by the onset of secondary sterility. This difficulty is not overcome by life table analysis except under highly restrictive assumptions. It is doubtful whether the current emphasis on securing such data is justified. Further investigation is needed to provide a better basis for the definition and analysis of interval data if they are to be used. 相似文献

16.

Assessing the stability of classification trees using Florida birth data

Panagiota Kitsantas Myles Hollander Lei M. Li 《Journal of statistical planning and inference》2007

Using 1998 and 1999 singleton birth data of the State of Florida, we study the stability of classification trees. Tree stability depends on both the learning algorithm and the specific data set. In this study, test samples are used in statistical learning to evaluate both stability and predictive performance. We also use the resampling technique bootstrap, which can be regarded as data self-perturbation, to evaluate the sensitivity of the modeling algorithm with respect to the specific data set. We demonstrate that the selection of the cost function plays an important role in stability. In particular, classifiers with equal misclassification costs and equal priors are less stable compared to those with unequal misclassification costs and equal priors. 相似文献

17.

On modelling data from degradation sample paths over time

Tsung I. Lin Jack C. Lee 《Australian & New Zealand Journal of Statistics》2003,45(3):257-270

This paper is mainly concerned with modelling data from degradation sample paths over time. It uses a general growth curve model with Box‐Cox transformation, random effects and ARMA(p, q) dependence to analyse a set of such data. A maximum likelihood estimation procedure for the proposed model is derived and future values are predicted, based on the best linear unbiased prediction. The paper compares the proposed model with a nonlinear degradation model from a prediction point of view. Forecasts of failure times with various data lengths in the sample are also compared. 相似文献

18.

Local polynomial modelling of the conditional quantile for functional data

Fatiha Messaci Nahima Nemouchi Idir Ouassou Mustapha Rachdi 《Statistical Methods and Applications》2015,24(4):597-622

相似文献

19.

Multi-scale process modelling and distributed computation for spatial data

Zammit-Mangion Andrew Rougier Jonathan 《Statistics and Computing》2020,30(6):1609-1627

Statistics and Computing - Recent years have seen a huge development in spatial modelling and prediction methodology, driven by the increased availability of remote-sensing data and the reduced... 相似文献

20.

A flexible marginal modelling strategy for non-monotone missing data

Ivy Jansen Geert Molenberghs 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2008,171(2):347-373

Summary. Much research has been devoted to modelling strategies for longitudinal data with missingness, recently especially within the missingness not at random context. In this paper, the relatively unexplored but practically highly relevant domain of non-monotone missingness with multivariate ordinal responses is broached. For this, a dedicated version of the multivariate Dale model is formulated. Furthermore, we also assess the sensitivity of these models to their assumptions, by using the technique of global influence. 相似文献