期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Random change point models: investigating cognitive decline in the presence of missing data

G. Muniz Terrera A. van den Hout F. E. Matthews 《Journal of applied statistics》2011,38(4):705-716

With the aim of identifying the age of onset of change in the rate of cognitive decline while accounting for the missing observations, we considered a selection modelling framework. A random change point model was fitted to data from a population-based longitudinal study of ageing (the Cambridge City over 75 Cohort Study) to model the longitudinal process. A missing at random mechanism was modelled using logistic regression. Random effects such as initial cognitive status, rate of decline before and after the change point, and the age of onset of change in rate of decline were estimated after adjustment for risk factors for cognitive decline. Among other possible predictors, the last observed cognitive score was used to adjust the probability of death and dropout. Individuals who experienced less variability in their cognitive scores experienced a change in their rate of decline at older ages than individuals whose cognitive scores varied more. 相似文献

2.

Empirical bayes estimation of the log odds ratio in 2×2 contingency tables

J. S. Maritz 《统计学通讯:理论与方法》2013,42(9):3215-3233

We consider a sequence of contingency tables whose cell probabilities may vary randomly. The distribution of cell probabilities is modelled by a Dirichlet distribution. Bayes and empirical Bayes estimates of the log odds ratio are obtained. Emphasis is placed on estimating the risks associated with the Bayes, empirical Bayes and maximum lilkelihood estimates of the log odds ratio. 相似文献

3.

Bayesian inference in joint modelling of location and scale parameters of the t distribution for longitudinal data

Tsung-I Lin Wan-Lun Wang 《Journal of statistical planning and inference》2011,141(4):1543-1553

This paper presents a fully Bayesian approach to multivariate t regression models whose mean vector and scale covariance matrix are modelled jointly for analyzing longitudinal data. The scale covariance structure is factorized in terms of unconstrained autoregressive and scale innovation parameters through a modified Cholesky decomposition. A computationally flexible data augmentation sampler coupled with the Metropolis-within-Gibbs scheme is developed for computing the posterior distributions of parameters. The Bayesian predictive inference for the future response vector is also investigated. The proposed methodologies are illustrated through a real example from a sleep dose–response study. 相似文献

4.

Prior distribution elicitation for generalized linear and piecewise-linear models

Paul H. Garthwaite Shafeeqah A. Al-Awadhi Fadlalla G. Elfadaly David J. Jenkinson 《Journal of applied statistics》2013,40(1):59-75

An elicitation method is proposed for quantifying subjective opinion about the regression coefficients of a generalized linear model. Opinion between a continuous predictor variable and the dependent variable is modelled by a piecewise-linear function, giving a flexible model that can represent a wide variety of opinion. To quantify his or her opinions, the expert uses an interactive computer program, performing assessment tasks that involve drawing graphs and bar-charts to specify medians and other quantiles. Opinion about the regression coefficients is represented by a multivariate normal distribution whose parameters are determined from the assessments. It is practical to use the procedure with models containing a large number of parameters. This is illustrated through practical examples and the benefit from using prior knowledge is examined through cross-validation. 相似文献

5.

An apparent paradox: a classifier based on a partially classified sample may have smaller expected error rate than that if the sample were completely classified

Ahfock Daniel McLachlan Geoffrey J. 《Statistics and Computing》2020,30(6):1779-1790

There has been increasing interest in using semi-supervised learning to form a classifier. As is well known, the (Fisher) information in an unclassified feature with unknown class label is less (considerably less for weakly separated classes) than that of a classified feature which has known class label. Hence in the case where the absence of class labels does not depend on the data, the expected error rate of a classifier formed from the classified and unclassified features in a partially classified sample is greater than that if the sample were completely classified. We propose to treat the labels of the unclassified features as missing data and to introduce a framework for their missingness as in the pioneering work of Rubin (Biometrika 63:581–592, 1976) for missingness in incomplete data analysis. An examination of several partially classified data sets in the literature suggests that the unclassified features are not occurring at random in the feature space, but rather tend to be concentrated in regions of relatively high entropy. It suggests that the missingness of the labels of the features can be modelled by representing the conditional probability of a missing label for a feature via the logistic model with covariate depending on the entropy of the feature or an appropriate proxy for it. We consider here the case of two normal classes with a common covariance matrix where for computational convenience the square of the discriminant function is used as the covariate in the logistic model in place of the negative log entropy. Rather paradoxically, we show that the classifier so formed from the partially classified sample may have smaller expected error rate than that if the sample were completely classified.

相似文献

6.

Simplifying Regression Models Using Dimensional Analysis 总被引：1，自引：0，他引：1

V.A. Vignaux & J.L. Scott 《Australian & New Zealand Journal of Statistics》1999,41(1):31-41

Dimensional analysis can make a contribution to model formulation when some of the measurements in the problem are of physical factors. The analysis constructs a set of independent dimensionless factors that should be used as the variables of the regression in place of the original measurements. There are fewer of these than the originals and they often have a more appropriate interpretation. The technique is described briefly and its proposed role in regression discussed and illustrated with examples. We conclude that dimensional analysis can be effective in the preliminary stages of regression analysis whendeveloping formulations involving continuous variables with several dimensions. 相似文献

7.

Bayesian modelling of catch in a north-west Atlantic fishery

Carmen Fernández Eduardo Ley Mark F. J. Steel 《Journal of the Royal Statistical Society. Series C, Applied statistics》2002,51(3):257-280

Summary. We model daily catches of fishing boats in the Grand Bank fishing grounds. We use data on catches per species for a number of vessels collected by the European Union in the context of the Northwest Atlantic Fisheries Organization. Many variables can be thought to influence the amount caught: a number of ship characteristics (such as the size of the ship, the fishing technique used and the mesh size of the nets) are obvious candidates, but one can also consider the season or the actual location of the catch. Our database leads to 28 possible regressors (arising from six continuous variables and four categorical variables, whose 22 levels are treated separately), resulting in a set of 177 million possible linear regression models for the log-catch. Zero observations are modelled separately through a probit model. Inference is based on Bayesian model averaging, using a Markov chain Monte Carlo approach. Particular attention is paid to the prediction of catches for single and aggregated ships. 相似文献

8.

A Bounded Derivative Model for Prior Ignorance about a Real-valued Parameter 总被引：1，自引：0，他引：1

Peter Walley 《Scandinavian Journal of Statistics》1997,24(4):463-483

A new method is proposed for drawing coherent statistical inferences about a real-valued parameter in problems where there is little or no prior information. Prior ignorance about the parameter is modelled by the set of all continuous probability density functions for which the derivative of the log-density is bounded by a positive constant. This set is translation-invariant, it contains density functions with a wide variety of shapes and tail behaviour, and it generates prior probabilities that are highly imprecise. Statistical inferences can be calculated by solving a simple type of optimal control problem whose general solution is characterized. Detailed results are given for the problems of calculating posterior upper and lower means, variances, distribution functions and probabilities of intervals. In general, posterior upper and lower expectations are achieved by prior density functions that are piecewise exponential. The results are illustrated by normal and binomial examples 相似文献

9.

Numerical evaluation of observed sojourn time distributions for a single ion channel incorporating time interval omission

F. G. Ball G. F. Yeo 《Statistics and Computing》1994,4(1):1-12

The dynamical aspects of single ion channel gating can be modelled by a semi-Markov process. There is aggregation of states, corresponding to the receptor channel being open or closed, and there is time interval omission, brief sojourns in either the open or closed classes of states not being detected. This paper is concerned with the computation of the probability density functions of observed open (closed) sojourn-times incorporating time interval omission. A system of Volterra integral equations is derived, whose solution governs the required density function. Numerical procedures, using iterative and multistep methods, are described for solving these equations. Examples are given, and in the special case of Markov models results are compared with those obtained by alternative methods. Probabilistic interpretations are given for the iterative methods, which also give lower bounds for the solutions. 相似文献

10.

A Nonparametric Frailty Model for Clustered Survival Data

Samuel O. M. Manda 《统计学通讯:理论与方法》2013,42(5):863-875

Clayton-type counting process formulations for survival data and parametric gamma models for cluster-specific frailty quantities are now routinely applied in analyses of clustered survival data. On the other hand, although nonparametric frailty models have been studied, they are not used much in practice. In this article, the distribution of the frailty terms is assumed to be an unknown random variable. The unknown frailty distribution is then modelled completely with a Dirichlet process prior. This prior assigns cluster units into sub-classes whose members have the same random frailty effect. The Gibbs sampler algorithm is used for computing posterior parameter estimates of the fixed effect hazards regression and the frailty distribution. The methodology is used to analyze community-clustered child survival in sub-Saharan Africa. The results show that the communities could be separated into fewer distinct classes of risk of childhood mortality; the fewer classes could be studied easily in order to provide useful guidance on the more effective use of resources for child health intervention programmes. 相似文献

11.

Gender and performance of world-class athletes 总被引：1，自引：1，他引：0

SANGIT CHATTERJEE MATTHEW LAUDATO 《Journal of applied statistics》1997,24(1):3-10

SUMMARY The athletic performances of men and women are compared based on worldrecord times for various distance events in swimming, running and skating. The ratio of the times of women to those of men against years is modelled through a modified exponential distribution. The rate of improvement is found to be higher for women in the three sports. Law-like relationships are observed for world-record times against distance. Although men's absolute performance is generally superior, the disparity diminishes with increasing distance. 相似文献

12.

Order-restricted hypothesis testing in a variation of the normal mixture model

Dan Nettleton 《Revue canadienne de statistique》1999,27(2):383-394

This paper examines likelihood-ratio tests concerning the relationships among a fixed number of univariate normal means given a sample of normal observations whose population membership is uncertain. The asymptotic null distributions of likelihood-ratio test statistics are derived for a class of tests including hypotheses which place linear inequality constraints on the normal means. The use of such tests in the interval mapping of quantitative trait loci is addressed. 相似文献

13.

The L₂ Rate of Convergence for Event History Regression with Time-dependent Covariates

Jianhua Z. Huang & Charles J. Stone 《Scandinavian Journal of Statistics》1998,25(4):603-620

Consider repeated events of multiple kinds that occur according to a right-continuous semi-Markov process whose transition rates are influenced by one or more time-dependent covariates. The logarithms of the intensities of the transitions from one state to another are modelled as members of a linear function space, which may be finite- or infinite-dimensional. Maximum likelihood estimates are used, where the maximizations are taken over suitably chosen finite-dimensional approximating spaces. It is shown that the L ₂ rates of convergence of the maximum likelihood estimates are determined by the approximation power and dimension of the approximating spaces. The theory is applied to a functional ANOVA model, where the logarithms of the intensities are approximated by functions having the form of a specified sum of a constant term, main effects (functions of one variable), and interaction terms (functions of two or more variables). It is shown that the curse of dimensionality can be ameliorated if only main effects and low-order interactions are considered in functional ANOVA models. 相似文献

14.

Modelling heterogeneity of survival in band-recovery data using mixtures

Shirley Pledger Carl J. Schwarz 《Journal of applied statistics》2002,29(1):315-327

Finite mixture methods are applied to bird band-recovery studies to allow for heterogeneity of survival. Birds are assumed to belong to one of finitely many groups, each of which has its own survival rate (or set of survival rates varying by time and/or age). The group to which a specific animal belongs is not known, so its survival probability is a random variable from a finite mixture. Heterogeneity is thus modelled as a latent effect. This gives a wide selection of likelihood-based models, which may be compared using likelihood ratio tests. These models are discussed with reference to real and simulated data, and compared with previous models. 相似文献

15.

Quantile regression in functional linear semiparametric model

Tang Qingguo Linglong Kong 《Statistics》2017,51(6):1342-1358

This paper proposes nonparametric estimation methods for functional linear semiparametric quantile regression, where the conditional quantile of the scalar responses is modelled by both scalar and functional covariates and an additional unknown nonparametric function term. The slope function is estimated using the functional principal component basis and the nonparametric function is approximated by a piecewise polynomial function. The asymptotic distribution of the estimators of slope parameters is derived and the global convergence rate of the quantile estimator of unknown slope function is established under suitable norm. The asymptotic distribution of the estimator of the unknown nonparametric function is also established. Simulation studies are conducted to investigate the finite-sample performance of the proposed estimators. The proposed methodology is demonstrated by analysing a real data from ADHD-200 sample. 相似文献

16.

Modelling heterogeneity of survival in band-recovery data using mixtures

Shirley Pledger Carl J. Schwarz 《Journal of applied statistics》2002,29(1-4):315-327

Finite mixture methods are applied to bird band-recovery studies to allow for heterogeneity of survival. Birds are assumed to belong to one of finitely many groups, each of which has its own survival rate (or set of survival rates varying by time and/or age). The group to which a specific animal belongs is not known, so its survival probability is a random variable from a finite mixture. Heterogeneity is thus modelled as a latent effect. This gives a wide selection of likelihood-based models, which may be compared using likelihood ratio tests. These models are discussed with reference to real and simulated data, and compared with previous models. 相似文献

17.

A copula-based approach for estimating the survival functions of two alternating recurrent events

Moumita Chatterjee Sugata Sen Roy 《Journal of Statistical Computation and Simulation》2018,88(16):3098-3115

In this paper, we study the survival times of alternately occurring events. The dependence between the times to the two events is modelled through the Archimedean copula, while the dependence over the recurring cycles is modelled through a functional relationship of the distribution parameters. Taking account of appropriate censoring that may be present in the data, the model parameters are estimated using the maximum likelihood method. The standard errors of the estimators are then derived and confidence belts for the survival functions constructed. Methods for choosing the appropriate copula are also discussed. The results are illustrated through a clinical trial data on patients suffering from cystic fibrosis. A simulation study is also done to corroborate the results. 相似文献

18.

The table auto-regressive moving-average model for (categorical) stationary series: statistical properties (causality; from the all random to the conditional random)

Chrysoula Dimitriou-Fakalou 《Journal of nonparametric statistics》2019,31(1):31-63

A strictly stationary time series is modelled directly, once the variables' realizations fit into a table: no knowledge of a distribution is required other than the prior discretization. A multiplicative model with combined random ‘Auto-Regressive’ and ‘Moving-Average’ parts is considered for the serial dependence. Based on a multi-sequence of unobserved series that serve as differences and differences of differences from the main building block, a causal version is obtained; a condition that secures an exponential rate of convergence for its expected random coefficients is presented. For the remainder, writing the conditional probability as a function of past conditional probabilities, is within reach: subject to the presence of the moving-average segment in the original equation, what could be a long process of elimination with mathematical arguments concludes with a new derivation that does not support a simplistic linear dependence on the lagged probability values. 相似文献

19.

New improvements in the use of dependence measures for sensitivity analysis and screening

《Journal of Statistical Computation and Simulation》2012,82(15):3038-3058

ABSTRACT

Physical phenomena are commonly modelled by time consuming numerical simulators, function of many uncertain parameters whose influences can be measured via a global sensitivity analysis. The usual variance-based indices require too many simulations, especially as the inputs are numerous. To address this limitation, we consider recent advances in dependence measures, focusing on the distance correlation and the Hilbert–Schmidt independence criterion. We study and use these indices for a screening purpose. Numerical tests reveal differences between variance-based indices and dependence measures. Then, two approaches are proposed to use the latter for a screening purpose. The first approach uses independence tests, with existing asymptotic versions and spectral extensions; bootstrap versions are also proposed. The second considers a linear model with dependence measures, coupled to a bootstrap selection method or a Lasso penalization. Numerical experiments show their potential in the presence of many non-influential inputs and give successful results for a nuclear reliability application. 相似文献

20.

Modeling and simulation of a nonhomogeneous poisson process having cyclic behavior

Sanghoon Lee James R. Wilson Melba M. Crawford 《统计学通讯:模拟与计算》2013,42(2-3):777-809

In this paper we develop a unified approach to modeling and simulation of a nonhomogeneous Poisson process whose rate function exhibits cyclic behavior as well as a long-term evolutionary trend. The approach can be applied whether the oscillation frequency of the cyclic behavior is known or unknown. To model such a process, we use an exponential rate function whose exponent includes both a polynomial and a trigonometric component.Maximum likelihood estimates of the unknown continuous parameters of this function are obtained numerically, and the degree of the polynomial component is determined by a likelihood ratio test. If the oscillation frequency is unknown, then an initial estimate of this parameter is obtained via spectral analysis of the observed series of events; initial estimates of the remaining trigonometric (respectively, polynomial) parameters are computed from a standard maximum likelihood (respectively, moment-matching) procedure for an exponential-trigonometric (respectively, exponential-polynomial) rate function. To simulate the fitted process by the method of thinning, we present (a) a procedure for constructing an optimal piecewise linear majorizing rate function; and(b)a "piecewise thinning" simulation procedure based on the inverse transform method for generating events from a piecewise linear rate function. These procedures are applied to the storm-arrival process observed at an off-shore drilling site. 相似文献