期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Gender composition of friendship networks and age at first intercourse: a life-course data analysis

Francesco?C.?Billari Email author Letizia?Mencarini 《Statistical Methods and Applications》2004,12(3):377-390

We investigate the impact of some characteristics of friendship networks on the timing of the first sexual intercourse. We assume that the gender-segregated composition of such networks explains part of the particularly late age at first intercourse in Italy. We use new data from a survey on sexual behavior and reproductive health of Italian first and second-year university students. The survey has been carried out in 15 different universities in 2000-2001 and it includes retrospective data on age at first intercourse, as well as retrospectively-collected time-varying measures for the gender composition of the friendship network at different ages, for almost 5,000 cases. After having described the data as transition frequencies, we use a Cox proportional hazards model with time-varying covariates. Results are in accordance with the hypothesis that having friendship networks that include more members of the other gender and talking about sex with friends increases the relative risk of first sexual intercourse. 相似文献

2.

Estimating catch at age from market sampling data by using a Bayesian hierarchical model

David Hirst Sondre Aanes Geir Storvik Ragnar Bang Huseby Ingunn Fride Tvete 《Journal of the Royal Statistical Society. Series C, Applied statistics》2004,53(1):1-14

Summary. The paper develops a Bayesian hierarchical model for estimating the catch at age of cod landed in Norway. The model includes covariate effects such as season and gear, and can also account for the within-boat correlation. The hierarchical structure allows us to account properly for the uncertainty in the estimates. 相似文献

3.

Multivariate logistic regression for familial aggregation in age at disease onset

Matthews AG Finkelstein DM Betensky RA 《Lifetime data analysis》2007,13(2):191-209

Familial aggregation studies seek to identify diseases that cluster in families. These studies are often carried out as a first step in the search for hereditary factors affecting the risk of disease. It is necessary to account for age at disease onset to avoid potential misclassification of family members who are disease-free at the time of study participation or who die before developing disease. This is especially true for late-onset diseases, such as prostate cancer or Alzheimer's disease. We propose a discrete time model that accounts for the age at disease onset and allows the familial association to vary with age and to be modified by covariates, such as pedigree relationship. The parameters of the model have interpretations as conditional log-odds and log-odds ratios, which can be viewed as discrete time conditional cross hazard ratios. These interpretations are appealing for cancer risk assessment. Properties of this model are explored in simulation studies, and the method is applied to a large family study of cancer conducted by the National Cancer Institute-sponsored Cancer Genetics Network (CGN). 相似文献

4.

Tests for outliers in the inverse Gaussian distribution,with application to first hitting time models

《Journal of Statistical Computation and Simulation》2012,82(1):73-80

The inverse Gaussian (IG) distribution is often applied in statistical modelling, especially with lifetime data. We present tests for outlying values of the parameters (μ, λ) of this distribution when data are available from a sample of independent units and possibly with more than one event per unit. Outlier tests are constructed from likelihood ratio tests for equality of parameters. The test for an outlying value of λ is based on an F-distributed statistic that is transformed to an approximate normal statistic when there are unequal numbers of events per unit. Simulation studies are used to confirm that Bonferroni tests have accurate size and to examine the powers of the tests. The application to first hitting time models, where the IG distribution is derived from an underlying Wiener process, is described. The tests are illustrated on data concerning the strength of different lots of insulating material. 相似文献

5.

Mean estimation with data missing at random for functional covariables

Frédéric Ferraty Philippe Vieu 《Statistics》2013,47(4):688-706

In a missing-data setting, we want to estimate the mean of a scalar outcome, based on a sample in which an explanatory variable is observed for every subject while responses are missing by happenstance for some of them. We consider two kinds of estimates of the mean response when the explanatory variable is functional. One is based on the average of the predicted values and the second one is a functional adaptation of the Horvitz–Thompson estimator. We show that the infinite dimensionality of the problem does not affect the rates of convergence by stating that the estimates are root-n consistent, under missing at random (MAR) assumption. These asymptotic features are completed by simulated experiments illustrating the easiness of implementation and the good behaviour on finite sample sizes of the method. This is the first paper emphasizing that the insensitiveness of averaged estimates, well known in multivariate non-parametric statistics, remains true for an infinite-dimensional covariable. In this sense, this work opens the way for various other results of this kind in functional data analysis. 相似文献

6.

Estimation of HIV seroconversion and effects of age in the San Francisco homosexual population

Wai-Yuan Tan Si Chin Tang Sho Rong Lee 《Journal of applied statistics》1998,25(1):85-102

SUMMARY Using San Francisco city clinic cohort data, we estimate the HIV seroconversion distribution by both non-parametric and parametric methods, and illustrate the effects of age on this distribution. The non-parametric methods include the Turnbull method, the Bacchetti method, the expectation, maximization and smoothing (EMS) method and the penalized spline method. The seroconversion density curves estimated by these nonparametric methods are of bimodal nature with obvious effects of age. As a result of the bimodal nature of the seroconversion curves, the parametric models considered are mixtures of two distributions taken from the generalized log-logistic distribution with three parameters, the Weibull distribution and the log-normal distribution. In terms of the logarithm of the likelihood values, it appears that the non-parametric methods with smoothing as well as without smoothing (i.e. the Turnbull method) provided much better fits than did the parametric models. Among the non-parametric methods, the EMS and the spline estimates are more appealing, because the unsmoothed Turnbull estimates are very unstable and because the Bacchetti estimates have a longer tail. Among the parametric models, the mixture of a generalized log-logistic distribution with three parameters and a Weibull distribution or a log-normal distribution provided better fits than did other mixtures of parametric models. 相似文献

7.

Comparisons of methods of estimation for a pareto distribution of the first kind

Anwar M. Hossain William J. Zimmer 《统计学通讯:理论与方法》2013,42(4):859-878

This paper compares methods of estimation for the parameters of a Pareto distribution of the first kind to determine which method provides the better estimates when the observations are censored, The unweighted least squares (LS) and the maximum likelihood estimates (MLE) are presented for both censored and uncensored data. The MLE's are obtained using two methods, In the first, called the ML method, it is shown that log-likelihood is maximized when the scale parameter is the minimum sample value. In the second method, called the modified ML (MML) method, the estimates are found by utilizing the maximum likelihood value of the shape parameter in terms of the scale parameter and the equation for the mean of the first order statistic as a function of both parameters. Since censored data often occur in applications, we study two types of censoring for their effects on the methods of estimation: Type II censoring and multiple random censoring. In this study we consider different sample sizes and several values of the true shape and scale parameters.

Comparisons are made in terms of bias and the mean squared error of the estimates. We propose that the LS method be generally preferred over the ML and MML methods for estimating the Pareto parameter γ for all sample sizes, all values of the parameter and for both complete and censored samples. In many cases, however, the ML estimates are comparable in their efficiency, so that either estimator can effectively be used. For estimating the parameter α, the LS method is also generally preferred for smaller values of the parameter (α ≤4). For the larger values of the parameter, and for censored samples, the MML method appears superior to the other methods with a slight advantage over the LS method. For larger values of the parameter α, for censored samples and all methods, underestimation can be a problem. 相似文献

8.

Not the First Digit! Using Benford's Law to Detect Fraudulent Scientif ic Data

Andreas Diekmann 《Journal of applied statistics》2007,34(3):321-329

Digits in statistical data produced by natural or social processes are often distributed in a manner described by ‘Benford's law’. Recently, a test against this distribution was used to identify fraudulent accounting data. This test is based on the supposition that first, second, third, and other digits in real data follow the Benford distribution while the digits in fabricated data do not. Is it possible to apply Benford tests to detect fabricated or falsified scientific data as well as fraudulent financial data? We approached this question in two ways. First, we examined the use of the Benford distribution as a standard by checking the frequencies of the nine possible first and ten possible second digits in published statistical estimates. Second, we conducted experiments in which subjects were asked to fabricate statistical estimates (regression coefficients). The digits in these experimental data were scrutinized for possible deviations from the Benford distribution. There were two main findings. First, both digits of the published regression coefficients were approximately Benford distributed or at least followed a pattern of monotonic decline. Second, the experimental results yielded new insights into the strengths and weaknesses of Benford tests. Surprisingly, first digits of faked data also exhibited a pattern of monotonic decline, while second, third, and fourth digits were distributed less in accordance with Benford's law. At least in the case of regression coefficients, there were indications that checks for digit-preference anomalies should focus less on the first (i.e. leftmost) and more on later digits. 相似文献

9.

Assessing the use of sample selection models in the estimation of fertility postponement effects

F. C. Billari R. Borgoni 《Statistical Methods and Applications》2005,14(3):389-402

Several studies have shown that at the individual level there exists a negative relationship between age at first birth and completed fertility. Using twin data in order to control for unobserved heterogeneity as possible source of bias, Kohler et al. (2001) showed the significant presence of such "postponement effect" at the micro level. In this paper, we apply sample selection models, where selection is based on having or not having had a first birth at all, to estimate the impact of postponing first births on subsequent fertility for four European nations, three of which have now lowest-low fertility levels. We use data from a set of comparative surveys (Fertility and Family Surveys), and we apply sample selection models on the logarithm of total fertility and on the progression to the second birth. Our results show that postponement effects are only very slightly affected by sample selection biases, so that sample selection models do not improve significantly the results of standard regression techniques on selected samples. Our results confirm that the postponement effect is higher in countries with lowest-low fertility levels. 相似文献

10.

A marginalized diffusion model for estimating age at first lower endoscopy use from current status data

Diana L. Miglioretti Elizabeth R. Brown 《Journal of the Royal Statistical Society. Series C, Applied statistics》2008,57(1):61-74

Summary. We propose an approach for estimating the age at first lower endoscopy examination from current status data that were collected via two series of cross-sectional surveys. To model the national probability of ever having a lower endoscopy examination, we incorporate birth cohort effects into a mixed influence diffusion model. We link a state-specific model to the national level diffusion model by using a marginalized modelling approach. In future research, results from our model will be used as microsimulation model inputs to estimate the contribution of endoscopy examinations to observed changes in colorectal cancer incidence and mortality. 相似文献

11.

Estimation in Discretely Observed Diffusions Killed at a Threshold

ENRICO BIBBONA SUSANNE DITLEVSEN 《Scandinavian Journal of Statistics》2013,40(2):274-293

Abstract. Parameter estimation in diffusion processes from discrete observations up to a first‐passage time is clearly of practical relevance, but does not seem to have been studied so far. In neuroscience, many models for the membrane potential evolution involve the presence of an upper threshold. Data are modelled as discretely observed diffusions which are killed when the threshold is reached. Statistical inference is often based on a misspecified likelihood ignoring the presence of the threshold causing severe bias, e.g. the bias incurred in the drift parameters of the Ornstein–Uhlenbeck model for biological relevant parameters can be up to 25–100 per cent. We compute or approximate the likelihood function of the killed process. When estimating from a single trajectory, considerable bias may still be present, and the distribution of the estimates can be heavily skewed and with a huge variance. Parametric bootstrap is effective in correcting the bias. Standard asymptotic results do not apply, but consistency and asymptotic normality may be recovered when multiple trajectories are observed, if the mean first‐passage time through the threshold is finite. Numerical examples illustrate the results and an experimental data set of intracellular recordings of the membrane potential of a motoneuron is analysed. 相似文献

12.

Parametric Estimation of Menarcheal Age Distribution Based on Recall Data

下载免费PDF全文

Sedigheh Mirzaei Salehabadi Debasis Sengupta Rituparna Das 《Scandinavian Journal of Statistics》2015,42(1):290-305

Menarche, the onset of menstruation, is an important maturational event of female childhood. Most of the studies of age at menarche make use of dichotomous (status quo) data. More information can be harnessed from recall data, but such data are often censored in a informative way. We show that the usual maximum likelihood estimator based on interval censored data, which ignores the informative nature of censoring, can be biased and inconsistent. We propose a parametric estimator of the menarcheal age distribution on the basis of a realistic model of the recall phenomenon. We identify the additional information contained in the recall data and demonstrate theoretically as well as through simulations the advantage of the maximum likelihood estimator based on recall data over that based on status quo data. 相似文献

13.

Forecasting mortality rates via density ratio modeling

Benjamin Kedem Guanhua Lu Rong Wei Paul D. Williams 《Revue canadienne de statistique》2008,36(2):193-206

The authors propose a semiparametric approach to modeling and forecasting age‐specific mortality in the United States. Their method is based on an extension of a class of semiparametric models to time series. It combines information from several time series and estimates the predictive distribution conditional on past data. The conditional expectation, which is the most commonly used predictor in practice, is the first moment of this distribution. The authors compare their method to that of Lee and Carter. 相似文献

14.

The long-term pattern of adult mortality and the highest attained age

A. R. Thatcher 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》1999,162(1):5-43

Recent new data on old age mortality point to a particular model for the way in which the probability of dying increases with age. The model is found to fit not only modern data but also some widely spaced historical data for the 19th and 17th centuries, and even some estimates for the early mediaeval period. The results show a pattern which calls for explanation. The model can also be used to predict a probability distribution for the highest age which will be attained in given circumstances. The results are relevant to the current debate about whether there is a fixed upper limit to the length of human life. 相似文献

15.

Models for potentially biased evidence in meta-analysis using empirically based priors

N. J. Welton A. E. Ades J. B. Carlin D. G. Altman J. A. C. Sterne 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2009,172(1):119-136

Summary. We present models for the combined analysis of evidence from randomized controlled trials categorized as being at either low or high risk of bias due to a flaw in their conduct. We formulate a bias model that incorporates between-study and between-meta-analysis heterogeneity in bias, and uncertainty in overall mean bias. We obtain algebraic expressions for the posterior distribution of the bias-adjusted treatment effect, which provide limiting values for the information that can be obtained from studies at high risk of bias. The parameters of the bias model can be estimated from collections of previously published meta-analyses. We explore alternative models for such data, and alternative methods for introducing prior information on the bias parameters into a new meta-analysis. Results from an illustrative example show that the bias-adjusted treatment effect estimates are sensitive to the way in which the meta-epidemiological data are modelled, but that using point estimates for bias parameters provides an adequate approximation to using a full joint prior distribution. A sensitivity analysis shows that the gain in precision from including studies at high risk of bias is likely to be low, however numerous or large their size, and that little is gained by incorporating such studies, unless the information from studies at low risk of bias is limited. We discuss approaches that might increase the value of including studies at high risk of bias, and the acceptability of the methods in the evaluation of health care interventions. 相似文献

16.

The distribution by age of the frequency of first marriage in a female cohort 总被引：1，自引：0，他引：1

Coale AJ Mcneil DR 《Journal of the American Statistical Association》1972,67(340):743-749

The authors present 2 methods for the approximation of a representative schedule recording first marriage frequencies by age. Both treatments are mathematically complex. One method achieves a very close approximation with a simple closed form frequency function, which is the limiting distribution of the convolution of an infinite number of exponentially distributed components. The other method achieves an equal approximation by the convolution of a normal distribution of age of entry into a marriageable state and as few as 3 exponentially distributed delays. This latter convolution provides a feasible model of nuptiality, a model receiving surprising empirical support. 相似文献

17.

Efficient inverse probability weighting method for quantile regression with nonignorable missing data

Pu-Ying Zhao De-Peng Jiang 《Statistics》2017,51(2):363-386

Quantitle regression (QR) is a popular approach to estimate functional relations between variables for all portions of a probability distribution. Parameter estimation in QR with missing data is one of the most challenging issues in statistics. Regression quantiles can be substantially biased when observations are subject to missingness. We study several inverse probability weighting (IPW) estimators for parameters in QR when covariates or responses are subject to missing not at random. Maximum likelihood and semiparametric likelihood methods are employed to estimate the respondent probability function. To achieve nice efficiency properties, we develop an empirical likelihood (EL) approach to QR with the auxiliary information from the calibration constraints. The proposed methods are less sensitive to misspecified missing mechanisms. Asymptotic properties of the proposed IPW estimators are shown under general settings. The efficiency gain of EL-based IPW estimator is quantified theoretically. Simulation studies and a data set on the work limitation of injured workers from Canada are used to illustrated our proposed methodologies. 相似文献

18.

Joint generalized estimating equations for multivariate longitudinal binary outcomes with missing data: an application to acquired immune deficiency syndrome data

Stuart R. Lipsitz Garrett M. Fitzmaurice Joseph G. Ibrahim Debajyoti Sinha Michael Parzen Steven Lipshultz 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2009,172(1):3-20

Summary. In a large, prospective longitudinal study designed to monitor cardiac abnormalities in children born to women who are infected with the human immunodeficiency virus, instead of a single outcome variable, there are multiple binary outcomes (e.g. abnormal heart rate, abnormal blood pressure and abnormal heart wall thickness) considered as joint measures of heart function over time. In the presence of missing responses at some time points, longitudinal marginal models for these multiple outcomes can be estimated by using generalized estimating equations (GEEs), and consistent estimates can be obtained under the assumption of a missingness completely at random mechanism. When the missing data mechanism is missingness at random, i.e. the probability of missing a particular outcome at a time point depends on observed values of that outcome and the remaining outcomes at other time points, we propose joint estimation of the marginal models by using a single modified GEE based on an EM-type algorithm. The method proposed is motivated by the longitudinal study of cardiac abnormalities in children who were born to women infected with the human immunodeficiency virus, and analyses of these data are presented to illustrate the application of the method. Further, in an asymptotic study of bias, we show that, under a missingness at random mechanism in which missingness depends on all observed outcome variables, our joint estimation via the modified GEE produces almost unbiased estimates, provided that the correlation model has been correctly specified, whereas estimates from standard GEEs can lead to substantial bias. 相似文献

19.

Search of Good Rotation Patterns to Improve the Precision of Estimates at Current Occasion

G. N. Singh Kumari Priyanka 《统计学通讯:理论与方法》2013,42(3):337-348

In rotation (successive) sampling, it is common practice to use the information collected on a previous occasion to improve the precision of the estimates at current occasion. The previous information may be in the form of an auxiliary character, the character under study itself, or both. In the present work, information on an auxiliary character, which is readily available on all the occasions, has been used along with the information on study character from the previous and current occasion. Consequently, chain type difference and regression estimators have been proposed for estimating the population mean at second (current) occasion in the two occasions rotation (successive) sampling. The proposed estimators have been compared with sample mean estimator when there is no matching and the optimum estimator, which is the combination of the means of the matched and unmatched portions of the sample at the second occasion. Optimum replacement policy is also discussed. Theoretical results have been justified through empirical interpretation. 相似文献

20.

Using continuation-ratio logits to analyze the variation of the age composition of fish catches

Trine Kvist Henrik Gislason Poul Thyregod 《Journal of applied statistics》2000,27(3):303-319

Major sources of information for the estimation of the size of the fish stocks and the rate of their exploitation are samples from which the age composition of catches may be determined. However, the age composition in the catches often varies as a result of several factors. Stratification of the sampling is desirable, because it leads to better estimates of the age composition, and the corresponding variances and covariances. The analysis is impeded by the fact that the response is ordered categorical. This paper introduces an easily applicable method to analyze such data. The method combines continuation-ratio logits and the theory for generalized linear mixed models. Continuation-ratio logits are designed for ordered multinomial response and have the feature that the associated log-likelihood splits into separate terms for each category levels. Thus, generalized linear mixed models can be applied separately to each level of the logits. The method is illustrated by the analysis of age-composition data collected from the Danish sandeel fishery in the North Sea in 1993. The significance of possible sources of variation is evaluated, and formulae for estimating the proportions of each age group and their variance-covariance matrix are derived. 相似文献