首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 486 毫秒
1.
We describe a semiparametric mixture model for human fertility studies. The probability of conception is a product of two components. The mixing distribution, the component that introduces the heterogeneity among the menstrual cycles that come from different couples, is characterized nonparametrically by a finite number of moments. The second component, the intercourse-related probability is modeled parametrically to assess the possible exposure effects. We discuss an EM algorithm-based estimating procedure that incorporates the natural order in the moments.  相似文献   

2.
Grouped data are commonly encountered in applications. All data from a continuous population are grouped due to rounding of the individual observations. The Bernstein polynomial model is proposed as an approximate model in this paper for estimating a univariate density function based on grouped data. The coefficients of the Bernstein polynomial, as the mixture proportions of beta distributions, can be estimated using an EM algorithm. The optimal degree of the Bernstein polynomial can be determined using a change-point estimation method. The rate of convergence of the proposed density estimate to the true density is proved to be almost parametric by an acceptance–rejection argument used for generating random numbers. The proposed method is compared with some existing methods in a simulation study and is applied to the Chicken Embryo Data.  相似文献   

3.
The number of sterile couples in a retrospective study of the number of cycles to conception is necessarily zero; this is not so for a prospective study. The paper puts forward a modification of Weinberg and Gladen's beta geometric model for cycles to conception that is suitable for both types of investigation. The probability that a couple achieves conception at the xth cycle, but not earlier, is assumed to take the form Rx = (1 ? ρ)/(1 ? m x?1 ρ/u), instead of μ/(1 ? θ + θx). The set of parameter restraints (0 < m < 1, 0< ρ < 1, 1 < u) is appropriate for retrospective data, whilst the alternative set of restraints (1 < m, 1 < ρ, 0 < u < 1) is appropriate for prospective data. The decrease in Rx over time can be interpreted not only as a time effect, but also as a heterogeneity effect by replacing Weinberg and Gladen's beta mixture of geometric distributions by a q-beta mixture.  相似文献   

4.
The beta modified Weibull distribution   总被引:2,自引:0,他引:2  
A five-parameter distribution so-called the beta modified Weibull distribution is defined and studied. The new distribution contains, as special submodels, several important distributions discussed in the literature, such as the generalized modified Weibull, beta Weibull, exponentiated Weibull, beta exponential, modified Weibull and Weibull distributions, among others. The new distribution can be used effectively in the analysis of survival data since it accommodates monotone, unimodal and bathtub-shaped hazard functions. We derive the moments and examine the order statistics and their moments. We propose the method of maximum likelihood for estimating the model parameters and obtain the observed information matrix. A real data set is used to illustrate the importance and flexibility of the new distribution.  相似文献   

5.

Ordinal data are often modeled using a continuous latent response distribution, which is partially observed through windows of adjacent intervals defined by cutpoints. In this paper we propose the beta distribution as a model for the latent response. The beta distribution has several advantages over the other common distributions used, e.g. , normal and logistic. In particular, it enables separate modeling of location and dispersion effects which is essential in the Taguchi method of robust design. First, we study the problem of estimating the location and dispersion parameters of a single beta distribution (representing a single treatment) from ordinal data assuming known equispaced cutpoints. Two methods of estimation are compared: the maximum likelihood method and the method of moments. Two methods of treating the data are considered: in raw discrete form and in smoothed continuousized form. A large scale simulation study is carried out to compare the different methods. The mean square errors of the estimates are obtained under a variety of parameter configurations. Comparisons are made based on the ratios of the mean square errors (called the relative efficiencies). No method is universally the best, but the maximum likelihood method using continuousized data is found to perform generally well, especially for estimating the dispersion parameter. This method is also computationally much faster than the other methods and does not experience convergence difficulties in case of sparse or empty cells. Next, the problem of estimating unknown cutpoints is addressed. Here the multiple treatments setup is considered since in an actual application, cutpoints are common to all treatments, and must be estimated from all the data. A two-step iterative algorithm is proposed for estimating the location and dispersion parameters of the treatments, and the cutpoints. The proposed beta model and McCullagh's (1980) proportional odds model are compared by fitting them to two real data sets.  相似文献   

6.
Abstract. Time‐to‐pregnancy (TTP) is the duration from the time a couple starts trying to become pregnant until they succeed. It is considered one of the most direct methods to measure natural fecundity in humans. Statistical tools for designing and analysing time to pregnancy studies belong to survival analysis, but several features require special attention. Prospective designs are difficult to carry out and retrospective (pregnancy‐based) designs, being widely used in this area, do not allow efficiently including couples remaining childless. A third possible design starts from a cross‐sectional sample of couples currently trying to become pregnant, using current duration (backward recurrence time) as basis for the estimation of TTP. Regression analysis is then most conveniently carried out in the accelerated failure time model. This paper surveys some practical and technical‐statistical issues in implementing this approach in a large telephone‐based survey, the Epidemiological Observatory of Fecundity in France (Obseff).  相似文献   

7.
In this paper we propose an alternative procedure for estimating the parameters of the beta regression model. This alternative estimation procedure is based on the EM-algorithm. For this, we took advantage of the stochastic representation of the beta random variable through ratio of independent gamma random variables. We present a complete approach based on the EM-algorithm. More specifically, this approach includes point and interval estimations and diagnostic tools for detecting outlying observations. As it will be illustrated in this paper, the EM-algorithm approach provides a better estimation of the precision parameter when compared to the direct maximum likelihood (ML) approach. We present the results of Monte Carlo simulations to compare EM-algorithm and direct ML. Finally, two empirical examples illustrate the full EM-algorithm approach for the beta regression model. This paper contains a Supplementary Material.  相似文献   

8.
In longitudinal studies, observation times are often irregular and subject‐specific. Frequently they are related to the outcome measure or other variables that are associated with the outcome measure but undesirable to condition upon in the model for outcome. Regression analyses that are unadjusted for outcome‐dependent follow‐up then yield biased estimates. The authors propose a class of inverse‐intensity rate‐ratio weighted estimators in generalized linear models that adjust for outcome‐dependent follow‐up. The estimators, based on estimating equations, are very simple and easily computed; they can be used under mixtures of continuous and discrete observation times. The predictors of observation times can be past observed outcomes, cumulative values of outcome‐model covariates and other factors associated with the outcome. The authors validate their approach through simulations and they illustrate it using data from a supported housing program from the US federal government.  相似文献   

9.
1 solution to the dimensionality problem raised by projection of individual age-specific fertility rates is the use of parametric curves to approximate the annual age-specific rates and a multivariate time series model to forecast the curve parameters. Such a method reduces the number of time series to be modeled for women 14-45 years of age from 32 to 40 (the number of curve parameters). In addition, the curves force even longterm fertility projections to exhibit the same smooth distribution across age as historical data. The data base used to illustrate this approach was age-specific fertility rates for US white women in 1921-84. An important advantage of this model is that it permits investigation of the interactions among the total fertility rate, the mean age of childbearing, and the standard deviation of age at childbearing. In the analysis of this particular data base, the contemporaneous relationship between the mean and standard deviation of age at childbearing was the only significant relationship. The addition of bias forecasts to the forecast gamma curve improves forecast accuracy, especially 1-2 years ahead. The most recent US Census Bureau projections have combined a time series model with longterm projections based on demographic judgment. These official projections yielded a slightly higher ultimate mean age and slightly lower standard deviation than those resulting from the model described in this paper.  相似文献   

10.
Longitudinal data analysis requires a proper estimation of the within-cluster correlation structure in order to achieve efficient estimates of the regression parameters. When applying likelihood-based methods one may select an optimal correlation structure by the AIC or BIC. However, such information criteria are not applicable for estimating equation based approaches. In this paper we develop a model averaging approach to estimate the correlation matrix by a weighted sum of a group of patterned correlation matrices under the GEE framework. The optimal weight is determined by minimizing the difference between the weighted sum and a consistent yet inefficient estimator of the correlation structure. The computation of our proposed approach only involves a standard quadratic programming on top of the standard GEE procedure and can be easily implemented in practice. We provide theoretical justifications and extensive numerical simulations to support the application of the proposed estimator. A couple of well-known longitudinal data sets are revisited where we implement and illustrate our methodology.  相似文献   

11.
The McDonald extended distribution: properties and applications   总被引:1,自引:0,他引:1  
We study a five-parameter lifetime distribution called the McDonald extended exponential model to generalize the exponential, generalized exponential, Kumaraswamy exponential and beta exponential distributions, among others. We obtain explicit expressions for the moments and incomplete moments, quantile and generating functions, mean deviations, Bonferroni and Lorenz curves and Gini concentration index. The method of maximum likelihood and a Bayesian procedure are adopted for estimating the model parameters. The applicability of the new model is illustrated by means of a real data set.  相似文献   

12.
ABSTRACT

We introduce a new methodology for estimating the parameters of a two-sided jump model, which aims at decomposing the daily stock return evolution into (unobservable) positive and negative jumps as well as Brownian noise. The parameters of interest are the jump beta coefficients which measure the influence of the market jumps on the stock returns, and are latent components. For this purpose, at first we use the Variance Gamma (VG) distribution which is frequently used in modeling financial time series and leads to the revelation of the hidden market jumps' distributions. Then, our method is based on the central moments of the stock returns for estimating the parameters of the model. It is proved that the proposed method provides always a solution in terms of the jump beta coefficients. We thus achieve a semi-parametric fit to the empirical data. The methodology itself serves as a criterion to test the fit of any sets of parameters to the empirical returns. The analysis is applied to NASDAQ and Google returns during the 2006–2008 period.  相似文献   

13.
Abstract

Two problems need to be solved before being able to give proper advice to couples undergoing in vitro fertilization therapy. Firstly, does the long-run success rate really converge to 100%? Secondly, what the success rate can be expected within a reasonable finite number of cycles? We propose a model based on a Weibull distribution. Data on 23,520 couples were used to calculate the cumulative pregnancy rate.  相似文献   

14.
Monte Carlo simulation methods are increasingly being used to evaluate the property of statistical estimators in a variety of settings. The utility of these methods depends upon the existence of an appropriate data-generating process. Observational studies are increasingly being used to estimate the effects of exposures and interventions on outcomes. Conventional regression models allow for the estimation of conditional or adjusted estimates of treatment effects. There is an increasing interest in statistical methods for estimating marginal or average treatment effects. However, in many settings, conditional treatment effects can differ from marginal treatment effects. Therefore, existing data-generating processes for conditional treatment effects are of little use in assessing the performance of methods for estimating marginal treatment effects. In the current study, we describe and evaluate the performance of two different data-generation processes for generating data with a specified marginal odds ratio. The first process is based upon computing Taylor Series expansions of the probabilities of success for treated and untreated subjects. The expansions are then integrated over the distribution of the random variables to determine the marginal probabilities of success for treated and untreated subjects. The second process is based upon an iterative process of evaluating marginal odds ratios using Monte Carlo integration. The second method was found to be computationally simpler and to have superior performance compared to the first method.  相似文献   

15.
Modeling clustered categorical data based on extensions of generalized linear model theory has received much attention in recent years. The rapidly increasing number of approaches suitable for categorical data in which clusters are uncorrelated, but correlations exist within a cluster, has caused uncertainty among applied scientists as to their respective merits and demerits. Upon centering estimation around solving an unbiased estimating function for mean parameters and estimation of covariance parameters describing within-cluster or among-cluster heterogeneity, many approaches can easily be related. This contribution describes a series of algorithms and their implementation in detail, based on a classification of inferential procedures for clustered data.  相似文献   

16.
In the analysis of recurrent events where the primary interest lies in studying covariate effects on the expected number of events occurring over a period of time, it is appealing to base models on the cumulative mean function (CMF) of the processes (Lawless & Nadeau 1995). In many chronic diseases, however, more than one type of event is manifested. Here we develop a robust inference procedure for joint regression models for the CMFs arising from a bivariate point process. Consistent parameter estimates with robust variance estimates are obtained via unbiased estimating functions for the CMFs. In most situations, the covariance structure of the bivariate point processes is difficult to specify correctly, but when it is known, an optimal estimating function for the CMFs can be obtained. As a convenient model for more general settings, we suggest the use of the estimating functions arising from bivariate mixed Poisson processes. Simulation studies demonstrate that the estimators based on this working model are practically unbiased with robust variance estimates. Furthermore, hypothesis tests may be based on the generalized Wald or generalized score tests. Data from a trial of patients with bronchial asthma are analyzed to illustrate the estimation and inference procedures.  相似文献   

17.
Dichotomization is the transformation of a continuous outcome (response) to a binary outcome. This approach, while somewhat common, is harmful from the viewpoint of statistical estimation and hypothesis testing. We show that this leads to loss of information, which can be large. For normally distributed data, this loss in terms of Fisher's information is at least 1-2/pi (or 36%). In other words, 100 continuous observations are statistically equivalent to 158 dichotomized observations. The amount of information lost depends greatly on the prior choice of cut points, with the optimal cut point depending upon the unknown parameters. The loss of information leads to loss of power or conversely a sample size increase to maintain power. Only in certain cases, for instance, in estimating a value of the cumulative distribution function and when the assumed model is very different from the true model, can the use of dichotomized outcomes be considered a reasonable approach.  相似文献   

18.
ABSTRACT

We consider a statistical model for directed network formation that features both node-specific parameters that capture degree heterogeneity and common parameters that reflect homophily among nodes. The goal is to perform statistical inference on the homophily parameters while treating the node-specific parameters as fixed effects. Jointly estimating all parameters leads to incidental-parameter bias and incorrect inference. As an alternative, we develop an approach based on a sufficient statistic that separates inference on the homophily parameters from estimation of the fixed effects. The estimator is easy to compute and can be applied to both dense and sparse networks, and is shown to have desirable asymptotic properties under sequences of growing networks. We illustrate the improvements of this estimator over maximum likelihood and bias-corrected estimation in a series of numerical experiments. The technique is applied to explain the import and export patterns in a dense network of countries and to estimate a more sparse advice network among attorneys in a corporate law firm.  相似文献   

19.
New robust estimates for variance components are introduced. Two simple models are considered: the balanced one-way classification model with a random factor and the balanced mixed model with one random factor and one fixed factor. However, the method of estimation proposed can be extended to more complex models. The new method of estimation we propose is based on the relationship between the variance components and the coefficients of the least-mean-squared-error predictor between two observations of the same group. This relationship enables us to transform the problem of estimating the variance components into the problem of estimating the coefficients of a simple linear regression model. The variance-component estimators derived from the least-squares regression estimates are shown to coincide with the maximum-likelihood estimates. Robust estimates of the variance components can be obtained by replacing the least-squares estimates by robust regression estimates. In particular, a Monte Carlo study shows that for outlier-contaminated normal samples, the estimates of variance components derived from GM regression estimates and the derived test outperform other robust procedures.  相似文献   

20.
Inference for Observations of Integrated Diffusion Processes   总被引:1,自引:0,他引:1  
Abstract.  Estimation of parameters in diffusion models is investigated when the observations are integrals over intervals of the process with respect to some weight function. This type of observations can, for example, be obtained when the process is observed after passage through an electronic filter. Another example is provided by the ice-core data on oxygen isotopes used to investigate paleo-temperatures. Finally, such data play a role in connection with the stochastic volatility models of finance. The integrated process is not a Markov process. Therefore, prediction-based estimating functions are applied to estimate parameters in the underlying diffusion model. The estimators are shown to be consistent and asymptotically normal. The theory developed in the paper also applies to integrals of processes other than diffusions. The method is applied to inference based on integrated data from Ornstein–Uhlenbeck processes and from the Cox–Ingersoll–Ross model, for both of which an explicit optimal estimating function is found.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号