首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Abstract

In this paper, we propose a hybrid method to estimate the baseline hazard for Cox proportional hazard model. In the proposed method, the nonparametric estimate of the survival function by Kaplan Meier, and the parametric estimate of the logistic function in the Cox proportional hazard by partial likelihood method are combined to estimate a parametric baseline hazard function. We compare the estimated baseline hazard using the proposed method and the Cox model. The results show that the estimated baseline hazard using hybrid method is improved in comparison with estimated baseline hazard using the Cox model. The performance of each method is measured based on the estimated parameters of the baseline distribution as well as goodness of fit of the model. We have used real data as well as simulation studies to compare performance of both methods. Monte Carlo simulations carried out in order to evaluate the performance of the proposed method. The results show that the proposed hybrid method provided better estimate of the baseline in comparison with the estimated values by the Cox model.  相似文献   

2.
Clusters of galaxies are a useful proxy to trace the distribution of mass in the universe. By measuring the mass of clusters of galaxies on different scales, one can follow the evolution of the mass distribution (Martínez and Saar, Statistics of the Galaxy Distribution, 2002). It can be shown that finding galaxy clusters is equivalent to finding density contour clusters (Hartigan, Clustering Algorithms, 1975): connected components of the level set S c ≡{f>c} where f is a probability density function. Cuevas et al. (Can. J. Stat. 28, 367–382, 2000; Comput. Stat. Data Anal. 36, 441–459, 2001) proposed a nonparametric method for density contour clusters, attempting to find density contour clusters by the minimal spanning tree. While their algorithm is conceptually simple, it requires intensive computations for large datasets. We propose a more efficient clustering method based on their algorithm with the Fast Fourier Transform (FFT). The method is applied to a study of galaxy clustering on large astronomical sky survey data.  相似文献   

3.
Abstract

In this article, we propose a penalized local log-likelihood method to locally select the number of components in non parametric finite mixture of regression models via proportion shrinkage method. Mean functions and variance functions are estimated simultaneously. We show that the number of components can be estimated consistently, and further establish asymptotic normality of functional estimates. We use a modified EM algorithm to estimate the unknown functions. Simulations are conducted to demonstrate the performance of the proposed method. We illustrate our method via an empirical analysis of the housing price index data of United States.  相似文献   

4.
Abstract

Based on the Gamma kernel density estimation procedure, this article constructs a nonparametric kernel estimate for the regression functions when the covariate are nonnegative. Asymptotic normality and uniform almost sure convergence results for the new estimator are systematically studied, and the finite performance of the proposed estimate is discussed via a simulation study and a comparison study with an existing method. Finally, the proposed estimation procedure is applied to the Geyser data set.  相似文献   

5.
ABSTRACT

A four-parameter extended bimodal lifetime model called the exponentiated log-sinh Cauchy distribution is proposed. It extends the log-sinh Cauchy and folded Cauchy distributions. We derive some of its mathematical properties including explicit expressions for the ordinary moments and generating and quantile functions. The method of maximum likelihood is used to estimate the model parameters. We implement the fit of the model in the GAMLSS package and provide the codes. The flexibility of the model is illustrated by means of three real data sets.  相似文献   

6.
ABSTRACT

A new method is proposed for identifying clusters in continuous data indexed by time or by space. The scan statistic we introduce is derived from the well-known Mann–Whitney statistic. It is completely non parametric as it relies only on the ranks of the marks. This scan test seems to be very powerful against any clustering alternative. These results have applications in various fields, such as the study of climate data or socioeconomic data.  相似文献   

7.
ABSTRACT

The varying-coefficient single-index model (VCSIM) is a very general and flexible tool for exploring the relationship between a response variable and a set of predictors. Popular special cases include single-index models and varying-coefficient models. In order to estimate the index-coefficient and the non parametric varying-coefficients in the VCSIM, we propose a two-stage composite quantile regression estimation procedure, which integrates the local linear smoothing method and the information of quantile regressions at a number of conditional quantiles of the response variable. We establish the asymptotic properties of the proposed estimators for the index-coefficient and varying-coefficients when the error is heterogeneous. When compared with the existing mean-regression-based estimation method, our simulation results indicate that our proposed method has comparable performance for normal error and is more robust for error with outliers or heavy tail. We illustrate our methodologies with a real example.  相似文献   

8.
Abstract

A nonparametric procedure is proposed to estimate multiple change-points of location changes in a univariate data sequence by using ranks instead of the raw data. While existing rank-based multiple change-point detection methods are mostly based on sequential tests, we treat it as a model selection problem. We derive the corresponding Schwarz’s information criterion for rank-statistics, theoretically prove the consistency of the change-point estimator and use a pruned dynamic programing algorithm to achieve the change-point estimator. Simulation studies show our method’s robustness, effectiveness and efficiency in detecting mean-changes. We also apply the method to a gene dataset as an illustration.  相似文献   

9.
ABSTRACT

Researchers are often required to reuse data that have been collected and analyzed for other purposes. Issues may arise if the outcome of this secondary study is related to the outcome of the first study and traditional methods may fail to deliver a consistent estimate. Here we propose a semiparametric approach that takes this correlation into account and produces asymptotically consistent and normally distributed estimates. We discuss its performance through simulations and apply the proposed method to a real dataset.  相似文献   

10.
ABSTRACT

A variable selection procedure based on least absolute deviation (LAD) estimation and adaptive lasso (LAD-Lasso for short) is proposed for median regression models with doubly censored data. The proposed procedure can select significant variables and estimate the parameters simultaneously, and the resulting estimators enjoy the oracle property. Simulation results show that the proposed method works well.  相似文献   

11.
In this paper, we propose a new procedure to estimate the distribution of a variable y when there are missing data. To compensate the presence of missing responses, it is assumed that a covariate vector x is observed and that y and x are related by means of a semi-parametric regression model. Observed residuals are combined with predicted values to estimate the missing response distribution. Once the responses distribution is consistently estimated, we can estimate any parameter defined through a continuous functional T using a plug in procedure. We prove that the proposed estimators have high breakdown point.  相似文献   

12.
Abstract

A method is proposed for the estimation of missing data in analysis of covariance models. This is based on obtaining an estimate of the missing observation that minimizes the error sum of squares. Specific derivation of this estimate is carried out for the one-factor analysis of covariance, and numerical examples are given to show the nature of the estimates produced. Parameter estimates of the imputed data are then compared with those of the incomplete data.  相似文献   

13.
ABSTRACT

For many years, detection of clusters has been of great public health interest and widely studied. Several methods have been developed to detect clusters and their performance has been evaluated in various contexts. Spatial scan statistics are widely used for geographical cluster detection and inference. Different types of discrete or continuous data can be analyzed using spatial scan statistics for Bernoulli, Poisson, ordinal, exponential, and normal models. In this paper, we propose a scan statistic for survival data which is based on generalized life distribution model that provides three important life distributions, viz. Weibull, exponential, and Rayleigh. The proposed method is applied to the survival data of tuberculosis patients in Nainital district of Uttarakhand, India, for the year 2004–05. The Monte Carlo simulation studies reveal that the proposed method performs well for different survival distributions.  相似文献   

14.
Abstract

This paper presents a new method to estimate the quantiles of generic statistics by combining the concept of random weighting with importance resampling. This method converts the problem of quantile estimation to a dual problem of tail probabilities estimation. Random weighting theories are established to calculate the optimal resampling weights for estimation of tail probabilities via sequential variance minimization. Subsequently, the quantile estimation is constructed by using the obtained optimal resampling weights. Experimental results on real and simulated data sets demonstrate that the proposed random weighting method can effectively estimate the quantiles of generic statistics.  相似文献   

15.
Abstract

Cluster analysis is the distribution of objects into different groups or more precisely the partitioning of a data set into subsets (clusters) so that the data in subsets share some common trait according to some distance measure. Unlike classification, in clustering one has to first decide the optimum number of clusters and then assign the objects into different clusters. Solution of such problems for a large number of high dimensional data points is quite complicated and most of the existing algorithms will not perform properly. In the present work a new clustering technique applicable to large data set has been used to cluster the spectra of 702248 galaxies and quasars having 1,540 points in wavelength range imposed by the instrument. The proposed technique has successfully discovered five clusters from this 702,248X1,540 data matrix.  相似文献   

16.
Estimation of the correlation coefficient between two variates (p) in the presence of correlated observations from a bivar iate normal population is considered The estimated maximum likelihood estimator (EMLE), an estimate based on the maximum likelihood estimator (MLE), is proposed and studied for the estimation of p For the large sample case , approximate expressions foi the variance and the bias of the Pearson estimate of the correlation coefficient are derived. These expressions suggests that the Pearson’s estimator possesses high mean square error (MSE) in estimating ρ in comparison to the MLE The MSE is particularly high when the observations within clusters aie highly correlated. The Pearson’s estimate, the MLE, and the EMLE aie evaluated in a simulation study This study shows that the proposed EMLE pefoims bettei than the Pearson’s correlation coefficient except when the number of clusters is small.  相似文献   

17.
ABSTRACT

The Greenwood estimate (GE) is commonly employed for estimating the variance of the Kaplan–Meier estimate (KME) even though it underestimates the variance. To reduce the bias of the GE, Zhao (1996) proposed an alternative, called the homogenetic estimate (HE). In this note, we point out that the HE actually esimates the variance of the reduced sample estimate (RE) and can seriously overestimate that of the KME. We also derive the explict relationship between the HE and the GE and discuss the use of the HE.  相似文献   

18.
ABSTRACT

In this article we study the approximately unbiased multi-level pseudo maximum likelihood (MPML) estimation method for general multi-level modeling with sampling weights. We conduct a simulation study to determine the effect various factors have on the estimation method. The factors we included in this study are scaling method, size of clusters, invariance of selection, informativeness of selection, intraclass correlation, and variability of standardized weights. The scaling method is an indicator of how the weights are normalized on each level. The invariance of the selection is an indicator of whether or not the same selection mechanism is applied across clusters. The informativeness of the selection is an indicator of how biased the selection is. We summarize our findings and recommend a multi-stage procedure based on the MPML method that can be used in practical applications.  相似文献   

19.
ABSTRACT

In panel data models and other regressions with unobserved effects, fixed effects estimation is often paired with cluster-robust variance estimation (CRVE) to account for heteroscedasticity and un-modeled dependence among the errors. Although asymptotically consistent, CRVE can be biased downward when the number of clusters is small, leading to hypothesis tests with rejection rates that are too high. More accurate tests can be constructed using bias-reduced linearization (BRL), which corrects the CRVE based on a working model, in conjunction with a Satterthwaite approximation for t-tests. We propose a generalization of BRL that can be applied in models with arbitrary sets of fixed effects, where the original BRL method is undefined, and describe how to apply the method when the regression is estimated after absorbing the fixed effects. We also propose a small-sample test for multiple-parameter hypotheses, which generalizes the Satterthwaite approximation for t-tests. In simulations covering a wide range of scenarios, we find that the conventional cluster-robust Wald test can severely over-reject while the proposed small-sample test maintains Type I error close to nominal levels. The proposed methods are implemented in an R package called clubSandwich. This article has online supplementary materials.  相似文献   

20.
Relative potency estimations in both multiple parallel-line and slope-ratio assays involve construction of simultaneous confidence intervals for ratios of linear combinations of general linear model parameters. The key problem here is that of determining multiplicity adjusted percentage points of a multivariate t-distribution, the correlation matrix R of which depends on the unknown relative potency parameters. Several methods have been proposed in the literature on how to deal with R . In this article, we introduce a method based on an estimate of R (also called the plug-in approach) and compare it with various methods including conservative procedures based on probability inequalities. Attention is restricted to parallel-line assays though the theory is applicable for any ratios of coefficients in the general linear model. Extension of the plug-in method to linear mixed effect models is also discussed. The methods will be compared with respect to their simultaneous coverage probabilities via Monte Carlo simulations. We also evaluate the methods in terms of confidence interval width through application to data on multiple parallel-line assay.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号