首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A new general method of combining estimators is proposed in order to obtain an estimator with “improved” small sample properties. It is based on a specification test statistic and incorporates some well-known methods like preliminary testing. It is used to derive an alternative estimator for the slope in the simple errors-in-variables model, combining OLS and the modified instrumental variable estimator by Fuller. Small sample properties of the new estimator are investigated by means of a Monte Carlo study.  相似文献   

2.
When there is an outlier in the data set, the efficiency of traditional methods decreases. In order to solve this problem, Kadilar et al. (2007) adapted Huber-M method which is only one of robust regression methods to ratio-type estimators and decreased the effect of outlier problem. In this study, new ratio-type estimators are proposed by considering Tukey-M, Hampel M, Huber MM, LTS, LMS and LAD robust methods based on the Kadilar et al. (2007). Theoretically, we obtain the mean square error (MSE) for these estimators. We compared with MSE values of proposed estimators and MSE values of estimators based on Huber-M and OLS methods. As a result of these comparisons, we observed that our proposed estimators give more efficient results than both Huber M approach which was proposed by Kadilar et al. (2007) and OLS approach. Also, under all conditions, all of the other proposed estimators except Lad method are more efficient than robust estimators proposed by Kadilar et al. (2007). And, these theoretical results are supported with the aid of a numerical example and simulation by basing on data that includes an outlier.  相似文献   

3.
The zero-inflated Poisson (ZIP) distribution is widely used for modeling a count data set when the frequency of zeros is higher than the one expected under the Poisson distribution. There are many methods for making inferences for the inflation parameter in the ZIP models, e.g. the methods for testing Poisson (the inflation parameter is zero) versus ZIP distribution (the inflation parameter is positive). Most of these methods are based on the maximum likelihood estimators which do not have an explicit expression. However, the estimators which are obtained by the method of moments are powerful enough, easy to obtain and implement. In this paper, we propose an approach based on the method of moments for making inferences about the inflation parameter in the ZIP distribution. Our method is also compared to some recent methods via a simulation study and it is illustrated by an example.  相似文献   

4.
In this article, we employ the variational Bayesian method to study the parameter estimation problems of linear regression model, wherein some regressors are of Gaussian distribution with nonzero prior means. We obtain an analytical expression of the posterior parameter distribution, and then propose an iterative algorithm for the model. Simulations are carried out to test the performance of the proposed algorithm, and the simulation results confirm both the effectiveness and the reliability of the proposed algorithm.  相似文献   

5.
Classification of high-dimensional data set is a big challenge for statistical learning and data mining algorithms. To effectively apply classification methods to high-dimensional data sets, feature selection is an indispensable pre-processing step of learning process. In this study, we consider the problem of constructing an effective feature selection and classification scheme for data set which has a small number of sample size with a large number of features. A novel feature selection approach, named four-Staged Feature Selection, has been proposed to overcome high-dimensional data classification problem by selecting informative features. The proposed method first selects candidate features with number of filtering methods which are based on different metrics, and then it applies semi-wrapper, union and voting stages, respectively, to obtain final feature subsets. Several statistical learning and data mining methods have been carried out to verify the efficiency of the selected features. In order to test the adequacy of the proposed method, 10 different microarray data sets are employed due to their high number of features and small sample size.  相似文献   

6.
Recently, an empirical best linear unbiased predictor is widely used as a practical approach to small area inference. It is also of interest to construct empirical prediction intervals. However, we do not know which method should be used from among the several existing prediction intervals. In this article, we first obtain an empirical prediction interval by using the residual maximum likelihood method for estimating unknown model variance parameters. Then we compare the later with other intervals with the residual maximum likelihood method. Additionally, some different parametric bootstrap methods for constructing empirical prediction intervals are also compared in a simulation study.  相似文献   

7.
Missing observations due to non‐response are commonly encountered in data collected from sample surveys. The focus of this article is on item non‐response which is often handled by filling in (or imputing) missing values using the observed responses (donors). Random imputation (single or fractional) is used within homogeneous imputation classes that are formed on the basis of categorical auxiliary variables observed on all the sampled units. A uniform response rate within classes is assumed, but that rate is allowed to vary across classes. We construct confidence intervals (CIs) for a population parameter that is defined as the solution to a smooth estimating equation with data collected using stratified simple random sampling. The imputation classes are assumed to be formed across strata. Fractional imputation with a fixed number of random draws is used to obtain an imputed estimating function. An empirical likelihood inference method under the fractional imputation is proposed and its asymptotic properties are derived. Two asymptotically correct bootstrap methods are developed for constructing the desired CIs. In a simulation study, the proposed bootstrap methods are shown to outperform traditional bootstrap methods and some non‐bootstrap competitors under various simulation settings. The Canadian Journal of Statistics 47: 281–301; 2019 © 2019 Statistical Society of Canada  相似文献   

8.
In this article, we propose a novel approach to fit a functional linear regression in which both the response and the predictor are functions. We consider the case where the response and the predictor processes are both sparsely sampled at random time points and are contaminated with random errors. In addition, the random times are allowed to be different for the measurements of the predictor and the response functions. The aforementioned situation often occurs in longitudinal data settings. To estimate the covariance and the cross‐covariance functions, we use a regularization method over a reproducing kernel Hilbert space. The estimate of the cross‐covariance function is used to obtain estimates of the regression coefficient function and of the functional singular components. We derive the convergence rates of the proposed cross‐covariance, the regression coefficient, and the singular component function estimators. Furthermore, we show that, under some regularity conditions, the estimator of the coefficient function has a minimax optimal rate. We conduct a simulation study and demonstrate merits of the proposed method by comparing it to some other existing methods in the literature. We illustrate the method by an example of an application to a real‐world air quality dataset. The Canadian Journal of Statistics 47: 524–559; 2019 © 2019 Statistical Society of Canada  相似文献   

9.
Abstract.  The problem of choosing the bandwidth h for kernel density estimation is considered. All the plug-in-type bandwidth selection methods require the use of a pilot bandwidth g . The usual way to make an h -dependent choice of g is by obtaining their asymptotic expressions separately and solving the two equations. In contrast, we obtain the asymptotically optimal value of g for every fixed h , thus making our selection 'less asymptotic'. Exact error expressions show that some usually assumed hypotheses have to be discarded in the asymptotic study in this case. Two versions of a new bandwidth selector based on this idea are proposed, and their properties are analysed through theoretical results and a simulation study.  相似文献   

10.
We consider estimation of the unknown parameters of Chen distribution [Chen Z. A new two-parameter lifetime distribution with bathtub shape or increasing failure rate function. Statist Probab Lett. 2000;49:155–161] with bathtub shape using progressive-censored samples. We obtain maximum likelihood estimates by making use of an expectation–maximization algorithm. Different Bayes estimates are derived under squared error and balanced squared error loss functions. It is observed that the associated posterior distribution appears in an intractable form. So we have used an approximation method to compute these estimates. A Metropolis–Hasting algorithm is also proposed and some more approximate Bayes estimates are obtained. Asymptotic confidence interval is constructed using observed Fisher information matrix. Bootstrap intervals are proposed as well. Sample generated from MH algorithm are further used in the construction of HPD intervals. Finally, we have obtained prediction intervals and estimates for future observations in one- and two-sample situations. A numerical study is conducted to compare the performance of proposed methods using simulations. Finally, we analyse real data sets for illustration purposes.  相似文献   

11.
Fang Y  Wu H  Zhu LX 《Statistica Sinica》2011,21(3):1145-1170
We propose a two-stage estimation method for random coefficient ordinary differential equation (ODE) models. A maximum pseudo-likelihood estimator (MPLE) is derived based on a mixed-effects modeling approach and its asymptotic properties for population parameters are established. The proposed method does not require repeatedly solving ODEs, and is computationally efficient although it does pay a price with the loss of some estimation efficiency. However, the method does offer an alternative approach when the exact likelihood approach fails due to model complexity and high-dimensional parameter space, and it can also serve as a method to obtain the starting estimates for more accurate estimation methods. In addition, the proposed method does not need to specify the initial values of state variables and preserves all the advantages of the mixed-effects modeling approach. The finite sample properties of the proposed estimator are studied via Monte Carlo simulations and the methodology is also illustrated with application to an AIDS clinical data set.  相似文献   

12.
The maximum likelihood (ML) method is used to estimate the unknown Gamma regression (GR) coefficients. In the presence of multicollinearity, the variance of the ML method becomes overstated and the inference based on the ML method may not be trustworthy. To combat multicollinearity, the Liu estimator has been used. In this estimator, estimation of the Liu parameter d is an important problem. A few estimation methods are available in the literature for estimating such a parameter. This study has considered some of these methods and also proposed some new methods for estimation of the d. The Monte Carlo simulation study has been conducted to assess the performance of the proposed methods where the mean squared error (MSE) is considered as a performance criterion. Based on the Monte Carlo simulation and application results, it is shown that the Liu estimator is always superior to the ML and recommendation about which best Liu parameter should be used in the Liu estimator for the GR model is given.  相似文献   

13.
Based on various improved robust covariance estimators in the literature, several modified versions of the well-known correlated information criterion (CIC) for working intra-cluster correlation structure (ICS) selection are proposed. Performances of these modified criteria are examined and compared to the CIC via simulations. When the response is Gaussian, binary, or Poisson, the modified criteria are demonstrated to have higher detection rates when the true ICS is exchangeable, while the CIC would perform better when the true ICS is AR(1). An application of the criteria is made to a real dataset.  相似文献   

14.
The authors study the empirical likelihood method for linear regression models. They show that when missing responses are imputed using least squares predictors, the empirical log‐likelihood ratio is asymptotically a weighted sum of chi‐square variables with unknown weights. They obtain an adjusted empirical log‐likelihood ratio which is asymptotically standard chi‐square and hence can be used to construct confidence regions. They also obtain a bootstrap empirical log‐likelihood ratio and use its distribution to approximate that of the empirical log‐likelihood ratio. A simulation study indicates that the proposed methods are comparable in terms of coverage probabilities and average lengths of confidence intervals, and perform better than a normal approximation based method.  相似文献   

15.
Unit-level regression models are commonly used in small area estimation (SAE) to obtain an empirical best linear unbiased prediction of small area characteristics. The underlying assumptions of these models, however, may be unrealistic in some applications. Previous work developed a copula-based SAE model where the empirical Kendall's tau was used to estimate the dependence between two units from the same area. In this article, we propose a likelihood framework to estimate the intra-class dependence of the multivariate exchangeable copula for the empirical best unbiased prediction (EBUP) of small area means. One appeal of the proposed approach lies in its accommodation of both parametric and semi-parametric estimation approaches. Under each estimation method, we further propose a bootstrap approach to obtain a nearly unbiased estimator of the mean squared prediction error of the EBUP of small area means. The performance of the proposed methods is evaluated through simulation studies and also by a real data application.  相似文献   

16.
Regression analysis is one of methods widely used in prediction problems. Although there are many methods used for parameter estimation in regression analysis, ordinary least squares (OLS) technique is the most commonly used one among them. However, this technique is highly sensitive to outlier observation. Therefore, in literature, robust techniques are suggested when data set includes outlier observation. Besides, in prediction a problem, using the techniques that reduce the effectiveness of outlier and using the median as a target function rather than an error mean will be more successful in modeling these kinds of data. In this study, a new parameter estimation method using the median of absolute rate obtained by division of the difference between observation values and predicted values by the observation value and based on particle swarm optimization was proposed. The performance of the proposed method was evaluated with a simulation study by comparing it with OLS and some other robust methods in the literature.  相似文献   

17.
It is of interest to estimate the size of a crowd in a demonstration. We propose a practical method to obtain an estimate of the size of the crowd and its standard error. This method has been implemented in practice and, compared with other counting methods, is found to be more efficient, more timely and have less scope for bias. The method described in this paper was motivated by the annual 1 July demonstrations in Hong Kong, and data from the 2006 demonstration are used as an example of the proposed method.  相似文献   

18.
Grouped data are commonly encountered in applications. All data from a continuous population are grouped due to rounding of the individual observations. The Bernstein polynomial model is proposed as an approximate model in this paper for estimating a univariate density function based on grouped data. The coefficients of the Bernstein polynomial, as the mixture proportions of beta distributions, can be estimated using an EM algorithm. The optimal degree of the Bernstein polynomial can be determined using a change-point estimation method. The rate of convergence of the proposed density estimate to the true density is proved to be almost parametric by an acceptance–rejection argument used for generating random numbers. The proposed method is compared with some existing methods in a simulation study and is applied to the Chicken Embryo Data.  相似文献   

19.
Post marketing data offer rich information and cost-effective resources for physicians and policy-makers to address some critical scientific questions in clinical practice. However, the complex confounding structures (e.g., nonlinear and nonadditive interactions) embedded in these observational data often pose major analytical challenges for proper analysis to draw valid conclusions. Furthermore, often made available as electronic health records (EHRs), these data are usually massive with hundreds of thousands observational records, which introduce additional computational challenges. In this paper, for comparative effectiveness analysis, we propose a statistically robust yet computationally efficient propensity score (PS) approach to adjust for the complex confounding structures. Specifically, we propose a kernel-based machine learning method for flexibly and robustly PS modeling to obtain valid PS estimation from observational data with complex confounding structures. The estimated propensity score is then used in the second stage analysis to obtain the consistent average treatment effect estimate. An empirical variance estimator based on the bootstrap is adopted. A split-and-merge algorithm is further developed to reduce the computational workload of the proposed method for big data, and to obtain a valid variance estimator of the average treatment effect estimate as a by-product. As shown by extensive numerical studies and an application to postoperative pain EHR data comparative effectiveness analysis, the proposed approach consistently outperforms other competing methods, demonstrating its practical utility.  相似文献   

20.
In statistical data analysis it is often important to compare, classify, and cluster different time series. For these purposes various methods have been proposed in the literature, but they usually assume time series with the same sample size. In this article, we propose a spectral domain method for handling time series of unequal length. The method make the spectral estimates comparable by producing statistics at the same frequency. The procedure is compared with other methods proposed in the literature by a Monte Carlo simulation study. As an illustrative example, the proposed spectral method is applied to cluster industrial production series of some developed countries.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号