共查询到20条相似文献,搜索用时 15 毫秒
1.
Junyong Park Jayson D. Wilbur Jayanta K. Ghosh Cindy H. Nakatsu Corinne Ackerman 《统计学通讯:模拟与计算》2013,42(4):855-869
We adopt boosting for classification and selection of high-dimensional binary variables for which classical methods based on normality and non singular sample dispersion are inapplicable. Boosting seems particularly well suited for binary variables. We present three methods of which two combine boosting with the relatively classical variable selection methods developed in Wilbur et al. (2002). Our primary interest is variable selection in classification with small misclassification error being used as validation of proposed method for variable selection. Two of the new methods perform uniformly better than Wilbur et al. (2002) in one set of simulated and three real life examples. 相似文献
2.
In this article, we consider two different shared frailty regression models under the assumption of Gompertz as baseline distribution. Mostly assumption of gamma distribution is considered for frailty distribution. To compare the results with gamma frailty model, we consider the inverse Gaussian shared frailty model also. We compare these two models to a real life bivariate survival data set of acute leukemia remission times (Freireich et al., 1963). Analysis is performed using Markov Chain Monte Carlo methods. Model comparison is made using Bayesian model selection criterion and a well-fitted model is suggested for the acute leukemia data. 相似文献
3.
When a sufficient correlation between the study variable and the auxiliary variable exists, the ranks of the auxiliary variable are also correlated with the study variable, and thus, these ranks can be used as an effective tool in increasing the precision of an estimator. In this paper, we propose a new improved estimator of the finite population mean that incorporates the supplementary information in forms of: (i) the auxiliary variable and (ii) ranks of the auxiliary variable. Mathematical expressions for the bias and the mean-squared error of the proposed estimator are derived under the first order of approximation. The theoretical and empirical studies reveal that the proposed estimator always performs better than the usual mean, ratio, product, exponential-ratio and -product, classical regression estimators, and Rao (1991), Singh et al. (2009), Shabbir and Gupta (2010), Grover and Kaur (2011, 2014) estimators. 相似文献
4.
Tony Vangeneugden Geert Molenberghs Geert Verbeke Clarice G.B. Demétrio 《统计学通讯:理论与方法》2014,43(19):4164-4178
In hierarchical data settings, be it of a longitudinal, spatial, multi-level, clustered, or otherwise repeated nature, often the association between repeated measurements attracts at least part of the scientific interest. Quantifying the association frequently takes the form of a correlation function, including but not limited to intraclass correlation. Vangeneugden et al. (2010) derived approximate correlation functions for longitudinal sequences of general data type, Gaussian and non-Gaussian, based on generalized linear mixed-effects models. Here, we consider the extended model family proposed by Molenberghs et al. (2010). This family flexibly accommodates data hierarchies, intra-sequence correlation, and overdispersion. The family allows for closed-form means, variance functions, and correlation function, for a variety of outcome types and link functions. Unfortunately, for binary data with logit link, closed forms cannot be obtained. This is in contrast with the probit link, for which such closed forms can be derived. It is therefore that we concentrate on the probit case. It is of interest, not only in its own right, but also as an instrument to approximate the logit case, thanks to the well-known probit-logit ‘conversion.’ Next to the general situation, some important special cases such as exchangeable clustered outcomes receive attention because they produce insightful expressions. The closed-form expressions are contrasted with the generic approximate expressions of Vangeneugden et al. (2010) and with approximations derived for the so-called logistic-beta-normal combined model. A simulation study explores performance of the method proposed. Data from a schizophrenia trial are analyzed and correlation functions derived. 相似文献
5.
Simard et al. [16 17] proposed a transformation distance called “tangent distance” (TD) which can make pattern recognition be efficient. The key idea is to construct a distance measure which is invariant with respect to some chosen transformations. In this research, we provide a method using adaptive TD based on an idea inspired by “discriminant adaptive nearest neighbor” [7]. This method is relatively easy compared with many other complicated ones. A real handwritten recognition data set is used to illustrate our new method. Our results demonstrate that the proposed method gives lower classification error rates than those by standard implementation of neural networks and support vector machines and is as good as several other complicated approaches. 相似文献
6.
The Significance Analysis of Microarrays (SAM; Tusher et al., 2001) method is widely used in analyzing gene expression data while controlling the FDR by using resampling-based procedure in the microarray setting. One of the main components of the SAM procedure is the adjustment of the test statistic. The introduction of the fudge factor to the test statistic aims at deflating the large value of test statistics due to the small standard error of gene-expression. Lin et al. (2008) pointed out that the fudge factor does not effectively improve the power and the control of the FDR as compared to the SAM procedure without the fudge factor in the presence of small variance genes. Motivated by the simulation results presented in Lin et al. (2008), in this article, we extend our study to compare several methods for choosing the fudge factor in the modified t-type test statistics and use simulation studies to investigate the power and the control of the FDR of the considered methods. 相似文献
7.
Recently, Abbasnejad et al. (2010) proposed a measure of uncertainty based on survival function, called the survival entropy of order α. A dynamic form of the survival entropy of order α is also proposed by them. In this paper, we derive the weighted form of these measures. The properties of the new measures are also discussed. 相似文献
8.
Hsiaw-Chan Yeh 《统计学通讯:理论与方法》2013,42(1):76-87
For studying and modeling the time to failure of a system or component, many reliability practitioners used the hazard rate and its monotone behaviors. However, nowadays, there are two problems. First, the modern components have high reliability and, second, their distributions are usually have non monotone hazard rate, such as, the truncated normal, Burr XII, and inverse Gaussian distributions. So, modeling these data based on the hazard rate models seems to be too stringent. Zimmer et al. (1998) and Wang et al. (2003, 2008) introduced and studied a new time to failure model in continuous distributions based on log-odds rate (LOR) which is comparable to the model based on the hazard rate. There are many components and devices in industry, that have discrete distributions with non monotone hazard rate, so, in this article, we introduce the discrete log-odds rate which is different from its analog in continuous case. Also, an alternative discrete reversed hazard rate which we called it the second reversed rate of failure in discrete times is also defined here. It is shown that the failure time distributions can be characterized by the discrete LOR. Moreover, we show that the discrete logistic and log logistics distributions have property of a constant discrete LOR with respect to t and ln t, respectively. Furthermore, properties of some distributions with monotone discrete LOR, such as the discrete Burr XII, discrete Weibull, and discrete truncated normal are obtained. 相似文献
9.
In this article, we have evaluated the performance of different forecasters and tested association between their performances for different pairs of variables. We have used three data sets of track records of professional U.S. economic forecasters participating in the Blue Chip consensus forecasting service (the data sets contain the root mean square errors (RMSE) of different forecasters for different years). To evaluate the performance of forecasters we have covered three well-known tests, namely the usual F test (cf. Fisher (1923)), Kruskal Wallis test (cf. Kruskal and Wallis (1952)), and Extension of Median test (cf. Daniel (1990)). To test the association between the forecaster's performances for different pairs of variables, we have considered Gini mean correlation coefficient rg1 (cf. Yitzhaki, S., and Olkin, I. (1991) and Yitzhaki (2003)), Modified rank correlation coefficient (cf. Zimmerman (1994)) and three modifications of Spearman rank correlation coefficient. We have observed that different forecasters do not necessarily offer same average performance. Moreover, an evidence of association between two criteria does not always lead us reaching at the same decision. The outcomes of the study may help the practitioners in selecting the best forecaster(s) for policymaking purposes. 相似文献
10.
Mi-Hwa Ko 《统计学通讯:理论与方法》2018,47(3):671-680
In this article, we study the complete convergence for sequences of coordinatewise asymptotically negatively associated random vectors in Hilbert spaces. We also investigate that some related results for coordinatewise negatively associated random vectors in Huan, Quang, and Thuan (2014) still hold under this concept. 相似文献
11.
This article proposes various Searls-type ratio imputation methods (STRIM) on the lines of Ahmed et al. (2006). It is a well-known fact that the optimal ratio type estimator attains the MSE of regression estimator (or optimal difference estimator) but while using Searls-type transformation (STT) (Searls (1964)) this may not always happen. These STRIM are shown to perform better than the imputation procedures of Ahmed et al. (2006). The STRIM may even outperform the Searls type difference imputation methods (STDIM) proposed by us in our earlier work, Bhushan and Pandey (2016). This study is concluded with the numerical study along with the theoretical comparison. 相似文献
12.
Viswanathan Ramakrishnan 《统计学通讯:模拟与计算》2013,42(3):405-418
In many genetic analyses of dichotomous twin data, odds ratios have been used to test hypotheses on heritability and shared common environment effects of a given disease (Lichtenstein et al., 2000; Ahlbom et al., 1997; Ramakrishnan et al., 1992, 4). However, estimates of these two effects have not been dealt with in the literature. In epidemiology, the attributable fraction (AF), a function of the odds ratio and the prevalence of the risk factor has been used to describe the contribution of a risk factor to a disease in a given population (Leviton, 1973). In this article, we adapt the AF to quantify the heritability and the shared common environment. Twin data on cancer, gallstone disease and phobia are used to illustrate the applicability of the AF estimate as a measure of heritability. 相似文献
13.
Abouzar Bazyari 《统计学通讯:模拟与计算》2017,46(9):7194-7209
Testing homogeneity of multivariate normal mean vectors under an order restriction when the covariance matrices are unknown, arbitrary positive definite and unequal are considered. This problem of testing has been studied to some extent, for example, by Kulatunga and Sasabuchi (1984) when the covariance matrices are known and also Sasabuchi et al. (2003) and Sasabuchi (2007) when the covariance matrices are unknown but common. In this paper, a test statistic is proposed and because of the main advantage of the bootstrap test is that it avoids the derivation of the complex null distribution analytically, a bootstrap test statistic is derived and since the proposed test statistic is location invariance the bootstrap p-value defined logical and some steps are presented to estimate it. Our numerical studies via Monte Carlo simulation show that the proposed bootstrap test can correctly control the type I error rates. The power of the test for some of the p-dimensional normal distributions is computed by Monte Carlo simulation. Also, the null distribution of test statistic is estimated using kernel density. Finally, the bootstrap test is illustrated using a real data. 相似文献
14.
We propose a new ratio type estimator for estimating the finite population mean using two auxiliary variables in stratified two-phase sampling. Expressions for bias and mean squared error of the proposed estimator are derived up to the first order of approximation. The proposed estimator is more efficient than the usual stratified sample mean estimator, traditional stratified ratio estimator and some other stratified estimators including Bahl and Tuteja (1991), Chami et al. (2012), Chand (1975), Choudhury and Singh (2012), Hamad et al. (2013), Vishwakarma and Gangele (2014), Sanaullah et al. (2014), and Chanu and Singh (2014). 相似文献
15.
There have been many alternative strategies for implementing sampling survey on quantitative characteristic of sensitive issues by using randomized response (RR) technique. The efficiency of most of those strategies has been improved by choosing the suitable design parameters of model. However, the two different procedures with pre-assigned design parameter values cannot ensure that they possess the same protection degree to the respondents. Some earlier comparisons of those strategies are inadequate (as in Eichhorn and Hayre, 1983; Gupta et al., 2002). Some literature contains a more comprehensive comparison based on efficiency and protection degree to the respondents among the qualitative characteristic RR techniques (see Bhargava and Singh, 2002; Nayak, 1994; Zaizai and Zankan, 2004). As far as the comparisons are concerned that are based on efficiency and protection degree to the respondents among the quantitative characteristic RR techniques, very few related studies have been found so far. The purpose of this article is to give a more adequate comparison among those earlier quantitative characteristic RR strategies. It is found that several important differences between the results obtained in this article and some known results exist. Therefore, these earlier RR strategies should be reevaluated. 相似文献
16.
Housila P. Singh 《统计学通讯:理论与方法》2013,42(23):4222-4238
This article considers some classes of estimators of the population median of the study variable using information on an auxiliary variable with their properties under large sample approximation. Asymptotic optimum estimator (AOE) in each class of estimators has been investigated along with the approximate mean square error formulae. It has been shown that the proposed classes of estimators are better than these considered by Gross (1980), Kuk and Mak (1989), Singh et al. (2003a), and Al and Cingi (2009). An empirical study is carried out to judge the merits of the suggested class of estimators over other existing estimators. 相似文献
17.
A Bottom-Up Dynamic Model of Portfolio Credit Risk with Stochastic Intensities and Random Recoveries
Tomasz R. Bielecki Areski Cousin Stéphane Crépey Alexander Herbertsson 《统计学通讯:理论与方法》2014,43(7):1362-1389
In Bielecki et al. (2014a), the authors introduced a Markov copula model of portfolio credit risk where pricing and hedging can be done in a sound theoretical and practical way. Further theoretical backgrounds and practical details are developed in Bielecki et al. (2014b,c) where numerical illustrations assumed deterministic intensities and constant recoveries. In the present paper, we show how to incorporate stochastic default intensities and random recoveries in the bottom-up modeling framework of Bielecki et al. (2014a) while preserving numerical tractability. These two features are of primary importance for applications like CVA computations on credit derivatives (Assefa et al., 2011; Bielecki et al., 2012), as CVA is sensitive to the stochastic nature of credit spreads and random recoveries allow to achieve satisfactory calibration even for “badly behaved” data sets. This article is thus a complement to Bielecki et al. (2014a), Bielecki et al. (2014b) and Bielecki et al. (2014c). 相似文献
18.
In an earlier article (Bai et al., 1999), the problem of simultaneous estimation of the number of signals and frequencies of multiple sinusoids is considered in the case that some observations are missing. The number of signals is estimated with an information theoretic criterion and the frequencies are estimated with eigenvariation linear prediction. Asymptotic properties of the procedure are investigated but the Monte Carlo simulation is not performed. In this article, a slightly different but scale invariant criterion for detection is proposed and the estimation of frequencies remains the same. Asymptotic properties of this new procedure are provided. Monte Carlo Simulation for both procedures is carried out. Furthermore, comparison on the real signals is also given. 相似文献
19.
Vikas Kumar 《统计学通讯:理论与方法》2017,46(17):8343-8354
In this article, the concept of cumulative residual entropy (CRE) given by Rao et al. (2004) is extended to Tsallis entropy function and dynamic version, both residual and past of it. We study some properties and characterization results for these generalized measures. In addition, we provide some characterization results of the first-order statistic based on the Tsallis survival entropy. 相似文献
20.
Shesh N. Rai Jianmin Pan Xiaobin Yuan Jianguo Sun Melissa M. Hudson Deo K. Srivastava 《统计学通讯:理论与方法》2013,42(17):3117-3133
New drug discovery in the pediatrics has dramatically improved survival, but with long- term adverse events. This motivates the examination of adverse outcomes such as long-term toxicity in a phase IV trial. An ideal approach to monitor long-term toxicity is to systematically follow the survivors, which is generally not feasible. Instead, cross-sectional surveys are conducted in Hudson et al. (2007), with one of the objectives to estimate the cumulative incidence rates along with specific interest in fixed-term (5 or 10 year) rates. We present inference procedures based on current status data to our motivating example with very interesting findings. 相似文献