首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Background: On the basis of statistical methods about index S (S = SEN × SPE), we develop a new weighted ways (weighted product index Sw) of combining sensitivity and specificity with user-defined weights. Methods: The new weighted product index Sw is defined as Sw = (SEN) (Youden 1950)2w × (SPE) (Youden 1950) 2(1?w) Results: For the large sample, the test statistics Z of two-independent-sample weighted product indices can either be a monotonous increasing/decreasing function or a no-monotonous function of weight w. Type I error of this statistics can be guaranteed close to the nominal level of 5%, which is more conservative than the weighted Youden index from simulation.  相似文献   

2.
Model selection strategies play an important, if not explicit, role in quantitative research. The inferential properties of these strategies are largely unknown, therefore, there is little basis for recommending (or avoiding) any particular set of strategies. In this paper, we evaluate several commonly used model selection procedures [Bayesian information criterion (BIC), adjusted R 2, Mallows’ C p, Akaike information criteria (AIC), AICc, and stepwise regression] using Monte-Carlo simulation of model selection when the true data generating processes (DGP) are known.

We find that the ability of these selection procedures to include important variables and exclude irrelevant variables increases with the size of the sample and decreases with the amount of noise in the model. None of the model selection procedures do well in small samples, even when the true DGP is largely deterministic; thus, data mining in small samples should be avoided entirely. Instead, the implicit uncertainty in model specification should be explicitly discussed. In large samples, BIC is better than the other procedures at correctly identifying most of the generating processes we simulated, and stepwise does almost as well. In the absence of strong theory, both BIC and stepwise appear to be reasonable model selection strategies in large samples. Under the conditions simulated, adjusted R 2, Mallows’ C p AIC, and AICc are clearly inferior and should be avoided.  相似文献   


3.
We consider wavelet-based non linear estimators, which are constructed by using the thresholding of the empirical wavelet coefficients, for the mean regression functions with strong mixing errors and investigate their asymptotic rates of convergence. We show that these estimators achieve nearly optimal convergence rates within a logarithmic term over a large range of Besov function classes Bsp, q. The theory is illustrated with some numerical examples.

A new ingredient in our development is a Bernstein-type exponential inequality, for a sequence of random variables with certain mixing structure and are not necessarily bounded or sub-Gaussian. This moderate deviation inequality may be of independent interest.  相似文献   


4.
5.
We consider the problem of estimating and testing a general linear hypothesis in a general multivariate linear model, the so-called Growth Curve model, when the p × N observation matrix is normally distributed.

The maximum likelihood estimator (MLE) for the mean is a weighted estimator with the inverse of the sample covariance matrix which is unstable for large p close to N and singular for p larger than N. We modify the MLE to an unweighted estimator and propose new tests which we compare with the previous likelihood ratio test (LRT) based on the weighted estimator, i.e., the MLE. We show that the performance of these new tests based on the unweighted estimator is better than the LRT based on the MLE.  相似文献   


6.
Efficient, accurate, and fast Markov Chain Monte Carlo estimation methods based on the Implicit approach are proposed. In this article, we introduced the notion of Implicit method for the estimation of parameters in Stochastic Volatility models.

Implicit estimation offers a substantial computational advantage for learning from observations without prior knowledge and thus provides a good alternative to classical inference in Bayesian method when priors are missing.

Both Implicit and Bayesian approach are illustrated using simulated data and are applied to analyze daily stock returns data on CAC40 index.  相似文献   


7.
In this study an attempt is made to assess statistically the validity of two theories as to the origin of comets. This subject still leads to great controversy amongst astronomers but recently two main schools of thought have developed.

These are that comets are of

(i) planetary origin,

(ii) interstellar origin.

Many theories have been expanded within each school of thought but at the present time one theory in each is generally accepted. This paper sets out to identify the statistical implications of each theory and evaluate each theory in terms of their implications.  相似文献   


8.
This article provides a procedure for the detection and identification of outliers in the spectral domain where the Whittle maximum likelihood estimator of the panel data model proposed by Chen [W.D. Chen, Testing for spurious regression in a panel data model with the individual number and time length growing, J. Appl. Stat. 33(88) (2006b), pp. 759–772] is implemented. We extend the approach of Chang and co-workers [I. Chang, G.C. Tiao, and C. Chen, Estimation of time series parameters in the presence of outliers, Technometrics 30 (2) (1988), pp. 193–204] to the spectral domain and through the Whittle approach we can quickly detect and identify the type of outliers. A fixed effects panel data model is used, in which the remainder disturbance is assumed to be a fractional autoregressive integrated moving-average (ARFIMA) process and the likelihood ratio criterion is obtained directly through the modified inverse Fourier transform. This saves much time, especially when the estimated model implements a huge data-set.

Through Monte Carlo experiments, the consistency of the estimator is examined by growing the individual number N and time length T, in which the long memory remainder disturbances are contaminated with two types of outliers: additive outlier and innovation outlier. From the power tests, we see that the estimators are quite successful and powerful.

In the empirical study, we apply the model on Taiwan's computer motherboard industry. Weekly data from 1 January 2000 to 31 October 2006 of nine familiar companies are used. The proposed model has a smaller mean square error and shows more distinctive aggressive properties than the raw data model does.  相似文献   


9.
The 1978 European Community Typology for Agricultural Holdings is described in this paper and contrasted with a data based, polythetic-multivariate classification based on cluster analysis.

The requirement to reduce the size of the variable set employed in an optimisation-partition method of clustering suggested the value of principal components and factor analysis for the identification of major ‘source’ dimensions against which to measure farm differences and similarities.

The Euclidean cluster analysis incorporating the reduced dimensions quickly converged to a stable solution and was little influenced by the initial number or nature of ‘seeding’ partitions of the data.

The assignment of non-sampled observations from the population to cluster classes was completed using classification functions.

The final scheme, based on a sample of over 2,000 observations, was found to be both capable of interpretation and meaningful in terms of agricultural structure and practice and much superior in its explanatory power when compared with a version of the principal activity typology.  相似文献   


10.
Four procedures are suggested for estimating the parameter ‘a’ in the Pauling equation:

e-X/a+e ? Y/a = 1.

The procedures are: using the mean of individual solutions, least squares with Y the subject of the equation, least squares with X the subject of the equation and maximum likelihood using a statistical model. In order to compare these estimates, we use Efron's bootstrap technique (1979), since distributional results are not available. This example also illustrates the role of the bootstrap in statistical inference.  相似文献   


11.
In this article, we consider the problem of testing (a) sphericity and (b) intraclass covariance structure under a growth curve model. The maximum likelihood estimator (MLE) for the mean in a growth curve model is a weighted estimator with the inverse of the sample covariance matrix which is unstable for large p close to N and singular for p larger than N. The MLE for the covariance matrix is based on the MLE for the mean, which can be very poor for p close to N. For both structures (a) and (b), we modify the MLE for the mean to an unweighted estimator and based on this estimator we propose a new estimator for the covariance matrix. This new estimator leads to new tests for (a) and (b). We also propose two other tests for each structure, which are just based on the sample covariance matrix.

To compare the performance of all four tests we compute for each structure (a) and (b) the attained significance level and the empirical power. We show that one of the tests based on the sample covariance matrix is better than the likelihood ratio test based on the MLE.  相似文献   


12.
We define a new family of stochastic processes called Markov modulated Brownian motions with a sticky boundary at zero. Intuitively, each process is a regulated Markov-modulated Brownian motion whose boundary behavior is modified to slow down at level zero.

To determine the stationary distribution of a sticky MMBM, we follow a Markov-regenerative approach similar to the one developed with great success in the context of quasi-birth-and-death processes and fluid queues. Our analysis also relies on recent work showing that Markov-modulated Brownian motions arise as limits of a parametrized family of fluid queues.  相似文献   


13.
14.
The C statistic, also known as the Cash statistic, is often used in astronomy for the analysis of low-count Poisson data. The main advantage of this statistic, compared to the more commonly used χ2 statistic, is its applicability without the need to combine data points. This feature has made the C statistic a very useful method to analyze Poisson data that have small (or even null) counts in each resolution element. One of the challenges of the C statistic is that its probability distribution, under the null hypothesis that the data follow a parent model, is not known exactly. This paper presents an effort towards improving our understanding of the C statistic by studying (a) the distribution of C statistic for a fully specified model, (b) the distribution of Cmin resulting from a maximum-likelihood fit to a simple one-parameter constant model, i.e. a model that represents the sample mean of N Poisson measurements, and (c) the distribution of the associated ΔC statistic that is used for parameter estimation. The results confirm the expectation that, in the high-count limit, both C statistic and Cmin have the same mean and variance as a χ2 statistic with same number of degrees of freedom. It is also found that, in the low-count regime, the expectation of the C statistic and Cmin can be substantially lower than for a χ2 distribution. The paper makes use of recent X-ray observations of the astronomical source PG 1116+215 to illustrate the application of the C statistic to Poisson data.  相似文献   

15.
The multivariate extremal index function is a measure of the clustering among the extreme values of a multivariate stationary sequence. In this article, we introduce a measure of the degree of clustering of upcrossings in a multivariate stationary sequence, called multivariate upcrossings index, which is a multivariate generalization of the concept of upcrossings index. We derive the main properties of this function, namely the relations with the multivariate extremal index and the clustering of upcrossings.

Imposing general local and asymptotic dependence restrictions on the sequence or on its marginals we compute the multivariate upcrossings index from the marginal upcrossings indices and from the joint distribution of a finite number of variables. A couple of illustrative examples are exploited.  相似文献   


16.
17.
18.
In this paper, we study, by means of randomized sampling, the long-run stability of some open Markov population fed with time-dependent Poisson inputs. We show that state probabilities within transient states converge—even when the overall expected population dimension increases without bound—under general conditions on the transition matrix and input intensities.

Following the convergence results, we obtain ML estimators for a particular sequence of input intensities, where the sequence of new arrivals is modeled by a sigmoidal function. These estimators allow for the forecast, by confidence intervals, of the evolution of the relative population structure in the transient states.

Applying these results to the study of a consumption credit portfolio, we estimate the implicit default rate.  相似文献   


19.
We study high-dimensional covariance/precision matrix estimation under the assumption that the covariance/precision matrix can be decomposed into a low-rank component L and a diagonal component D. The rank of L can either be chosen to be small or controlled by a penalty function. Under moderate conditions on the population covariance/precision matrix itself and on the penalty function, we prove some consistency results for our estimators. A block-wise coordinate descent algorithm, which iteratively updates L and D, is then proposed to obtain the estimator in practice. Finally, various numerical experiments are presented; using simulated data, we show that our estimator performs quite well in terms of the Kullback–Leibler loss; using stock return data, we show that our method can be applied to obtain enhanced solutions to the Markowitz portfolio selection problem. The Canadian Journal of Statistics 48: 308–337; 2020 © 2019 Statistical Society of Canada  相似文献   

20.
This paper analyses direct and indirect forms of dependence in the probability of scoring in a handball match, taking into account the mutual influence of both playing teams. Non-identical distribution (i.d.) and non-stationarity, which are commonly observed in sport games, are studied through the specification of time-varying parameters.

The model accounts for the binary character of the dependent variable, and for unobserved heterogeneity. The parameter dynamics is specified by a first-order auto-regressive process.

Data from the Handball World Championships 2001–2005 show that the dynamics of handball violate both independence and i.d., in some cases having a non-stationary behaviour.  相似文献   


设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号