期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Nonparametric approach to intervention time series modeling

Jin-Hong Park 《Journal of applied statistics》2012,39(7):1397-1408

Time series are often affected by interventions such as strikes, earthquakes, or policy changes. In the current paper, we build a practical nonparametric intervention model using the central mean subspace in time series. We estimate the central mean subspace for time series taking into account known interventions by using the Nadaraya–Watson kernel estimator. We use the modified Bayesian information criterion to estimate the unknown lag and dimension. Finally, we demonstrate that this nonparametric approach for intervened time series performs well in simulations and in a real data analysis such as the Monthly average of the oxidant. 相似文献

2.

Notes on odds ratio estimation for a randomized clinical trial with noncompliance and missing outcomes

Kung-Jong Lui Kuang-Chao Chang 《Journal of applied statistics》2010,37(12):2057-2071

The odds ratio (OR) has been recommended elsewhere to measure the relative treatment efficacy in a randomized clinical trial (RCT), because it possesses a few desirable statistical properties. In practice, it is not uncommon to come across an RCT in which there are patients who do not comply with their assigned treatments and patients whose outcomes are missing. Under the compound exclusion restriction, latent ignorable and monotonicity assumptions, we derive the maximum likelihood estimator (MLE) of the OR and apply Monte Carlo simulation to compare its performance with those of the other two commonly used estimators for missing completely at random (MCAR) and for the intention-to-treat (ITT) analysis based on patients with known outcomes, respectively. We note that both estimators for MCAR and the ITT analysis may produce a misleading inference of the OR even when the relative treatment effect is equal. We further derive three asymptotic interval estimators for the OR, including the interval estimator using Wald’s statistic, the interval estimator using the logarithmic transformation, and the interval estimator using an ad hoc procedure of combining the above two interval estimators. On the basis of a Monte Carlo simulation, we evaluate the finite-sample performance of these interval estimators in a variety of situations. Finally, we use the data taken from a randomized encouragement design studying the effect of flu shots on the flu-related hospitalization rate to illustrate the use of the MLE and the asymptotic interval estimators for the OR developed here. 相似文献

3.

The widespread misinterpretation of p-values as error probabilities

Raymond Hubbard 《Journal of applied statistics》2011,38(11):2617-2626

The anonymous mixing of Fisherian (p-values) and Neyman–Pearsonian (α levels) ideas about testing, distilled in the customary but misleading p < α criterion of statistical significance, has led researchers in the social and management sciences (and elsewhere) to commonly misinterpret the p-value as a ‘data-adjusted’ Type I error rate. Evidence substantiating this claim is provided from a number of fronts, including comments by statisticians, articles judging the value of significance testing, textbooks, surveys of scholars, and the statistical reporting behaviours of applied researchers. That many investigators do not know the difference between p’s and α’s indicates much bewilderment over what those most ardently sought research outcomes—statistically significant results—means. Statisticians can play a leading role in clearing this confusion. A good starting point would be to abolish the p < α criterion of statistical significance. 相似文献

4.

Model selection for factor analysis: Some new criteria and performance comparisons

In Choi Hanbat Jeong 《Econometric Reviews》2019,38(6):577-596

This paper derives Akaike information criterion (AIC), corrected AIC, the Bayesian information criterion (BIC) and Hannan and Quinn’s information criterion for approximate factor models assuming a large number of cross-sectional observations and studies the consistency properties of these information criteria. It also reports extensive simulation results comparing the performance of the extant and new procedures for the selection of the number of factors. The simulation results show the di?culty of determining which criterion performs best. In practice, it is advisable to consider several criteria at the same time, especially Hannan and Quinn’s information criterion, Bai and Ng’s IC_p2 and BIC₃, and Onatski’s and Ahn and Horenstein’s eigenvalue-based criteria. The model-selection criteria considered in this paper are also applied to Stock and Watson’s two macroeconomic data sets. The results differ considerably depending on the model-selection criterion in use, but evidence suggesting five factors for the first data and five to seven factors for the second data is obtainable. 相似文献

5.

A Practical Procedure to Find Matching Priors for Frequentist Inference

Juan Zhang John E. Kolassa 《统计学通讯:理论与方法》2013,42(15):2758-2767

We present a practical way to find matching priors via the use of saddlepoint approximations and obtain p-values of tests of an interest parameter in the presence of nuisance parameters. The advantages of our procedure are the flexibility in choosing different initial conditions so that one may adjust the performance of a test, and the less intensive computational efforts compared to a Markov Chain Monto Carlo method. 相似文献

6.

Employee turnover forecasting for human resource management based on time series analysis

Xiaojuan Zhu William Seaver Rapinder Sawhney Bruce Holt Gurudatt Bhaskar Sanil 《Journal of applied statistics》2017,44(8):1421-1440

In some organizations, the hiring lead time is often long due to responding to human resource requirements associated with technical and security constrains. Thus, the human resource departments in these organizations are pretty interested in forecasting employee turnover since a good prediction of employee turnover could help the organizations to minimize the costs and impacts from the turnover on the operational capabilities and the budget. This study aims to enhance the ability to forecast employee turnover with or without considering the impact of economic indicators. Various time series modelling techniques were used to identify optimal models for effective employee turnover prediction. More than 11-years of monthly turnover data were used to build and validate the proposed models. Compared with other models, a dynamic regression model with additive trend, seasonality, interventions, and a very important economic indicator effectively predicted the turnover with training R²?=?0.77 and holdout R²?=?0.59. The forecasting performance of optimal models confirms that time series modelling approach has the ability to predict employee turnover for the specific scenario observed in our analysis. 相似文献

7.

Application of the Bootstrap Approach to the Choice of Dimension and the α Parameter in the SIRα Method

Benoît Liquet 《统计学通讯:模拟与计算》2013,42(6):1198-1218

To reduce the dimensionality of regression problems, sliced inverse regression approaches make it possible to determine linear combinations of a set of explanatory variables X related to the response variable Y in general semiparametric regression context. From a practical point of view, the determination of a suitable dimension (number of the linear combination of X) is important. In the literature, statistical tests based on the nullity of some eigenvalues have been proposed. Another approach is to consider the quality of the estimation of the effective dimension reduction (EDR) space. The square trace correlation between the true EDR space and its estimate can be used as goodness of estimation. In this article, we focus on the SIR_α method and propose a naïve bootstrap estimation of the square trace correlation criterion. Moreover, this criterion could also select the α parameter in the SIR_α method. We indicate how it can be used in practice. A simulation study is performed to illustrate the behavior of this approach. 相似文献

8.

A permutation test approach to the choice of size k for the nearest neighbors classifier

Yinglei Lai Baolin Wu Hongyu Zhao 《Journal of applied statistics》2011,38(10):2289-2302

The k nearest neighbors (k-NN) classifier is one of the most popular methods for statistical pattern recognition and machine learning. In practice, the size k, the number of neighbors used for classification, is usually arbitrarily set to one or some other small numbers, or based on the cross-validation procedure. In this study, we propose a novel alternative approach to decide the size k. Based on a k-NN-based multivariate multi-sample test, we assign each k a permutation test based Z-score. The number of NN is set to the k with the highest Z-score. This approach is computationally efficient since we have derived the formulas for the mean and variance of the test statistic under permutation distribution for multiple sample groups. Several simulation and real-world data sets are analyzed to investigate the performance of our approach. The usefulness of our approach is demonstrated through the evaluation of prediction accuracies using Z-score as a criterion to select the size k. We also compare our approach to the widely used cross-validation approaches. The results show that the size k selected by our approach yields high prediction accuracies when informative features are used for classification, whereas the cross-validation approach may fail in some cases. 相似文献

9.

A comparison of reweighting estimators of average treatment effects in real world populations

Chen-Yen Lin Eloise Kaizar Douglas Faries Joseph Johnston 《Pharmaceutical statistics》2021,20(4):765-782

Regulatory agencies typically evaluate the efficacy and safety of new interventions and grant commercial approval based on randomized controlled trials (RCTs). Other major healthcare stakeholders, such as insurance companies and health technology assessment agencies, while basing initial access and reimbursement decisions on RCT results, are also keenly interested in whether results observed in idealized trial settings will translate into comparable outcomes in real world settings—that is, into so-called “real world” effectiveness. Unfortunately, evidence of real world effectiveness for new interventions is not available at the time of initial approval. To bridge this gap, statistical methods are available to extend the estimated treatment effect observed in a RCT to a target population. The generalization is done by weighting the subjects who participated in a RCT so that the weighted trial population resembles a target population. We evaluate a variety of alternative estimation and weight construction procedures using both simulations and a real world data example using two clinical trials of an investigational intervention for Alzheimer's disease. Our results suggest an optimal approach to estimation depends on the characteristics of source and target populations, including degree of selection bias and treatment effect heterogeneity. 相似文献

10.

A Bayesian procedure for assessing process performance based on the third-generation capability index

Chien-Wei Wu Tsai-Yu Lin 《Journal of applied statistics》2009,36(11):1205-1223

Capability indices that qualify process potential and process performance are practical tools for successful quality improvement activities and quality program implementation. Most existing methods to assess process capability were derived on the basis of the traditional frequentist point of view. This paper considers the problem of estimating and testing process capability based on the third-generation capability index C _pmk from the Bayesian point of view. We first derive the posterior probability p for the process under investigation is capable. The one-sided credible interval, a Bayesian analog of the classical lower confidence interval, can be obtained to assess process performance. To investigate the effectiveness of the derived results, a series of simulation was undertaken. The results indicate that the performance of the proposed Bayesian approach depends strongly on the value of ξ=(μ?T)/σ. It performs very well with the accurate coverage rate when μ is sufficiently far from T. In those cases, they have the same acceptable performance even though the sample size n is as small as 25. 相似文献

11.

Robust classification of high-dimensional data using artificial neural networks

D. J. Smith T. C. Bailey A. G. Munford 《Statistics and Computing》1993,3(2):71-81

This paper is concerned with the application of artificial neural networks (ANNs) to a practical, difficult and high-dimensional classification problem, discrimination between selected under-water sounds. The application provides for a particular comparison of the relative performance of time-delay as opposed to fully connected network architectures, in the analysis of temporal data. More originally, suggestions are given for adapting the conventional backpropagation algorithm to give greater robustness to mis-classification errors in the training examples—a particular problem with underwater sound data and one which may arise in other realistic applications of ANNs. An informal comparison is made between the generalisation performance of various architectures in classifying real dolphin sounds when networks are trained using the conventional least squares minimisation norm, L ₂, that of least absolute deviation, L ₁, and that of the Huber criterion, which involves a mixture of both L ₁ and L ₂. The results suggest that L ₁ and Huber may provide performance gains. In order to evaluate these robust adjustments more formally under controlled conditions, an experiment is then conducted using simulated dolphin sounds with known levels of random noise and misclassification error. Here, the results are more ambiguous and significant interactions are indicated which raise issues for future research. 相似文献

12.

A variable selection method for detecting abnormality based on the T2 test

N. Shinozaki T. Iida 《统计学通讯:理论与方法》2017,46(17):8603-8617

This paper proposes a variable selection method for detecting abnormal items based on the T² test when the observations on abnormal items are available. Based on the unbiased estimates of the powers for all subsets of variables, the variable selection method selects the subset of variables that maximizes the power estimate. Since more than one subsets of variables maximize the power estimate frequently, the averaged p-value of the rejected items is used as a second criterion. Although the performance of the method depends on the sample size for the abnormal items and the true power values for all subsets of variables, numerical experiments show the effectiveness of the proposed method. Since normal and abnormal items are simulated using one-factor and two-factor models, basic properties of the power functions for the models are investigated. 相似文献

13.

On constructing general minimum lower order confounding two-level block designs

Sheng-Li Zhao Qing Sun 《统计学通讯:理论与方法》2017,46(3):1261-1274

In practice, to reduce systematic variation and increase precision of effect estimation, a practical design strategy is then to partition the experimental units into homogeneous groups, known as blocks. It is an important issue to study the optimal way on blocking the experimental units. Blocked general minimum lower order confounding (B¹-GMC) is a new criterion for selecting optimal block designs. The paper considers the construction of optimal two-level block designs with respect to the B¹-GMC criterion. By utilizing doubling theory and MaxC2 design, some optimal block designs with respect to the B¹-GMC criterion are obtained. 相似文献

14.

Statistics development: statistical methods meeting the user’s needs

Jan?Engel Henriette C.M.?Hoonhout 《AStA Advances in Statistical Analysis》2007,91(4):413-427

Statistical methods have the potential of being effectively used by industrial practitioners if they satisfied two criteria: functionality and usability. Statistical methods are usually the product of statistical research activities of universities and other research organizations. Some already satisfy these criteria; however, many do not. The effect is that potentially relevant methods are not used in practice as often as they could be. In this paper we will present an approach regarding ‘statistics development,’ in which the end-user is given a central position, so that the results from statistical research aim to meet the needs and requirements of the practitioner. Examples of known and new methods will be presented, and we will discuss issues such as education in statistics, the link with statistical consultancy and publication of methods through various channels. 相似文献

15.

Applying skovgaard's modified directed likelihood statistic to mixed linear models

《Journal of Statistical Computation and Simulation》2012,82(1-4):225-242

The introduction of software to calculate maximum likelihood estimates for mixed linear models has made likelihood estimation a practical alternative to methods based on sums of squares. Likelihood based tests and confidence intervals, however, may be misleading in problems with small sample sizes. This paper discusses an adjusted version of the directed log-likelihood statistic for mixed models that is highly accurate for testing one parameter hypotheses. Indroduced by Skovgaard (1996, Journal of the Bernoulli Society,2,145-165), we show in mixed models that the statistic has a simple conpact from that may be obtained from standard software. Simulation studies indicate that this statistic is more accurate than many of the specialized procedure that have been advocated. 相似文献

16.

Subgroup analyses in cost‐effectiveness analyses to support health technology assessments

Christine Fletcher Christy Chuang‐Stein Marie‐Ange Paget Carol Reid Neil Hawkins 《Pharmaceutical statistics》2014,13(4):265-274

‘Success’ in drug development is bringing to patients a new medicine that has an acceptable benefit–risk profile and that is also cost‐effective. Cost‐effectiveness means that the incremental clinical benefit is deemed worth paying for by a healthcare system, and it has an important role in enabling manufacturers to obtain new medicines to patients as soon as possible following regulatory approval. Subgroup analyses are increasingly being utilised by decision‐makers in the determination of the cost‐effectiveness of new medicines when making recommendations. This paper highlights the statistical considerations when using subgroup analyses to support cost‐effectiveness for a health technology assessment. The key principles recommended for subgroup analyses supporting clinical effectiveness published by Paget et al. are evaluated with respect to subgroup analyses supporting cost‐effectiveness. A health technology assessment case study is included to highlight the importance of subgroup analyses when incorporated into cost‐effectiveness analyses. In summary, we recommend planning subgroup analyses for cost‐effectiveness analyses early in the drug development process and adhering to good statistical principles when using subgroup analyses in this context. In particular, we consider it important to provide transparency in how subgroups are defined, be able to demonstrate the robustness of the subgroup results and be able to quantify the uncertainty in the subgroup analyses of cost‐effectiveness. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献

17.

Phase I monitoring of generalized linear model-based regression profiles

《Journal of Statistical Computation and Simulation》2012,82(14):2839-2859

In some industrial applications, the quality of a process or product is characterized by a relationship between the response variable and one or more independent variables which is called as profile. There are many approaches for monitoring different types of profiles in the literature. Most researchers assume that the response variable follows a normal distribution. However, this assumption may be violated in many cases. The most likely situation is when the response variable follows a distribution from generalized linear models (GLMs). For example, when the response variable is the number of defects in a certain area of a product, the observations follow Poisson distribution and ignoring this fact will cause misleading results. In this paper, three methods including a T²-based method, likelihood ratio test (LRT) method and F method are developed and modified in order to be applied in monitoring GLM regression profiles in Phase I. The performance of the proposed methods is analysed and compared for the special case that the response variable follows Poisson distribution. A simulation study is done regarding the probability of the signal criterion. Results show that the LRT method performs better than two other methods and the F method performs better than the T²-based method in detecting either small or large step shifts as well as drifts. Moreover, the F method performs better than the other two methods, and the LRT method performs poor in comparison with the F and T²-based methods in detecting outliers. A real case, in which the size and number of agglomerates ejected from a volcano in successive days form the GLM profile, is illustrated and the proposed methods are applied to determine whether the number of agglomerates of each size is under statistical control or not. Results showed that the proposed methods could handle the mentioned situation and distinguish the out-of-control conditions. 相似文献

18.

A Hellinger distance approach to MCMC diagnostics

《Journal of Statistical Computation and Simulation》2012,82(4):833-849

Bayesian analysis often requires the researcher to employ Markov Chain Monte Carlo (MCMC) techniques to draw samples from a posterior distribution which in turn is used to make inferences. Currently, several approaches to determine convergence of the chain as well as sensitivities of the resulting inferences have been developed. This work develops a Hellinger distance approach to MCMC diagnostics. An approximation to the Hellinger distance between two distributions f and g based on sampling is introduced. This approximation is studied via simulation to determine the accuracy. A criterion for using this Hellinger distance for determining chain convergence is proposed as well as a criterion for sensitivity studies. These criteria are illustrated using a dataset concerning the Anguilla australis, an eel native to New Zealand. 相似文献

19.

An inclusion-consistent solution to the problem of absurd confidence statements: 1. Consistent exact confidence-interval estimation

Andr Plante 《Revue canadienne de statistique》1991,19(4):389-397

An example is given of a uniformly most accurate unbiased confidence belt which yields absurd confidence statements with 100% occurrence. In several known examples, as well as in the 100%-occurrence counterexample, an optimal confidence belt provides absurd statements because it is inclusion-inconsistent with either a null or an all-inclusive belt or both. It is concluded that confidence-theory optimality criteria alone are inadequate for practice, and that a consistency criterion is required. An approach based upon inclusion consistency of belts [C(x) C C C(x), for some x, implies γ ≤ γ for confidence coefficients] is suggested for exact interval estimation in continuous parametric models. Belt inclusion consistency, the existence of a proper-pivotal vector [a pivotal vector T(X, θ) such that the effective range of T(x,.) is independent of x], and the existence of a confidence distribution are proven mutually equivalent. This consistent approach being restrictive, it is shown, using Neyman's anomalous 1954 example, how to determine whether any given parametric function can be estimated consistently and exactly or whether a consistent nonexact solution must be attempted. 相似文献

20.

A non-iterative optimization method for smoothness in penalized spline regression

Hirokazu Yanagihara 《Statistics and Computing》2012,22(2):527-544

Typically, an optimal smoothing parameter in a penalized spline regression is determined by minimizing an information criterion, such as one of the C _p, CV and GCV criteria. Since an explicit solution to the minimization problem for an information criterion cannot be obtained, it is necessary to carry out an iterative procedure to search for the optimal smoothing parameter. In order to avoid such extra calculation, a non-iterative optimization method for smoothness in penalized spline regression is proposed using the formulation of generalized ridge regression. By conducting numerical simulations, we verify that our method has better performance than other methods which optimize the number of basis functions and the single smoothing parameter by means of the CV or GCV criteria. 相似文献