期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Penalized angular regression for personalized predictions

Kristoffer H. Hellton 《Scandinavian Journal of Statistics》2023,50(1):184-212

Personalization is becoming an important aspect of many predictive applications. We introduce a penalized regression method which inherently implements personalization. Personalized angle (PAN) regression constructs regression coefficients that are specific to the covariate vector for which one is producing a prediction, thus personalizing the regression model itself. This is achieved by penalizing the normalized prediction for a given covariate vector. The method therefore penalizes the normalized regression coefficients, or the angles of the regression coefficients in a hyperspherical parametrization, introducing a new angle-based class of penalties. PAN hence combines two novel concepts: penalizing the normalized coefficients and personalization. For an orthogonal design matrix, we show that the PAN estimator is the solution to a low-dimensional eigenvector equation. Based on the hyperspherical parametrization, we construct an efficient algorithm to calculate the PAN estimator. We propose a parametric bootstrap procedure for selecting the tuning parameter, and simulations show that PAN regression can outperform ordinary least squares, ridge regression and other penalized regression methods in terms of prediction error. Finally, we demonstrate the method in a medical application. 相似文献

2.

Modified weighted squared error estimation procedures with special emphasis on the stable laws

A. S. Paulson T. A. Delehanty 《统计学通讯:模拟与计算》2013,42(4):927-972

Two families of parameter estimation procedures for the stable laws based on a variant of the characteristic function are provided. The methodology which produces viable computational procedures for the stable laws is generally applicable to other families of distributions across a variety of settings. Both families of procedures may be described as a modified weighted chi-squared minimization procedure, and both explicitly take account of constraints on the parameter space. Influence func-tions for and efficiencies of the estimators are given. If x₁, x₂, …x_n random sample from an unknown distribution F , a method for determining the stable law to which F is attracted is developed. Procedures for regression and autoregres-sion with stable error structure are provided. A number of examples are given. 相似文献

3.

Tempered stable Ornstein– Uhlenbeck processes: A practical view

Michele Leonardo Bianchi Svetlozar T. Rachev Frank J. Fabozzi 《统计学通讯:模拟与计算》2017,46(1):423-445

We study the one-dimensional Ornstein–Uhlenbeck (OU) processes with marginal law given by tempered stable and tempered infinitely divisible distributions. We investigate the transition law between consecutive observations of these processes and evaluate the characteristic function of integrated tempered OU processes with a view toward practical applications. We then analyze how to draw a random sample from this class of processes by considering both the classical inverse transform algorithm and an acceptance–rejection method based on simulating a stable random sample. Using a maximum likelihood estimation method based on the fast Fourier transform, we empirically assess the simulation algorithm performance. 相似文献

4.

Not the First Digit! Using Benford's Law to Detect Fraudulent Scientif ic Data

Andreas Diekmann 《Journal of applied statistics》2007,34(3):321-329

Digits in statistical data produced by natural or social processes are often distributed in a manner described by ‘Benford's law’. Recently, a test against this distribution was used to identify fraudulent accounting data. This test is based on the supposition that first, second, third, and other digits in real data follow the Benford distribution while the digits in fabricated data do not. Is it possible to apply Benford tests to detect fabricated or falsified scientific data as well as fraudulent financial data? We approached this question in two ways. First, we examined the use of the Benford distribution as a standard by checking the frequencies of the nine possible first and ten possible second digits in published statistical estimates. Second, we conducted experiments in which subjects were asked to fabricate statistical estimates (regression coefficients). The digits in these experimental data were scrutinized for possible deviations from the Benford distribution. There were two main findings. First, both digits of the published regression coefficients were approximately Benford distributed or at least followed a pattern of monotonic decline. Second, the experimental results yielded new insights into the strengths and weaknesses of Benford tests. Surprisingly, first digits of faked data also exhibited a pattern of monotonic decline, while second, third, and fourth digits were distributed less in accordance with Benford's law. At least in the case of regression coefficients, there were indications that checks for digit-preference anomalies should focus less on the first (i.e. leftmost) and more on later digits. 相似文献

5.

The Absolute Difference Law For Expectations

Liang Hong 《The American statistician》2015,69(1):8-10

We revisit the addition law for expectations and present a sibling law: the absolute law for expectations. We show that these two laws and their corresponding laws for probabilities can be reconciled under a single framework. As an application, we use the absolute law for expectations to calculate the mean absolute deviation. Finally, we remark on a hidden point in a related article previously published on these pages; this will help readers to avoid a potential pitfall. 相似文献

6.

Bias corrected estimation for generalized probit regression with covariate measurement error and censored responses

Yueqin Wu Minggao Gu 《Journal of statistical planning and inference》2012,142(1):221-231

In this paper, we propose a bias corrected estimate of the regression coefficient for the generalized probit regression model when the covariates are subject to measurement error and the responses are subject to interval censoring. The main improvement of our method is that it reduces most of the bias that the naive estimates have. The great advantage of our method is that it is baseline and censoring distribution free, in a sense that the investigator does not need to calculate the baseline or the censoring distribution to obtain the estimator of the regression coefficient, an important property of Cox regression model. A sandwich estimator for the variance is also proposed. Our procedure can be generalized to general measurement error distribution as long as the first four moments of the measurement error are known. The results of extensive simulations show that our approach is very effective in eliminating the bias when the measurement error is not too large relative to the error term of the regression model. 相似文献

7.

Some Simple Method of Estimation for the Parameters of the Discrete Stable Distribution with the Probability Generating Function

Louis G. Doray Shu Mei Jiang Andrew Luong 《统计学通讯:模拟与计算》2013,42(9):2004-2017

In this article, we develop a method to estimate the two parameters of the discrete stable distribution. By minimizing the quadratic distance between transforms of the empirical and theoretical probability generating functions, we obtain estimators simple to calculate, asymptotically unbiased, and normally distributed. We also derive the expression for their variance–covariance matrix. We simulate several samples of discrete stable distributed datasets with different parameters, to analyze the effect of tuncation on the right tail of the distribution. 相似文献

8.

General partially linear varying-coefficient transformation models for ranking data

Jianbo Li Minggao Gu Tao Hu 《Journal of applied statistics》2012,39(7):1475-1488

In this paper,we propose a class of general partially linear varying-coefficient transformation models for ranking data. In the models, the functional coefficients are viewed as nuisance parameters and approximated by B-spline smoothing approximation technique. The B-spline coefficients and regression parameters are estimated by rank-based maximum marginal likelihood method. The three-stage Monte Carlo Markov Chain stochastic approximation algorithm based on ranking data is used to compute estimates and the corresponding variances for all the B-spline coefficients and regression parameters. Through three simulation studies and a Hong Kong horse racing data application, the proposed procedure is illustrated to be accurate, stable and practical. 相似文献

9.

Alternative Estimators for Randomized Response Techniques in Multi-Character Surveys

Raghunath Arnab 《统计学通讯:理论与方法》2013,42(10):1839-1848

Tail estimates are developed for power law probability distributions with exponential tempering, using a conditional maximum likelihood approach based on the upper-order statistics. Tempered power law distributions are intermediate between heavy power-law tails and Laplace or exponential tails, and are sometimes called “semi-heavy” tailed distributions. The estimation method is demonstrated on simulated data from a tempered stable distribution, and for several data sets from geophysics and finance that show a power law probability tail with some tempering. 相似文献

10.

Partially linear models and their applications to change point detection of chemical process data

Clécio S. Ferreira Camila B. Zeller Aparecida M. S. Mimura Júlio C. J. Silva 《Journal of applied statistics》2017,44(12):2125-2141

In many chemical data sets, the amount of radiation absorbed (absorbance) is related to the concentration of the element in the sample by Lambert–Beer's law. However, this relation changes abruptly when the variable concentration reaches an unknown threshold level, the so-called change point. In the context of analytical chemistry, there are many methods that describe the relationship between absorbance and concentration, but none of them provide inferential procedures to detect change points. In this paper, we propose partially linear models with a change point separating the parametric and nonparametric components. The Schwarz information criterion is used to locate a change point. A back-fitting algorithm is presented to obtain parameter estimates and the penalized Fisher information matrix is obtained to calculate the standard errors of the parameter estimates. To examine the proposed method, we present a simulation study. Finally, we apply the method to data sets from the chemistry area. The partially linear models with a change point developed in this paper are useful supplements to other methods of absorbance–concentration analysis in chemical studies, for example, and in many other practical applications. 相似文献

11.

An approximate maximum likelihood procedure for parameter estimation in multivariate discrete data regression models

Andrew W. Roddam 《Journal of applied statistics》2001,28(2):273-279

This paper considers an alternative to iterative procedures used to calculate maximum likelihood estimates of regression coefficients in a general class of discrete data regression models. These models can include both marginal and conditional models and also local regression models. The classical estimation procedure is generally via a Fisher-scoring algorithm and can be computationally intensive for high-dimensional problems. The alternative method proposed here is non-iterative and is likely to be more efficient in high-dimensional problems. The method is demonstrated on two different classes of regression models. 相似文献

12.

Likelihood estimation for longitudinal zero-inflated power series regression models

E. Bahrami Samani Y. Amirian M. Ganjali 《Journal of applied statistics》2012,39(9):1965-1974

In this paper, a zero-inflated power series regression model for longitudinal count data with excess zeros is presented. We demonstrate how to calculate the likelihood for such data when it is assumed that the increment in the cumulative total follows a discrete distribution with a location parameter that depends on a linear function of explanatory variables. Simulation studies indicate that this method can provide improvements in obtaining standard errors of the estimates. We also calculate the dispersion index for this model. The influence of a small perturbation of the dispersion index of the zero-inflated model on likelihood displacement is also studied. The zero-inflated negative binomial regression model is illustrated on data regarding joint damage in psoriatic arthritis. 相似文献

13.

Efficient posterior integration in stable paretian models

Efthymios G. Tsonias 《Statistical Papers》2000,41(3):305-325

The paper proposes a Markov Chain Monte Carlo method for Bayesian analysis of general regression models with disturbances from the family of stable distributions with arbitrary characteristic exponent and skewness parameter. The method does not require data augmentation and is based on combining fast Fourier transforms of the characteristic function to get the likelihood function and a Metropolis random walk chain to perform posterior analysis. Both a validation nonlinear regression and a nonlinear model for the Standard and Poor’s composite price index illustrate the method. 相似文献

14.

Estimating the Risk of a Down's Syndrome Term Pregnancy Using Age and Serum Markers: Comparison of Various Methods

Sándor Baran Kinga Sikolya Lajos Veress 《统计学通讯:模拟与计算》2013,42(7):1654-1672

The risk of an individual woman having a pregnancy associated with Down's syndrome is estimated given her age, α-fetoprotein, human chorionic gonadotropin, and pregnancy-specific β1-glycoprotein levels. The classical estimation method is based on discriminant analysis under the assumption of lognormality of the marker values, but logistic regression is also applied for data classification. In the present work, we compare the performance of the two methods using a dataset containing the data of almost 89,000 unaffected and 333 affected pregnancies. Assuming lognormality of the marker values, we also calculate the theoretical detection and false positive rates for both the methods. 相似文献

15.

带线性约束的多元线性回归模型参数估计

李小胜王申令《统计研究》2016,33(11):85-92

本文首先构造线性约束条件下的多元线性回归模型的样本似然函数,利用Lagrange法证明其合理性。其次,从似然函数的角度讨论线性约束条件对模型参数的影响,对由传统理论得出的参数估计作出贝叶斯与经验贝叶斯的改进。做贝叶斯改进时,将矩阵正态-Wishart分布作为模型参数和精度阵的联合共轭先验分布,结合构造的似然函数得出参数的后验分布,计算出参数的贝叶斯估计;做经验贝叶斯改进时,将样本分组,从方差的角度讨论由子样得出的参数估计对总样本的参数估计的影响,计算出经验贝叶斯估计。最后,利用Matlab软件生成的随机矩阵做模拟。结果表明,这两种改进后的参数估计均较由传统理论得出的参数估计更精确,拟合结果的误差比更小,可信度更高,在大数据的情况下,这种计算方法的速度更快。相似文献

16.

Invariance of linearity of regression and related characterizations of some classical discrete distributions

G. P. Patil M. V. Ratnaparkhi 《统计学通讯:理论与方法》2013,42(2):167-174

Rao (1963) introduced what we call an additive damage model. In this model, original observation is subjected to damage according to a specified probability law by the survival distribution. In this paper, we consider a bivariate observation with second component subjected to damage. Using the invariance of linearity of regression of the first component on the second under the transition of the second component from the original to the damaged state, we obtain the characterizations of the Poisson, binomial and negative binomial distributions within the framework of the additive damage model. 相似文献

17.

Laws of the Iterated Logarithm and a Moderate Deviation of MLE for the Proportional Hazards Model with Incomplete Information

Yurong Chen Luqin Liu 《统计学通讯:理论与方法》2013,42(22):4696-4708

In this paper, we obtain a law of iterated logarithm, a Chung-type law of iterated logarithm, and a moderate deviation result of the maximum likelihood estimator (MLE) for the unknown regression parameter vector in a proportional hazards model with incomplete information. 相似文献

18.

Empirical Likelihood Based Synthetic Data Method for Censored Regression Analysis

Ming Zheng Yunting Sun 《统计学通讯:理论与方法》2013,42(24):4365-4383

In this article, we propose a new empirical likelihood method for linear regression analysis with a right censored response variable. The method is based on the synthetic data approach for censored linear regression analysis. A log-empirical likelihood ratio test statistic for the entire regression coefficients vector is developed and we show that it converges to a standard chi-squared distribution. The proposed method can also be used to make inferences about linear combinations of the regression coefficients. Moreover, the proposed empirical likelihood ratio provides a way to combine different normal equations derived from various synthetic response variables. Maximizing this empirical likelihood ratio yields a maximum empirical likelihood estimator which is asymptotically equivalent to the solution of the estimating equation that are optimal linear combination of the original normal equations. It improves the estimation efficiency. The method is illustrated by some Monte Carlo simulation studies as well as a real example. 相似文献

19.

Laplace based approximate posterior inference for differential equation models

Sarat C. Dass Jaeyong Lee Kyoungjae Lee Jonghun Park 《Statistics and Computing》2017,27(3):679-698

Ordinary differential equations are arguably the most popular and useful mathematical tool for describing physical and biological processes in the real world. Often, these physical and biological processes are observed with errors, in which case the most natural way to model such data is via regression where the mean function is defined by an ordinary differential equation believed to provide an understanding of the underlying process. These regression based dynamical models are called differential equation models. Parameter inference from differential equation models poses computational challenges mainly due to the fact that analytic solutions to most differential equations are not available. In this paper, we propose an approximation method for obtaining the posterior distribution of parameters in differential equation models. The approximation is done in two steps. In the first step, the solution of a differential equation is approximated by the general one-step method which is a class of numerical numerical methods for ordinary differential equations including the Euler and the Runge-Kutta procedures; in the second step, nuisance parameters are marginalized using Laplace approximation. The proposed Laplace approximated posterior gives a computationally fast alternative to the full Bayesian computational scheme (such as Makov Chain Monte Carlo) and produces more accurate and stable estimators than the popular smoothing methods (called collocation methods) based on frequentist procedures. For a theoretical support of the proposed method, we prove that the Laplace approximated posterior converges to the actual posterior under certain conditions and analyze the relation between the order of numerical error and its Laplace approximation. The proposed method is tested on simulated data sets and compared with the other existing methods. 相似文献

20.

Robust plug-in bandwidth estimators in nonparametric regression

《Journal of statistical planning and inference》1997,57(1):109-142

In this paper, we propose a robust bandwidth selection method for local M-estimates used in nonparametric regression. We study the asymptotic behavior of the resulting estimates. We use the results of a Monte Carlo study to compare the performance of various competitors for moderate samples sizes. It appears that the robust plug-in bandwidth selector we propose compares favorably to its competitors, despite the need to select a pilot bandwidth. The Monte Carlo study shows that the robust plug-in bandwidth selector is very stable and relatively insensitive to the choice of the pilot. 相似文献