期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Study of incompatibility or near compatibility of bivariate discrete conditional probability distributions through divergence measures

Indranil Ghosh N. Balakrishnan 《Journal of Statistical Computation and Simulation》2015,85(1):117-130

Consider a two-dimensional discrete random variable (X, Y) with possible values 1, 2, …, I for X and 1, 2, …, J for Y. For specifying the distribution of (X, Y), suppose both conditional distributions, of X given Y and of Y given X, are provided. Under this setting, we present here different ways of measuring discrepancy between incompatible conditional distributions in the finite discrete case. In the process, we also suggest different ways of defining the most nearly compatible distributions in incompatible cases. Many new divergence measures are discussed along with those that are already known for determining the most nearly compatible joint distribution P. Finally, a comparative study is carried out between all these divergence measures as some examples. 相似文献

2.

PARAMETER ESTIMATION BASED ON GROUPED OR CONTINUOUS DATA FOR TRUNCATED EXPONENTIAL DISTRIBUTIONS

《统计学通讯:理论与方法》2013,42(6):889-900

ABSTRACT

Parameter estimation based on truncated data is dealt with; the data are assumed to obey truncated exponential distributions with a variety of truncation time—a ₁ data are obtained by truncation time b ₁, a ₂ data are obtained by truncation time b ₂ and so on, whereas the underlying distribution is the same exponential one. The purpose of the present paper is to give existence conditions of the maximum likelihood estimators (MLEs) and to show some properties of the MLEs in two cases: 1) the grouped and truncated data are given (that is, the data each express the number of the data value falling in a corresponding subinterval), 2) the continuous and truncated data are given. 相似文献

3.

Curve registration by local regression

A. Kneip X. Li K. B. MacGibbon J. O. Ramsay 《Revue canadienne de statistique》2000,28(1):19-29

Functional data analysis involves the extension of familiar statistical procedures such as principal‐components analysis, linear modelling and canonical correlation analysis to data where the raw observation is a function x, (t). An essential preliminary to a functional data analysis is often the registration or alignment of salient curve features by suitable monotone transformations h_i(t). In effect, this conceptualizes variation among functions as being composed of two aspects: phase and amplitude. Registration aims to remove phase variation as a preliminary to statistical analyses of amplitude variation. A local nonlinear regression technique is described for identifying the smooth monotone transformations h_i, and is illustrated by analyses of simulated and actual data. 相似文献

4.

Principal differential analysis with a continuous covariate: low-dimensional approximations for functional data

Seoweon Jin Indika Mallawaarachchi 《Journal of Statistical Computation and Simulation》2013,83(10):1964-1980

Given a collection of n curves that are independent realizations of a functional variable, we are interested in finding patterns in the curve data by exploring low-dimensional approximations to the curves. It is assumed that the data curves are noisy samples from the vector space span <texlscub>f ₁, …, f _m</texlscub>, where f ₁, …, f _m are unknown functions on the real interval (0, T) with square-integrable derivatives of all orders m or less, and m<n. Ramsay [Principal differential analysis: Data reduction by differential operators, J. R. Statist. Soc. Ser. B 58 (1996), pp. 495–508] first proposed the method of regularized principal differential analysis (PDA) as an alternative to principal component analysis for finding low-dimensional approximations to curves. PDA is based on the following theorem: there exists an annihilating linear differential operator (LDO) ? of order m such that ?f _i=0, i=1, …, m [E.A. Coddington and N. Levinson, Theory of Ordinary Differential Equations, McGraw-Hill, New York, 1955, Theorem 6.2]. PDA specifies m, then uses the data to estimate an annihilating LDO. Smooth estimates of the coefficients of the LDO are obtained by minimizing a penalized sum of the squared norm of the residuals. In this context, the residual is that part of the data curve that is not annihilated by the LDO. PDA obtains the smooth low dimensional approximation to the data curves by projecting onto the null space of the estimated annihilating LDO; PDA is thus useful for obtaining low-dimensional approximations to the data curves whether or not the interpretation of the annihilating LDO is intuitive or obvious from the context of the data. This paper extends PDA to allow for the coefficients in the LDO to smoothly depend upon a single continuous covariate. The estimating equations for the coefficients allowing for a continuous covariate are derived; the penalty of Eilers and Marx [Flexible smoothing with B-splines and penalties, Statist. Sci. 11(2) (1996), pp. 89–121] is used to impose smoothness. The results of a small computer simulation study investigating the bias and variance properties of the estimator are reported. 相似文献

5.

Maximum Z Scores and Outliers

Ronald E. Shiffler 《The American statistician》2013,67(1):79-80

The magnitude of Z_(n), the Z score associated with the largest value of X in a data set of size n, is shown to be bounded above by (n — 1)/√n. As a result, outliers defined as values exceeding four standard deviations from the mean cannot exist for small data sets. 相似文献

6.

An algorithm for the Buckley–James estimator with interval-censored data

《Journal of Statistical Computation and Simulation》2012,82(11):1341-1353

The Buckley–James estimator (BJE) [J. Buckley and I. James, Linear regression with censored data, Biometrika 66 (1979), pp. 429–436] has been extended from right-censored (RC) data to interval-censored (IC) data by Rabinowitz et al. [D. Rabinowitz, A. Tsiatis, and J. Aragon, Regression with interval-censored data, Biometrika 82 (1995), pp. 501–513]. The BJE is defined to be a zero-crossing of a modified score function H(b), a point at which H(·) changes its sign. We discuss several approaches (for finding a BJE with IC data) which are extensions of the existing algorithms for RC data. However, these extensions may not be appropriate for some data, in particular, they are not appropriate for a cancer data set that we are analysing. In this note, we present a feasible iterative algorithm for obtaining a BJE. We apply the method to our data. 相似文献

7.

Aggregate statistical data: models for their representation

Maurizio Rafanelli 《Statistics and Computing》1995,5(1):3-24

The paper gives a review of a number of data models for aggregate statistical data which have appeared in the computer science literature in the last ten years.After a brief introduction to the data model in general, the fundamental concepts of statistical data are introduced. These are called statistical objects because they are complex data structures (vectors, matrices, relations, time series, etc) which may have different possible representations (e.g. tables, relations, vectors, pie-charts, bar-charts, graphs, and so on). For this reason a statistical object is defined by two different types of attribute (a summary attribute, with its own summary type and with its own instances, called summary data, and the set of category attributes, which describe the summary attribute). Some conceptual models of statistical data (CSM, SDM4S), some semantic models of statistical data (SCM, SAM*, OSAM*), and some graphical models of statistical data (SUBJECT, GRASS, STORM) are also discussed. 相似文献

8.

Performance characteristics of the adjusted r2 algorithm for determining the start of the terminal disposition phase and comparison with a simple r2 algorithm and a visual inspection method

Dennis A. Noe 《Pharmaceutical statistics》2020,19(2):88-100

The adjusted r² algorithm is a popular automated method for selecting the start time of the terminal disposition phase (t_z) when conducting a noncompartmental pharmacokinetic data analysis. Using simulated data, the performance of the algorithm was assessed in relation to the ratio of the slopes of the preterminal and terminal disposition phases, the point of intercept of the terminal disposition phase with the preterminal disposition phase, the length of the terminal disposition phase captured in the concentration‐time profile, the number of data points present in the terminal disposition phase, and the level of variability in concentration measurement. The adjusted r² algorithm was unable to identify t_z accurately when there were more than three data points present in a profile's terminal disposition phase. The terminal disposition phase rate constant (λ_z) calculated based on the value of t_z selected by the algorithm had a positive bias in all simulation data conditions. Tolerable levels of bias (median bias less than 5%) were achieved under conditions of low measurement variability. When measurement variability was high, tolerable levels of bias were attained only when the terminal phase time span was 4 multiples of t_1/2 or longer. A comparison of the performance of the adjusted r² algorithm, a simple r² algorithm, and t_z selection by visual inspection was conducted using a subset of the simulation data. In the comparison, the simple r² algorithm performed as well as the adjusted r² algorithm and the visual inspection method outperformed both algorithms. Recommendations concerning the use of the various t_z selection methods are presented. 相似文献

9.

A semiparametric nonlinear mixed model approach to phase I profile monitoring

Abdel-Salam Gomaa Jeffrey B. Birch 《统计学通讯:模拟与计算》2019,48(6):1677-1693

When process data follow a particular curve in quality control, profile monitoring is suitable and appropriate for assessing process stability. Previous research in profile monitoring focusing on nonlinear parametric (P) modeling, involving both fixed and random-effects, was made under the assumption of an accurate nonlinear model specification. Lately, nonparametric (NP) methods have been used in the profile monitoring context in the absence of an obvious linear P model. This study introduces a novel technique in profile monitoring for any nonlinear and auto-correlated data. Referred to as the nonlinear mixed robust profile monitoring (NMRPM) method, it proposes a semiparametric (SP) approach that combines nonlinear P and NP profile fits for scenarios in which a nonlinear P model is adequate over part of the data but inadequate of the rest. These three methods (P, NP, and NMRPM) account for the auto-correlation within profiles and treats the collection of profiles as a random sample with a common population. During Phase I analysis, a version of Hotelling’s T² statistic is proposed for each approach to identify abnormal profiles based on the estimated random effects and obtain the corresponding control limits. The performance of the NMRPM method is then evaluated using a real data set. Results reveal that the NMRPM method is robust to model misspecification and performs adequately against a correctly specified nonlinear P model. Control charts with the NMRPM method have excellent capability of detecting changes in Phase I data with control limits that are easily computable. 相似文献

10.

Estimation of parameters for a Birnbaum–Saunders regression model with censored data

《Journal of Statistical Computation and Simulation》2012,82(11):983-997

Little work has been published on the analysis of censored data for the Birnbaum–Saunders distribution (BISA). In this article, we implement the EM algorithm to fit a regression model with censored data when the failure times follow the BISA. Three approaches to implement the E-Step of the EM algorithm are considered. In two of these implementations, the M-Step is attained by an iterative least-squares procedure. The algorithm is exemplified with a single explanatory variable in the model. 相似文献

11.

Interval-wise testing for functional data

A. Pini S. Vantini 《Journal of nonparametric statistics》2017,29(2):407-424

In the framework of null hypothesis significance testing for functional data, we propose a procedure able to select intervals of the domain imputable for the rejection of a null hypothesis. An unadjusted p-value function and an adjusted one are the output of the procedure, namely interval-wise testing. Depending on the sort and level α of type-I error control, significant intervals can be selected by thresholding the two p-value functions at level α. We prove that the unadjusted (adjusted) p-value function point-wise (interval-wise) controls the probability of type-I error and it is point-wise (interval-wise) consistent. To enlighten the gain in terms of interpretation of the phenomenon under study, we applied the interval-wise testing to the analysis of a benchmark functional data set, i.e. Canadian daily temperatures. The new procedure provides insights that current state-of-the-art procedures do not, supporting similar advantages in the analysis of functional data with less prior knowledge. 相似文献

12.

ESTIMATING A RATIO OF MEANS FROM BIVARIATE COUNTS WITH APPLICATIONS IN STEREOLOGY

Adrian Baddeley 《Australian & New Zealand Journal of Statistics》2011,53(3):365-387

In survey sampling and in stereology, it is often desirable to estimate the ratio of means θ= E(Y)/E(X) from bivariate count data (X, Y) with unknown joint distribution. We review methods that are available for this problem, with particular reference to stereological applications. We also develop new methods based on explicit statistical models for the data, and associated model diagnostics. The methods are tested on a stereological dataset. For point‐count data, binomial regression and bivariate binomial models are generally adequate. Intercept‐count data are often overdispersed relative to Poisson regression models, but adequately fitted by negative binomial regression. 相似文献

13.

Simultaneous marginal survival estimators when doubly censored data is present

Julià O Gómez G 《Lifetime data analysis》2011,17(3):347-372

A doubly censoring scheme occurs when the lifetimes T being measured, from a well-known time origin, are exactly observed within a window [L, R] of observational time and are otherwise censored either from above (right-censored observations) or below (left-censored observations). Sample data consists on the pairs (U, δ) where U = min{R, max{T, L}} and δ indicates whether T is exactly observed (δ = 0), right-censored (δ = 1) or left-censored (δ = −1). We are interested in the estimation of the marginal behaviour of the three random variables T, L and R based on the observed pairs (U, δ). We propose new nonparametric simultaneous marginal estimators [^(S)]_T, [^(S)]_L{\hat S_{T}, \hat S_{L}} and [^(S)]_R{\hat S_{R}} for the survival functions of T, L and R, respectively, by means of an inverse-probability-of-censoring approach. The proposed estimators [^(S)]_T, [^(S)]_L{\hat S_{T}, \hat S_{L}} and [^(S)]_R{\hat S_{R}} are not computationally intensive, generalize the empirical survival estimator and reduce to the Kaplan-Meier estimator in the absence of left-censored data. Furthermore, [^(S)]_T{\hat S_{T}} is equivalent to a self-consistent estimator, is uniformly strongly consistent and asymptotically normal. The method is illustrated with data from a cohort of drug users recruited in a detoxification program in Badalona (Spain). For these data we estimate the survival function for the elapsed time from starting IV-drugs to AIDS diagnosis, as well as the potential follow-up time. A simulation study is discussed to assess the performance of the three survival estimators for moderate sample sizes and different censoring levels. 相似文献

14.

A general algorithm for computing simultaneous prediction intervals for the (log)-location-scale family of distributions

Yimeng Xie Luis A. Escobar William Q. Meeker 《Journal of Statistical Computation and Simulation》2017,87(8):1559-1576

Making predictions of future realized values of random variables based on currently available data is a frequent task in statistical applications. In some applications, the interest is to obtain a two-sided simultaneous prediction interval (SPI) to contain at least k out of m future observations with a certain confidence level based on n previous observations from the same distribution. A closely related problem is to obtain a one-sided upper (or lower) simultaneous prediction bound (SPB) to exceed (or be exceeded) by at least k out of m future observations. In this paper, we provide a general approach for computing SPIs and SPBs based on data from a particular member of the (log)-location-scale family of distributions with complete or right censored data. The proposed simulation-based procedure can provide exact coverage probability for complete and Type II censored data. For Type I censored data, our simulation results show that our procedure provides satisfactory results in small samples. We use three applications to illustrate the proposed simultaneous prediction intervals and bounds. 相似文献

15.

THE q-BETA-GEOMETRIC DISTRIBUTION AS A MODEL FOR FECUNDABILITY

《统计学通讯:理论与方法》2013,42(11):2373-2384

The number of sterile couples in a retrospective study of the number of cycles to conception is necessarily zero; this is not so for a prospective study. The paper puts forward a modification of Weinberg and Gladen's beta geometric model for cycles to conception that is suitable for both types of investigation. The probability that a couple achieves conception at the xth cycle, but not earlier, is assumed to take the form R_x = (1 ? ρ)/(1 ? m ^x?1 ρ/u), instead of μ/(1 ? θ + θx). The set of parameter restraints (0 < m < 1, 0< ρ < 1, 1 < u) is appropriate for retrospective data, whilst the alternative set of restraints (1 < m, 1 < ρ, 0 < u < 1) is appropriate for prospective data. The decrease in R_x over time can be interpreted not only as a time effect, but also as a heterogeneity effect by replacing Weinberg and Gladen's beta mixture of geometric distributions by a q-beta mixture. 相似文献

16.

MODELING THE LIFETIME OF LONGITUDINAL ELEMENTS

《统计学通讯:模拟与计算》2013,42(4):717-741

In the study of the stochastic behaviour of the lifetime of an element as a function of its length, it is often observed that the failure time (or lifetime) decreases as the length increases. In probabilistic terms, such an idea can be expressed as follows. Let T be the lifetime of a specimen of length x, so the survival function, which denotes the probability that an element of length x survives till time t, will be given by S_T (t, x) = P(T > t/α(x), where α(x) is a monotonically decreasing function. In particular, it is often assumed that T has a Weibull distribution. In this paper, we propose a generalization of this Weibull model by assuming that the distribution of T is Generalized gamma (GG). Since the GG model contains the Weibull, Gamma and Lognormal models as special and limiting cases, a GG regression model is an appropriate tool for describing the size effect on the lifetime and for selecting among the embedded models. Maximum likelihood estimates are obtained for the GG regression model with α(x) = cx^b . As a special case this provide an alternative to the usual approach to estimation for the GG distribution which involves reparametrization. Related parametric inference issues are addressed and illustrated using two experimental data sets. Some discussion of censored data is also provided. 相似文献

17.

RATES OF CONVERGENCE IN SEMI-PARAMETRIC MODELLING OF LONGITUDINAL DATA 总被引：2，自引：0，他引：2

R.A. MOYEED P.J. DIGGLE 《Australian & New Zealand Journal of Statistics》1994,36(1):75-93

We consider the problem of semi-parametric regression modelling when the data consist of a collection of short time series for which measurements within series are correlated. The objective is to estimate a regression function of the form E[Y(t) | x] =x'ß+μ(t), where μ(.) is an arbitrary, smooth function of time t, and x is a vector of explanatory variables which may or may not vary with t. For the non-parametric part of the estimation we use a kernel estimator with fixed bandwidth h. When h is chosen without reference to the data we give exact expressions for the bias and variance of the estimators for β and μ(t) and an asymptotic analysis of the case in which the number of series tends to infinity whilst the number of measurements per series is held fixed. We also report the results of a small-scale simulation study to indicate the extent to which the theoretical results continue to hold when h is chosen by a data-based cross-validation method. 相似文献

18.

On the performance of L2E estimation in modelling heterogeneous count responses with extreme values

《Journal of Statistical Computation and Simulation》2012,82(3):564-581

In healthcare studies, count data sets measured with covariates often exhibit heterogeneity and contain extreme values. To analyse such count data sets, we use a finite mixture of regression model framework and investigate a robust estimation approach, called the L₂E [D.W. Scott, On fitting and adapting of density estimates, Comput. Sci. Stat. 30 (1998), pp. 124–133], to estimate the parameters. The L₂E is based on an integrated L₂ distance between parametric conditional and true conditional mass functions. In addition to studying the theoretical properties of the L₂E estimator, we compare the performance of L₂E with the maximum likelihood (ML) estimator and a minimum Hellinger distance (MHD) estimator via Monte Carlo simulations for correctly specified and gross-error contaminated mixture of Poisson regression models. These show that the L₂E is a viable robust alternative to the ML and MHD estimators. More importantly, we use the L₂E to perform a comprehensive analysis of a Western Australia hospital inpatient obstetrical length of stay (LOS) (in days) data that contains extreme values. It is shown that the L₂E provides a two-component Poisson mixture regression fit to the LOS data which is better than those based on the ML and MHD estimators. The L₂E fit identifies admission type as a significant covariate that profiles the predominant subpopulation of normal-stayers as planned patients and the small subpopulation of long-stayers as emergency patients. 相似文献

19.

Adaptive Warped Kernel Estimators

下载免费PDF全文

Gaëlle Chagny 《Scandinavian Journal of Statistics》2015,42(2):336-360

In this work, we develop a method of adaptive non‐parametric estimation, based on ‘warped’ kernels. The aim is to estimate a real‐valued function s from a sample of random couples (X,Y). We deal with transformed data (Φ(X),Y), with Φ a one‐to‐one function, to build a collection of kernel estimators. The data‐driven bandwidth selection is performed with a method inspired by Goldenshluger and Lepski (Ann. Statist., 39, 2011, 1608). The method permits to handle various problems such as additive and multiplicative regression, conditional density estimation, hazard rate estimation based on randomly right‐censored data, and cumulative distribution function estimation from current‐status data. The interest is threefold. First, the squared‐bias/variance trade‐off is automatically realized. Next, non‐asymptotic risk bounds are derived. Lastly, the estimator is easily computed, thanks to its simple expression: a short simulation study is presented. 相似文献

20.

Evaluating lifetime performance for the Pareto model with censored and imprecise information

《Journal of Statistical Computation and Simulation》2012,82(12):1817-1833

A lifetime capability index L _tp has been proposed to measure the business lifetime performance, wherein output lifetime measurements are assumed to be precise from the Pareto model with censored information. In the present study, we study a more realistic situation where the lifetime output data are imprecise. The approach developed by Buckley [Fuzzy system, Soft Comput. 9 (2005), pp. 757–760; Fuzzy statistics: Regression and prediction, Soft Comput. 9 (2005), pp. 769–775] incorporated with some extensions (a set of confidence intervals, one on top of the other), is used to construct the triangular-shaped fuzzy number for the fuzzy estimate of the L _tp. With the sampling distribution of the unbiased estimator of the L _tp, two useful fuzzy inference criteria, its critical value and fuzzy p-value are obtained to assess the lifetime performance. The presented methodology can handle the lifetime performance assessment on the condition that sample lifetime data are involved with imprecise information, classifying the lifetime performance with the three-decision rule. With different preset requirements and a certain degree of imprecise data, we also develop a four quadrants decision-making plot where managers can easily simultaneously visualize several important features of lifetime performance for making a decision. An example of business lifetime data is given to illustrate the applicability of the proposed method. 相似文献