首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 625 毫秒
1.
Consider a two-dimensional discrete random variable (X, Y) with possible values 1, 2, …, I for X and 1, 2, …, J for Y. For specifying the distribution of (X, Y), suppose both conditional distributions, of X given Y and of Y given X, are provided. Under this setting, we present here different ways of measuring discrepancy between incompatible conditional distributions in the finite discrete case. In the process, we also suggest different ways of defining the most nearly compatible distributions in incompatible cases. Many new divergence measures are discussed along with those that are already known for determining the most nearly compatible joint distribution P. Finally, a comparative study is carried out between all these divergence measures as some examples.  相似文献   

2.
ABSTRACT

Parameter estimation based on truncated data is dealt with; the data are assumed to obey truncated exponential distributions with a variety of truncation time—a 1 data are obtained by truncation time b 1, a 2 data are obtained by truncation time b 2 and so on, whereas the underlying distribution is the same exponential one. The purpose of the present paper is to give existence conditions of the maximum likelihood estimators (MLEs) and to show some properties of the MLEs in two cases: 1) the grouped and truncated data are given (that is, the data each express the number of the data value falling in a corresponding subinterval), 2) the continuous and truncated data are given.  相似文献   

3.
Functional data analysis involves the extension of familiar statistical procedures such as principal‐components analysis, linear modelling and canonical correlation analysis to data where the raw observation is a function x, (t). An essential preliminary to a functional data analysis is often the registration or alignment of salient curve features by suitable monotone transformations hi(t). In effect, this conceptualizes variation among functions as being composed of two aspects: phase and amplitude. Registration aims to remove phase variation as a preliminary to statistical analyses of amplitude variation. A local nonlinear regression technique is described for identifying the smooth monotone transformations hi, and is illustrated by analyses of simulated and actual data.  相似文献   

4.
Given a collection of n curves that are independent realizations of a functional variable, we are interested in finding patterns in the curve data by exploring low-dimensional approximations to the curves. It is assumed that the data curves are noisy samples from the vector space span <texlscub>f 1, …, f m </texlscub>, where f 1, …, f m are unknown functions on the real interval (0, T) with square-integrable derivatives of all orders m or less, and m<n. Ramsay [Principal differential analysis: Data reduction by differential operators, J. R. Statist. Soc. Ser. B 58 (1996), pp. 495–508] first proposed the method of regularized principal differential analysis (PDA) as an alternative to principal component analysis for finding low-dimensional approximations to curves. PDA is based on the following theorem: there exists an annihilating linear differential operator (LDO) ? of order m such that ?f i =0, i=1, …, m [E.A. Coddington and N. Levinson, Theory of Ordinary Differential Equations, McGraw-Hill, New York, 1955, Theorem 6.2]. PDA specifies m, then uses the data to estimate an annihilating LDO. Smooth estimates of the coefficients of the LDO are obtained by minimizing a penalized sum of the squared norm of the residuals. In this context, the residual is that part of the data curve that is not annihilated by the LDO. PDA obtains the smooth low dimensional approximation to the data curves by projecting onto the null space of the estimated annihilating LDO; PDA is thus useful for obtaining low-dimensional approximations to the data curves whether or not the interpretation of the annihilating LDO is intuitive or obvious from the context of the data. This paper extends PDA to allow for the coefficients in the LDO to smoothly depend upon a single continuous covariate. The estimating equations for the coefficients allowing for a continuous covariate are derived; the penalty of Eilers and Marx [Flexible smoothing with B-splines and penalties, Statist. Sci. 11(2) (1996), pp. 89–121] is used to impose smoothness. The results of a small computer simulation study investigating the bias and variance properties of the estimator are reported.  相似文献   

5.
The magnitude of Z(n), the Z score associated with the largest value of X in a data set of size n, is shown to be bounded above by (n — 1)/√n. As a result, outliers defined as values exceeding four standard deviations from the mean cannot exist for small data sets.  相似文献   

6.
The Buckley–James estimator (BJE) [J. Buckley and I. James, Linear regression with censored data, Biometrika 66 (1979), pp. 429–436] has been extended from right-censored (RC) data to interval-censored (IC) data by Rabinowitz et al. [D. Rabinowitz, A. Tsiatis, and J. Aragon, Regression with interval-censored data, Biometrika 82 (1995), pp. 501–513]. The BJE is defined to be a zero-crossing of a modified score function H(b), a point at which H(·) changes its sign. We discuss several approaches (for finding a BJE with IC data) which are extensions of the existing algorithms for RC data. However, these extensions may not be appropriate for some data, in particular, they are not appropriate for a cancer data set that we are analysing. In this note, we present a feasible iterative algorithm for obtaining a BJE. We apply the method to our data.  相似文献   

7.
The paper gives a review of a number of data models for aggregate statistical data which have appeared in the computer science literature in the last ten years.After a brief introduction to the data model in general, the fundamental concepts of statistical data are introduced. These are called statistical objects because they are complex data structures (vectors, matrices, relations, time series, etc) which may have different possible representations (e.g. tables, relations, vectors, pie-charts, bar-charts, graphs, and so on). For this reason a statistical object is defined by two different types of attribute (a summary attribute, with its own summary type and with its own instances, called summary data, and the set of category attributes, which describe the summary attribute). Some conceptual models of statistical data (CSM, SDM4S), some semantic models of statistical data (SCM, SAM*, OSAM*), and some graphical models of statistical data (SUBJECT, GRASS, STORM) are also discussed.  相似文献   

8.
The adjusted r2 algorithm is a popular automated method for selecting the start time of the terminal disposition phase (tz) when conducting a noncompartmental pharmacokinetic data analysis. Using simulated data, the performance of the algorithm was assessed in relation to the ratio of the slopes of the preterminal and terminal disposition phases, the point of intercept of the terminal disposition phase with the preterminal disposition phase, the length of the terminal disposition phase captured in the concentration‐time profile, the number of data points present in the terminal disposition phase, and the level of variability in concentration measurement. The adjusted r2 algorithm was unable to identify tz accurately when there were more than three data points present in a profile's terminal disposition phase. The terminal disposition phase rate constant (λz) calculated based on the value of tz selected by the algorithm had a positive bias in all simulation data conditions. Tolerable levels of bias (median bias less than 5%) were achieved under conditions of low measurement variability. When measurement variability was high, tolerable levels of bias were attained only when the terminal phase time span was 4 multiples of t1/2 or longer. A comparison of the performance of the adjusted r2 algorithm, a simple r2 algorithm, and tz selection by visual inspection was conducted using a subset of the simulation data. In the comparison, the simple r2 algorithm performed as well as the adjusted r2 algorithm and the visual inspection method outperformed both algorithms. Recommendations concerning the use of the various tz selection methods are presented.  相似文献   

9.
When process data follow a particular curve in quality control, profile monitoring is suitable and appropriate for assessing process stability. Previous research in profile monitoring focusing on nonlinear parametric (P) modeling, involving both fixed and random-effects, was made under the assumption of an accurate nonlinear model specification. Lately, nonparametric (NP) methods have been used in the profile monitoring context in the absence of an obvious linear P model. This study introduces a novel technique in profile monitoring for any nonlinear and auto-correlated data. Referred to as the nonlinear mixed robust profile monitoring (NMRPM) method, it proposes a semiparametric (SP) approach that combines nonlinear P and NP profile fits for scenarios in which a nonlinear P model is adequate over part of the data but inadequate of the rest. These three methods (P, NP, and NMRPM) account for the auto-correlation within profiles and treats the collection of profiles as a random sample with a common population. During Phase I analysis, a version of Hotelling’s T2 statistic is proposed for each approach to identify abnormal profiles based on the estimated random effects and obtain the corresponding control limits. The performance of the NMRPM method is then evaluated using a real data set. Results reveal that the NMRPM method is robust to model misspecification and performs adequately against a correctly specified nonlinear P model. Control charts with the NMRPM method have excellent capability of detecting changes in Phase I data with control limits that are easily computable.  相似文献   

10.
Little work has been published on the analysis of censored data for the Birnbaum–Saunders distribution (BISA). In this article, we implement the EM algorithm to fit a regression model with censored data when the failure times follow the BISA. Three approaches to implement the E-Step of the EM algorithm are considered. In two of these implementations, the M-Step is attained by an iterative least-squares procedure. The algorithm is exemplified with a single explanatory variable in the model.  相似文献   

11.
In the framework of null hypothesis significance testing for functional data, we propose a procedure able to select intervals of the domain imputable for the rejection of a null hypothesis. An unadjusted p-value function and an adjusted one are the output of the procedure, namely interval-wise testing. Depending on the sort and level α of type-I error control, significant intervals can be selected by thresholding the two p-value functions at level α. We prove that the unadjusted (adjusted) p-value function point-wise (interval-wise) controls the probability of type-I error and it is point-wise (interval-wise) consistent. To enlighten the gain in terms of interpretation of the phenomenon under study, we applied the interval-wise testing to the analysis of a benchmark functional data set, i.e. Canadian daily temperatures. The new procedure provides insights that current state-of-the-art procedures do not, supporting similar advantages in the analysis of functional data with less prior knowledge.  相似文献   

12.
In survey sampling and in stereology, it is often desirable to estimate the ratio of means θ= E(Y)/E(X) from bivariate count data (X, Y) with unknown joint distribution. We review methods that are available for this problem, with particular reference to stereological applications. We also develop new methods based on explicit statistical models for the data, and associated model diagnostics. The methods are tested on a stereological dataset. For point‐count data, binomial regression and bivariate binomial models are generally adequate. Intercept‐count data are often overdispersed relative to Poisson regression models, but adequately fitted by negative binomial regression.  相似文献   

13.
A doubly censoring scheme occurs when the lifetimes T being measured, from a well-known time origin, are exactly observed within a window [L, R] of observational time and are otherwise censored either from above (right-censored observations) or below (left-censored observations). Sample data consists on the pairs (U, δ) where U = min{R, max{T, L}} and δ indicates whether T is exactly observed (δ = 0), right-censored (δ = 1) or left-censored (δ = −1). We are interested in the estimation of the marginal behaviour of the three random variables T, L and R based on the observed pairs (U, δ). We propose new nonparametric simultaneous marginal estimators [^(S)]T, [^(S)]L{\hat S_{T}, \hat S_{L}} and [^(S)]R{\hat S_{R}} for the survival functions of T, L and R, respectively, by means of an inverse-probability-of-censoring approach. The proposed estimators [^(S)]T, [^(S)]L{\hat S_{T}, \hat S_{L}} and [^(S)]R{\hat S_{R}} are not computationally intensive, generalize the empirical survival estimator and reduce to the Kaplan-Meier estimator in the absence of left-censored data. Furthermore, [^(S)]T{\hat S_{T}} is equivalent to a self-consistent estimator, is uniformly strongly consistent and asymptotically normal. The method is illustrated with data from a cohort of drug users recruited in a detoxification program in Badalona (Spain). For these data we estimate the survival function for the elapsed time from starting IV-drugs to AIDS diagnosis, as well as the potential follow-up time. A simulation study is discussed to assess the performance of the three survival estimators for moderate sample sizes and different censoring levels.  相似文献   

14.
Making predictions of future realized values of random variables based on currently available data is a frequent task in statistical applications. In some applications, the interest is to obtain a two-sided simultaneous prediction interval (SPI) to contain at least k out of m future observations with a certain confidence level based on n previous observations from the same distribution. A closely related problem is to obtain a one-sided upper (or lower) simultaneous prediction bound (SPB) to exceed (or be exceeded) by at least k out of m future observations. In this paper, we provide a general approach for computing SPIs and SPBs based on data from a particular member of the (log)-location-scale family of distributions with complete or right censored data. The proposed simulation-based procedure can provide exact coverage probability for complete and Type II censored data. For Type I censored data, our simulation results show that our procedure provides satisfactory results in small samples. We use three applications to illustrate the proposed simultaneous prediction intervals and bounds.  相似文献   

15.
The number of sterile couples in a retrospective study of the number of cycles to conception is necessarily zero; this is not so for a prospective study. The paper puts forward a modification of Weinberg and Gladen's beta geometric model for cycles to conception that is suitable for both types of investigation. The probability that a couple achieves conception at the xth cycle, but not earlier, is assumed to take the form Rx = (1 ? ρ)/(1 ? m x?1 ρ/u), instead of μ/(1 ? θ + θx). The set of parameter restraints (0 < m < 1, 0< ρ < 1, 1 < u) is appropriate for retrospective data, whilst the alternative set of restraints (1 < m, 1 < ρ, 0 < u < 1) is appropriate for prospective data. The decrease in Rx over time can be interpreted not only as a time effect, but also as a heterogeneity effect by replacing Weinberg and Gladen's beta mixture of geometric distributions by a q-beta mixture.  相似文献   

16.
In the study of the stochastic behaviour of the lifetime of an element as a function of its length, it is often observed that the failure time (or lifetime) decreases as the length increases. In probabilistic terms, such an idea can be expressed as follows. Let T be the lifetime of a specimen of length x, so the survival function, which denotes the probability that an element of length x survives till time t, will be given by ST (t, x) = P(T > t/α(x), where α(x) is a monotonically decreasing function. In particular, it is often assumed that T has a Weibull distribution. In this paper, we propose a generalization of this Weibull model by assuming that the distribution of T is Generalized gamma (GG). Since the GG model contains the Weibull, Gamma and Lognormal models as special and limiting cases, a GG regression model is an appropriate tool for describing the size effect on the lifetime and for selecting among the embedded models. Maximum likelihood estimates are obtained for the GG regression model with α(x) = cxb . As a special case this provide an alternative to the usual approach to estimation for the GG distribution which involves reparametrization. Related parametric inference issues are addressed and illustrated using two experimental data sets. Some discussion of censored data is also provided.  相似文献   

17.
RATES OF CONVERGENCE IN SEMI-PARAMETRIC MODELLING OF LONGITUDINAL DATA   总被引:2,自引:0,他引:2  
We consider the problem of semi-parametric regression modelling when the data consist of a collection of short time series for which measurements within series are correlated. The objective is to estimate a regression function of the form E[Y(t) | x] =x'ß+μ(t), where μ(.) is an arbitrary, smooth function of time t, and x is a vector of explanatory variables which may or may not vary with t. For the non-parametric part of the estimation we use a kernel estimator with fixed bandwidth h. When h is chosen without reference to the data we give exact expressions for the bias and variance of the estimators for β and μ(t) and an asymptotic analysis of the case in which the number of series tends to infinity whilst the number of measurements per series is held fixed. We also report the results of a small-scale simulation study to indicate the extent to which the theoretical results continue to hold when h is chosen by a data-based cross-validation method.  相似文献   

18.
In healthcare studies, count data sets measured with covariates often exhibit heterogeneity and contain extreme values. To analyse such count data sets, we use a finite mixture of regression model framework and investigate a robust estimation approach, called the L2E [D.W. Scott, On fitting and adapting of density estimates, Comput. Sci. Stat. 30 (1998), pp. 124–133], to estimate the parameters. The L2E is based on an integrated L2 distance between parametric conditional and true conditional mass functions. In addition to studying the theoretical properties of the L2E estimator, we compare the performance of L2E with the maximum likelihood (ML) estimator and a minimum Hellinger distance (MHD) estimator via Monte Carlo simulations for correctly specified and gross-error contaminated mixture of Poisson regression models. These show that the L2E is a viable robust alternative to the ML and MHD estimators. More importantly, we use the L2E to perform a comprehensive analysis of a Western Australia hospital inpatient obstetrical length of stay (LOS) (in days) data that contains extreme values. It is shown that the L2E provides a two-component Poisson mixture regression fit to the LOS data which is better than those based on the ML and MHD estimators. The L2E fit identifies admission type as a significant covariate that profiles the predominant subpopulation of normal-stayers as planned patients and the small subpopulation of long-stayers as emergency patients.  相似文献   

19.
In this work, we develop a method of adaptive non‐parametric estimation, based on ‘warped’ kernels. The aim is to estimate a real‐valued function s from a sample of random couples (X,Y). We deal with transformed data (Φ(X),Y), with Φ a one‐to‐one function, to build a collection of kernel estimators. The data‐driven bandwidth selection is performed with a method inspired by Goldenshluger and Lepski (Ann. Statist., 39, 2011, 1608). The method permits to handle various problems such as additive and multiplicative regression, conditional density estimation, hazard rate estimation based on randomly right‐censored data, and cumulative distribution function estimation from current‐status data. The interest is threefold. First, the squared‐bias/variance trade‐off is automatically realized. Next, non‐asymptotic risk bounds are derived. Lastly, the estimator is easily computed, thanks to its simple expression: a short simulation study is presented.  相似文献   

20.
A lifetime capability index L tp has been proposed to measure the business lifetime performance, wherein output lifetime measurements are assumed to be precise from the Pareto model with censored information. In the present study, we study a more realistic situation where the lifetime output data are imprecise. The approach developed by Buckley [Fuzzy system, Soft Comput. 9 (2005), pp. 757–760; Fuzzy statistics: Regression and prediction, Soft Comput. 9 (2005), pp. 769–775] incorporated with some extensions (a set of confidence intervals, one on top of the other), is used to construct the triangular-shaped fuzzy number for the fuzzy estimate of the L tp. With the sampling distribution of the unbiased estimator of the L tp, two useful fuzzy inference criteria, its critical value and fuzzy p-value are obtained to assess the lifetime performance. The presented methodology can handle the lifetime performance assessment on the condition that sample lifetime data are involved with imprecise information, classifying the lifetime performance with the three-decision rule. With different preset requirements and a certain degree of imprecise data, we also develop a four quadrants decision-making plot where managers can easily simultaneously visualize several important features of lifetime performance for making a decision. An example of business lifetime data is given to illustrate the applicability of the proposed method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号