首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Inference for the state occupation probabilities, given a set of baseline covariates, is an important problem in survival analysis and time to event multistate data. We introduce an inverse censoring probability re-weighted semi-parametric single index model based approach to estimate conditional state occupation probabilities of a given individual in a multistate model under right-censoring. Besides obtaining a temporal regression function, we also test the potential time varying effect of a baseline covariate on future state occupation. We show that the proposed technique has desirable finite sample performances and its performance is competitive when compared with three other existing approaches. We illustrate the proposed methodology using two different data sets. First, we re-examine a well-known data set dealing with leukemia patients undergoing bone marrow transplant with various state transitions. Our second illustration is based on data from a study involving functional status of a set of spinal cord injured patients undergoing a rehabilitation program.  相似文献   

2.
The GARCH and stochastic volatility (SV) models are two competing, well-known and often used models to explain the volatility of financial series. In this paper, we consider a closed form estimator for a stochastic volatility model and derive its asymptotic properties. We confirm our theoretical results by a simulation study. In addition, we propose a set of simple, strongly consistent decision rules to compare the ability of the GARCH and the SV model to fit the characteristic features observed in high frequency financial data such as high kurtosis and slowly decaying autocorrelation function of the squared observations. These rules are based on a number of moment conditions that is allowed to increase with sample size. We show that our selection procedure leads to choosing the model that fits best, or the simplest model under equivalence, with probability one as the sample size increases. The finite sample size behavior of our procedure is analyzed via simulations. Finally, we provide an application to stocks in the Dow Jones industrial average index.  相似文献   

3.
In this work we propose an autoregressive model with parameters varying in time applied to irregularly spaced non-stationary time series. We expand all the functional parameters in a wavelet basis and estimate the coefficients by least squares after truncation at a suitable resolution level. We also present some simulations in order to evaluate both the estimation method and the model behavior on finite samples. Applications to silicates and nitrites irregularly observed data are provided as well.  相似文献   

4.
Time series which have more than one time dependent variable require building an appropriate model in which the variables not only have relationships with each other, but also depend on previous values in time. Based on developments for a sufficient dimension reduction, we investigate a new class of multiple time series models without parametric assumptions. First, for the dependent and independent time series, we simply use a univariate time series central subspace to estimate the autoregressive lags of the series. Secondly, we extract the successive directions to estimate the time series central subspace for regressors which include past lags of dependent and independent series in a mutual information multiple-index time series. Lastly, we estimate a multiple time series model for the reduced directions. In this article, we propose a unified estimation method of minimal dimension using an Akaike information criterion, for situations in which the dimension for multiple regressors is unknown. We present an analysis using real data from the housing price index showing that our approach is an alternative for multiple time series modeling. In addition, we check the accuracy for the multiple time series central subspace method using three simulated data sets.  相似文献   

5.
Environmental variables have an important effect on the reliability of many products such as coatings and polymeric composites. Long-term prediction of the performance or service life of such products must take into account the probabilistic/stochastic nature of the outdoor weather. In this article, we propose a time series modeling procedure to model the time series data of daily accumulated degradation. Daily accumulated degradation is the total amount of degradation accrued within one day and can be obtained by using a degradation rate model for the product and the weather data. The fitted model of the time series can then be used to estimate the future distribution of cumulative degradation over a period of time, and to compute reliability measures such as the probability of failure. The modeling technique and estimation method are illustrated using the degradation of a solar reflector material. We also provide a method to construct approximate confidence intervals for the probability of failure.  相似文献   

6.
ABSTRACT

In this article we discuss methodology for analyzing nonstationary time series whose periodic nature changes approximately linearly with time. We make use of the M-stationary process to describe such data sets, and in particular we use the discrete Euler(p) model to obtain forecasts and estimate the spectral characteristics. We discuss the use of the M-spectrum for displaying linear time-varying periodic content in a time series realization in much the same way that the spectrum shows periodic content within a realization of a stationary series. We also introduce the instantaneous frequency and spectrum of an M-stationary process for purposes of describing how frequency changes with time. To illustrate our techniques we use one simulated data set and two bat echolocation signals that show time varying frequency behavior. Our results indicate that for data whose periodic content is changing approximately linearly in time, the Euler model serves as a very good model for spectral analysis, filtering, and forecasting. Additionally, the instantaneous spectrum is shown to provide better representation of the time-varying frequency content in the data than window-based techniques such as the Gabor and wavelet transforms. Finally, it is noted that the results of this article can be extended to processes whose frequencies change like atα, a > 0, ?∞ < α < ? ∞.  相似文献   

7.
We study the problem of classification with multiple q-variate observations with and without time effect on each individual. We develop new classification rules for populations with certain structured and unstructured mean vectors and under certain covariance structures. The new classification rules are effective when the number of observations is not large enough to estimate the variance–covariance matrix. Computational schemes for maximum likelihood estimates of required population parameters are given. We apply our findings to two real data sets as well as to a simulated data set.  相似文献   

8.
Several procedures of sequential pattern analysis are designed to detect frequently occurring patterns in a single categorical time series (episode mining). Based on these frequent patterns, rules are generated and evaluated, for example, in terms of their confidence. The confidence value is commonly interpreted as an estimate of a conditional probability, so some kind of stochastic model has to be assumed. The model is identified as a variable length Markov model. With this assumption, the usual confidences are maximum likelihood estimates of the transition probabilities of the Markov model. We discuss possibilities of how to efficiently fit an appropriate model to the data. Based on this model, rules are formulated. It is demonstrated that this new approach generates noticeably less and more reliable rules.  相似文献   

9.
Absolute risk is the probability that a cause-specific event occurs in a given time interval in the presence of competing events. We present methods to estimate population-based absolute risk from a complex survey cohort that can accommodate multiple exposure-specific competing risks. The hazard function for each event type consists of an individualized relative risk multiplied by a baseline hazard function, which is modeled nonparametrically or parametrically with a piecewise exponential model. An influence method is used to derive a Taylor-linearized variance estimate for the absolute risk estimates. We introduce novel measures of the cause-specific influences that can guide modeling choices for the competing event components of the model. To illustrate our methodology, we build and validate cause-specific absolute risk models for cardiovascular and cancer deaths using data from the National Health and Nutrition Examination Survey. Our applications demonstrate the usefulness of survey-based risk prediction models for predicting health outcomes and quantifying the potential impact of disease prevention programs at the population level.  相似文献   

10.
Summary.  In functional data analysis, curves or surfaces are observed, up to measurement error, at a finite set of locations, for, say, a sample of n individuals. Often, the curves are homogeneous, except perhaps for individual-specific regions that provide heterogeneous behaviour (e.g. 'damaged' areas of irregular shape on an otherwise smooth surface). Motivated by applications with functional data of this nature, we propose a Bayesian mixture model, with the aim of dimension reduction, by representing the sample of n curves through a smaller set of canonical curves. We propose a novel prior on the space of probability measures for a random curve which extends the popular Dirichlet priors by allowing local clustering: non-homogeneous portions of a curve can be allocated to different clusters and the n individual curves can be represented as recombinations (hybrids) of a few canonical curves. More precisely, the prior proposed envisions a conceptual hidden factor with k -levels that acts locally on each curve. We discuss several models incorporating this prior and illustrate its performance with simulated and real data sets. We examine theoretical properties of the proposed finite hybrid Dirichlet mixtures, specifically, their behaviour as the number of the mixture components goes to ∞ and their connection with Dirichlet process mixtures.  相似文献   

11.
We introduce and study general mathematical properties of a new generator of continuous distributions with three extra parameters called the new generalized odd log-logistic family of distributions. The proposed family contains several important classes discussed in the literature as submodels such as the proportional reversed hazard rate and odd log-logistic classes. Its density function can be expressed as a mixture of exponentiated densities based on the same baseline distribution. Some of its mathematical properties including ordinary moments, quantile and generating functions, entropy measures, and order statistics, which hold for any baseline model, are presented. We also present certain characterization of the proposed distribution and derive a power series for the quantile function. We discuss the method of maximum likelihood to estimate the model parameters. We study the behavior of the maximum likelihood estimator via simulation. The importance of the new family is illustrated by means of two real data sets. These applications indicate that the new family can provide better fits than other well-known classes of distributions. The beauty and importance of the new family lies in its ability to model real data.  相似文献   

12.
Affiliation network is one kind of two-mode social network with two different sets of nodes (namely, a set of actors and a set of social events) and edges representing the affiliation of the actors with the social events. The connections in many affiliation networks are only binary weighted between actors and social events that can not reveal the affiliation strength relationship. Although a number of statistical models are proposed to analyze affiliation binary weighted networks, the asymptotic behaviors of the maximum likelihood estimator (MLE) are still unknown or have not been properly explored in affiliation weighted networks. In this paper, we study an affiliation model with the degree sequence as the exclusively natural sufficient statistic in the exponential family distributions. We derive the consistency and asymptotic normality of the maximum likelihood estimator in affiliation finite discrete weighted networks when the numbers of actors and events both go to infinity. Simulation studies and a real data example demonstrate our theoretical results.  相似文献   

13.
We develop an entropy-based test for randomness of binary time series of finite length. The test uses the frequencies of contiguous blocks of different lengths. A simple condition ib the block lengths and the length of the time series enables one to estimate the entropy rate for the data, and this information is used to develop a statistic to test the hypothesis of randomness. This static measures the deviation of the estimated entropy of the observed data from the theoretical maximum under the randomness hypothesis. This test offers a real alternative to the conventional runs test. Critical percentage points, based on simulations, are provided for testing the hypothesis of randomness. Power calculations using dependent data show that the proposed test has higher power against the runs test for short series, and it is similar to the runs test for long series. The test is applied to two published data sets that wree investigated by others with respect to their randomness.  相似文献   

14.
In this paper we consider the statistical analysis of multivariate multiple nonlinear regression models with correlated errors, using Finite Fourier Transforms. Consistency and asymptotic normality of the weighted least squares estimates are established under various conditions on the regressor variables. These conditions involve different types of scalings, and the scaling factors are obtained explicitly for various types of nonlinear regression models including an interesting model which requires the estimation of unknown frequencies. The estimation of frequencies is a classical problem occurring in many areas like signal processing, environmental time series, astronomy and other areas of physical sciences. We illustrate our methodology using two real data sets taken from geophysics and environmental sciences. The data we consider from geophysics are polar motion (which is now widely known as “Chandlers Wobble”), where one has to estimate the drift parameters, the offset parameters and the two periodicities associated with elliptical motion. The data were first analyzed by Arato, Kolmogorov and Sinai who treat it as a bivariate time series satisfying a finite order time series model. They estimate the periodicities using the coefficients of the fitted models. Our analysis shows that the two dominant frequencies are 12 h and 410 days. The second example, we consider is the minimum/maximum monthly temperatures observed at the Antarctic Peninsula (Faraday/Vernadsky station). It is now widely believed that over the past 50 years there is a steady warming in this region, and if this is true, the warming has serious consequences on ecology, marine life, etc. as it can result in melting of ice shelves and glaciers. Our objective here is to estimate any existing temperature trend in the data, and we use the nonlinear regression methodology developed here to achieve that goal.  相似文献   

15.
Summary.  Suppose that we have m repeated measures on each subject, and we model the observation vectors with a finite mixture model.  We further assume that the repeated measures are conditionally independent. We present methods to estimate the shape of the component distributions along with various features of the component distributions such as the medians, means and variances. We make no distributional assumptions on the components; indeed, we allow different shapes for different components.  相似文献   

16.
Acute respiratory diseases are transmitted over networks of social contacts. Large-scale simulation models are used to predict epidemic dynamics and evaluate the impact of various interventions, but the contact behavior in these models is based on simplistic and strong assumptions which are not informed by survey data. These assumptions are also used for estimating transmission measures such as the basic reproductive number and secondary attack rates. Development of methodology to infer contact networks from survey data could improve these models and estimation methods. We contribute to this area by developing a model of within-household social contacts and using it to analyze the Belgian POLYMOD data set, which contains detailed diaries of social contacts in a 24-hour period. We model dependency in contact behavior through a latent variable indicating which household members are at home. We estimate age-specific probabilities of being at home and age-specific probabilities of contact conditional on two members being at home. Our results differ from the standard random mixing assumption. In addition, we find that the probability that all members contact each other on a given day is fairly low: 0.49 for households with two 0-5 year olds and two 19-35 year olds, and 0.36 for households with two 12-18 year olds and two 36+ year olds. We find higher contact rates in households with 2-3 members, helping explain the higher influenza secondary attack rates found in households of this size.  相似文献   

17.
We study the nonparametric maximum likelihood estimate (NPMLE) of the cdf or sub-distribution functions of the failure time for the failure causes in a series system. The study is motivated by a cancer research data (from the Memorial Sloan-Kettering Cancer Center) with interval-censored time and masked failure cause. The NPMLE based on this data set suggests that the existing masking models are not appropriate. We propose a new model called the random partition masking model, which does not rely on the commonly used symmetry assumption (namely, given the failure cause, the probability of observing the masked failure causes is independent of the failure time; see Flehinger et al. Inference about defects in the presence of masking, Technometrics 38 (1996), pp. 247–255). The RPM model is easier to implement in simulation studies than the existing models. We discuss the algorithms for computing the NPMLE and study its asymptotic properties. Our simulation and data analysis indicate that the NPMLE is feasible for a moderate sample size.  相似文献   

18.
An extension to the class of conventional numerical probability models for nondeterministic phenomena has been identified by Dempster and Shafer in the class of belief functions. We were originally stimulated by this work, but have since come to believe that the bewildering diversity of uncertainty and chance phenomena cannot be encompassed within either the conventional theory of probability, its relatively minor modifications (e.g., not requiring countable additivity), or the theory of belief functions. In consequence, we have been examining the properties of, and prospects for, the generalization of belief functions that is known as upper and lower, or interval-valued, probability. After commenting on what we deem to be problematic elements of common personalist/subjectivist/Bayesian positions that employ either finitely or countably additive probability to represent strength of belief and that are intended to be normative for rational behavior, we sketch some of the ways in which the set of lower envelopes, a subset of the set of lower probabilities that contains the belief functions, enables us to preserve the core of Bayesian reasoning while admitting a more realistic (e.g., in its reduced insistence upon an underlying precision in our beliefs) class of probability-like models. Particular advantages of lower envelopes are identified in the area of the aggregation of beliefs.

The focus of our own research is in the area of objective probabilistic reasoning about time series generated by physical or other empirical (e.g., societal) processes. As it is not the province of a general mathematical methodology such as probability theory to a priori rule out of existence empirical phenomena, we are concerned by the contraint imposed by conventional probability theory that an empirical process of bounded random variables that is believed to have a time- invariant generating mechanism must then exhibot long-run stable time averages. We have shown that lower probability models that allow for unstable time averages can only lie in the class of undominated lower probabilities, a subset of lower probability models disjoint from the lower envelopes and having the weakest relationship to conventional probability measures. Our research has been devoted to exploring and developing the theory of undominated lower probabilities so that it can be applied to model and understand nondeterministic phenomena, and we have also been interested in identifying actual physical processes (e.g., flicker noises) that exhibit behavior requiring such novel models.  相似文献   


19.
When identifying the best model for representing the behavior of rainfall distribution based on a sequence of dry (wet) days, focus is usually given on the fitted model with the least number of estimated parameters. If the model with lesser number of parameters is found not adequate for describing a particular data distribution, the model with a higher number of parameters is recommended. Based on several probability models developed by previous researchers in this field, we propose five types of mixed probability models as the alternative to describe the distribution of dry (wet) spells for daily rainfall events. The mixed probability models comprise of the combination of log series distribution with three other types of models, which are Poisson distribution (MLPD), truncated Poisson distribution (MLTPD), and geometric distribution (MLGD). In addition, the combination of the two log series distributions (MLSD) and the mixed geometric with the truncated Poisson distribution (MGTPD) are also introduced as the alternative models. Daily rainfall data from 14 selected rainfall stations in Peninsular Malaysia for the periods of 1975 to 2004 were used in this present study. When selecting the best probability model to describe the observed distribution of dry (wet) spells, the Akaike’s Information Criterion (AIC) was considered. The results revealed that MLGD was the best probability model to represent the distribution of dry spells over the Peninsular.  相似文献   

20.
We present a test for detecting 'multivariate structure' in data sets. This procedure consists of transforming the data to remove the correlations, then discretizing the data and, finally, studying the cell counts in the resulting contingency table. A formal test can be performed using the usual chi-squared test statistic. We give the limiting distribution of the chi-squared statistic and also present simulation results to examine the accuracy of this limiting distribution in finite samples. Several examples show that our procedure can detect a variety of different types of structure. Our examples include data with clustering, digitized speech data, and residuals from a fitted time series model. The chi-squared statistic can also be used as a test for multivariate normality.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号