The Double Chain Markov Model (DCMM) is used to model an observable process $Y = \{Y_{t}\}_{t=1}^{T}$ as a Markov chain with transition matrix, $P_{x_{t}}$ , dependent on the value of an unobservable (hidden) Markov chain $\{X_{t}\}_{t=1}^{T}$ . We present and justify an efficient algorithm for sampling from the posterior distribution associated with the DCMM, when the observable process Y consists of independent vectors of (possibly) different lengths. Convergence of the Gibbs sampler, used to simulate the posterior density, is improved by adding a random permutation step. Simulation studies are included to illustrate the method. The problem that motivated our model is presented at the end. It is an application to real data, consisting of the credit rating dynamics of a portfolio of financial companies where the (unobserved) hidden process is the state of the broader economy.  相似文献   

We consider equalities between the ordinary least squares estimator ( $\mathrm {OLSE} $ ), the best linear unbiased estimator ( $\mathrm {BLUE} $ ) and the best linear unbiased predictor ( $\mathrm {BLUP} $ ) in the general linear model $\{ \mathbf y , \mathbf X \varvec{\beta }, \mathbf V \}$ extended with the new unobservable future value $ \mathbf y _{*}$ of the response whose expectation is $ \mathbf X _{*}\varvec{\beta }$ . Our aim is to provide some new insight and new proofs for the equalities under consideration. We also collect together various expressions, without rank assumptions, for the $\mathrm {BLUP} $ and provide new results giving upper bounds for the Euclidean norm of the difference between the $\mathrm {BLUP} ( \mathbf y _{*})$ and $\mathrm {BLUE} ( \mathbf X _{*}\varvec{\beta })$ and between the $\mathrm {BLUP} ( \mathbf y _{*})$ and $\mathrm {OLSE} ( \mathbf X _{*}\varvec{\beta })$ . A remark is made on the application to small area estimation.  相似文献   

Krämer (Sankhy $\bar{\mathrm{a }}$ 42:130–131, 1980) posed the following problem: “Which are the $\mathbf{y}$ , given $\mathbf{X}$ and $\mathbf{V}$ , such that OLS and Gauss–Markov are equal?”. In other words, the problem aimed at identifying those vectors $\mathbf{y}$ for which the ordinary least squares (OLS) and Gauss–Markov estimates of the parameter vector $\varvec{\beta }$ coincide under the general Gauss–Markov model $\mathbf{y} = \mathbf{X} \varvec{\beta } + \mathbf{u}$ . The problem was later called a “twist” to Kruskal’s Theorem, which provides conditions necessary and sufficient for the OLS and Gauss–Markov estimates of $\varvec{\beta }$ to be equal. The present paper focuses on a similar problem to the one posed by Krämer in the aforementioned paper. However, instead of the estimation of $\varvec{\beta }$ , we consider the estimation of the systematic part $\mathbf{X} \varvec{\beta }$ , which is a natural consequence of relaxing the assumption that $\mathbf{X}$ and $\mathbf{V}$ are of full (column) rank made by Krämer. Further results, dealing with the Euclidean distance between the best linear unbiased estimator (BLUE) and the ordinary least squares estimator (OLSE) of $\mathbf{X} \varvec{\beta }$ , as well as with an equality between BLUE and OLSE are also provided. The calculations are mostly based on a joint partitioned representation of a pair of orthogonal projectors.  相似文献   

Let \(\mathbb{N } = \{1, 2, 3, \ldots \}\) . Let \(\{X, X_{n}; n \in \mathbb N \}\) be a sequence of i.i.d. random variables, and let \(S_{n} = \sum _{i=1}^{n}X_{i}, n \in \mathbb N \) . Then \( S_{n}/\sqrt{n} \Rightarrow N(0, \sigma ^{2})\) for some \(\sigma ^{2} < \infty \) whenever, for a subsequence \(\{n_{k}; k \in \mathbb N \}\) of \(\mathbb N \) , \( S_{n_{k}}/\sqrt{n_{k}} \Rightarrow N(0, \sigma ^{2})\) . Motivated by this result, we study the central limit theorem along subsequences of sums of i.i.d. random variables when \(\{\sqrt{n}; n \in \mathbb N \}\) is replaced by \(\{\sqrt{na_{n}};n \in \mathbb N \}\) with \(\lim _{n \rightarrow \infty } a_{n} = \infty \) . We show that, for given positive nondecreasing sequence \(\{a_{n}; n \in \mathbb N \}\) with \(\lim _{n \rightarrow \infty } a_{n} = \infty \) and \(\lim _{n \rightarrow \infty } a_{n+1}/a_{n} = 1\) and given nondecreasing function \(h(\cdot ): (0, \infty ) \rightarrow (0, \infty )\) with \(\lim _{x \rightarrow \infty } h(x) = \infty \) , there exists a sequence \(\{X, X_{n}; n \in \mathbb N \}\) of symmetric i.i.d. random variables such that \(\mathbb E h(|X|) = \infty \) and, for some subsequence \(\{n_{k}; k \in \mathbb N \}\) of \(\mathbb N \) , \( S_{n_{k}}/\sqrt{n_{k}a_{n_{k}}} \Rightarrow N(0, 1)\) . In particular, for given \(0 < p < 2\) and given nondecreasing function \(h(\cdot ): (0, \infty ) \rightarrow (0, \infty )\) with \(\lim _{x \rightarrow \infty } h(x) = \infty \) , there exists a sequence \(\{X, X_{n}; n \in \mathbb N \}\) of symmetric i.i.d. random variables such that \(\mathbb E h(|X|) = \infty \) and, for some subsequence \(\{n_{k}; k \in \mathbb N \}\) of \(\mathbb N \) , \( S_{n_{k}}/n_{k}^{1/p} \Rightarrow N(0, 1)\) .  相似文献   

R. Göb 《Statistical Papers》1992,33(1):273-277
In elementary probability theory, as a result of a limiting process the probabilities of aBi(n, p) binomial distribution are approximated by the probabilities of aPo(np) Poisson distribution. Accordingly, in statistical quality control the binomial operating characteristic function \(\mathcal{L}_{n,c} (p)\) is approximated by the Poisson operating characteristic function \(\mathcal{F}_{n,c} (p)\) . The inequality \(\mathcal{L}_{n + 1,c + 1} (p) > \mathcal{L}_{n,c} (p)\) forp∈(0;1) is evident from the interpretation of \(\mathcal{L}_{n + 1,c + 1} (p)\) , \(\mathcal{L}_{n,c} (p)\) as probabilities of accepting a lot. It is shown that the Poisson approximation \(\mathcal{F}_{n,c} (p)\) preserves this essential feature of the binomial operating characteristic function, i.e. that an analogous inequality holds for the Poisson operating characteristic function, too.  相似文献   

Let \(X_1 ,X_2 ,\ldots ,X_n \) be a sequence of Markov Bernoulli trials (MBT) and \(\underline{X}_n =( {X_{n,k_1 } ,X_{n,k_2 } ,\ldots ,X_{n,k_r } })\) be a random vector where \(X_{n,k_i } \) represents the number of occurrences of success runs of length \(k_i \,( {i=1,2,\ldots ,r})\) . In this paper the joint distribution of \(\underline{X}_n \) in the sequence of \(n\) MBT is studied using method of conditional probability generating functions. Five different counting schemes of runs namely non-overlapping runs, runs of length at least \(k\) , overlapping runs, runs of exact length \(k\) and \(\ell \) -overlapping runs (i.e. \(\ell \) -overlapping counting scheme), \(0\le \ell are considered. The pgf of joint distribution of \(\underline{X}_n \) is obtained in terms of matrix polynomial and an algorithm is developed to get exact probability distribution. Numerical results are included to demonstrate the computational flexibility of the developed results. Various applications of the joint distribution of \(\underline{X}_n \) such as in evaluation of the reliability of \(( {n,f,k})\!\!:\!\!G\) and \(\!:\!\!G\) system, in evaluation of quantities related to start-up demonstration tests, acceptance sampling plans are also discussed.  相似文献   

Given a random sample of size \(n\) with mean \(\overline{X} \) and standard deviation \(s\) from a symmetric distribution \(F(x; \mu , \sigma ) = F_{0} (( x- \mu ) / \sigma ) \) with \(F_0\) known, and \(X \sim F(x;\; \mu , \sigma )\) independent of the sample, we show how to construct an expansion \( a_n^{\prime } = \sum _{i=0}^\infty \ c_i \ n^{-i} \) such that \(\overline{X} - s a_n^{\prime } < X < \overline{X} + s a_n^{\prime } \) with a given probability \(\beta \) . The practical value of this result is illustrated by simulation and using a real data set.  相似文献   

Given a stationary multidimensional spatial process $\left\{ Z_{\mathbf{i}}=\left( X_{\mathbf{i}},\ Y_{\mathbf{i}}\right) \in \mathbb R ^d\right. \left. \times \mathbb R ,\mathbf{i}\in \mathbb Z ^{N}\right\} $ , we investigate a kernel estimate of the spatial conditional mode function of the response variable $Y_{\mathbf{i}}$ given the explicative variable $X_{\mathbf{i}}$ . Consistency in $L^p$ norm and strong convergence of the kernel estimate are obtained when the sample considered is a $\alpha $ -mixing sequence. An application to real data is given in order to illustrate the behavior of our methodology.  相似文献   

The Hotelling’s \(\textit{T}^{2 }\) control chart with variable parameters (VP \(T^{2})\) has been shown to have better statistical performance than other adaptive control schemes in detecting small to moderate process mean shifts. In this paper, we investigate the statistical performance of the VP \(T^{2}\) control chart coupled with run rules. We consider two well-known run rules schemes. Statistical performance is evaluated by using a Markov chain modeling the random shock mechanism of the monitored process. The in-control time interval of the process is assumed to follow an exponential distribution. A genetic algorithm has been designed to select the optimal chart design parameters. We provide an extensive numerical analysis indicating that the VP \(T^{2}\) control chart with run rules outperforms other charts for small sizes of the mean shift expressed through the Mahalanobis distance.  相似文献   

For the counting process N={N(t), t≥0} and the probability that a device survives the first k shocks \(\bar P_k \) , the probability that the device survives beyond t that is \(\bar H(t) = \sum\limits_{k = 0}^\omega {P(N(t) = k)} \bar P_k \) is considered. The survival \(\bar H(t)\) is proved to have the new better (worse) than used renewal failure rate and the new better (worse) than average failure rate properties under, some conditions on N and \((\bar P_k )_{k = \rho }^\omega \) . In particular we study the survival probability when N is a nonhomogeneous Poisson process or birth process. Acumulative damage model and Laplace transform characterization for properties are investigated. Further the generating functions for these renewal failure rates properties are given.  相似文献   

Suppose one has a sample of high-frequency intraday discrete observations of a continuous time random process, such as foreign exchange rates and stock prices, and wants to test for the presence of jumps in the process. We show that the power of any test of this hypothesis depends on the frequency of observation. In particular, if the process is observed at intervals of length $1/n$ 1 / n and the instantaneous volatility of the process is given by $ \sigma _{t}$ σ t , we show that at best one can detect jumps of height no smaller than $\sigma _{t}\sqrt{2\log (n)/n}$ σ t 2 log ( n ) / n . We present a new test which achieves this rate for diffusion-type processes, and examine its finite-sample properties using simulations.  相似文献   

The general Gauss–Markov model, Y = e, E(e) = 0, Cov(e) = σ 2 V, has been intensively studied and widely used. Most studies consider covariance matrices V that are nonsingular but we focus on the most difficult case wherein C(X), the column space of X, is not contained in C(V). This forces V to be singular. Under this condition there exist nontrivial linear functions of Q that are known with probability 1 (perfectly) where ${C(Q)=C(V)^\perp}$ . To treat ${C(X) \not \subset C(V)}$ , much of the existing literature obtains estimates and tests by replacing V with a pseudo-covariance matrix T = V + XUX′ for some nonnegative definite U such that ${C(X) \subset C(T)}$ , see Christensen (Plane answers to complex questions: the theory of linear models, 2002, Chap. 10). We find it more intuitive to first eliminate what is known about and then to adjust X while keeping V unchanged. We show that we can decompose β into the sum of two orthogonal parts, β = β 0 + β 1, where β 0 is known. We also show that the unknown component of X β is ${X\beta_1 \equiv \tilde{X} \gamma}$ , where ${C(\tilde{X})=C(X)\cap C(V)}$ . We replace the original model with ${Y-X\beta_0=\tilde{X}\gamma+e}$ , E(e) = 0, ${Cov(e)=\sigma^2V}$ and perform estimation and tests under this new model for which the simplifying assumption ${C(\tilde{X}) \subset C(V)}$ holds. This allows us to focus on the part of that parameters that are not known perfectly. We show that this method provides the usual estimates and tests.  相似文献   

Let \(\mathbf {X} = (X_1,\ldots ,X_p)\) be a stochastic vector having joint density function \(f_{\mathbf {X}}(\mathbf {x})\) with partitions \(\mathbf {X}_1 = (X_1,\ldots ,X_k)\) and \(\mathbf {X}_2 = (X_{k+1},\ldots ,X_p)\). A new method for estimating the conditional density function of \(\mathbf {X}_1\) given \(\mathbf {X}_2\) is presented. It is based on locally Gaussian approximations, but simplified in order to tackle the curse of dimensionality in multivariate applications, where both response and explanatory variables can be vectors. We compare our method to some available competitors, and the error of approximation is shown to be small in a series of examples using real and simulated data, and the estimator is shown to be particularly robust against noise caused by independent variables. We also present examples of practical applications of our conditional density estimator in the analysis of time series. Typical values for k in our examples are 1 and 2, and we include simulation experiments with values of p up to 6. Large sample theory is established under a strong mixing condition.  相似文献   

Letx i(1)≤x i(2)≤…≤x i(ri) be the right-censored samples of sizesn i from theith exponential distributions $\sigma _i^{ - 1} exp\{ - (x - \mu _i )\sigma _i^{ - 1} \} ,i = 1,2$ where μi and σi are the unknown location and scale parameters respectively. This paper deals with the posteriori distribution of the difference between the two location parameters, namely μ21, which may be represented in the form $\mu _2 - \mu _1 \mathop = \limits^\mathcal{D} x_{2(1)} - x_{1(1)} + F_1 \sin \theta - F_2 \cos \theta $ where $\mathop = \limits^\mathcal{D} $ stands for equal in distribution,F i stands for the central F-variable with [2,2(r i?1)] degrees of freedom and $\tan \theta = \frac{{n_2 s_{x1} }}{{n_1 s_{x2} }}, s_{x1} = (r_1 - 1)^{ - 1} \left\{ {\sum\limits_{j = 1}^{r_i - 1} {(n_i - j)(x_{i(j + 1)} - x_{i(j)} )} } \right\}$ The paper also derives the distribution of the statisticV=F 1 sin σ?F 2 cos σ and tables of critical values of theV-statistic are provided for the 5% level of significance and selected degrees of freedom.  相似文献   

In this paper, by relaxing the mixing coefficients to α(n) = O(n ), β > 3, we investigate the Bahadur representation of sample quantiles under α-mixing sequence and obtain the rate as ${O(n^{-\frac{1}{2}}(\log\log n\cdot\log n)^{\frac{1}{2}})}$ . Meanwhile, for any δ > 0, by strengthening the mixing coefficients to α(n) = O(n ), ${\beta > \max\{3+\frac{5}{1+\delta},1+\frac{2}{\delta}\}}$ , we have the rate as ${O(n^{-\frac{3}{4}+\frac{\delta}{4(2+\delta)}}(\log\log n\cdot \log n)^{\frac{1}{2}})}$ . Specifically, if ${\delta=\frac{\sqrt{41}-5}{4}}$ and ${\beta > \frac{\sqrt{41}+7}{2}}$ , then the rate is presented as ${O(n^{-\frac{\sqrt{41}+5}{16}}(\log\log n\cdot \log n)^{\frac{1}{2}})}$ .  相似文献   

A semicompeting risks problem involves two-types of events: a nonterminal and a terminal event (death). Typically, the nonterminal event is the focus of the study, but the terminal event can preclude the occurrence of the nonterminal event. Semicompeting risks are ubiquitous in studies of aging. Examples of semicompeting risk dyads include: dementia and death, frailty syndrome and death, disability and death, and nursing home placement and death. Semicompeting risk models can be divided into two broad classes: models based only on observables quantities (class \(\mathcal {O}\) ) and those based on potential (latent) failure times (class \(\mathcal {L}\) ). The classical illness-death model belongs to class \(\mathcal {O}\) . This model is a special case of the multistate models, which has been an active area of methodology development. During the past decade and a half, there has also been a flurry of methodological activity on semicompeting risks based on latent failure times ( \(\mathcal {L}\) models). These advances notwithstanding, the semicompeting risks methodology has not penetrated biomedical research, in general, and gerontological research, in particular. Some possible reasons for this lack of uptake are: the methods are relatively new and sophisticated, conceptual problems associated with potential failure time models are difficult to overcome, paucity of expository articles aimed at educating practitioners, and non-availability of readily usable software. The main goals of this review article are: (i) to describe the major types of semicompeting risks problems arising in aging research, (ii) to provide a brief survey of the semicompeting risks methods, (iii) to suggest appropriate methods for addressing the problems in aging research, (iv) to highlight areas where more work is needed, and (v) to suggest ways to facilitate the uptake of the semicompeting risks methodology by the broader biomedical research community.  相似文献   

Finite mixture models can adequately model population heterogeneity when this heterogeneity arises from a finite number of relatively homogeneous clusters. An example of such a situation is market segmentation. Order selection in mixture models, i.e. selecting the correct number of components, however, is a problem which has not been satisfactorily resolved. Existing simulation results in the literature do not completely agree with each other. Moreover, it appears that the performance of different selection methods is affected by the type of model and the parameter values. Furthermore, most existing results are based on simulations where the true generating model is identical to one of the models in the candidate set. In order to partly fill this gap we carried out a (relatively) large simulation study for finite mixture models of normal linear regressions. We included several types of model (mis)specification to study the robustness of 18 order selection methods. Furthermore, we compared the performance of these selection methods based on unpenalized and penalized estimates of the model parameters. The results indicate that order selection based on penalized estimates greatly improves the success rates of all order selection methods. The most successful methods were \(MDL2\) , \(MRC\) , \(MRC_k\) , \(ICL\) \(BIC\) , \(ICL\) , \(CAIC\) , \(BIC\) and \(CLC\) but not one method was consistently good or best for all types of model (mis)specification.  相似文献   

We investigate methods for the design of sample surveys, and address the traditional resistance of survey samplers to the use of model-based methods by incorporating model robustness at the design stage. The designs are intended to be sufficiently flexible and robust that resulting estimates, based on the designer’s best guess at an appropriate model, remain reasonably accurate in a neighbourhood of this central model. Thus, consider a finite population of N units in which a survey variable Y is related to a q dimensional auxiliary variable x. We assume that the values of x are known for all N population units, and that we will select a sample of nN population units and then observe the n corresponding values of Y. The objective is to predict the population total $T=\sum_{i=1}^{N}Y_{i}$ . The design problem which we consider is to specify a selection rule, using only the values of the auxiliary variable, to select the n units for the sample so that the predictor has optimal robustness properties. We suppose that T will be predicted by methods based on a linear relationship between Y—possibly transformed—and given functions of x. We maximise the mean squared error of the prediction of T over realistic neighbourhoods of the fitted linear relationship, and of the assumed variance and correlation structures. This maximised mean squared error is then minimised over the class of possible samples, yielding an optimally robust (‘minimax’) design. To carry out the minimisation step we introduce a genetic algorithm and discuss its tuning for maximal efficiency.  相似文献   

In this paper, we consider the problem of hypotheses testing about the drift parameter \(\theta \) in the process \(\text {d}Y^{\delta }_{t} = \theta \dot{f}(t)Y^{\delta }_{t}\text {d}t + b(t)\text {d}L^{\delta }_{t}\) driven by symmetric \(\delta \)-stable Lévy process \(L^{\delta }_{t}\) with \(\dot{f}(t)\) being the derivative of a known increasing function f(t) and b(t) being known as well. We consider the hypotheses testing \(H_{0}: \theta \le 0\) and \(K_{0}: \theta =0\) against the alternatives \(H_{1}: \theta >0\) and \(K_{1}: \theta \ne 0\), respectively. For these hypotheses, we propose inverse methods, which are motivated by sequential approach, based on the first hitting time of the observed process (or its absolute value) to a pre-specified boundary or two boundaries until some given time. The applicability of these methods is illustrated. For the case \(Y^{\delta }_{0}=0\), we are able to calculate the values of boundaries and finite observed times more directly. We are able to show the consistencies of proposed tests for \(Y^{\delta }_{0}\ge 0\) with \(\delta \in (1,2]\) and for \(Y^{\delta }_{0}=0\) with \(\delta \in (0,2]\) under quite mild conditions.  相似文献   

