首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
We consider in this paper the semiparametric mixture of two unknown distributions equal up to a location parameter. The model is said to be semiparametric in the sense that the mixed distribution is not supposed to belong to a parametric family. To insure the identifiability of the model, it is assumed that the mixed distribution is zero symmetric, the model being then defined by the mixing proportion, two location parameters and the probability density function of the mixed distribution. We propose a new class of M‐estimators of these parameters based on a Fourier approach and prove that they are ‐consistent under mild regularity conditions. Their finite sample properties are illustrated by a Monte Carlo study, and a benchmark real dataset is also studied with our method.  相似文献   

2.
This paper deals with the study of dependencies between two given events modelled by point processes. In particular, we focus on the context of DNA to detect favoured or avoided distances between two given motifs along a genome suggesting possible interactions at a molecular level. For this, we naturally introduce a so‐called reproduction function h that allows to quantify the favoured positions of the motifs and that is considered as the intensity of a Poisson process. Our first interest is the estimation of this function h assumed to be well localized. The estimator based on random thresholds achieves an oracle inequality. Then, minimax properties of on Besov balls are established. Some simulations are provided, proving the good practical behaviour of our procedure. Finally, our method is applied to the analysis of the dependence between promoter sites and genes along the genome of the Escherichia coli bacterium.  相似文献   

3.
We discuss a class of difference‐based estimators for the autocovariance in nonparametric regression when the signal is discontinuous and the errors form a stationary m‐dependent process. These estimators circumvent the particularly challenging task of pre‐estimating such an unknown regression function. We provide finite‐sample expressions of their mean squared errors for piecewise constant signals and Gaussian errors. Based on this, we derive biased‐optimized estimates that do not depend on the unknown autocovariance structure. Notably, for positively correlated errors, that part of the variance of our estimators that depend on the signal is minimal as well. Further, we provide sufficient conditions for ‐consistency; this result is extended to piecewise Hölder regression with non‐Gaussian errors. We combine our biased‐optimized autocovariance estimates with a projection‐based approach and derive covariance matrix estimates, a method that is of independent interest. An R package, several simulations and an application to biophysical measurements complement this paper.  相似文献   

4.
Mixture models are commonly used in biomedical research to account for possible heterogeneity in population. In this paper, we consider tests for homogeneity between two groups in the exponential tilt mixture models. A novel pairwise pseudolikelihood approach is proposed to eliminate the unknown nuisance function. We show that the corresponding pseudolikelihood ratio test has an asymptotic distribution as a supremum of two squared Gaussian processes under the null hypothesis. To maintain the appeal of simplicity for conventional likelihood ratio tests, we propose two alternative tests, both shown to have a simple asymptotic distribution of under the null. Simulation studies show that the proposed class of pseudolikelihood ratio tests performs well in controlling type I errors and having competitive powers compared with the current tests. The proposed tests are illustrated by an example of partial differential expression detection using microarray data from prostate cancer patients.  相似文献   

5.
Estimation of time‐average variance constant (TAVC), which is the asymptotic variance of the sample mean of a dependent process, is of fundamental importance in various fields of statistics. For frequentists, it is crucial for constructing confidence interval of mean and serving as a normalizing constant in various test statistics and so forth. For Bayesians, it is widely used for evaluating effective sample size and conducting convergence diagnosis in Markov chain Monte Carlo method. In this paper, by considering high‐order corrections to the asymptotic biases, we develop a new class of TAVC estimators that enjoys optimal ‐convergence rates under different degrees of the serial dependence of stochastic processes. The high‐order correction procedure is applicable to estimation of the so‐called smoothness parameter, which is essential in determining the optimal bandwidth. Comparisons with existing TAVC estimators are comprehensively investigated. In particular, the proposed optimal high‐order corrected estimator has the best performance in terms of mean squared error.  相似文献   

6.
The starting point in uncertainty quantification is a stochastic model, which is fitted to a technical system in a suitable way, and prediction of uncertainty is carried out within this stochastic model. In any application, such a model will not be perfect, so any uncertainty quantification from such a model has to take into account the inadequacy of the model. In this paper, we rigorously show how the observed data of the technical system can be used to build a conservative non‐asymptotic confidence interval on quantiles related to experiments with the technical system. The construction of this confidence interval is based on concentration inequalities and order statistics. An asymptotic bound on the length of this confidence interval is presented. Here we assume that engineers use more and more of their knowledge to build models with order of errors bounded by . The results are illustrated by applying the newly proposed approach to real and simulated data.  相似文献   

7.
Let X be lognormal(μ,σ2) with density f(x); let θ > 0 and define . We study properties of the exponentially tilted density (Esscher transform) fθ(x) = e?θxf(x)/L(θ), in particular its moments, its asymptotic form as θ and asymptotics for the saddlepoint θ(x) determined by . The asymptotic formulas involve the Lambert W function. The established relations are used to provide two different numerical methods for evaluating the left tail probability of the sum of lognormals Sn=X1+?+Xn: a saddlepoint approximation and an exponential tilting importance sampling estimator. For the latter, we demonstrate logarithmic efficiency. Numerical examples for the cdf Fn(x) and the pdf fn(x) of Sn are given in a range of values of σ2,n and x motivated by portfolio value‐at‐risk calculations.  相似文献   

8.
This paper is concerned with studying the dependence structure between two random variables Y1 and Y2 in the presence of a covariate X, which affects both marginal distributions but not the dependence structure. This is reflected in the property that the conditional copula of Y1 and Y2 given X, does not depend on the value of X. This latter independence often appears as a simplifying assumption in pair‐copula constructions. We introduce a general estimator for the copula in this specific setting and establish its consistency. Moreover, we consider some special cases, such as parametric or nonparametric location‐scale models for the effect of the covariate X on the marginals of Y1 and Y2 and show that in these cases, weak convergence of the estimator, at ‐rate, holds. The theoretical results are illustrated by simulations and a real data example.  相似文献   

9.
In this paper, we consider the problem of adaptive density or survival function estimation in an additive model defined by Z=X+Y with X independent of Y, when both random variables are non‐negative. This model is relevant, for instance, in reliability fields where we are interested in the failure time of a certain material that cannot be isolated from the system it belongs. Our goal is to recover the distribution of X (density or survival function) through n observations of Z, assuming that the distribution of Y is known. This issue can be seen as the classical statistical problem of deconvolution that has been tackled in many cases using Fourier‐type approaches. Nonetheless, in the present case, the random variables have the particularity to be supported. Knowing that, we propose a new angle of attack by building a projection estimator with an appropriate Laguerre basis. We present upper bounds on the mean squared integrated risk of our density and survival function estimators. We then describe a non‐parametric data‐driven strategy for selecting a relevant projection space. The procedures are illustrated with simulated data and compared with the performances of a more classical deconvolution setting using a Fourier approach. Our procedure achieves faster convergence rates than Fourier methods for estimating these functions.  相似文献   

10.
Let X n = (x i j ) be a k ×n data matrix with complex‐valued, independent and standardized entries satisfying a Lindeberg‐type moment condition. We consider simultaneously R sample covariance matrices , where the Q r 's are non‐random real matrices with common dimensions p ×k (k p ). Assuming that both the dimension p and the sample size n grow to infinity, the limiting distributions of the eigenvalues of the matrices { B n r } are identified, and as the main result of the paper, we establish a joint central limit theorem (CLT) for linear spectral statistics of the R matrices { B n r }. Next, this new CLT is applied to the problem of testing a high‐dimensional white noise in time series modelling. In experiments, the derived test has a controlled size and is significantly faster than the classical permutation test, although it does have lower power. This application highlights the necessity of such joint CLT in the presence of several dependent sample covariance matrices. In contrast, all the existing works on CLT for linear spectral statistics of large sample covariance matrices deal with a single sample covariance matrix (R = 1).  相似文献   

11.
In this paper, we consider the problem of estimating the Laplace transform of volatility within a fixed time interval [0,T] using high‐frequency sampling, where we assume that the discretized observations of the latent process are contaminated by microstructure noise. We use the pre‐averaging approach to deal with the effect of microstructure noise. Under the high‐frequency scenario, we obtain a consistent estimator whose convergence rate is , which is known as the optimal convergence rate of the estimation of integrated volatility functionals under the presence of microstructure noise. The related central limit theorem is established. The simulation studies justify the finite‐sample performance of the proposed estimator.  相似文献   

12.
We propose a new method for risk‐analytic benchmark dose (BMD) estimation in a dose‐response setting when the responses are measured on a continuous scale. For each dose level d, the observation X(d) is assumed to follow a normal distribution: . No specific parametric form is imposed upon the mean μ(d), however. Instead, nonparametric maximum likelihood estimates of μ(d) and σ are obtained under a monotonicity constraint on μ(d). For purposes of quantitative risk assessment, a ‘hybrid’ form of risk function is defined for any dose d as R(d) = P[X(d) < c], where c > 0 is a constant independent of d. The BMD is then determined by inverting the additional risk functionRA(d) = R(d) ? R(0) at some specified value of benchmark response. Asymptotic theory for the point estimators is derived, and a finite‐sample study is conducted, using both real and simulated data. When a large number of doses are available, we propose an adaptive grouping method for estimating the BMD, which is shown to have optimal mean integrated squared error under appropriate designs.  相似文献   

13.
We consider model selection for linear mixed-effects models with clustered structure, where conditional Kullback–Leibler (CKL) loss is applied to measure the efficiency of the selection. We estimate the CKL loss by substituting the empirical best linear unbiased predictors (EBLUPs) into random effects with model parameters estimated by maximum likelihood. Although the BLUP approach is commonly used in predicting random effects and future observations, selecting random effects to achieve asymptotic loss efficiency concerning CKL loss is challenging and has not been well studied. In this paper, we propose addressing this difficulty using a conditional generalized information criterion (CGIC) with two tuning parameters. We further consider a challenging but practically relevant situation where the number, m $$ m $$ , of clusters does not go to infinity with the sample size. Hence the random-effects variances are not consistently estimable. We show that via a novel decomposition of the CKL risk, the CGIC achieves consistency and asymptotic loss efficiency, whether m $$ m $$ is fixed or increases to infinity with the sample size. We also conduct numerical experiments to illustrate the theoretical findings.  相似文献   

14.
We study adaptive importance sampling (AIS) as an online learning problem and argue for the importance of the trade-off between exploration and exploitation in this adaptation. Borrowing ideas from the online learning literature, we propose Daisee, a partition-based AIS algorithm. We further introduce a notion of regret for AIS and show that Daisee has 𝒪 ( T ( log T ) 3 4 ) cumulative pseudo-regret, where T $$ T $$ is the number of iterations. We then extend Daisee to adaptively learn a hierarchical partitioning of the sample space for more efficient sampling and confirm the performance of both algorithms empirically.  相似文献   

15.
In prior works, this group demonstrated the feasibility of valid adaptive sequential designs for crossover bioequivalence studies. In this paper, we extend the prior work to optimize adaptive sequential designs over a range of geometric mean test/reference ratios (GMRs) of 70–143% within each of two ranges of intra‐subject coefficient of variation (10–30% and 30–55%). These designs also introduce a futility decision for stopping the study after the first stage if there is sufficiently low likelihood of meeting bioequivalence criteria if the second stage were completed, as well as an upper limit on total study size. The optimized designs exhibited substantially improved performance characteristics over our previous adaptive sequential designs. Even though the optimized designs avoided undue inflation of type I error and maintained power at 80%, their average sample sizes were similar to or less than those of conventional single stage designs. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

16.
Continuous determinantal point processes (DPPs) are a class of repulsive point processes on d $$ {\mathbb{R}}^d $$ with many statistical applications. Although an explicit expression of their density is known, it is too complicated to be used directly for maximum likelihood estimation. In the stationary case, an approximation using Fourier series has been suggested, but it is limited to rectangular observation windows and no theoretical results support it. In this contribution, we investigate a different way to approximate the likelihood by looking at its asymptotic behavior when the observation window grows toward d $$ {\mathbb{R}}^d $$ . This new approximation is not limited to rectangular windows, is faster to compute than the previous one, does not require any tuning parameter, and some theoretical justifications are provided. It moreover provides an explicit formula for estimating the asymptotic variance of the associated estimator. The performances are assessed in a simulation study on standard parametric models on d $$ {\mathbb{R}}^d $$ and compare favorably to common alternative estimation methods for continuous DPPs.  相似文献   

17.
Abstract. Let M be an isotonic real‐valued function on a compact subset of and let be an unconstrained estimator of M. A feasible monotonizing technique is to take the largest (smallest) monotone function that lies below (above) the estimator or any convex combination of these two envelope estimators. When the process is asymptotically equicontinuous for some sequence rn→∞, we show that these projection‐type estimators are rn‐equivalent in probability to the original unrestricted estimator. Our first motivating application involves a monotone estimator of the conditional distribution function that has the distributional properties of the local linear regression estimator. Applications also include the estimation of econometric (probability‐weighted moment, quantile) and biometric (mean remaining lifetime) functions.  相似文献   

18.
The theory of Bayesian robustness modeling uses heavy-tailed distributions to resolve conflicts of information by rejecting automatically the outlying information in favor of the other sources of information. In particular, the Student's-t process is a natural alternative to the Gaussian process when the data might carry atypical information. Several works attest to the robustness of the Student t $$ t $$ process, however, the studies are mostly guided by intuition and focused mostly on the computational aspects rather than the mathematical properties of the involved distributions. This work uses the theory of regular variation to address the robustness of the Student t $$ t $$ process in the context of nonlinear regression, that is, the behavior of the posterior distribution in the presence of outliers in the inputs, in the outputs, or in both sources of information. In all these cases, under certain conditions, it is shown that the posterior distribution tends to a quantity that does not depend on the atypical information, then, for every case, the limiting posterior distribution as the outliers tend to infinity is provided. The impact of outliers on the predictive posterior distribution is also addressed. The theory is illustrated with a few simulated examples.  相似文献   

19.
Let f ^ n be the nonparametric maximum likelihood estimator of a decreasing density. Grenander characterized this as the left‐continuous slope of the least concave majorant of the empirical distribution function. For a sample from the uniform distribution, the asymptotic distribution of the L2‐distance of the Grenander estimator to the uniform density was derived in an article by Groeneboom and Pyke by using a representation of the Grenander estimator in terms of conditioned Poisson and gamma random variables. This representation was also used in an article by Groeneboom and Lopuhaä to prove a central limit result of Sparre Andersen on the number of jumps of the Grenander estimator. Here we extend this to the proof of the main result on the L2‐distance of the Grenander estimator to the uniform density and also prove a similar asymptotic normality results for the entropy functional. Cauchy's formula and saddle point methods are the main tools in our development.  相似文献   

20.
Ordinal classification is an important area in statistical machine learning, where labels exhibit a natural order. One of the major goals in ordinal classification is to correctly predict the relative order of instances. We develop a novel concordance-based approach to ordinal classification, where a concordance function is introduced and a penalized smoothed method for optimization is designed. Variable selection using the L 1 $$ {L}_1 $$ penalty is incorporated for sparsity considerations. Within the set of classification rules that maximize the concordance function, we find optimal thresholds to predict labels by minimizing a loss function. After building the classifier, we derive nonparametric estimation of class conditional probabilities. The asymptotic properties of the estimators as well as the variable selection consistency are established. Extensive simulations and real data applications show the robustness and advantage of the proposed method in terms of classification accuracy, compared with other existing methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号