首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We used capture-recapture methods to estimate bird species richness from mist-net and point-count data from a study area in Campeche, Mexico. We estimated species richness separately for each survey technique for two habitats, forest and pasture, in six sampling periods. We then estimated richness based on species' detections by either technique, and estimated the proportion of species detected by each technique that are not part of the population sampled by the other technique. No consistent differences existed between richness estimates from count data and from capture data in the two habitats. In some sampling periods, over 50% of the richness estimate from one survey technique may be species that are not sampled by the other technique, suggesting that one technique may not be adequate to estimate total species richness and that comparing estimates from areas sampled by different techniques may not be valid.  相似文献   

2.
Feature selection arises in many areas of modern science. For example, in genomic research, we want to find the genes that can be used to separate tissues of different classes (e.g. cancer and normal). One approach is to fit regression/classification models with certain penalization. In the past decade, hyper-LASSO penalization (priors) have received increasing attention in the literature. However, fully Bayesian methods that use Markov chain Monte Carlo (MCMC) for regression/classification with hyper-LASSO priors are still in lack of development. In this paper, we introduce an MCMC method for learning multinomial logistic regression with hyper-LASSO priors. Our MCMC algorithm uses Hamiltonian Monte Carlo in a restricted Gibbs sampling framework. We have used simulation studies and real data to demonstrate the superior performance of hyper-LASSO priors compared to LASSO, and to investigate the issues of choosing heaviness and scale of hyper-LASSO priors.  相似文献   

3.
Based on a capture-recapture sample of size $i;n+k,n≥ l,k ≥ 0, from a population of an unknown number of distinct species (or classes), the problem of estimating the total probability of the species unobserved in the first n selections is considered. As the estimand depends on both the unknown parameters and the data, the standard theory of estimation is inadequate for this problem A suitable definition of sufficiency is introduced and used to prove a Rao-Blackwell type result and discuss uniformly minimum mean squared error unbiased estimation. An alternative proof for an inadmissibility result is presented. The new proof gives more insight and a method for deriving improved estimators. The theoretical developments may be useful in other problems concerning inferences about random parametric functions.  相似文献   

4.
ABSTRACT

We develop a new score-driven model for the joint dynamics of fat-tailed realized covariance matrix observations and daily returns. The score dynamics for the unobserved true covariance matrix are robust to outliers and incidental large observations in both types of data by assuming a matrix-F distribution for the realized covariance measures and a multivariate Student's t distribution for the daily returns. The filter for the unknown covariance matrix has a computationally efficient matrix formulation, which proves beneficial for estimation and simulation purposes. We formulate parameter restrictions for stationarity and positive definiteness. Our simulation study shows that the new model is able to deal with high-dimensional settings (50 or more) and captures unobserved volatility dynamics even if the model is misspecified. We provide an empirical application to daily equity returns and realized covariance matrices up to 30 dimensions. The model statistically and economically outperforms competing multivariate volatility models out-of-sample. Supplementary materials for this article are available online.  相似文献   

5.
Summary.  In the empirical literature on assortative matching using linked employer–employee data, unobserved worker quality appears to be negatively correlated with unobserved firm quality. We show that this can be caused by standard estimation error. We develop formulae that show that the estimated correlation is biased downwards if there is true positive assortative matching and when any conditioning covariates are uncorrelated with the firm and worker fixed effects. We show that this bias is bigger the fewer movers there are in the data, which is 'limited mobility bias'. This result applies to any two-way (or higher) error components model that is estimated by fixed effects methods. We apply these bias corrections to a large German linked employer–employee data set. We find that, although the biases can be considerable, they are not sufficiently large to remove the negative correlation entirely.  相似文献   

6.
Functional logistic regression is becoming more popular as there are many situations where we are interested in the relation between functional covariates (as input) and a binary response (as output). Several approaches have been advocated, and this paper goes into detail about three of them: dimension reduction via functional principal component analysis, penalized functional regression, and wavelet expansions in combination with Least Absolute Shrinking and Selection Operator penalization. We discuss the performance of the three methods on simulated data and also apply the methods to data regarding lameness detection for horses. Emphasis is on classification performance, but we also discuss estimation of the unknown parameter function.  相似文献   

7.
We propose marginalized lasso, a new nonconvex penalization for variable selection in regression problem. The marginalized lasso penalty is motivated from integrating out the penalty parameter in the original lasso penalty with a gamma prior distribution. This study provides a thresholding rule and a lasso-based iterative algorithm for parameter estimation in the marginalized lasso. We also provide a coordinate descent algorithm to efficiently optimize the marginalized lasso penalized regression. Numerical comparison studies are provided to demonstrate its competitiveness over the existing sparsity-inducing penalizations and suggest some guideline for tuning parameter selection.  相似文献   

8.
We investigate the effect of unobserved heterogeneity in the context of the linear transformation model for censored survival data in the clinical trials setting. The unobserved heterogeneity is represented by a frailty term, with unknown distribution, in the linear transformation model. The bias of the estimate under the assumption of no unobserved heterogeneity when it truly is present is obtained. We also derive the asymptotic relative efficiency of the estimate of treatment effect under the incorrect assumption of no unobserved heterogeneity. Additionally we investigate the loss of power for clinical trials that are designed assuming the model without frailty when, in fact, the model with frailty is true. Numerical studies under a proportional odds model show that the loss of efficiency and the loss of power can be substantial when the heterogeneity, as embodied by a frailty, is ignored. An erratum to this article can be found at  相似文献   

9.
The group folded concave penalization problems have been shown to process the satisfactory oracle property theoretically. However, it remains unknown whether the optimization algorithm for solving the resulting nonconvex problem can find such oracle solution among multiple local solutions. In this paper, we extend the well-known local linear approximation (LLA) algorithm to solve the group folded concave penalization problem for the linear models. We prove that, with the group LASSO estimator as the initial value, the two-step LLA solution converges to the oracle estimator with overwhelming probability, and thus closing the theoretical gap. The results are high-dimensional which allow the group number to grow exponentially, the true relevant groups and the true maximum group size to grow polynomially. Numerical studies are also conducted to show the merits of the LLA procedure.  相似文献   

10.
Summary.  We develop Markov chain Monte Carlo methodology for Bayesian inference for non-Gaussian Ornstein–Uhlenbeck stochastic volatility processes. The approach introduced involves expressing the unobserved stochastic volatility process in terms of a suitable marked Poisson process. We introduce two specific classes of Metropolis–Hastings algorithms which correspond to different ways of jointly parameterizing the marked point process and the model parameters. The performance of the methods is investigated for different types of simulated data. The approach is extended to consider the case where the volatility process is expressed as a superposition of Ornstein–Uhlenbeck processes. We apply our methodology to the US dollar–Deutschmark exchange rate.  相似文献   

11.
Abstract.  We propose a new method for fitting proportional hazards models with error-prone covariates. Regression coefficients are estimated by solving an estimating equation that is the average of the partial likelihood scores based on imputed true covariates. For the purpose of imputation, a linear spline model is assumed on the baseline hazard. We discuss consistency and asymptotic normality of the resulting estimators, and propose a stochastic approximation scheme to obtain the estimates. The algorithm is easy to implement, and reduces to the ordinary Cox partial likelihood approach when the measurement error has a degenerate distribution. Simulations indicate high efficiency and robustness. We consider the special case where error-prone replicates are available on the unobserved true covariates. As expected, increasing the number of replicates for the unobserved covariates increases efficiency and reduces bias. We illustrate the practical utility of the proposed method with an Eastern Cooperative Oncology Group clinical trial where a genetic marker, c- myc expression level, is subject to measurement error.  相似文献   

12.
We consider robust Bayesian prediction of a function of unobserved data based on observed data under an asymmetric loss function. Under a general linear-exponential posterior risk function, the posterior regret gamma-minimax (PRGM), conditional gamma-minimax (CGM), and most stable (MS) predictors are obtained when the prior distribution belongs to a general class of prior distributions. We use this general form to find the PRGM, CGM, and MS predictors of a general linear combination of the finite population values under LINEX loss function on the basis of two classes of priors in a normal model. Also, under the general ε-contamination class of prior distributions, the PRGM predictor of a general linear combination of the finite population values is obtained. Finally, we provide a real-life example to predict a finite population mean and compare the estimated risk and risk bias of the obtained predictors under the LINEX loss function by a simulation study.  相似文献   

13.
Multivariate mixture regression models can be used to investigate the relationships between two or more response variables and a set of predictor variables by taking into consideration unobserved population heterogeneity. It is common to take multivariate normal distributions as mixing components, but this mixing model is sensitive to heavy-tailed errors and outliers. Although normal mixture models can approximate any distribution in principle, the number of components needed to account for heavy-tailed distributions can be very large. Mixture regression models based on the multivariate t distributions can be considered as a robust alternative approach. Missing data are inevitable in many situations and parameter estimates could be biased if the missing values are not handled properly. In this paper, we propose a multivariate t mixture regression model with missing information to model heterogeneity in regression function in the presence of outliers and missing values. Along with the robust parameter estimation, our proposed method can be used for (i) visualization of the partial correlation between response variables across latent classes and heterogeneous regressions, and (ii) outlier detection and robust clustering even under the presence of missing values. We also propose a multivariate t mixture regression model using MM-estimation with missing information that is robust to high-leverage outliers. The proposed methodologies are illustrated through simulation studies and real data analysis.  相似文献   

14.
We consider optimal sample designs for observing classes of objects. Suppose we will take a simple random sample of equal-sized sectors from a study population and observe the classes existing on these sectors. The classes might be many different things, for example, herbaceous plant species (in sampling quadrats), microinvertebrate species (in sampling cores), and side effects from a drug (in conducting medical trials). Under a nonparametric mixture model and data from a previous related study, we first estimate the optimal number of sample sectors of a given size. Then for negative binomial dispersions of individuals with a common aggregation parameter k, we consider the optimal size as well as number of sample sectors. A simple test exists to check our common k assumption and our optimal size method requires far less data than would be required by a grid method or other method which utilizes data from sample sectors of several different sizes.  相似文献   

15.
A method is proposed for estimating regression parameters from data containing covariate measurement errors by using Stein estimates of the unobserved true covariates. The method produces consistent estimates for the slope parameter in the classical linear errors-in-variables model and applies to a broad range of nonlinear regression problems, provided the measurement error is Gaussian with known variance. Simulations are used to examine the performance of the estimates in a nonlinear regression problem and to compare them with the usual naive ones obtained by ignoring error and with other estimates proposed recently in the literature.  相似文献   

16.
Summary.  When a treatment has a positive average causal effect (ACE) on an intermediate variable or surrogate end point which in turn has a positive ACE on a true end point, the treatment may have a negative ACE on the true end point due to the presence of unobserved confounders, which is called the surrogate paradox. A criterion for surrogate end points based on ACEs has recently been proposed to avoid the surrogate paradox. For a continuous or ordinal discrete end point, the distributional causal effect (DCE) may be a more appropriate measure for a causal effect than the ACE. We discuss criteria for surrogate end points based on DCEs. We show that commonly used models, such as generalized linear models and Cox's proportional hazard models, can make the sign of the DCE of the treatment on the true end point determinable by the sign of the DCE of the treatment on the surrogate even if the models include unobserved confounders. Furthermore, for a general distribution without any assumption of parametric models, we give a sufficient condition for a distributionally consistent surrogate and prove that it is almost necessary.  相似文献   

17.
Practical questions motivate the search for predictors either of an as yet unobserved random vector, or of a random function of a parameter. An extension of the classical UMVUE theory is presented to cover such situations. In includes a Rao-Blackwell-type theorem, a Cramer-Rao-type inequality, and necessary and sufficient conditions for a predictor to minimize the mean squared error uniformly in the parameter. Applications are considered to the problem of selected means, the species problem, and the examination of some u-v estimates of Robbins (1988).  相似文献   

18.
Measurement error models constitute a wide class of models that include linear and nonlinear regression models. They are very useful to model many real-life phenomena, particularly in the medical and biological areas. The great advantage of these models is that, in some sense, they can be represented as mixed effects models, allowing us to implement well-known techniques, like the EM-algorithm for the parameter estimation. In this paper, we consider a class of multivariate measurement error models where the observed response and/or covariate are not fully observed, i.e., the observations are subject to certain threshold values below or above which the measurements are not quantifiable. Consequently, these observations are considered censored. We assume a Student-t distribution for the unobserved true values of the mismeasured covariate and the error term of the model, providing a robust alternative for parameter estimation. Our approach relies on a likelihood-based inference using an EM-type algorithm. The proposed method is illustrated through some simulation studies and the analysis of an AIDS clinical trial dataset.  相似文献   

19.
Missing observations due to non‐response are commonly encountered in data collected from sample surveys. The focus of this article is on item non‐response which is often handled by filling in (or imputing) missing values using the observed responses (donors). Random imputation (single or fractional) is used within homogeneous imputation classes that are formed on the basis of categorical auxiliary variables observed on all the sampled units. A uniform response rate within classes is assumed, but that rate is allowed to vary across classes. We construct confidence intervals (CIs) for a population parameter that is defined as the solution to a smooth estimating equation with data collected using stratified simple random sampling. The imputation classes are assumed to be formed across strata. Fractional imputation with a fixed number of random draws is used to obtain an imputed estimating function. An empirical likelihood inference method under the fractional imputation is proposed and its asymptotic properties are derived. Two asymptotically correct bootstrap methods are developed for constructing the desired CIs. In a simulation study, the proposed bootstrap methods are shown to outperform traditional bootstrap methods and some non‐bootstrap competitors under various simulation settings. The Canadian Journal of Statistics 47: 281–301; 2019 © 2019 Statistical Society of Canada  相似文献   

20.
We consider the problem of binary-image restoration. The image being restored is not random, and we make no assumption about the nature of its contents. The estimate of the colour at each site is a fixed (the same for all sites) function of the data available in a neighbourhood of that site. Under this restriction, the estimate minimizing the overall mean squared error of prediction is the conditional expectation of the true colour given the observations in the neighbourhood of a site. The computation of this conditional expectation leads to the formal definition of the local characteristics of an image, namely, the frequency with which each pattern appears in the true unobserved image. When the “true” distribution of the patterns is unknown, it can be estimated from the records. The conditional expectation described above can then be evaluated using the estimated distribution of the patterns, and this procedure leads to a very natural estimate of the colour at each site. We propose two unbiased and consistent estimates for the distribution of patterns when the noise is a Gaussian white noise. Since the size of realistic images is very large, the estimated pattern distribution is usually close to the true one. This suggests that the estimated conditional expectation can be expected to be nearly optimal. An interesting feature of the proposed restoration methods is that they do not require prior knowledge of the local or global properties of the true underlying image. Several examples based on synthetic images show that the new methods perform fairly well for a variety of images with different degrees of colour continuity or textures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号