首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In recent years, a number of statistical models have been proposed for the purposes of high-level image analysis tasks such as object recognition. However, in general, these models remain hard to use in practice, partly as a result of their complexity, partly through lack of software. In this paper we concentrate on a particular deformable template model which has proved potentially useful for locating and labelling cells in microscope slides Rue and Hurn (1999). This model requires the specification of a number of rather non-intuitive parameters which control the shape variability of the deformed templates. Our goal is to arrange the estimation of these parameters in such a way that the microscope user's expertise is exploited to provide the necessary training data graphically by identifying a number of cells displayed on a computer screen, but that no additional statistical input is required. In this paper we use maximum likelihood estimation incorporating the error structure in the generation of our training data.  相似文献   

2.
Whereas there are many references on univariate boundary kernels, the construction of boundary kernels for multivariate density and curve estimation has not been investigated in detail. The use of multivariate boundary kernels ensures global consistency of multivariate kernel estimates as measured by the integrated mean-squared error or sup-norm deviation for functions with compact support. We develop a class of boundary kernels which work for any support, regardless of the complexity of its boundary. Our construction yields a boundary kernel for each point in the boundary region where the function is to be estimated. These boundary kernels provide a natural continuation of non-negative kernels used in the interior onto the boundary. They are obtained as solutions of the same kernel-generating variational problem which also produces the kernel function used in the interior as its solution. We discuss the numerical implementation of the proposed boundary kernels and their relationship to locally weighted least squares. Along the way we establish a continuous least squares principle and a continuous analogue of the Gauss–Markov theorem.  相似文献   

3.
A new flexible cure rate survival model is developed where the initial number of competing causes of the event of interest (say lesions or altered cells) follows a compound negative binomial (NB) distribution. This model provides a realistic interpretation of the biological mechanism of the event of interest, as it models a destructive process of the initial competing risk factors and records only the damaged portion of the original number of risk factors. Besides, it also accounts for the underlying mechanisms that lead to cure through various latent activation schemes. Our method of estimation exploits maximum likelihood (ML) tools. The methodology is illustrated on a real data set on malignant melanoma, and the finite sample behavior of parameter estimates are explored through simulation studies.  相似文献   

4.
We propose a hierarchical Bayesian model for analyzing gene expression data to identify pathways differentiating between two biological states (e.g., cancer vs. non-cancer and mutant vs. normal). Finding significant pathways can improve our understanding of biological processes. When the biological process of interest is related to a specific disease, eliciting a better understanding of the underlying pathways can lead to designing a more effective treatment. We apply our method to data obtained by interrogating the mutational status of p53 in 50 cancer cell lines (33 mutated and 17 normal). We identify several significant pathways with strong biological connections. We show that our approach provides a natural framework for incorporating prior biological information, and it has the best overall performance in terms of correctly identifying significant pathways compared to several alternative methods.  相似文献   

5.
We propose a flexible prior model for the parameters of binary Markov random fields (MRF), defined on rectangular lattices and with maximal cliques defined from a template maximal clique. The prior model allows higher‐order interactions to be included. We also define a reversible jump Markov chain Monte Carlo algorithm to sample from the associated posterior distribution. The number of possible parameters for a higher‐order MRF becomes high, even for small template maximal cliques. We define a flexible parametric form where the parameters have interpretation as potentials for clique configurations, and limit the effective number of parameters by assigning apriori discrete probabilities for events where groups of parameter values are equal. To cope with the computationally intractable normalising constant of MRFs, we adopt a previously defined approximation of binary MRFs. We demonstrate the flexibility of our prior formulation with simulated and real data examples.  相似文献   

6.
For nearly any challenging scientific problem evaluation of the likelihood is problematic if not impossible. Approximate Bayesian computation (ABC) allows us to employ the whole Bayesian formalism to problems where we can use simulations from a model, but cannot evaluate the likelihood directly. When summary statistics of real and simulated data are compared??rather than the data directly??information is lost, unless the summary statistics are sufficient. Sufficient statistics are, however, not common but without them statistical inference in ABC inferences are to be considered with caution. Previously other authors have attempted to combine different statistics in order to construct (approximately) sufficient statistics using search and information heuristics. Here we employ an information-theoretical framework that can be used to construct appropriate (approximately sufficient) statistics by combining different statistics until the loss of information is minimized. We start from a potentially large number of different statistics and choose the smallest set that captures (nearly) the same information as the complete set. We then demonstrate that such sets of statistics can be constructed for both parameter estimation and model selection problems, and we apply our approach to a range of illustrative and real-world model selection problems.  相似文献   

7.
The emerging field of cancer radiomics endeavors to characterize intrinsic patterns of tumor phenotypes and surrogate markers of response by transforming medical images into objects that yield quantifiable summary statistics to which regression and machine learning algorithms may be applied for statistical interrogation. Recent literature has identified clinicopathological association based on textural features deriving from gray-level co-occurrence matrices (GLCM) which facilitate evaluations of gray-level spatial dependence within a delineated region of interest. GLCM-derived features, however, tend to contribute highly redundant information. Moreover, when reporting selected feature sets, investigators often fail to adjust for multiplicities and commonly fail to convey the predictive power of their findings. This article presents a Bayesian probabilistic modeling framework for the GLCM as a multivariate object as well as describes its application within a cancer detection context based on computed tomography. The methodology, which circumvents processing steps and avoids evaluations of reductive and highly correlated feature sets, uses latent Gaussian Markov random field structure to characterize spatial dependencies among GLCM cells and facilitates classification via predictive probability. Correctly predicting the underlying pathology of 81% of the adrenal lesions in our case study, the proposed method outperformed current practices which achieved a maximum accuracy of only 59%. Simulations and theory are presented to further elucidate this comparison as well as ascertain the utility of applying multivariate Gaussian spatial processes to GLCM objects.  相似文献   

8.
In this article, we consider the destructive length-biased Poisson cure rate model, proposed by Rodrigues et al., that presents a realistic and interesting interpretation of the biological mechanism for the recurrence of tumor in a competing causes scenario. Assuming the lifetime to follow the Weibull distribution and censoring mechanism to be non-informative, the necessary steps of the EM algorithm for the determination of the MLEs of the model parameters are developed here based on right censored data. The standard errors of the MLEs are obtained by inverting the observed information matrix. A simulation study is then carried out to examine the method of inference developed here. Finally, the proposed methodology is illustrated with a real melanoma dataset.  相似文献   

9.
In biomedical studies, it is of substantial interest to develop risk prediction scores using high-dimensional data such as gene expression data for clinical endpoints that are subject to censoring. In the presence of well-established clinical risk factors, investigators often prefer a procedure that also adjusts for these clinical variables. While accelerated failure time (AFT) models are a useful tool for the analysis of censored outcome data, it assumes that covariate effects on the logarithm of time-to-event are linear, which is often unrealistic in practice. We propose to build risk prediction scores through regularized rank estimation in partly linear AFT models, where high-dimensional data such as gene expression data are modeled linearly and important clinical variables are modeled nonlinearly using penalized regression splines. We show through simulation studies that our model has better operating characteristics compared to several existing models. In particular, we show that there is a non-negligible effect on prediction as well as feature selection when nonlinear clinical effects are misspecified as linear. This work is motivated by a recent prostate cancer study, where investigators collected gene expression data along with established prognostic clinical variables and the primary endpoint is time to prostate cancer recurrence. We analyzed the prostate cancer data and evaluated prediction performance of several models based on the extended c statistic for censored data, showing that 1) the relationship between the clinical variable, prostate specific antigen, and the prostate cancer recurrence is likely nonlinear, i.e., the time to recurrence decreases as PSA increases and it starts to level off when PSA becomes greater than 11; 2) correct specification of this nonlinear effect improves performance in prediction and feature selection; and 3) addition of gene expression data does not seem to further improve the performance of the resultant risk prediction scores.  相似文献   

10.
Summary: Job creation and destruction should be considered as key success or failure criteria of the economic policy. Job creation and destruction are both effects of economic policy, the degree of out- and in-sourcing, and the ability to create new ideas that can be transformed into jobs. Job creation and destruction are results of businesses attempting to maximize their economic outcome. One of the costs of this process is that employees have to move from destroyed jobs to created jobs. The development of this process probably depends on labor protection laws, habits, the educational system, and the whole UI-system. A flexible labor market ensures that scarce labor resources are used where they are most in demand. Thus, labor turnover is an essential factor in a well-functioning economy. This paper uses employer-employee data from the Danish registers of persons and workplaces to show where jobs have been destroyed and where they have been created over the last couple of business cycles. Jobs are in general destroyed and created simultaneously within each industry, but at the same time a major restructuring has taken place, so that jobs have been lost in Textile and Clothing, Manufacturing and the other “old industries”, while jobs have been created in Trade and Service industries. Out-sourcing has been one of the causes. This restructuring has caused a tremendous pressure on workers and their ability to find employment in expanding sectors. The paper shows how this has been accomplished. Especially, the paper shows what has happened to employees involved. Have they become unemployed, employed in the welfare sector or where? * First draft of this paper was presented at Deutsche Statistische Woche, Frankfurt, September 2004. Thanks to two referees for instructive comments. Financial support from The Danish Social Science Research Council through CCP is acknowledged.  相似文献   

11.
We consider a class of finite state, two-dimensional Markov chains which can produce a rich variety of patterns and whose simulation is very fast. A parameterization is chosen to make the process nearly spatially homogeneous. We use a form of pseudo-likelihood estimation which results in quick determination of estimate. Parameters associated with boundary cells are estimated separately. We derive the asymptotic distribution of the maximum pseudo-likelihood estimates and show that the usual form of the variance matrix has to be modified to take account of local dependence. Standard error calculations based on the modified asymptotic variance are supported by a simulation study. The procedure is applied to an eight-state permeability pattern from a section of hydrocarbon reservoir rock.  相似文献   

12.
In this article, we aim at assessing hierarchical Bayesian modeling for the analysis of multiple exposures and highly correlated effects in a multilevel setting. We exploit an artificial data set to apply our method and show the gains in the final estimates of the crucial parameters. As a motivating example to simulate data, we consider a real prospective cohort study designed to investigate the association of dietary exposures with the occurrence of colon-rectum cancer in a multilevel framework, where, e.g., individuals have been enrolled from different countries or cities. We rely on the presence of some additional information suitable to mediate the final effects of the exposures and to be arranged in a level-2 regression to model similarities among the parameters of interest (e.g., data on the nutrient compositions for each dietary item).  相似文献   

13.
Loddon Mallee Integrated Cancer Service plays a key role in planning the delivery of cancer services in the Loddon Mallee Region of Victoria, Australia. Forecasting the incidence of cancer is an important part of planning for these services. This article is written from an industry perspective. We describe the context of our work, review the literature on forecasting the incidence of cancer, discuss contemporary approaches, describe our experience with forecasting models, and list issues associated with applying these models. An extensive bibliography illustrates the world-wide interest in this forecasting problem. We hope that it is useful to researchers.  相似文献   

14.
Central limit theorems play an important role in the study of statistical inference for stochastic processes. However, when the non‐parametric local polynomial threshold estimator, especially local linear case, is employed to estimate the diffusion coefficients of diffusion processes, the adaptive and predictable structure of the estimator conditionally on the σ ‐field generated by diffusion processes is destroyed, so the classical central limit theorem for martingale difference sequences cannot work. In high‐frequency data, we proved the central limit theorems of local polynomial threshold estimators for the volatility function in diffusion processes with jumps by Jacod's stable convergence theorem. We believe that our proof procedure for local polynomial threshold estimators provides a new method in this field, especially in the local linear case.  相似文献   

15.
Summary.  The fundamental equations that model turbulent flow do not provide much insight into the size and shape of observed turbulent structures. We investigate the efficient and accurate representation of structures in two-dimensional turbulence by applying statistical models directly to the simulated vorticity field. Rather than extract the coherent portion of the image from the background variation, as in the classical signal-plus-noise model, we present a model for individual vortices using the non-decimated discrete wavelet transform. A template image, which is supplied by the user, provides the features to be extracted from the vorticity field. By transforming the vortex template into the wavelet domain, specific characteristics that are present in the template, such as size and symmetry, are broken down into components that are associated with spatial frequencies. Multivariate multiple linear regression is used to fit the vortex template to the vorticity field in the wavelet domain. Since all levels of the template decomposition may be used to model each level in the field decomposition, the resulting model need not be identical to the template. Application to a vortex census algorithm that records quantities of interest (such as size, peak amplitude and circulation) as the vorticity field evolves is given. The multiresolution census algorithm extracts coherent structures of all shapes and sizes in simulated vorticity fields and can reproduce known physical scaling laws when processing a set of vorticity fields that evolve over time.  相似文献   

16.
Abstract

For academic libraries, because budgetary pressures are nearly universal, it is imperative to evaluate journal packages regularly. This article presents an overview of the data and methods that the NC State University Libraries traditionally uses to evaluate journal packages and presents additional methods to expand our evaluation of publishing and editorial activity. We describe methods for downloading and analyzing Web of Science citation data to identify the most common publishers for NC State affiliated authors as well as the journals in which NC State authors publish most frequently. This article also demonstrates a custom Python web scraping application to harvest NC State affiliated editor data from publishers’ websites. Finally, this article discusses how these data elements are combined to provide a more comprehensive evaluative strategy for our journal investments.  相似文献   

17.
In this article we introduce a general approach to dynamic path analysis. This is an extension of classical path analysis to the situation where variables may be time-dependent and where the outcome of main interest is a stochastic process. In particular we will focus on the survival and event history analysis setting where the main outcome is a counting process. Our approach will be especially fruitful for analyzing event history data with internal time-dependent covariates, where an ordinary regression analysis may fail. The approach enables us to describe how the effect of a fixed covariate partly is working directly and partly indirectly through internal time-dependent covariates. For the sequence of times of event, we define a sequence of path analysis models. At each time of an event, ordinary linear regression is used to estimate the relation between the covariates, while the additive hazard model is used for the regression of the counting process on the covariates. The methodology is illustrated using data from a randomized trial on survival for patients with liver cirrhosis.  相似文献   

18.
The big data era demands new statistical analysis paradigms, since traditional methods often break down when datasets are too large to fit on a single desktop computer. Divide and Recombine (D&R) is becoming a popular approach for big data analysis, where results are combined over subanalyses performed in separate data subsets. In this article, we consider situations where unit record data cannot be made available by data custodians due to privacy concerns, and explore the concept of statistical sufficiency and summary statistics for model fitting. The resulting approach represents a type of D&R strategy, which we refer to as summary statistics D&R; as opposed to the standard approach, which we refer to as horizontal D&R. We demonstrate the concept via an extended Gamma–Poisson model, where summary statistics are extracted from different databases and incorporated directly into the fitting algorithm without having to combine unit record data. By exploiting the natural hierarchy of data, our approach has major benefits in terms of privacy protection. Incorporating the proposed modelling framework into data extraction tools such as TableBuilder by the Australian Bureau of Statistics allows for potential analysis at a finer geographical level, which we illustrate with a multilevel analysis of the Australian unemployment data. Supplementary materials for this article are available online.  相似文献   

19.
For right-censored survival data, the information that whether the observed time is survival or censoring time is frequently lost. This is the case for the competing risk data. In this article, we consider statistical inference for the right-censored survival data with censoring indicators missing at random under the proportional mean residual life model. Simple and augmented inverse probability weighted estimating equation approaches are developed, in which the nonmissingness probability and some unknown conditional expectations are estimated by the kernel smoothing technique. The asymptotic properties of all the proposed estimators are established, while extensive simulation studies demonstrate that our proposed methods perform well under the moderate sample size. At last, the proposed method is applied to a data set from a stage II breast cancer trial.  相似文献   

20.
In the analysis of competing risks data, cumulative incidence function is a useful summary of the overall crude risk for a failure type of interest. Mixture regression modeling has served as a natural approach to performing covariate analysis based on this quantity. However, existing mixture regression methods with competing risks data either impose parametric assumptions on the conditional risks or require stringent censoring assumptions. In this article, we propose a new semiparametric regression approach for competing risks data under the usual conditional independent censoring mechanism. We establish the consistency and asymptotic normality of the resulting estimators. A simple resampling method is proposed to approximate the distribution of the estimated parameters and that of the predicted cumulative incidence functions. Simulation studies and an analysis of a breast cancer dataset demonstrate that our method performs well with realistic sample sizes and is appropriate for practical use.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号