首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper presents a study on symmetry of repeated bi-phased data signals, in particular, on quantification of the deviation between the two parts of the signal. Three symmetry scores are defined using functional data techniques such as smoothing and registration. One score is related to the L 2-distance between the two parts of the signal, whereas the other two are constructed to specifically measure differences in amplitude and phase. Moreover, symmetry scores based on functional principal component analysis (PCA) are examined. The scores are applied to acceleration signals from a study on equine gait. The scores turn out to be highly associated with lameness, and their applicability for lameness quantification and detection is investigated. Four classification approaches turn out to give similar results. The scores describing amplitude and phase variation turn out to outperform the PCA scores when it comes to the classification of lameness.  相似文献   

2.
When classification rules are constructed using sample estimatest it is known that the probability of misclassification is not minimized. This article introduces a biased minimum X2 rule to classify items from a multivariate normal population. Using the principle of variance reduction, the probability of misclassification is reduced when the biased procedure is employed. Results of sampling experiments over a broad range of conditions are provided to demonstrate this improvement.  相似文献   

3.
Data from a weather modification experiment are examined and a number of statistical analyses reported. The validity of earlier inferences is studied as are the utilities of various statistical methods. The experiment is described. The original analysis of North American Weather Consultants, who conducted the experiment, is reviewed. Data summarization is reported. A major approach to analysis is through the use of cloud-physics covari-ates in regression analyses. Finally, a multivariate analysis is discussed. It appears that the covariates may have been affected by treatment (cloud seeding) and that their use is invalid, not only reducing error variances but removing treatment effect. Some recommendations for improved design of similar future experiments are given in a concluding section, including preliminary trial use of blocking by storms.  相似文献   

4.
Here we consider a multinomial probit regression model where the number of variables substantially exceeds the sample size and only a subset of the available variables is associated with the response. Thus selecting a small number of relevant variables for classification has received a great deal of attention. Generally when the number of variables is substantial, sparsity-enforcing priors for the regression coefficients are called for on grounds of predictive generalization and computational ease. In this paper, we propose a sparse Bayesian variable selection method in multinomial probit regression model for multi-class classification. The performance of our proposed method is demonstrated with one simulated data and three well-known gene expression profiling data: breast cancer data, leukemia data, and small round blue-cell tumors. The results show that compared with other methods, our method is able to select the relevant variables and can obtain competitive classification accuracy with a small subset of relevant genes.  相似文献   

5.
We model the Alzheimer's disease-related phenotype response variables observed on irregular time points in longitudinal Genome-Wide Association Studies as sparse functional data and propose nonparametric test procedures to detect functional genotype effects while controlling the confounding effects of environmental covariates. Our new functional analysis of covariance tests are based on a seemingly unrelated kernel smoother, which takes into account the within-subject temporal correlations, and thus enjoy improved power over existing functional tests. We show that the proposed test combined with a uniformly consistent nonparametric covariance function estimator enjoys the Wilks phenomenon and is minimax most powerful. Data used in the preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative database, where an application of the proposed test lead to the discovery of new genes that may be related to Alzheimer's disease.  相似文献   

6.
This paper focuses on smoothed functional canonical correlation analysis (SFCCA) to investigate the relationships and changes in large, seasonal and long-term data sets. The aim of this study is to introduce a guideline for SFCCA for functional data and to give some insights on the fine tuning of the methodology for long-term periodical data. The guidelines are applied on temperature and humidity data for 11 years between 2000 and 2010 and the results are interpreted. Seasonal changes or periodical shifts are visually studied by yearly comparisons. The effects of the ‘number of basis functions’ and the ‘selection of smoothing parameter’ on the general variability structure and on correlations between the curves are examined. It is concluded that the number of time points (knots), number of basis functions and the time span of evaluation (monthly, daily, etc.) should all be chosen harmoniously. It is found that changing the smoothing parameter does not have a significant effect on the structure of curves and correlations. The number of basis functions is found to be the main effector on both individual and correlation weight functions.  相似文献   

7.
We propose a hybrid two-group classification method that integrates linear discriminant analysis, a polynomial expansion of the basis (or variable space), and a genetic algorithm with multiple crossover operations to select variables from the expanded basis. Using new product launch data from the biochemical industry, we found that the proposed algorithm offers mean percentage decreases in the misclassification error rate of 50%, 56%, 59%, 77%, and 78% in comparison to a support vector machine, artificial neural network, quadratic discriminant analysis, linear discriminant analysis, and logistic regression, respectively. These improvements correspond to annual cost savings of $4.40–$25.73 million.  相似文献   

8.
The purpose of this work is, on the one hand, to study how to forecast road trafficking on highway networks and, on the other hand, to describe future traffic events. Here, road trafficking is measured by vehicle velocities. The authors propose two methodologies. The first is based on an empirical classification method, and the second on a probability mixture model. They use an SAEM‐type algorithm (a stochastic approximation of the EM algorithm) to select the densities of the mixture model. Then, they test the validity of their methodologies by forecasting short term travel times.  相似文献   

9.
Several mathematical programming approaches to the classification problem in discriminant analysis have recently been introduced. This paper empirically compares these newly introduced classification techniques with Fisher's linear discriminant analysis (FLDA), quadratic discriminant analysis (QDA), logit analysis, and several rank-based procedures for a variety of symmetric and skewed distributions. The percent of correctly classified observations by each procedure in a holdout sample indicate that while under some experimental conditions the linear programming approaches compete well with the classical procedures, overall, however, their performance lags behind that of the classical procedures.  相似文献   

10.
Markov regression models are useful tools for estimating the impact of risk factors on rates of transition between multiple disease states. Alzheimer's disease (AD) is an example of a multi-state disease process in which great interest lies in identifying risk factors for transition. In this context, non-homogeneous models are required because transition rates change as subjects age. In this report we propose a non-homogeneous Markov regression model that allows for reversible and recurrent disease states, transitions among multiple states between observations, and unequally spaced observation times. We conducted simulation studies to demonstrate performance of estimators for covariate effects from this model and compare performance with alternative models when the underlying non-homogeneous process was correctly specified and under model misspecification. In simulation studies, we found that covariate effects were biased if non-homogeneity of the disease process was not accounted for. However, estimates from non-homogeneous models were robust to misspecification of the form of the non-homogeneity. We used our model to estimate risk factors for transition to mild cognitive impairment (MCI) and AD in a longitudinal study of subjects included in the National Alzheimer's Coordinating Center's Uniform Data Set. Using our model, we found that subjects with MCI affecting multiple cognitive domains were significantly less likely to revert to normal cognition.  相似文献   

11.
Few publications consider the estimation of relative risk for vector-borne infectious diseases. Most of these articles involve exploratory analysis that includes the study of covariates and their effects on disease distribution and the study of geographic information systems to integrate patient-related information. The aim of this paper is to introduce an alternative method of relative risk estimation based on discrete time–space stochastic SIR-SI models (susceptible–infective–recovered for human populations; susceptible–infective for vector populations) for the transmission of vector-borne infectious diseases, particularly dengue disease. First, we describe deterministic compartmental SIR-SI models that are suitable for dengue disease transmission. We then adapt these to develop corresponding discrete time–space stochastic SIR-SI models. Finally, we develop an alternative method of estimating the relative risk for dengue disease mapping based on these models and apply them to analyse dengue data from Malaysia. This new approach offers a better model for estimating the relative risk for dengue disease mapping compared with the other common approaches, because it takes into account the transmission process of the disease while allowing for covariates and spatial correlation between risks in adjacent regions.  相似文献   

12.
函数性数据的统计分析:思想、方法和应用   总被引:9,自引:0,他引:9       下载免费PDF全文
严明义 《统计研究》2007,24(2):87-94
 摘  要:实际中,越来越多的研究领域所收集到的样本观测数据具有函数性特征,这种函数性数据是融合时间序列和横截面两者的数据,有些甚是曲线或其他函数图像。虽然计量经济学近二十多年来发展的面板数据分析方法,具有很好的应用价值,但是面板数据只是函数性数据的一种特殊类型,且其分析方法太过于依赖模型的线性结构和假设条件等。本文基于函数性数据的普遍特征,介绍一种对其进行分析的全新方法,并率先使用该方法对经济函数性数据进行分析,拓展了函数性数据分析的应用范围。分析结果表明,函数性数据分析方法,较之计量经济学和其他统计方法具有更多的优越性,尤其能够揭示其他方法所不能揭示的数据特征  相似文献   

13.
The independent exploratory factor analysis method is introduced for recovering independent latent sources from their observed mixtures. The new model is viewed as a method of factor rotation in exploratory factor analysis (EFA). First, estimates for all EFA model parameters are obtained simultaneously. Then, an orthogonal rotation matrix is sought that minimizes the dependence between the common factors. The rotation of the scores is compensated by a rotation of the initial loading matrix. The proposed approach is applied to study winter monthly sea-level pressure anomalies over the Northern Hemisphere. The North Atlantic Oscillation, the North Pacific Oscillation, and the Scandinavian pattern are identified among the rotated spatial patterns with a physically interpretable structure.  相似文献   

14.
15.
16.
Summary.  Asymmetry is a feature of shape which is of particular interest in a variety of applications. With landmark data, the essential information on asymmetry is contained in the degree to which there is a mismatch between a landmark configuration and its relabelled and matched reflection. This idea is explored in the context of a study of facial shape in infants, where particular interest lies in identifying changes over time and in assessing residual deformity in children who have had corrective surgery for a cleft lip or cleft lip and palate. Interest lies not in whether the mean shape is asymmetric but in comparing the degrees of asymmetry in different populations. A decomposition of the asymmetry score into components that are attributable to particular features of the face is proposed. A further decomposition allows different sources of asymmetry due to position, orientation or intrinsic asymmetry to be identified for each feature. The methods are also extended to data representing anatomical curves across the face.  相似文献   

17.
We propose a mixture of latent variables model for the model-based clustering, classification, and discriminant analysis of data comprising variables with mixed type. This approach is a generalization of latent variable analysis, and model fitting is carried out within the expectation-maximization framework. Our approach is outlined and a simulation study conducted to illustrate the effect of sample size and noise on the standard errors and the recovery probabilities for the number of groups. Our modelling methodology is then applied to two real data sets and their clustering and classification performance is discussed. We conclude with discussion and suggestions for future work.  相似文献   

18.
19.
Summary.  Factor analysis is a powerful tool to identify the common characteristics among a set of variables that are measured on a continuous scale. In the context of factor analysis for non-continuous-type data, most applications are restricted to item response data only. We extend the factor model to accommodate ranked data. The Monte Carlo expectation–maximization algorithm is used for parameter estimation at which the E-step is implemented via the Gibbs sampler. An analysis based on both complete and incomplete ranked data (e.g. rank the top q out of k items) is considered. Estimation of the factor scores is also discussed. The method proposed is applied to analyse a set of incomplete ranked data that were obtained from a survey that was carried out in GuangZhou, a major city in mainland China, to investigate the factors affecting people's attitude towards choosing jobs.  相似文献   

20.
In modern statistical practice, it is increasingly common to observe a set of curves or images, often measured with noise, and to use these as the basis of analysis (functional data analysis). We consider a functional data model consisting of measurement error and functional random effects motivated by data from a study of human vision. By transforming the data into the wavelet domain we are able to exploit the expected sparse representation of the underlying function and the mechanism generating the random effects. We propose simple fitting procedures and illustrate the methods on the vision data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号