首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 976 毫秒
1.
In this article, we address the problem of mining and analyzing multivariate functional data. That is, data where each observation is a set of possibly correlated functions. Complex data of this kind is more and more common in many research fields, particularly in the biomedical context. In this work, we propose and apply a new concept of depth measure for multivariate functional data. With this new depth measure it is possible to generalize robust statistics, such as the median, to the multivariate functional framework, which in turn allows the application of outlier detection, boxplots construction, and nonparametric tests also in this more general framework. We present an application to Electrocardiographic (ECG) signals.  相似文献   

2.
Nonparametric regression methods have been widely studied in functional regression analysis in the context of functional covariates and univariate response, but it is not the case for functional covariates with multivariate response. In this paper, we present two new solutions for the latter problem: the first is to directly extend the nonparametric method for univariate response to multivariate response, and in the second, the correlation among different responses is incorporated into the model. The asymptotic properties of the estimators are studied, and the effectiveness of the proposed methods is demonstrated through several simulation studies and a real data example.  相似文献   

3.
Cross-validation has been widely used in the context of statistical linear models and multivariate data analysis. Recently, technological advancements give possibility of collecting new types of data that are in the form of curves. Statistical procedures for analysing these data, which are of infinite dimension, have been provided by functional data analysis. In functional linear regression, using statistical smoothing, estimation of slope and intercept parameters is generally based on functional principal components analysis (FPCA), that allows for finite-dimensional analysis of the problem. The estimators of the slope and intercept parameters in this context, proposed by Hall and Hosseini-Nasab [On properties of functional principal components analysis, J. R. Stat. Soc. Ser. B: Stat. Methodol. 68 (2006), pp. 109–126], are based on FPCA, and depend on a smoothing parameter that can be chosen by cross-validation. The cross-validation criterion, given there, is time-consuming and hard to compute. In this work, we approximate this cross-validation criterion by such another criterion so that we can turn to a multivariate data analysis tool in some sense. Then, we evaluate its performance numerically. We also treat a real dataset, consisting of two variables; temperature and the amount of precipitation, and estimate the regression coefficients for the former variable in a model predicting the latter one.  相似文献   

4.

Sufficient dimension reduction (SDR) provides a framework for reducing the predictor space dimension in statistical regression problems. We consider SDR in the context of dimension reduction for deterministic functions of several variables such as those arising in computer experiments. In this context, SDR can reveal low-dimensional ridge structure in functions. Two algorithms for SDR—sliced inverse regression (SIR) and sliced average variance estimation (SAVE)—approximate matrices of integrals using a sliced mapping of the response. We interpret this sliced approach as a Riemann sum approximation of the particular integrals arising in each algorithm. We employ the well-known tools from numerical analysis—namely, multivariate numerical integration and orthogonal polynomials—to produce new algorithms that improve upon the Riemann sum-based numerical integration in SIR and SAVE. We call the new algorithms Lanczos–Stieltjes inverse regression (LSIR) and Lanczos–Stieltjes average variance estimation (LSAVE) due to their connection with Stieltjes’ method—and Lanczos’ related discretization—for generating a sequence of polynomials that are orthogonal with respect to a given measure. We show that this approach approximates the desired integrals, and we study the behavior of LSIR and LSAVE with two numerical examples. The quadrature-based LSIR and LSAVE eliminate the first-order algebraic convergence rate bottleneck resulting from the Riemann sum approximation, thus enabling high-order numerical approximations of the integrals when appropriate. Moreover, LSIR and LSAVE perform as well as the best-case SIR and SAVE implementations (e.g., adaptive partitioning of the response space) when low-order numerical integration methods (e.g., simple Monte Carlo) are used.

  相似文献   

5.
In the manufacturing process, a sequence of measurements of quality characteristic is increasingly taken across some continuum, producing a curve that represents the quality of the item. This curve provides the so-called profile or functional data. Regardless of a linear or nonlinear profile, the common approaches of the control chart are based on the multivariate control chart by monitoring the estimated parameter of the pre-defined linear or nonlinear model. Usually, the model is difficult to know practically, and it is also difficult to identify the abnormal pattern from the outlying parameter. The functional data control chart we propose can provide a better solution to these problems. In the Monte-Carlo simulations, we show that the functional data control chart is sensitive when the underlying process status is changed. By applying the vertical density profile data, the new method exhibits a good performance.  相似文献   

6.
In first-level analyses of functional magnetic resonance imaging data, adjustments for temporal correlation as a Satterthwaite approximation or a prewhitening method are usually implemented in the univariate model to keep the nominal test level. In doing so, the temporal correlation structure of the data is estimated, assuming an autoregressive process of order one.We show that this is applicable in multivariate approaches too - more precisely in the so-called stabilized multivariate test statistics. Furthermore, we propose a block-wise permutation method including a random shift that renders an approximation of the temporal correlation structure unnecessary but also approximately keeps the nominal test level in spite of the dependence of sample elements.Although the intentions are different, a comparison of the multivariate methods with the multiple ones shows that the global approach may achieve advantages if applied to suitable regions of interest. This is illustrated using an example from fMRI studies.  相似文献   

7.
In this article, we first propose the classical multivariate generalized Birnbaum–Saunders kernel estimator for probability density function estimation in the context of multivariate non negative data. Then, we apply two multiplicative bias correction (MBC) techniques for multivariate kernel density estimator. Some properties (bias, variance, and mean integrated squared error) of the corresponding estimators are also investigated. Finally, the performances of the classical and MBC estimators based on family of generalized Birnbaum–Saunders kernels are illustrated by a simulation study.  相似文献   

8.
The partial least squares (PLS) approach first constructs new explanatory variables, known as factors (or components), which are linear combinations of available predictor variables. A small subset of these factors is then chosen and retained for prediction. We study the performance of PLS in estimating single-index models, especially when the predictor variables exhibit high collinearity. We show that PLS estimates are consistent up to a constant of proportionality. We present three simulation studies that compare the performance of PLS in estimating single-index models with that of sliced inverse regression (SIR). In the first two studies, we find that PLS performs better than SIR when collinearity exists. In the third study, we learn that PLS performs well even when there are multiple dependent variables, the link function is non-linear and the shape of the functional form is not known.  相似文献   

9.
Functional boxplot is an attractive technique to visualize data that come from functions. We propose an alternative to the functional boxplot based on depth measures. Our proposal generalizes the usual construction of the box-plot in one dimension related to the down-upward orderings of the data by considering two intuitive pre-orders in the functional context. These orderings are based on the epigraphs and hypographs of the data that allow a new definition of functional quartiles which is more robust to shape outliers. Simulated and real examples show that this proposal provides a convenient visualization technique with a great potential for analyzing functional data and illustrate its usefulness to detect outliers that other procedures do not detect.  相似文献   

10.

We propose a semiparametric framework based on sliced inverse regression (SIR) to address the issue of variable selection in functional regression. SIR is an effective method for dimension reduction which computes a linear projection of the predictors in a low-dimensional space, without loss of information on the regression. In order to deal with the high dimensionality of the predictors, we consider penalized versions of SIR: ridge and sparse. We extend the approaches of variable selection developed for multidimensional SIR to select intervals that form a partition of the definition domain of the functional predictors. Selecting entire intervals rather than separated evaluation points improves the interpretability of the estimated coefficients in the functional framework. A fully automated iterative procedure is proposed to find the critical (interpretable) intervals. The approach is proved efficient on simulated and real data. The method is implemented in the R package SISIR available on CRAN at https://cran.r-project.org/package=SISIR.

  相似文献   

11.
A parametric modelling for interval data is proposed, assuming a multivariate Normal or Skew-Normal distribution for the midpoints and log-ranges of the interval variables. The intrinsic nature of the interval variables leads to special structures of the variance–covariance matrix, which is represented by five different possible configurations. Maximum likelihood estimation for both models under all considered configurations is studied. The proposed modelling is then considered in the context of analysis of variance and multivariate analysis of variance testing. To access the behaviour of the proposed methodology, a simulation study is performed. The results show that, for medium or large sample sizes, tests have good power and their true significance level approaches nominal levels when the constraints assumed for the model are respected; however, for small samples, sizes close to nominal levels cannot be guaranteed. Applications to Chinese meteorological data in three different regions and to credit card usage variables for different card designations, illustrate the proposed methodology.  相似文献   

12.
Sliced inverse regression (SIR) was developed to find effective linear dimension-reduction directions for exploring the intrinsic structure of the high-dimensional data. In this study, we present isometric SIR for nonlinear dimension reduction, which is a hybrid of the SIR method using the geodesic distance approximation. First, the proposed method computes the isometric distance between data points; the resulting distance matrix is then sliced according to K-means clustering results, and the classical SIR algorithm is applied. We show that the isometric SIR (ISOSIR) can reveal the geometric structure of a nonlinear manifold dataset (e.g., the Swiss roll). We report and discuss this novel method in comparison to several existing dimension-reduction techniques for data visualization and classification problems. The results show that ISOSIR is a promising nonlinear feature extractor for classification applications.  相似文献   

13.
Sliced Inverse Regression (SIR) is a promising technique for the purpose of dimension reduction. Several properties of this method have been examined already, but little attention has been paid to robustness aspects. In this article, we focus on the sensitivity of SIR to outliers and show in what sense and how severely SIR can be influenced by outliers in the data.  相似文献   

14.
Stochastic compartmental (e.g., SIR) models have proven useful for studying the epidemics of childhood diseases while taking into account the variability of the epidemic dynamics. Here, we present a method for estimating balanced simultaneous confidence sets for the mean sample path of a stochastic SIR model, thus providing a simple representation of both the typical behavior and the variability of the epidemic. The confidence sets are estimated by a bootstrap procedure, using asymptotic properties of density dependent jump Markov processes. The method is applied to chickenpox epidemics in France and the coverage probability of the confidence sets is estimated in that context.  相似文献   

15.
In some fields, we are forced to work with missing data in multivariate time series. Unfortunately, the data analysis in this context cannot be carried out in the same way as in the case of complete data. To deal with this problem, a Bayesian analysis of multivariate threshold autoregressive models with exogenous inputs and missing data is carried out. In this paper, Markov chain Monte Carlo methods are used to obtain samples from the involved posterior distributions, including threshold values and missing data. In order to identify autoregressive orders, we adapt the Bayesian variable selection method in this class of multivariate process. The number of regimes is estimated using marginal likelihood or product parameter-space strategies.  相似文献   

16.
The univariate fatigue life distribution proposed by Birnbaum and Saunders [A new family of life distributions. J Appl Probab. 1969;6:319–327] has been used quite effectively to model times to failure for materials subject to fatigue and for modelling lifetime data and reliability problems. In this article, we introduce a Birnbaum–Saunders (BS) distribution in the multivariate setting. The new multivariate model arises in the context of conditionally specified distributions. The proposed multivariate model is an absolutely continuous distribution whose marginals are univariate BS distributions. General properties of the multivariate BS distribution are derived and the estimation of the unknown parameters by maximum likelihood is discussed. Further, the Fisher's information matrix is determined. Applications to real data of the proposed multivariate distribution are provided for illustrative purposes.  相似文献   

17.
We describe a mixed-effect hurdle model for zero-inflated longitudinal count data, where a baseline variable is included in the model specification. Association between the count data process and the endogenous baseline variable is modeled through a latent structure, assumed to be dependent across equations. We show how model parameters can be estimated in a finite mixture context, allowing for overdispersion, multivariate association and endogeneity of the baseline variable. The model behavior is investigated through a large-scale simulation experiment. An empirical example on health care utilization data is provided.  相似文献   

18.
Functional regression functions, with explanatory variables taking values in some abstract function space, have been studied extensively. In this article, we aim to investigate the multivariate functional regression function, and propose a nonparametric estimator for the multivariate case. By applying some properties of U-statistics, some asymptotic distributions of such estimator are obtained under different cases.  相似文献   

19.
Functional data are being observed frequently in many scientific fields, and therefore most of the standard statistical methods are being adapted for functional data. The multivariate analysis of variance problem for functional data is considered. It seems to be of practical interest similarly as the one-way analysis of variance for such data. For the MANOVA problem for multivariate functional data, we propose permutation tests based on a basis function representation and tests based on random projections. Their performance is examined in comprehensive simulation studies, which provide an idea of the size control and power of the tests and identify differences between them. The simulation experiments are based on artificial data and real labeled multivariate time series data found in the literature. The results suggest that the studied testing procedures can detect small differences between vectors of curves even with small sample sizes. Illustrative real data examples of the use of the proposed testing procedures in practice are also presented.  相似文献   

20.
A note on the Cook''s distance   总被引:1,自引:0,他引:1  
A modification of the classical Cook's distance is proposed, providing us with a generalized Mahalanobis distance in the context of multivariate elliptical linear regression models. We establish the exact distribution of a pivotal type statistics based on this generalized Mahalanobis distance, which provides critical points for the identification of outlier data points. Based on the equivalence between the modified Cook's distance and what is called the mean-shift multivariate outlier elliptical model, twelve new modifications are proposed for the Cook's distance. We also describe the explicit relationship between the Cook's distance and the likelihood displacement with the modified Cook's distance. We illustrate the procedure with some examples, in the context of multiple and multivariate linear regression.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号