首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Artificial neural networks have been successfully applied to a variety of machine learning tasks, including image recognition, semantic segmentation, and machine translation. However, few studies fully investigated ensembles of artificial neural networks. In this work, we investigated multiple widely used ensemble methods, including unweighted averaging, majority voting, the Bayes Optimal Classifier, and the (discrete) Super Learner, for image recognition tasks, with deep neural networks as candidate algorithms. We designed several experiments, with the candidate algorithms being the same network structure with different model checkpoints within a single training process, networks with same structure but trained multiple times stochastically, and networks with different structure. In addition, we further studied the overconfidence phenomenon of the neural networks, as well as its impact on the ensemble methods. Across all of our experiments, the Super Learner achieved best performance among all the ensemble methods in this study.  相似文献   

2.
This paper is concerned with the application of artificial neural networks (ANNs) to a practical, difficult and high-dimensional classification problem, discrimination between selected under-water sounds. The application provides for a particular comparison of the relative performance of time-delay as opposed to fully connected network architectures, in the analysis of temporal data. More originally, suggestions are given for adapting the conventional backpropagation algorithm to give greater robustness to mis-classification errors in the training examples—a particular problem with underwater sound data and one which may arise in other realistic applications of ANNs. An informal comparison is made between the generalisation performance of various architectures in classifying real dolphin sounds when networks are trained using the conventional least squares minimisation norm, L 2, that of least absolute deviation, L 1, and that of the Huber criterion, which involves a mixture of both L 1 and L 2. The results suggest that L 1 and Huber may provide performance gains. In order to evaluate these robust adjustments more formally under controlled conditions, an experiment is then conducted using simulated dolphin sounds with known levels of random noise and misclassification error. Here, the results are more ambiguous and significant interactions are indicated which raise issues for future research.  相似文献   

3.
Acceptance sampling, a category of statistical quality control, deals with the confidence of the product's quality. In certain times, it is necessary to deal with the error in the demanding distribution counting on the sample size and the pertained population size, in determining the necessitated sample size for the acute exactitude. Further this sample size with minimized error is utilized in deriving the most beneficial OC curve. Neural networks have been used to train the data with the resulting error and their matching toleration level for the sample sizes of different population sizes. This trained network can be used to foster automated acceptance or rejection of the sample size to be used for a better OC curve based on the minimized error, ensuing time reduction of the burdened work. It is better explained in this paper with the geo-statistics data, using SAS program.  相似文献   

4.
In this paper, almost sure exponential stability and pth moment exponential stability of stochastic cellular neural networks with mixed delays are investigated. Employing the methods of stochastic analysis, the Lyapunov’s method, and useful inequality techniques, a sufficient condition ensuring the almost sure exponential stability and pth moment exponential stability is obtained. Two examples are given to illustrate this sufficient condition.  相似文献   

5.
Neural networks are a popular machine learning tool, particularly in applications such as protein structure prediction; however, overfitting can pose an obstacle to their effective use. Due to the large number of parameters in a typical neural network, one may obtain a network fit that perfectly predicts the learning data, yet fails to generalize to other data sets. One way of reducing the size of the parmeter space is to alter the network topology so that some edges are removed; however it is often not immediately apparent which edges should be eliminated. We propose a data-adaptive method of selecting an optimal network architecture using a deletion/substitution/addition algorithm. Results of this approach to classification are presented on simulated data and the breast cancer data of Wolberg and Mangasarian [1990. Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc. Nat. Acad. Sci. 87, 9193–9196].  相似文献   

6.
This statement of retraction refers to the iFirst version of the paper that has since been removed from this site. A PDF version of the retracted article can be viewed in the Supplementary Content section of this article  相似文献   

7.
Bayesian neural networks for nonlinear time series forecasting   总被引:3,自引:0,他引:3  
In this article, we apply Bayesian neural networks (BNNs) to time series analysis, and propose a Monte Carlo algorithm for BNN training. In addition, we go a step further in BNN model selection by putting a prior on network connections instead of hidden units as done by other authors. This allows us to treat the selection of hidden units and the selection of input variables uniformly. The BNN model is compared to a number of competitors, such as the Box-Jenkins model, bilinear model, threshold autoregressive model, and traditional neural network model, on a number of popular and challenging data sets. Numerical results show that the BNN model has achieved a consistent improvement over the competitors in forecasting future values. Insights on how to improve the generalization ability of BNNs are revealed in many respects of our implementation, such as the selection of input variables, the specification of prior distributions, and the treatment of outliers.  相似文献   

8.
A relevant problem in many applicatory contexts is to test whether some given observations follow one of two possible probability distributions. The vast literature produced over the years on this topic does not identify a tool which can be easily adopted to any situation but only finds solutions to specific comparisons. Recently, an easy to implement procedure for discrimination between two distributions based on feed-forward neural networks has been proposed giving interesting results. In this work this procedure is further investigated in terms of power, neural network architecture and expected statistical properties of the test statistic for small, moderate and large sample sizes, in a wide range of symmetric and skewed alternatives.  相似文献   

9.
Echo state network (ESN) is viewed as a temporal expansion which naturally give rise to regressors of various relevance to a teacher output. We illustrate that often only a certain amount of the generated echo-regressors effectively explain the teacher output and we propose to determine the importance of the echo-regressors by a joint calculation of the individual variance contributions and Bayesian relevance using the locally regularized orthogonal forward regression (LROFR). This information can be advantageously used in a variety of ways for an analysis of an ESN structure. We present a locally regularized linear readout built using LROFR. The readout may have a smaller dimensionality than the ESN model itself, and improves robustness and accuracy of an ESN. Its main advantage is ability to determine what type of an additional readout is suitable for a task at hand. Comparison with PCA is provided too. We also propose a radial basis function (RBF) readout built using LROFR, since flexibility of the linear readout has limitations and might be insufficient for complex tasks. Its excellent generalization abilities make it a viable alternative to feed-forward neural networks or relevance-vector-machines. For cases where more temporal capacity is required we propose well studied delay&sum readout.  相似文献   

10.
Multi-layer perceptrons (MLPs), a common type of artificial neural networks (ANNs), are widely used in computer science and engineering for object recognition, discrimination and classification, and have more recently found use in process monitoring and control. Training such networks is not a straightforward optimisation problem, and we examine features of these networks which contribute to the optimisation difficulty.Although the original perceptron, developed in the late 1950s (Rosenblatt 1958, Widrow and Hoff 1960), had a binary output from each node, this was not compatible with back-propagation and similar training methods for the MLP. Hence the output of each node (and the final network output) was made a differentiable function of the network inputs. We reformulate the MLP model with the original perceptron in mind so that each node in the hidden layers can be considered as a latent (that is, unobserved) Bernoulli random variable. This maintains the property of binary output from the nodes, and with an imposed logistic regression of the hidden layer nodes on the inputs, the expected output of our model is identical to the MLP output with a logistic sigmoid activation function (for the case of one hidden layer).We examine the usual MLP objective function—the sum of squares—and show its multi-modal form and the corresponding optimisation difficulty. We also construct the likelihood for the reformulated latent variable model and maximise it by standard finite mixture ML methods using an EM algorithm, which provides stable ML estimates from random starting positions without the need for regularisation or cross-validation. Over-fitting of the number of nodes does not affect this stability. This algorithm is closely related to the EM algorithm of Jordan and Jacobs (1994) for the Mixture of Experts model.We conclude with some general comments on the relation between the MLP and latent variable models.  相似文献   

11.
This paper compares the performance between regression analysis and a clustering based neural network approach when the data deviates from the homoscedasticity assumption of regression. Heteroskedasticity is a problem that arises in linear regression due to the unequal error variances. One of the methods to deal heteroskedasticity in classical regression theory is weighted least-square regression (WLS). In order to deal the problem of heteroskedasticity, backpropagation neural network is applied. In this context, an algorithm is proposed which is based on robust estimates of location and dispersion matrix that helps in preserving the error assumption of the linear regression. Analysis is carried out with appropriate designs using simulated data and the results are presented.  相似文献   

12.
13.
There is a tendency for the true variability of feasible GLS estimators to be understated by asymptotic standard errors. For estimation of SUR models, this tendency becomes more severe in large equation systems when estimation of the error covariance matrix, C, becomes problematic. We explore a number of potential solutions involving the use of improved estimators for the disturbance covariance matrix and bootstrapping. In particular, Ullah and Racine (1992) have recently introduced a new class of estimators for SUR models that use nonparametric kernel density estimation techniques. The proposed estimators have the same structure as the feasible GLS estimator of Zellner (1962) differing only in the choice of estimator for C. Ullah and Racine (1992) prove that their nonparametric density estimator of C can be expressed as Zellner's original estimator plus a positive definite matrix that depends on the smoothing parameter chosen for the density estimation. It is this structure of the estimator that most interests us as it has the potential to be especially useful in large equation systems.

Atkinson and Wilson (1992) investigated the bias in the conventional and bootstrap estimators of coefficient standard errors in SUR models. They demonstrated that under certain conditions the former were superior, but they caution that neither estimator uniformly dominated and hence bootstrapping provides little improvement in the estimation of standard errors for the regression coefficients. Rilstone and Veal1 (1996) argue that an important qualification needs to be made to this somewhat negative conclusion. They demonstrated that bootstrapping can result in improvements in inferences if the procedures are applied to the t-ratios rather than to the standard errors. These issues are explored for the case of large equation systems and when bootstrapping is combined with improved covariance estimation.  相似文献   

14.
In the field of chaotic time series analysis, there is a lack of a distributional theory for the main quantities used to characterize the underlying data generating process (DGP). In this paper a method for resampling time series generated by a chaotic dynamical system is proposed. The basic idea is to develop an algorithm for building trajectories which lie on the same attractor of the true DGP, that is with the same dynamical and geometrical properties of the original data. We performed some numerical experiments on some short noise-free and high-noise series confirming that we are able to correctly reproduce the distribution of the largest finite-time Lyapunov exponent and of the correlation dimension.  相似文献   

15.
This paper deals with an important problem with large and complex Bayesian networks. Exact inference in these networks is simply not feasible owing to the huge storage requirements of exact methods. Markov chain Monte Carlo methods, however, are able to deal with these large networks but to do this they require an initial legal configuration to set off the sampler. So far nondeterministic methods such as forward sampling have often been used for this, even though the forward sampler may take an eternity to come up with a legal configuration. In this paper a novel algorithm will be presented that allows a legal configuration in a general Bayesian network to be found in polynomial time in almost all cases. The algorithm will not be proved deterministic but empirical results will demonstrate that this holds in most cases. Also, the algorithm will be justified by its simplicity and ease of implementation.  相似文献   

16.
The purpose of this study is to highlight dangerous motorways via estimating the intensity of accidents and study its pattern across the UK motorway network. Two methods have been developed to achieve this aim. First, the motorway-specific intensity is estimated by using a homogeneous Poisson process. The heterogeneity across motorways is incorporated using two-level hierarchical models. The data structure is multilevel since each motorway consists of junctions that are joined by grouped segments. In the second method, the segment-specific intensity is estimated. The homogeneous Poisson process is used to model accident data within grouped segments but heterogeneity across grouped segments is incorporated using three-level hierarchical models. A Bayesian method via Markov Chain Monte Carlo is used to estimate the unknown parameters in the models and the sensitivity to the choice of priors is assessed. The performance of the proposed models is evaluated by a simulation study and an application to traffic accidents in 2016 on the UK motorway network. The deviance information criterion (DIC) and the widely applicable information criterion (WAIC) are employed to choose between models.  相似文献   

17.
Dealing with incomplete data is a pervasive problem in statistical surveys. Bayesian networks have been recently used in missing data imputation. In this research, we propose a new methodology for the multivariate imputation of missing data using discrete Bayesian networks and conditional Gaussian Bayesian networks. Results from imputing missing values in coronary artery disease data set and milk composition data set as well as a simulation study from cancer-neapolitan network are presented to demonstrate and compare the performance of three Bayesian network-based imputation methods with those of multivariate imputation by chained equations (MICE) and the classical hot-deck imputation method. To assess the effect of the structure learning algorithm on the performance of the Bayesian network-based methods, two methods called Peter-Clark algorithm and greedy search-and-score have been applied. Bayesian network-based methods are: first, the method introduced by Di Zio et al. [Bayesian networks for imputation, J. R. Stat. Soc. Ser. A 167 (2004), 309–322] in which, each missing item of a variable is imputed using the information given in the parents of that variable; second, the method of Di Zio et al. [Multivariate techniques for imputation based on Bayesian networks, Neural Netw. World 15 (2005), 303–310] which uses the information in the Markov blanket set of the variable to be imputed and finally, our new proposed method which applies the whole available knowledge of all variables of interest, consisting the Markov blanket and so the parent set, to impute a missing item. Results indicate the high quality of our new proposed method especially in the presence of high missingness percentages and more connected networks. Also the new method have shown to be more efficient than the MICE method for small sample sizes with high missing rates.  相似文献   

18.
In this study, we combined a Poisson regression model with neural networks (neural network Poisson regression) to relax the traditional Poisson regression assumption of linearity of the Poisson mean as a function of covariates, while including it as a special case. In four simulated examples, we found that the neural network Poisson regression improved the performance of simple Poisson regression if the Poisson mean was nonlinearly related to covariates. We also illustrated the performance of the model in predicting five-year changes in cognitive scores, in association with age and education level; we found that the proposed approach had superior accuracy to conventional linear Poisson regression. As the interpretability of the neural networks is often difficult, its combination with conventional and more readily interpretable approaches under the generalized linear model can benefit applications in biomedicine.  相似文献   

19.
Frequentist and Bayesian methods differ in many aspects but share some basic optimal properties. In real-life prediction problems, situations exist in which a model based on one of the above paradigms is preferable depending on some subjective criteria. Nonparametric classification and regression techniques, such as decision trees and neural networks, have both frequentist (classification and regression trees (CARTs) and artificial neural networks) as well as Bayesian counterparts (Bayesian CART and Bayesian neural networks) to learning from data. In this paper, we present two hybrid models combining the Bayesian and frequentist versions of CART and neural networks, which we call the Bayesian neural tree (BNT) models. BNT models can simultaneously perform feature selection and prediction, are highly flexible, and generalise well in settings with limited training observations. We study the statistical consistency of the proposed approaches and derive the optimal value of a vital model parameter. The excellent performance of the newly proposed BNT models is shown using simulation studies. We also provide some illustrative examples using a wide variety of standard regression datasets from a public available machine learning repository to show the superiority of the proposed models in comparison to popularly used Bayesian CART and Bayesian neural network models.  相似文献   

20.
We consider an autoregressive process with a nonlinear regression function that is modelled by a feedforward neural network. First, we derive a uniform central limit theorem which is useful in the context of change-point analysis. Then, we propose a test for a change in the autoregression function which – by the uniform central limit theorem – has asymptotic power one for a large class of alternatives including local alternatives not restricted to the correctly specified model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号