首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到15条相似文献,搜索用时 15 毫秒
1.
A functional linear regression model linking observations of a functional response variable with measurements of an explanatory functional variable is considered. This model serves to analyse a real data set describing electricity consumption in Sardinia. The interest lies in predicting either oncoming weekends’ or oncoming weekdays’ consumption, provided actual weekdays’ consumption is known. A B-spline estimator of the functional parameter is used. Selected computational issues are addressed as well.  相似文献   

2.
Overfitting occurs when one tries to train a large model on small amount of data. Regularizing a neural network using prior knowledge remains a topic of research as it is not concluded how much prior information can be given to the neural network. In this paper, a novel algorithm is introduced which uses regularization to train a neural network without increasing the dataset. A trivial prior information of a class label is supplied to the model while training. Laplace noise is introduced to the intermediate layer for more generalization. The results show significant improvement in accuracy on the standard datasets for a simple Convolutional Neural Network (CNN). While the proposed method outperforms previous regularization techniques like dropout and batch normalization, it can also be applied with them for further improvement in the performance. On the variants of MNIST, proposed algorithm achieved an average 48% increment in the test accuracy.  相似文献   

3.
Artificial neural networks have been successfully applied to a variety of machine learning tasks, including image recognition, semantic segmentation, and machine translation. However, few studies fully investigated ensembles of artificial neural networks. In this work, we investigated multiple widely used ensemble methods, including unweighted averaging, majority voting, the Bayes Optimal Classifier, and the (discrete) Super Learner, for image recognition tasks, with deep neural networks as candidate algorithms. We designed several experiments, with the candidate algorithms being the same network structure with different model checkpoints within a single training process, networks with same structure but trained multiple times stochastically, and networks with different structure. In addition, we further studied the overconfidence phenomenon of the neural networks, as well as its impact on the ensemble methods. Across all of our experiments, the Super Learner achieved best performance among all the ensemble methods in this study.  相似文献   

4.
Bayesian neural networks for nonlinear time series forecasting   总被引:3,自引:0,他引:3  
In this article, we apply Bayesian neural networks (BNNs) to time series analysis, and propose a Monte Carlo algorithm for BNN training. In addition, we go a step further in BNN model selection by putting a prior on network connections instead of hidden units as done by other authors. This allows us to treat the selection of hidden units and the selection of input variables uniformly. The BNN model is compared to a number of competitors, such as the Box-Jenkins model, bilinear model, threshold autoregressive model, and traditional neural network model, on a number of popular and challenging data sets. Numerical results show that the BNN model has achieved a consistent improvement over the competitors in forecasting future values. Insights on how to improve the generalization ability of BNNs are revealed in many respects of our implementation, such as the selection of input variables, the specification of prior distributions, and the treatment of outliers.  相似文献   

5.
Frequentist and Bayesian methods differ in many aspects but share some basic optimal properties. In real-life prediction problems, situations exist in which a model based on one of the above paradigms is preferable depending on some subjective criteria. Nonparametric classification and regression techniques, such as decision trees and neural networks, have both frequentist (classification and regression trees (CARTs) and artificial neural networks) as well as Bayesian counterparts (Bayesian CART and Bayesian neural networks) to learning from data. In this paper, we present two hybrid models combining the Bayesian and frequentist versions of CART and neural networks, which we call the Bayesian neural tree (BNT) models. BNT models can simultaneously perform feature selection and prediction, are highly flexible, and generalise well in settings with limited training observations. We study the statistical consistency of the proposed approaches and derive the optimal value of a vital model parameter. The excellent performance of the newly proposed BNT models is shown using simulation studies. We also provide some illustrative examples using a wide variety of standard regression datasets from a public available machine learning repository to show the superiority of the proposed models in comparison to popularly used Bayesian CART and Bayesian neural network models.  相似文献   

6.
A relevant problem in many applicatory contexts is to test whether some given observations follow one of two possible probability distributions. The vast literature produced over the years on this topic does not identify a tool which can be easily adopted to any situation but only finds solutions to specific comparisons. Recently, an easy to implement procedure for discrimination between two distributions based on feed-forward neural networks has been proposed giving interesting results. In this work this procedure is further investigated in terms of power, neural network architecture and expected statistical properties of the test statistic for small, moderate and large sample sizes, in a wide range of symmetric and skewed alternatives.  相似文献   

7.
Consider the usual linear regression model y = x’β+?, relating a response y to a vector of predictors x. Suppose that n observations on y together with the corresponding values of x are available , and it is desired to construct simultaneous prediction intervals for k future values of y at values of x which can not be ascertained beforehand. In most applications the regression model contains an intercept. This paper presents two sets of prediction intervals appropriate to this case. The proposed intervals are compared with those of Carlstein (1986), and the improvements are illustrated in the case of simple linear regression.  相似文献   

8.
For the problem of individual prediction in linear regression models, that is, estimation of a linear combination of regression coefficients, mean square error behavior of a general class of adaptive predictors is examined.  相似文献   

9.
This paper compares the performance between regression analysis and a clustering based neural network approach when the data deviates from the homoscedasticity assumption of regression. Heteroskedasticity is a problem that arises in linear regression due to the unequal error variances. One of the methods to deal heteroskedasticity in classical regression theory is weighted least-square regression (WLS). In order to deal the problem of heteroskedasticity, backpropagation neural network is applied. In this context, an algorithm is proposed which is based on robust estimates of location and dispersion matrix that helps in preserving the error assumption of the linear regression. Analysis is carried out with appropriate designs using simulated data and the results are presented.  相似文献   

10.
Multi-layer perceptrons (MLPs), a common type of artificial neural networks (ANNs), are widely used in computer science and engineering for object recognition, discrimination and classification, and have more recently found use in process monitoring and control. Training such networks is not a straightforward optimisation problem, and we examine features of these networks which contribute to the optimisation difficulty.Although the original perceptron, developed in the late 1950s (Rosenblatt 1958, Widrow and Hoff 1960), had a binary output from each node, this was not compatible with back-propagation and similar training methods for the MLP. Hence the output of each node (and the final network output) was made a differentiable function of the network inputs. We reformulate the MLP model with the original perceptron in mind so that each node in the hidden layers can be considered as a latent (that is, unobserved) Bernoulli random variable. This maintains the property of binary output from the nodes, and with an imposed logistic regression of the hidden layer nodes on the inputs, the expected output of our model is identical to the MLP output with a logistic sigmoid activation function (for the case of one hidden layer).We examine the usual MLP objective function—the sum of squares—and show its multi-modal form and the corresponding optimisation difficulty. We also construct the likelihood for the reformulated latent variable model and maximise it by standard finite mixture ML methods using an EM algorithm, which provides stable ML estimates from random starting positions without the need for regularisation or cross-validation. Over-fitting of the number of nodes does not affect this stability. This algorithm is closely related to the EM algorithm of Jordan and Jacobs (1994) for the Mixture of Experts model.We conclude with some general comments on the relation between the MLP and latent variable models.  相似文献   

11.
Although the effect of missing data on regression estimates has received considerable attention, their effect on predictive performance has been neglected. We studied the performance of three missing data strategies—omission of records with missing values, replacement with a mean and imputation based on regression—on the predictive performance of logistic regression (LR), classification tree (CT) and neural network (NN) models in the presence of data missing completely at random (MCAR). Models were constructed using datasets of size 500 simulated from a joint distribution of binary and continuous predictors including nonlinearities, collinearity and interactions between variables. Though omission produced models that fit better on the data from which the models were developed, imputation was superior on average to omission for all models when evaluating the receiver operating characteristic (ROC) curve area, mean squared error (MSE), pooled variance across outcome categories and calibration X 2 on an independently generated test set. However, in about one-third of simulations, omission performed better. Performance was also more variable with omission including quite a few instances of extremely poor performance. Replacement and imputation generally produced similar results except with neural networks for which replacement, the strategy typically used in neural network algorithms, was inferior to imputation. Missing data affected simpler models much less than they did more complex models such as generalized additive models that focus on local structure For moderate sized datasets, logistic regressions that use simple nonlinear structures such as quadratic terms and piecewise linear splines appear to be at least as robust to randomly missing values as neural networks and classification trees.  相似文献   

12.
The use of GARCH type models and computational-intelligence-based techniques for forecasting financial time series has been proved extremely successful in recent times. In this article, we apply the finite mixture of ARMA-GARCH model instead of AR or ARMA models to compare with the standard BP and SVM in forecasting financial time series (daily stock market index returns and exchange rate returns). We do not apply the pure GARCH model as the finite mixture of the ARMA-GARCH model outperforms the pure GARCH model. These models are evaluated on five performance metrics or criteria. Our experiment shows that the SVM model outperforms both the finite mixture of ARMA-GARCH and BP models in deviation performance criteria. In direction performance criteria, the finite mixture of ARMA-GARCH model performs better. The memory property of these forecasting techniques is also examined using the behavior of forecasted values vis-à-vis the original values. Only the SVM model shows long memory property in forecasting financial returns.  相似文献   

13.
We consider an autoregressive process with a nonlinear regression function that is modelled by a feedforward neural network. First, we derive a uniform central limit theorem which is useful in the context of change-point analysis. Then, we propose a test for a change in the autoregression function which – by the uniform central limit theorem – has asymptotic power one for a large class of alternatives including local alternatives not restricted to the correctly specified model.  相似文献   

14.
15.
Networks of ambient monitoring stations are used to monitor environmental pollution fields such as those for acid rain and air pollution. Such stations provide regular measurements of pollutant concentrations. The networks are established for a variety of purposes at various times so often several stations measuring different subsets of pollutant concentrations can be found in compact geographical regions. The problem of statistically combining these disparate information sources into a single 'network' then arises. Capitalizing on the efficiencies so achieved can then lead to the secondary problem of extending this network. The subject of this paper is a set of 31 air pollution monitoring stations in southern Ontario. Each of these regularly measures a particular subset of ionic sulphate, sulphite, nitrite and ozone. However, this subset varies from station to station. For example only two stations measure all four. Some measure just one. We describe a Bayesian framework for integrating the measurements of these stations to yield a spatial predictive distribution for unmonitored sites and unmeasured concentrations at existing stations. Furthermore we show how this network can be extended by using an entropy maximization criterion. The methods assume that the multivariate response field being measured has a joint Gaussian distribution conditional on its mean and covariance function. A conjugate prior is used for these parameters, some of its hyperparameters being fitted empirically.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号