共查询到20条相似文献,搜索用时 15 毫秒
1.
Coppi et al. [7] applied Yang and Wu's [20] idea to propose a possibilistic k-means (PkM) clustering algorithm for LR-type fuzzy numbers. The memberships in the objective function of PkM no longer need to satisfy the constraint in fuzzy k-means that of a data point across classes sum to one. However, the clustering performance of PkM depends on the initializations and weighting exponent. In this paper, we propose a robust clustering method based on a self-updating procedure. The proposed algorithm not only solves the initialization problems but also obtains a good clustering result. Several numerical examples also demonstrate the effectiveness and accuracy of the proposed clustering method, especially the robustness to initial values and noise. Finally, three real fuzzy data sets are used to illustrate the superiority of this proposed algorithm. 相似文献
2.
Qunfang Xu 《Statistics》2017,51(6):1280-1303
In this paper, semiparametric modelling for longitudinal data with an unstructured error process is considered. We propose a partially linear additive regression model for longitudinal data in which within-subject variances and covariances of the error process are described by unknown univariate and bivariate functions, respectively. We provide an estimating approach in which polynomial splines are used to approximate the additive nonparametric components and the within-subject variance and covariance functions are estimated nonparametrically. Both the asymptotic normality of the resulting parametric component estimators and optimal convergence rate of the resulting nonparametric component estimators are established. In addition, we develop a variable selection procedure to identify significant parametric and nonparametric components simultaneously. We show that the proposed SCAD penalty-based estimators of non-zero components have an oracle property. Some simulation studies are conducted to examine the finite-sample performance of the proposed estimation and variable selection procedures. A real data set is also analysed to demonstrate the usefulness of the proposed method. 相似文献
3.
Ufuk Yolcu Erol Egrioglu Vedide R. Uslu 《Journal of Statistical Computation and Simulation》2013,83(4):599-612
Artificial intelligence procedures such as artificial neural networks (ANNs), genetic algorithms and particle swarm optimization and other procedures such as fuzzy clustering have been successfully used in the various stages of different fuzzy time-series forecasting approaches. Fuzzy clustering, genetic algorithm and particle swarm optimization are generally used in the fuzzification stage, and this simplifies the applicability of this stage and makes the fuzzy time-series approach more systematic. ANNs have also been applied successfully in the fuzzy relationship determination stage. In this study, we propose a new hybrid fuzzy time-series approach in which fuzzy c-means clustering procedure is employed in the fuzzification stage and feed-forward neural networks are used in the fuzzy relationship determination stage. This study also includes an empirical analysis pertaining to the forecasting of Index 100 for the stocks and bonds exchange market of Istanbul. 相似文献
4.
Langford IH Leyland AH Rasbash J Goldstein H 《Journal of the Royal Statistical Society. Series C, Applied statistics》1999,48(2):253-268
Multilevel modelling is used on problems arising from the analysis of spatially distributed health data. We use three applications to demonstrate the use of multilevel modelling in this area. The first concerns small area all-cause mortality rates from Glasgow where spatial autocorrelation between residuals is examined. The second analysis is of prostate cancer cases in Scottish counties where we use a range of models to examine whether the incidence is higher in more rural areas. The third develops a multiple-cause model in which deaths from cancer and cardiovascular disease in Glasgow are examined simultaneously in a spatial model. We discuss some of the issues surrounding the use of complex spatial models and the potential for future developments. 相似文献
5.
ABSTRACTThe estimation of variance function plays an extremely important role in statistical inference of the regression models. In this paper we propose a variance modelling method for constructing the variance structure via combining the exponential polynomial modelling method and the kernel smoothing technique. A simple estimation method for the parameters in heteroscedastic linear regression models is developed when the covariance matrix is unknown diagonal and the variance function is a positive function of the mean. The consistency and asymptotic normality of the resulting estimators are established under some mild assumptions. In particular, a simple version of bootstrap test is adapted to test misspecification of the variance function. Some Monte Carlo simulation studies are carried out to examine the finite sample performance of the proposed methods. Finally, the methodologies are illustrated by the ozone concentration dataset. 相似文献
6.
Gavin Shaddick Haojie Yan Ruth Salway Danielle Vienneau Daphne Kounali David Briggs 《Journal of applied statistics》2013,40(4):777-794
The potential effects of air pollution are a major concern both in terms of the environment and in relation to human health. In order to support environmental policy, there is a need for accurate measurements of the concentrations of pollutants at high geographical resolution over large regions. However, within such regions, there are likely to be areas where the monitoring information will be sparse and so methods are required to accurately predict concentrations. Set within a Bayesian framework, models are developed which exploit the relationships between pollution and geographical covariate information, such as land use, climate and transport variables together with spatial structure. Candidate models are compared based on their ability to predict a set of validation sites. The chosen model is used to perform large-scale prediction of nitrogen dioxide at a 1×1 km resolution for the entire EU. The models allow probabilistic statements to be made with regard to the levels of air pollution that might be experienced in each area. When combined with population data, such information can be invaluable in informing policy by indicating areas for which improvements may be given priority. 相似文献
7.
It is difficult to model stock market because of its uncertainty. Many methods have been introduced to tackle these difficulties, in which fuzzy time series has shown its advantages in dealing with fuzzy and uncertainty data. In recent years, many researchers have applied the fuzzy time series to analyze and forecast the stock price, and how to improve the accuracy of forecasting has attracted many researchers. In this paper, the data are first preprocessed and a new way to divide the universe of discourse is given, after which the data are fuzzified applying the triangular membership function, then three-layer back propagation (BP) neural network is established. Finally, the generalized inverse fuzzy number formula is applied to defuzzify the relation obtained with the prediction results. The proposed method is applied to predict the stock price of State Bank of India (SBI) and Dow-Jones Industrial Average (DJIA). The experimental results show that the proposed method can greatly improve the accuracy of forecasting. Furthermore, the proposed method is not sensitive to its parameters. 相似文献
8.
The present investigation was undertaken to study the gillnet catch efficiency of sardines in the coastal waters of Sri Lanka using commercial catch and effort data. Commercial catch and effort data of small mesh gillnet fishery were collected in five fisheries districts during the period May 1999–August 2002. Gillnet catch efficiency of sardines was investigated by developing catch rates predictive models using data on commercial fisheries and environmental variables. Three statistical techniques [multiple linear regression, generalized additive model and regression tree model (RTM)] were employed to predict the catch rates of trenched sardine Amblygaster sirm (key target species of small mesh gillnet fishery) and other sardines (Sardinella longiceps, S. gibbosa, S. albella and S. sindensis). The data collection programme was conducted for another six months and the models were tested on new data. RTMs were found to be the strongest in terms of reliability and accuracy of the predictions. The two operational characteristics used here for model formulation (i.e. depth of fishing and number of gillnet pieces used per fishing operation) were more useful as predictor variables than the environmental variables. The study revealed a rapid tendency of increasing the catch rates of A. sirm with increased sea depth up to around 32 m. 相似文献
9.
In this paper, a new hybrid model of vector autoregressive moving average (VARMA) models and Bayesian networks is proposed to improve the forecasting performance of multivariate time series. In the proposed model, the VARMA model, which is a popular linear model in time series forecasting, is specified to capture the linear characteristics. Then the errors of the VARMA model are clustered into some trends by K-means algorithm with Krzanowski–Lai cluster validity index determining the number of trends, and a Bayesian network is built to learn the relationship between the data and the trend of its corresponding VARMA error. Finally, the estimated values of the VARMA model are compensated by the probabilities of their corresponding VARMA errors belonging to each trend, which are obtained from the Bayesian network. Compared with VARMA models, the experimental results with a simulation study and two multivariate real-world data sets indicate that the proposed model can effectively improve the prediction performance. 相似文献
10.
《Journal of Statistical Computation and Simulation》2012,82(6):837-853
An important problem in statistics is the study of longitudinal data taking into account the effect of other explanatory variables such as treatments and time. In this paper, a new Bayesian approach for analysing longitudinal data is proposed. This innovative approach takes into account the possibility of having nonlinear regression structures on the mean and linear regression structures on the variance–covariance matrix of normal observations, and it is based on the modelling strategy suggested by Pourahmadi [M. Pourahmadi, Joint mean-covariance models with applications to longitudinal data: Unconstrained parameterizations, Biometrika, 87 (1999), pp. 667–690.]. We initially extend the classical methodology to accommodate the fitting of nonlinear mean models then we propose our Bayesian approach based on a generalization of the Metropolis–Hastings algorithm of Cepeda [E.C. Cepeda, Variability modeling in generalized linear models, Unpublished Ph.D. Thesis, Mathematics Institute, Universidade Federal do Rio de Janeiro, 2001]. Finally, we illustrate the proposed methodology by analysing one example, the cattle data set, that is used to study cattle growth. 相似文献
11.
Sophie Bercu 《Journal of applied statistics》2013,40(6):1333-1348
A dynamic coupled modelling is investigated to take temperature into account in the individual energy consumption forecasting. The objective is both to avoid the inherent complexity of exhaustive SARIMAX models and to take advantage of the usual linear relation between energy consumption and temperature for thermosensitive customers. We first recall some issues related to individual load curves forecasting. Then, we propose and study the properties of a dynamic coupled modelling taking temperature into account as an exogenous contribution and its application to the intraday prediction of energy consumption. Finally, these theoretical results are illustrated on a real individual load curve. The authors discuss the relevance of such an approach and anticipate that it could form a substantial alternative to the commonly used methods for energy consumption forecasting of individual customers. 相似文献
12.
Arthur Renshaw Steven Haberman 《Journal of the Royal Statistical Society. Series C, Applied statistics》2003,52(1):119-137
Summary. The paper presents a reinterpretation of the model underpinning the Lee–Carter methodology for forecasting mortality (and other vital) rates. A parallel methodology based on generalized linear modelling is introduced. The use of residual plots is proposed for both methods to aid the assessment of the goodness of fit. The two methods are compared in terms of structure and assumptions. They are then compared through an analysis of the gender- and age-specific mortality rates for England and Wales over the period 1950–1998 and through a consideration of the forecasts generated by the two methods. The paper also compares different approaches to the forecasting of life expectancy and considers the effectiveness of the Coale–Guo method for extrapolating mortality rates to the oldest ages. 相似文献
13.
To gain regulatory approval, a new medicine must demonstrate that its benefits outweigh any potential risks, ie, that the benefit‐risk balance is favourable towards the new medicine. For transparency and clarity of the decision, a structured and consistent approach to benefit‐risk assessment that quantifies uncertainties and accounts for underlying dependencies is desirable. This paper proposes two approaches to benefit‐risk evaluation, both based on the idea of joint modelling of mixed outcomes that are potentially dependent at the subject level. Using Bayesian inference, the two approaches offer interpretability and efficiency to enhance qualitative frameworks. Simulation studies show that accounting for correlation leads to a more accurate assessment of the strength of evidence to support benefit‐risk profiles of interest. Several graphical approaches are proposed that can be used to communicate the benefit‐risk balance to project teams. Finally, the two approaches are illustrated in a case study using real clinical trial data. 相似文献
14.
We focus our attention on the classification of fuzzy time trajectories with triangular membership function, described by
a given set of individuals. To this purpose, we adopt a fullyinformational approach, explicitly recognizing the informational nature shared by the ingredients of the classification procedure: the
observed data (Empirical Information) and the classification model (Theoretical Information). In particular, by supposing that the informational paradigm has a fuzzy nature, we suggest three fuzzy clustering models
allowing the classification of the triangular fuzzy time trajectories, based on the analysis of the cross sectional and/or
longitudinal characteristics of their components (centers and spreads). Two applicative examples are illustrated. 相似文献
15.
Pulak Ghosh Paramjit Gill Saman Muthukumarana Tim Swartz 《Australian & New Zealand Journal of Statistics》2010,52(3):289-302
This paper considers the use of Dirichlet process prior distributions in the statistical analysis of network data. Dirichlet process prior distributions have the advantages of avoiding the parametric specifications for distributions, which are rarely known, and of facilitating a clustering effect, which is often applicable to network nodes. The approach is highlighted for two network models and is conveniently implemented using WinBUGS software. 相似文献
16.
Y. Lee & J. A. Nelder 《Journal of the Royal Statistical Society. Series C, Applied statistics》2000,49(3):413-419
The human sex ratio data, collected in Saxony in the 19th century by Geissler, are reanalysed by joint modelling of the mean and dispersion. Extended quasi-likelihood and the unnormalized double-exponential family are shown to lead to identical inference. The use of the unnormalized form is discussed. The relationship between multinomial and Poisson models is studied for overdispersed data. 相似文献
17.
There are numerous difficulties involved in drilling operations of an oil well, one of the most important of them being well control. Well control systems are applied when we have irruption of liquids or unwanted intrusion of the reservoir's liquid (oil, gas or brine) into the well, during drilling when the pressure of well fluid column is less than formation pressure, and the permeability of the reservoir has a value that is able to pass the liquid through. For this purpose, a variety of methods including Driller, wait and weight, and the concurrent methods were used to control the well at different drilling sites. In this study, we investigate the optimum method for well control using a fussy method based on many parameters, including technical factors (mud weight, drilling rate, blockage of pipes, sensitivity to drilling network changes, etc.) and security factors (existence of effervescent mud, drilling circuit control, etc.), and cost of selection, which is one of the most important decisions that are made under critical conditions such as irruption. Till now, these methods were selected based on the experience of field personnel in drilling sites. The technical criteria and standards were influenced by experience, so the soft computerizing system (fuzzy method) was used. Thus, both these criteria and standards would be of greater importance and indicate whether the optimum numerical method is the same one that is expressed by human experience. The concurrent method was selected as the best for well control, using the fuzzy method at the end of the evaluation, while field personnel experience suggests the Driller method. 相似文献
18.
Upali W. Jayasinghe Herbert W. Marsh Nigel Bond 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2003,166(3):279-300
Summary. The peer review of grant proposals is very important to academics from all disciplines. Although there is limited research on the reliability of assessments for grant proposals, previously reported single-rater reliabilities have been disappointingly low (between 0.17 and 0.37). We found that the single-rater reliability of the overall assessor rating for Australian Research Council grants was 0.21 for social science and humanities (2870 ratings, 1928 assessors and 687 proposals) and 0.19 for science (7153 ratings, 4295 assessors and 1644 proposals). We used a multilevel, cross-classification approach (level 1, assessor and proposal cross-classification; level 2, field of study), taking into account that 34% of the assessors evaluated more than one proposal. Researcher-nominated assessors (those chosen by the authors of the research proposal) gave higher ratings than panel-nominated assessors chosen by the Australian Research Council, and proposals from more prestigious universities received higher ratings. In the social sciences and humanities, the status of Australian universities had significantly more effect on Australian assessors than on overseas assessors. In science, ratings were higher when assessors rated fewer proposals and apparently had a more limited frame of reference for making such ratings and when researchers were professors rather than non-professors. Particularly, the methodology of this large scale study is applicable to other forms of peer review (publications, job interviews, awarding of prizes and election to prestigious societies) where peer review is employed as a selection process. 相似文献
19.
K. Fernández-Aguirre M. I. Landaluce-Calvo A. Martín-Arroyuelos J. I. Modroño-Herrán 《Journal of applied statistics》2011,38(11):2661-2679
For a higher education public institution, young in relative terms, featuring local competition with another private and both long-established and reputed one, it is of great importance to become a reference university institution to be better known and felt with identification in the society it belongs to and ultimately to reach a good position within the European Higher Education Area. These considerations have made the university governors setting up the objective of achieving an adequate management of the university institutional brand focused on its logo and on image promotion, leading to the establishment of a university shop as it is considered a highly adequate instrument for such promotion. In this context, an on-line survey is launched on three different kinds of members of the institution, resulting in a large data sample. Different kinds of variables are analysed through appropriate exploratory multivariate techniques (symmetrical methods) and regression-related techniques (non-symmetrical methods). An advocacy for such combination is given as a conclusion. The application of statistical techniques of data and text mining provides us with empirical insights about the institution members’ perceptions and helps us to extract some facts valuable to establish policies that would improve the corporate identity and the success of the corporate shop. 相似文献
20.
Minimum Message Length (MML) is an invariant Bayesian point estimation technique which is also statistically consistent and efficient. We provide a brief overview of MML inductive inference (Wallace C.S. and Boulton D.M. 1968. Computer Journal, 11: 185–194; Wallace C.S. and Freeman P.R. 1987. J. Royal Statistical Society (Series B), 49: 240–252; Wallace C.S. and Dowe D.L. (1999). Computer Journal), and how it has both an information-theoretic and a Bayesian interpretation. We then outline how MML is used for statistical parameter estimation, and how the MML mixture modelling program, Snob (Wallace C.S. and Boulton D.M. 1968. Computer Journal, 11: 185–194; Wallace C.S. 1986. In: Proceedings of the Nineteenth Australian Computer Science Conference (ACSC-9), Vol. 8, Monash University, Australia, pp. 357–366; Wallace C.S. and Dowe D.L. 1994b. In: Zhang C. et al. (Eds.), Proc. 7th Australian Joint Conf. on Artif. Intelligence. World Scientific, Singapore, pp. 37–44. See http://www.csse.monash.edu.au/-dld/Snob.html) uses the message lengths from various parameter estimates to enable it to combine parameter estimation with selection of the number of components and estimation of the relative abundances of the components. The message length is (to within a constant) the logarithm of the posterior probability (not a posterior density) of the theory. So, the MML theory can also be regarded as the theory with the highest posterior probability. Snob currently assumes that variables are uncorrelated within each component, and permits multi-variate data from Gaussian, discrete multi-category (or multi-state or multinomial), Poisson and von Mises circular distributions, as well as missing data. Additionally, Snob can do fully-parameterised mixture modelling, estimating the latent class assignments in addition to estimating the number of components, the relative abundances of the parameters and the component parameters. We also report on extensions of Snob for data which has sequential or spatial correlations between observations, or correlations between attributes. 相似文献