首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Network autocorrelation models (NAMs) are widely used to study a response variable of interest among subjects embedded within a network. Although the NAM is highly useful for studying such networked observational units, several simulation studies have raised concerns about point estimation. Specifically, these studies have consistently demonstrated a negative bias of maximum likelihood estimators (MLEs) of the network effect parameter. However, in order to gain a practical understanding of point estimation in the NAM, these findings need to be expanded in three important ways. First, these simulation studies are based on relatively simple network generative models rather than observed networks, thereby leaving as an open question how realistic network topologies may affect point estimation in practice. Second, although there has been strong work done in developing two-stage least squares estimators as well as Bayesian estimators, only the MLE has received extensive attention in the literature, thus leaving practitioners in question as to best practices. Third, the performance of these estimators need to be compared using both bias and variance, as well as the coverage rate of each estimator's corresponding confidence or credible interval. In this paper we describe a simulation study which aims to overcome these shortcomings in the following way. We first fit real social networks using the exponential random graph model and used the Bayesian predictive posterior distribution to generate networks with realistic topologies. We then compared the performance of the three different estimators mentioned above.  相似文献   

2.
Researchers interested in the effects of social network ties on behavior are increasingly turning to the network autocorrelation model, which allows for the simultaneous computation of individual-level and network-level effects. Earlier research, however, had pointed to the possibility that the maximum likelihood estimates used to compute the network autocorrelation model yielded negatively biased parameter estimates. In this paper we use simulations to examine whether – and the conditions under which – a negative bias exists. We show that the network parameter estimate ρ is negatively biased under nearly all conditions, and that this bias becomes more severe at higher levels of both ρ and network density. We conclude by discussing the implications of these findings for researchers planning to use the network autocorrelation model.  相似文献   

3.
The network autocorrelation model has become an increasingly popular tool for conducting social network analysis. More and more researchers, however, have documented evidence of a systematic negative bias in the estimation of the network effect (ρ). In this paper, we take a different approach to the problem by investigating conditions under which, despite the underestimation bias, a network effect can still be detected by the network autocorrelation model. Using simulations, we find that moderately-sized network effects (e.g., ρ = .3) are still often detectable in modest-sized networks (i.e., 40 or more nodes). Analyses reveal that statistical power is primarily a nonlinear function of network effect size (ρ) and network size (N), although both of these factors can interact with network density and network structure to impair power under certain rare conditions. We conclude by discussing implications of these findings and guidelines for users of the autocorrelation model.  相似文献   

4.
Misspecification in network autocorrelation models poses a challenge for parameter estimation, which is amplified by missing data. Model misspecification has been a focus of recent work in the statistics literature and new robust procedures have been developed, in particular cutting feedback. This paper shows how this helps in a misspecified network autocorrelation model. Where model misspecification is mild and the traits are fully observed, Bayesian imputation is routine. In settings with high missingness, Bayesian inference can fail, but a closely related cut model is robust. We illustrate this on a data set of graduate students using a Facebook-like messaging app.  相似文献   

5.
The network autocorrelation model has been a workhorse for modeling network influences on individual behavior. The standard network approaches to mapping social influence using network measures, however, are limited to specifying an influence weight matrix (W) based on a single mode network. Additionally, it has been demonstrated that the estimate of the autocorrelation parameter ρ of the network effect tends to be negatively biased as the density in W matrix increases. The current study introduces a two-mode version of the network autocorrelation model. We then conduct simulations to examine conditions under which bias might exist. We show that the estimate for the affiliation autocorrelation parameter (ρ) tends to be negatively biased as density increases, as in the one-mode case. Inclusion of the diagonal of W, the count of the number of events participated in, as one of the variables in the regression model helps to attenuate such bias, however. We discuss the implications of these results.  相似文献   

6.
We propose using latent class analysis as an alternative to log-linear analysis for the multiple imputation of incomplete categorical data. Similar to log-linear models, latent class models can be used to describe complex association structures between the variables used in the imputation model. However, unlike log-linear models, latent class models can be used to build large imputation models containing more than a few categorical variables. To obtain imputations reflecting uncertainty about the unknown model parameters, we use a nonparametric bootstrap procedure as an alternative to the more common full Bayesian approach. The proposed multiple imputation method, which is implemented in Latent GOLD software for latent class analysis, is illustrated with two examples. In a simulated data example, we compare the new method to well-established methods such as maximum likelihood estimation with incomplete data and multiple imputation using a saturated log-linear model. This example shows that the proposed method yields unbiased parameter estimates and standard errors. The second example concerns an application using a typical social sciences data set. It contains 79 variables that are all included in the imputation model. The proposed method is especially useful for such large data sets because standard methods for dealing with missing data in categorical variables break down when the number of variables is so large.  相似文献   

7.
The co-authorship among members of a research group commonly can be represented by a (co-authorship) graph in which nodes represent the researchers that make up of this group and edges represent the connections between two agents (i.e., the co-authorship between these agents). Current study measures the reliability of networks by taking into consideration unreliable nodes (researchers) and perfectly reliable edges (co-authorship between two researchers). A Bayesian approach for the reliability of a network represented by the co-authorship among members of a real research group is proposed, obtaining Bayesian estimates and credibility intervals for the individual components (nodes or researchers) and the network. Weakly informative and non-informative prior distributions are assumed for those components and the posterior summaries are obtained by Monte Carlo-Markov Chain methods. The results show the relevance of an inferential approach for the reliability of scientific co-authorship network. The results also demonstrate that the contribution of each researcher is highly relevant for the maintenance of a research group. In addition, the Bayesian methodology was a feasible and easy computational implementation.  相似文献   

8.
《Social Networks》2002,24(1):1-20
Egocentered networks are common in social science research. Here, the unit of analysis is a respondent (ego) together with his/her personal network (alters). Usually, several variables are used to describe the relationship between egos and alters.In this paper, the aim is to estimate the reliability and validity of the averages of these measures by the multitrait–multimethod (MTMM) approach. This approach usually requires at least three repeated measurements (methods) of the same variable (trait) for model identification. This places a considerable burden on the respondent and increases the cost of data collection.In this paper, we use a split ballot MTMM experimental design, proposed by Saris (1999), in which separate groups of respondents get different combinations of just two methods. The design can also be regarded as having a planned missing data structure. The maximum likelihood estimation is used in the manner suggested by Allison (1987) of a confirmatory factor analysis model for MTMM-designs specified in Saris and Andrews (1991). This procedure is applied to social support data collected in the city of Ljubljana (Slovenia) in the year 2000.  相似文献   

9.
Network autocorrelation models have been widely used for decades to model the joint distribution of the attributes of a network's actors. This class of models can estimate both the effect of individual characteristics as well as the network effect, or social influence, on some actor attribute of interest. Collecting data on the entire network, however, is very often infeasible or impossible if the network boundary is unknown or difficult to define. Obtaining egocentric network data overcomes these obstacles, but as of yet there has been no clear way to model this type of data and still appropriately capture the network effect on the actor attributes in a way that is compatible with a joint distribution on the full network data. This paper adapts the class of network autocorrelation models to handle egocentric data. The proposed methods thus incorporate the complex dependence structure of the data induced by the network rather than simply using ad hoc measures of the egos’ networks to model the mean structure, and can estimate the network effect on the actor attribute of interest. The vast quantities of unknown information about the network can be succinctly represented in such a way that only depends on the number of alters in the egocentric network data and not on the total number of actors in the network. Estimation is done within a Bayesian framework. A simulation study is performed to evaluate the estimation performance, and an egocentric data set is analyzed where the aim is to determine if there is a network effect on environmental mastery, an important aspect of psychological well-being.  相似文献   

10.
Exponential random graph models are an important tool in the statistical analysis of data. However, Bayesian parameter estimation for these models is extremely challenging, since evaluation of the posterior distribution typically involves the calculation of an intractable normalizing constant. This barrier motivates the consideration of tractable approximations to the likelihood function, such as the pseudolikelihood function, which offers an approach to constructing such an approximation. Naive implementation of what we term a pseudo-posterior resulting from replacing the likelihood function in the posterior distribution by the pseudolikelihood is likely to give misleading inferences. We provide practical guidelines to correct a sample from such a pseudo-posterior distribution so that it is approximately distributed from the target posterior distribution and discuss the computational and statistical efficiency that result from this approach. We illustrate our methodology through the analysis of real-world graphs. Comparisons against the approximate exchange algorithm of Caimo and Friel (2011) are provided, followed by concluding remarks.  相似文献   

11.
Exponential random models have been widely adopted as a general probabilistic framework for complex networks and recently extended to embrace broader statistical settings such as dynamic networks, valued networks or two-mode networks. Our aim is to provide a further step into the generalization of this class of models by considering sample spaces which involve both families of networks and nodal properties verifying combinatorial constraints. We propose a class of probabilistic models for the joint distribution of nodal properties (demographic and behavioral characteristics) and network structures (friendship and professional partnership). It results in a general and flexible modeling framework to account for homophily in social structures. We present a Bayesian estimation method based on the full characterization of their sample spaces by systems of linear constraints. This provides an exact simulation scheme to sample from the likelihood, based on linear programming techniques. After a detailed analysis of the proposed statistical methodology, we illustrate our approach with an empirical analysis of co-authorship of journal articles in the field of neuroscience between 2009 and 2013.  相似文献   

12.
A class of statistical models is proposed that aims to recover latent settings structures in social networks. Settings may be regarded as clusters of vertices. The measurement model is based on two assumptions. (1) The observed network is generated by hierarchically nested latent transitive structures, expressed by ultrametrics, and (2) the expected tie strength decreases with ultrametric distance. The approach could be described as model–based clustering with an ultrametric space as the underlying metric to capture the dependence in the observations. Bayesian methods as well as maximum–likelihood methods are applied for statistical inference. Both approaches are implemented using Markov chain Monte Carlo methods.  相似文献   

13.
The authors have developed and tested scale-up methods, based on a simple social network theory, to estimate the size of hard-to-count subpopulations. The authors asked a nationally representative sample of respondents how many people they knew in a list of 32 subpopulations, including 29 subpopulations of known size and 3 of unknown size. Using these responses, the authors produced an effectively unbiased maximum likelihood estimate of the number of people each respondent knows. These estimates were then used to back-estimate the size of the three populations of unknown size. Maximum likelihood values and 95% confidence intervals are found for seroprevalence, 800,000 +/- 43,000; for homeless, 526,000 +/- 35,000; and for women raped in the last 12 months, 194,000 +/- 21,000. The estimate for seroprevalence agrees strikingly with medical estimates, the homeless estimate is well within the published estimates, and the authors' estimate lies in the middle of the published range for rape victims.  相似文献   

14.
In response to Abbott and Volberg's (in press) rejoinder to my epidemiologic note on verification bias and estimation of prevalence rates (Gambino, in press), I provide the formulas for computing confidence intervals for the results of second-stage verification. In addition, I provide the appropriate equation for determining confidence intervals when prevalence is near zero or one. Finally, we present formulas for determining the most efficient sample sizes needed to minimize second-stage variance estimates. These allow the investigator working under a fixed budget to determine the relative value of sampling negative screens to test for false negatives. We close with an observation on the interpretability of evidence.  相似文献   

15.
Exponential random graph models are a class of widely used exponential family models for social networks. The topological structure of an observed network is modelled by the relative prevalence of a set of local sub-graph configurations termed network statistics. One of the key tasks in the application of these models is which network statistics to include in the model. This can be thought of as statistical model selection problem. This is a very challenging problem—the posterior distribution for each model is often termed “doubly intractable” since computation of the likelihood is rarely available, but also, the evidence of the posterior is, as usual, intractable. The contribution of this paper is the development of a fully Bayesian model selection method based on a reversible jump Markov chain Monte Carlo algorithm extension of Caimo and Friel (2011) which estimates the posterior probability for each competing model.  相似文献   

16.
Berry B 《Evaluation review》2007,31(2):166-199
Risks of life on the street caused by inclement weather, harassment, and assault threaten the unsheltered homeless population. We address some challenges of enumerating the street homeless population by testing a novel capture-recapture (CR) estimation approach that models individuals' intermittent daytime visibility. We tested walking and vehicle-based variants of CR in downtown Toronto in March. Estimates that assume individual variability of sighting probabilities are most consistent with our knowledge of the homeless and achieve the most favorable confidence intervals, estimated detection probabilities, and coefficient of variation. Estimation bias from interobserver discrepancies, duplicate counting, and violation of the closed population assumption were minimized with uniform identification criteria, training, and sampling design. Bias caused by the social grouping of the homeless was small. Despite the limitations of visual identification, CR approaches as part of a multiple-method program can aid community responses to immediate needs on the street, especially during the harsh winter months.  相似文献   

17.
Bayesian modeling is becoming increasing popular as a method for data analyses in the social sciences and can move couple, marriage, and family therapy (C/MFT) research forward. Bayesian modeling helps researchers better understand the uncertainty of findings and incorporate previous research into analyses. Other benefits of Bayesian modeling are the straightforward interpretation of findings, high-quality inferences even with small samples (in combination with an informative prior), and the ability to work with complex data structures (observations nested in relationships and time points) which are common in C/MFT research. This article introduces the benefits of Bayesian modeling and provides an example of an Actor–Partner Interdependence Model using R. Information on how to conduct the same analyses using Stata and MPlus is provided in the Supplemental Information.  相似文献   

18.
This article reviews new specifications for exponential random graph models proposed by Snijders et al. [Snijders, T.A.B., Pattison, P., Robins, G.L., Handcock, M., 2006. New specifications for exponential random graph models. Sociological Methodology] and demonstrates their improvement over homogeneous Markov random graph models in fitting empirical network data. Not only do the new specifications show improvements in goodness of fit for various data sets, but they also help to avoid the problem of near-degeneracy that often afflicts the fitting of Markov random graph models in practice, particularly to network data exhibiting high levels of transitivity. The inclusion of a new higher order transitivity statistic allows estimation of parameters of exponential graph models for many (but not all) cases where it is impossible to estimate parameters of homogeneous Markov graph models. The new specifications were used to model a large number of classical small-scale network data sets and showed a dramatically better performance than Markov graph models. We also review three current programs for obtaining maximum likelihood estimates of model parameters and we compare these Monte Carlo maximum likelihood estimates with less accurate pseudo-likelihood estimates. Finally, we discuss whether homogeneous Markov random graph models may be superseded by the new specifications, and how additional elaborations may further improve model performance.  相似文献   

19.
20.
Discrete-time or grouped duration data, with one or multiple types of terminating events, are often observed in social sciences or economics. In this paper we suggest and discuss dynamic models for flexible Bayesian nonparametric analysis of such data. These models allow simultaneous incorporation and estimation of baseline hazards and time-varying covariate effects, without imposing particular parametric forms. Methods for exploring the possibility of time-varying effects, as for example the impact of nationality or unemployment insurance benefits on the probability of reemployment, have recently gained increasing interest. Our modeling and estimation approach is fully Bayesian and makes use of Markov Chain Monte Carlo (MCMC) simulation techniques. A detailed analysis of unemployment duration data, with full-time job, part-time job and other causes as terminating events, illustrates our methods and shows how they can be used to obtain refined results and interpretations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号