期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A simple two-step procedure using the Fellegi–Sunter model for frequency-based record linkage

Huiping Xu Xiaochun Li Shaun Grannis 《Journal of applied statistics》2022,49(11):2789

The widely used Fellegi–Sunter model for probabilistic record linkage does not leverage information contained in field values and consequently leads to identical classification of match status regardless of whether records agree on rare or common values. Since agreement on rare values is less likely to occur by chance than agreement on common values, records agreeing on rare values are more likely to be matches. Existing frequency-based methods typically rely on knowledge of error probabilities associated with field values and frequencies of agreed field values among matches, often derived using prior studies or training data. When such information is unavailable, applications of these methods are challenging. In this paper, we propose a simple two-step procedure for frequency-based matching using the Fellegi–Sunter framework to overcome these challenges. Matching weights are adjusted based on frequency distributions of the agreed field values among matches and non-matches, estimated by the Fellegi–Sunter model without relying on prior studies or training data. Through a real-world application and simulation, our method is found to produce comparable or better performance than the unadjusted method. Furthermore, frequency-based matching provides greater improvement in matching accuracy when using poorly discriminating fields with diminished benefit as the discriminating power of matching fields increases. 相似文献

2.

Linear Increments with Non‐monotone Missing Data and Measurement Error

下载免费PDF全文

Shaun R. Seaman Daniel Farewell Ian R. White 《Scandinavian Journal of Statistics》2016,43(4):996-1018

Linear increments (LI) are used to analyse repeated outcome data with missing values. Previously, two LI methods have been proposed, one allowing non‐monotone missingness but not independent measurement error and one allowing independent measurement error but only monotone missingness. In both, it was suggested that the expected increment could depend on current outcome. We show that LI can allow non‐monotone missingness and either independent measurement error of unknown variance or dependence of expected increment on current outcome but not both. A popular alternative to LI is a multivariate normal model ignoring the missingness pattern. This gives consistent estimation when data are normally distributed and missing at random (MAR). We clarify the relation between MAR and the assumptions of LI and show that for continuous outcomes multivariate normal estimators are also consistent under (non‐MAR and non‐normal) assumptions not much stronger than those of LI. Moreover, when missingness is non‐monotone, they are typically more efficient. 相似文献

3.

Asymmetric Forecast Densities for U.S. Macroeconomic Variables from a Gaussian Copula Model of Cross-Sectional and Serial Dependence

Michael S. Smith Shaun P. Vahey 《商业与经济统计学杂志》2016,34(3):416-434

Most existing reduced-form macroeconomic multivariate time series models employ elliptical disturbances, so that the forecast densities produced are symmetric. In this article, we use a copula model with asymmetric margins to produce forecast densities with the scope for severe departures from symmetry. Empirical and skew t distributions are employed for the margins, and a high-dimensional Gaussian copula is used to jointly capture cross-sectional and (multivariate) serial dependence. The copula parameter matrix is given by the correlation matrix of a latent stationary and Markov vector autoregression (VAR). We show that the likelihood can be evaluated efficiently using the unique partial correlations, and estimate the copula using Bayesian methods. We examine the forecasting performance of the model for four U.S. macroeconomic variables between 1975:Q1 and 2011:Q2 using quarterly real-time data. We find that the point and density forecasts from the copula model are competitive with those from a Bayesian VAR. During the recent recession the forecast densities exhibit substantial asymmetry, avoiding some of the pitfalls of the symmetric forecast densities from the Bayesian VAR. We show that the asymmetries in the predictive distributions of GDP growth and inflation are similar to those found in the probabilistic forecasts from the Survey of Professional Forecasters. Last, we find that unlike the linear VAR model, our fitted Gaussian copula models exhibit nonlinear dependencies between some macroeconomic variables. This article has online supplementary material. 相似文献

4.

Digital Inequality and Place: The Effects of Technological Diffusion on Internet Proficiency and Usage across Rural,Suburban, and Urban Counties*

Michael J. Stern Alison E. Adams Shaun Elsasser 《Sociological inquiry》2009,79(4):391-417

Recently researchers have made efforts to reconceptualize digital inequality into discrete levels. These levels reflect access to and diffusion of technologies, proficiency in Internet usage, and propensity to take advantage of the opportunities afforded by information and communication technologies for assistance in daily life. We assess the utility of this approach for studying digital inequality across rural, suburban, and urban counties. Based on data from a 2005 nationally representative random sample telephone survey of 2,185 adults, the results provide mixed support for using this approach to studying digital inequality. In particular, we find that rural residents use Internet technologies less for assistance in helping with economics and other daily activities when compared with individuals from suburban and urban areas; however, our results suggest that this relationship is the product of the slow diffusion of advanced technologies to rural areas. The implications of these findings for understanding this under‐theorized form of inequality are discussed, and we make contributions to this literature through empirically addressing issues of digital capital. 相似文献

5.

Ignorability conditions for frequentist non parametric analysis of conditional distributions with incomplete data

Shaun Bender Daniel F. Heitjan 《统计学通讯:理论与方法》2017,46(11):5252-5264

Rubin (1976 Rubin, D.B. (1976). Inference and missing data. Biometrika 63(3):581–592.[Crossref], [Web of Science ®] , [Google Scholar]) derived general conditions under which inferences that ignore missing data are valid. These conditions are sufficient but not generally necessary, and therefore may be relaxed in some special cases. We consider here the case of frequentist estimation of a conditional cdf subject to missing outcomes. We partition a set of data into outcome, conditioning, and latent variables, all of which potentially affect the probability of a missing response. We describe sufficient conditions under which a complete-case estimate of the conditional cdf of the outcome given the conditioning variable is unbiased. We use simulations on a renal transplant data set (Dienemann et al.) to illustrate the implications of these results. 相似文献

6.

Evaluation of the Food and Agriculture Sector Criticality Assessment Tool (FASCAT) and the Collected Data

Andrew G. Huff James S. Hodges Shaun P. Kennedy Amy Kircher 《Risk analysis》2015,35(8):1448-1467

To protect and secure food resources for the United States, it is crucial to have a method to compare food systems’ criticality. In 2007, the U.S. government funded development of the Food and Agriculture Sector Criticality Assessment Tool (FASCAT) to determine which food and agriculture systems were most critical to the nation. FASCAT was developed in a collaborative process involving government officials and food industry subject matter experts (SMEs). After development, data were collected using FASCAT to quantify threats, vulnerabilities, consequences, and the impacts on the United States from failure of evaluated food and agriculture systems. To examine FASCAT's utility, linear regression models were used to determine: (1) which groups of questions posed in FASCAT were better predictors of cumulative criticality scores; (2) whether the items included in FASCAT's criticality method or the smaller subset of FASCAT items included in DHS's risk analysis method predicted similar criticality scores. Akaike's information criterion was used to determine which regression models best described criticality, and a mixed linear model was used to shrink estimates of criticality for individual food and agriculture systems. The results indicated that: (1) some of the questions used in FASCAT strongly predicted food or agriculture system criticality; (2) the FASCAT criticality formula was a stronger predictor of criticality compared to the DHS risk formula; (3) the cumulative criticality formula predicted criticality more strongly than weighted criticality formula; and (4) the mixed linear regression model did not change the rank‐order of food and agriculture system criticality to a large degree. 相似文献

7.

The service system challenges of work with juvenile justice involved young people in the Hunter Region,Australia

Tamara Blakemore Kylie Agllias Amanda Howard Shaun McCarthy 《The Australian journal of social issues》2019,54(3):341-356

Current policies suggest that collaborative approaches are core to working effectively with juvenile justice involved young people. However, there is little research examining the workings of multi‐agency and collaborative endeavours in this field, or the experiences of the human service workers facilitating these connections. This paper reports on qualitative research that resulted from the Juvenile Justice and Education Equity in the Hunter Region project. Thirty‐eight human service workers were interviewed about their perceptions of the workings, strengths and challenges of the service system that supports young people who come into contact with the Children's Court in the Lower and Upper Hunter regions of New South Wales. Data analysis revealed three key themes related to (1) service gaps, cycles and maelstrom; (2) pursuing authentic service engagement; and (3) insider–outsider dynamics in service provision. Findings are discussed in relation to emerging practice and research agendas. 相似文献

8.

Identifying Citizens: ID Cards as Surveillance – By D. Lyon

Shaun Best 《The British journal of sociology》2011,62(4):749-750

相似文献

9.

The social context of Welsh-medium bilingual education in anglicised areas

Wynford Bellin Shaun Farrell Gary Higgs Sean White 《Journal of Sociolinguistics》1999,3(2):173-193

Principal component analysis of indicators from the 1991 Census was used to characterise the social context of school age Welsh speakers in South East Wales. The area had been largely anglicised by the Census of 1971, but the growth of Welsh-medium education was responsible for net gains in numbers of younger Welsh/English bilinguals. Doubts as to whether young people will remain active bilinguals after leaving school have been raised. The inter-relationships between figures for Welsh speaking in the Census and other social indicators were examined. Being categorised as a young Welsh speaker was found to cut across an economic advantage/disadvantage dimension, and so was a matter of life style rather than a by-product of parental choices unrelated to language resurgence. Probing life styles by means of interviews where Welsh-medium and English-medium schools could be matched on the economic advantage/disadvantage dimension showed that deciding for Welsh-medium education was embedded in authentic local life styles. Although networks of people at more local spatial scales were involved in Welsh-medium education, they were linked to wider scale networks establishing domains for Welsh language use in the public sector and local government. 相似文献

10.

Focus on Kids: Evaluation of a Research-Based Divorce Education Program

David G. Schramm Shaun Calix 《Journal of divorce & remarriage》2013,54(7):529-549

Using data from a sample of 2,274 divorced or separated parents who participated in the Focus on Kids (FOK) divorce education program, we examine program effectiveness by demographic characteristics. We followed up with 149 participants after between 4 and 10 months with a follow-up posttest survey to examine long-term effectiveness. Overall, the vast majority of parents indicated that the FOK program was helpful and worthwhile. However, younger participants, females, and those with lower education levels and lower incomes found the program to be most helpful. At follow-up, parents were less likely to be engaging in coparenting conflict. Implications for divorce education programs are discussed. 相似文献