BAD DATA, BETTER METHODS: THREE ESSAYS ON LEVERAGING MACHINE LEARNING FOR IMPROVED DATA QUALITY IN EXTREMISM RESEARCH
The quantitative academic research around terrorism, radicalization, and extremism has increased significantly over the last twenty-five years, largely due to the rise in open-source databases. Despite this increase in data, the underlying issues around data quality in terrorism, radicalization, and extremism research remain unaddressed. The data often suffer from latent clustering, systematic measurement error, and nonrandom missingness, all of which result in inaccurate findings if not carefully addressed. Traditional statistical methods for addressing these issues often require that the data meet strong assumptions, which is rarely the case when dealing with terrorism, radicalization, and extremism data. Moreover, the penalties of mishandling biased data go beyond inaccurate point estimates and standard errors; when policy decisions are based on flawed data, the potential for real-world repercussions is significant. Thus, it is crucial to understand and address the bias present within these data sources. This dissertation seeks to answer the following question: how can the application of machine learning methods improve data quality in radicalization and extremism studies? I argue that machine learning methods emerge as a potential solution to the data problems facing the fields of terrorism, radicalization, and extremism studies. In Chapter 1, I propose leveraging advances in Natural Language Processing (NLP), specifically embedding-based topic modeling, to address a methodological challenge facing radicalization and extremism scholars: with exponentially increasing data, how can scholars feasibly understand how the literature has changed topically over time? In Chapter 2, I propose a novel application of a semi-supervised graph-based learning algorithm to predict missing group-level data in the Global Terrorism Database (GTD). In Chapter 3, I answer the following questions: why do radical movements change tactics, and how can we detect if they do? I argue that radical social movements change tactics as a response to repression by adopting new strategies for recruitment and direct action. To test this theory, I apply a Bayesian hidden semi-Markov model to a novel dataset of direct actions perpetrated by the U.S. radical eco movement from 1995-2022. These three chapters demonstrate how machine learning methods can be used to improve the quality of terrorism, radicalization, and extremism data in three separate but related spheres.
History
Publisher
ProQuestLanguage
EnglishCommittee chair
Joseph YoungCommittee member(s)
Zois Boukouvalas; Thomas ZeitzoffDegree discipline
Justice, Law & CriminologyDegree grantor
American University. School of Public AffairsDegree level
- Doctoral