American University
Browse

BAD DATA, BETTER METHODS: THREE ESSAYS ON LEVERAGING MACHINE LEARNING FOR IMPROVED DATA QUALITY IN EXTREMISM RESEARCH

Download (2.35 MB)
thesis
posted on 2025-05-16, 14:01 authored by Bethany Leap

The quantitative academic research around terrorism, radicalization, and extremism has increased significantly over the last twenty-five years, largely due to the rise in open-source databases. Despite this increase in data, the underlying issues around data quality in terrorism, radicalization, and extremism research remain unaddressed. The data often suffer from latent clustering, systematic measurement error, and nonrandom missingness, all of which result in inaccurate findings if not carefully addressed. Traditional statistical methods for addressing these issues often require that the data meet strong assumptions, which is rarely the case when dealing with terrorism, radicalization, and extremism data. Moreover, the penalties of mishandling biased data go beyond inaccurate point estimates and standard errors; when policy decisions are based on flawed data, the potential for real-world repercussions is significant. Thus, it is crucial to understand and address the bias present within these data sources. This dissertation seeks to answer the following question: how can the application of machine learning methods improve data quality in radicalization and extremism studies? I argue that machine learning methods emerge as a potential solution to the data problems facing the fields of terrorism, radicalization, and extremism studies. In Chapter 1, I propose leveraging advances in Natural Language Processing (NLP), specifically embedding-based topic modeling, to address a methodological challenge facing radicalization and extremism scholars: with exponentially increasing data, how can scholars feasibly understand how the literature has changed topically over time? In Chapter 2, I propose a novel application of a semi-supervised graph-based learning algorithm to predict missing group-level data in the Global Terrorism Database (GTD). In Chapter 3, I answer the following questions: why do radical movements change tactics, and how can we detect if they do? I argue that radical social movements change tactics as a response to repression by adopting new strategies for recruitment and direct action. To test this theory, I apply a Bayesian hidden semi-Markov model to a novel dataset of direct actions perpetrated by the U.S. radical eco movement from 1995-2022. These three chapters demonstrate how machine learning methods can be used to improve the quality of terrorism, radicalization, and extremism data in three separate but related spheres.

History

Publisher

ProQuest

Language

English

Committee chair

Joseph Young

Committee member(s)

Zois Boukouvalas; Thomas Zeitzoff

Degree discipline

Justice, Law & Criminology

Degree grantor

American University. School of Public Affairs

Degree level

  • Doctoral

Degree name

Justice, Law & Criminology

Local identifier

Leap_american_0008E_12296

Media type

application/pdf

Pagination

148 pages

Call number

Thesis 11623

MMS ID

99187042192804102

Submission ID

12296

Usage metrics

    Theses and Dissertations

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC