American University
Browse

DATA FUSION ON THE SPACE OF SPARSE POSITIVE DEFINITE MATRICES: AN APPLICATION ON MISINFORMATION DETECTION

Download (431.93 kB)
thesis
posted on 2023-10-09, 16:54 authored by Morgan Tompkins
Accurately identifying misinformation on online platforms is a pressing and important topic in machine learning. By using multiple types of datasets, such as image and text data, in combination with joint decomposition methods, we can fully exploit dependence across sources to enhance classification accuracy. Independent vector analysis (IVA) is a developing method for joint decomposition in datasets with multiple modalities to recover latent sources. In this thesis, we use IVA-SPICE, an extension of IVA that exploits sparsity commonly found in real-world data, to reduce the effects of confounding relationships and improve the classification of data fusion applications. We develop a new framework that uses IVA-SPICE and sparse covariance matrices as features for training classification algorithms geared towards detecting misinformation. To do this, we compare the classification performance of IVA-SPICE across several models using both different methods for covariance matrix estimation as well as different classification methods. First, we review essential concepts of data fusion techniques and provide frameworks for IVA and IVA-SPICE. Next, we define the underlying geometry of the sparse symmetric positive definite (SPD) covariance structures we will use to train our models. Finally, we evaluate our new framework on its ability to correctly and efficiently classify Twitter data as misinformation using the concepts introduced in the first two sections of the thesis.

History

Publisher

ProQuest

Language

English

Committee chair

Zois Boukouvalas

Committee member(s)

Jun Lu; Roberto Corizzo

Degree discipline

Statistics

Degree grantor

American University. College of Arts and Sciences

Degree level

  • Masters

Degree name

M.S. in Statistics

Local identifier

Tompkins_american_0008N_12091.pdf

Media type

application/pdf

Pagination

63 pages

Submission ID

12091

Usage metrics

    Theses and Dissertations

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC