DATA FUSION ON THE SPACE OF SPARSE POSITIVE DEFINITE MATRICES: AN APPLICATION ON MISINFORMATION DETECTION

Tompkins, Morgan

doi:10.57912/24264514.v1

Tompkins_american_0008N_12091.pdf (431.93 kB)

DATA FUSION ON THE SPACE OF SPARSE POSITIVE DEFINITE MATRICES: AN APPLICATION ON MISINFORMATION DETECTION

thesis

posted on 2023-10-09, 16:54 authored by Morgan Tompkins

Accurately identifying misinformation on online platforms is a pressing and important topic in machine learning. By using multiple types of datasets, such as image and text data, in combination with joint decomposition methods, we can fully exploit dependence across sources to enhance classification accuracy. Independent vector analysis (IVA) is a developing method for joint decomposition in datasets with multiple modalities to recover latent sources. In this thesis, we use IVA-SPICE, an extension of IVA that exploits sparsity commonly found in real-world data, to reduce the effects of confounding relationships and improve the classification of data fusion applications. We develop a new framework that uses IVA-SPICE and sparse covariance matrices as features for training classification algorithms geared towards detecting misinformation. To do this, we compare the classification performance of IVA-SPICE across several models using both different methods for covariance matrix estimation as well as different classification methods. First, we review essential concepts of data fusion techniques and provide frameworks for IVA and IVA-SPICE. Next, we define the underlying geometry of the sparse symmetric positive definite (SPD) covariance structures we will use to train our models. Finally, we evaluate our new framework on its ability to correctly and efficiently classify Twitter data as misinformation using the concepts introduced in the first two sections of the thesis.

History

Publisher

ProQuest

Language

English

Committee chair

Zois Boukouvalas

Committee member(s)

Jun Lu; Roberto Corizzo

Degree discipline

Statistics

Degree grantor

American University. College of Arts and Sciences

Degree level

Masters

Degree name

M.S. in Statistics

Local identifier

Tompkins_american_0008N_12091.pdf

Media type

application/pdf

Pagination

63 pages

Submission ID

12091

Usage metrics

Keywords

Misinformation Machine learning Multisensor data fusion Statistical matching

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

DATA FUSION ON THE SPACE OF SPARSE POSITIVE DEFINITE MATRICES: AN APPLICATION ON MISINFORMATION DETECTION

History

Publisher

Language

Committee chair

Committee member(s)

Degree discipline

Degree grantor

Degree level

Degree name

Local identifier

Media type

Pagination

Submission ID

Usage metrics

Categories

Keywords

Licence

Exports