American University
auislandora_84921_OBJ.pdf (1.94 MB)

Regional Sequence Expansion or Collapse in Heterozygous Genome Assemblies

Download (1.94 MB)
posted on 2023-08-04, 09:11 authored by Kathryn Celestia Asalone

High levels of heterozygosity present a unique genome assembly challenge and can adversely impact downstream analyses yet is common in sequencing datasets obtained from non-model organisms. Here it is shown that by re-assembling a heterozygous dataset with variant parameters and different assembly algorithms, assemblies whose protein annotations are statistically enriched for specific gene ontology categories can be generated. While total assembly length was not significantly affected by assembly methodologies tested, the generated assemblies varied widely in fragmentation level and collapse or expansion which are underlying the enrichment or depletion of specific protein functional groups. It is shown that these statistically significant deviations in gene ontology groups can occur in seemingly high-quality assemblies and result from difficult-to-detect local sequence expansion or contractions. Given the unpredictable interplay between assembly algorithm, parameter, and biological sequence data heterozygosity, here the need for better measures of assembly quality than N50 value, including methods for assessing local expansion and collapse, is highlighted.



American University




Degree Awarded: M.S. Psychology. American University


Usage metrics

    Theses and Dissertations


    No categories selected