| Literature DB >> 35290762 |
Michelle D Noyes1, William T Harvey1, David Porubsky1, Arvis Sulovari1, Ruiyang Li1, Nicholas R Rose1, Peter A Audano1, Katherine M Munson1, Alexandra P Lewis1, Kendra Hoekzema1, Tuomo Mantere2, Tina A Graves-Lindsay3, Ashley D Sanders4, Sara Goodwin5, Melissa Kramer5, Younes Mokrab6, Michael C Zody7, Alexander Hoischen8, Jan O Korbel4, W Richard McCombie5, Evan E Eichler9.
Abstract
Studies of de novo mutation (DNM) have typically excluded some of the most repetitive and complex regions of the genome because these regions cannot be unambiguously mapped with short-read sequencing data. To better understand the genome-wide pattern of DNM, we generated long-read sequence data from an autism parent-child quad with an affected female where no pathogenic variant had been discovered in short-read Illumina sequence data. We deeply sequenced all four individuals by using three sequencing platforms (Illumina, Oxford Nanopore, and Pacific Biosciences) and three complementary technologies (Strand-seq, optical mapping, and 10X Genomics). Using long-read sequencing, we initially discovered and validated 171 DNMs across two children-a 20% increase in the number of de novo single-nucleotide variants (SNVs) and indels when compared to short-read callsets. The number of DNMs further increased by 5% when considering a more complete human reference (T2T-CHM13) because of the recovery of events in regions absent from GRCh38 (e.g., three DNMs in heterochromatic satellites). In total, we validated 195 de novo germline mutations and 23 potential post-zygotic mosaic mutations across both children; the overall true substitution rate based on this integrated callset is at least 1.41 × 10-8 substitutions per nucleotide per generation. We also identified six de novo insertions and deletions in tandem repeats, two of which represent structural variants. We demonstrate that long-read sequencing and assembly, especially when combined with a more complete reference genome, increases the number of DNMs by >25% compared to previous studies, providing a more complete catalog of DNM compared to short-read data alone.Entities:
Keywords: autism; de novo mutation; genome sequencing; long-read sequencing
Mesh:
Substances:
Year: 2022 PMID: 35290762 PMCID: PMC9069071 DOI: 10.1016/j.ajhg.2022.02.014
Source DB: PubMed Journal: Am J Hum Genet ISSN: 0002-9297 Impact factor: 11.043