| Literature DB >> 28263316 |
Derek M Bickhart1, Benjamin D Rosen2, Sergey Koren3, Brian L Sayre4, Alex R Hastie5, Saki Chan5, Joyce Lee5, Ernest T Lam5, Ivan Liachko6, Shawn T Sullivan7, Joshua N Burton6, Heather J Huson8, John C Nystrom8, Christy M Kelley9, Jana L Hutchison2, Yang Zhou2,10, Jiajie Sun11, Alessandra Crisà12, F Abel Ponce de León13, John C Schwartz14, John A Hammond14, Geoffrey C Waldbieser15, Steven G Schroeder2, George E Liu2, Maitreya J Dunham6, Jay Shendure6,16, Tad S Sonstegard17, Adam M Phillippy3, Curtis P Van Tassell2, Timothy P L Smith9.
Abstract
The decrease in sequencing cost and increased sophistication of assembly algorithms for short-read platforms has resulted in a sharp increase in the number of species with genome assemblies. However, these assemblies are highly fragmented, with many gaps, ambiguities, and errors, impeding downstream applications. We demonstrate current state of the art for de novo assembly using the domestic goat (Capra hircus) based on long reads for contig formation, short reads for consensus validation, and scaffolding by optical and chromatin interaction mapping. These combined technologies produced what is, to our knowledge, the most continuous de novo mammalian assembly to date, with chromosome-length scaffolds and only 649 gaps. Our assembly represents a ∼400-fold improvement in continuity due to properly assembled gaps, compared to the previously published C. hircus assembly, and better resolves repetitive structures longer than 1 kb, representing the largest repeat family and immune gene complex yet produced for an individual of a ruminant species.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28263316 PMCID: PMC5909822 DOI: 10.1038/ng.3802
Source DB: PubMed Journal: Nat Genet ISSN: 1061-4036 Impact factor: 38.330