Literature DB >> 32893860

TGS-GapCloser: A fast and accurate gap closer for large genomes with low coverage of error-prone long reads.

Mengyang Xu1,2,3, Lidong Guo1,4, Shengqiang Gu1,4, Ou Wang3,5, Rui Zhang1, Brock A Peters3,6, Guangyi Fan1,3, Xin Liu1,2,3,7, Xun Xu3,7, Li Deng1,2,3, Yongwei Zhang3,6.   

Abstract

BACKGROUND: Analyses that use genome assemblies are critically affected by the contiguity, completeness, and accuracy of those assemblies. In recent years single-molecule sequencing techniques generating long-read information have become available and enabled substantial improvement in contig length and genome completeness, especially for large genomes (>100 Mb), although bioinformatic tools for these applications are still limited.
FINDINGS: We developed a software tool to close sequence gaps in genome assemblies, TGS-GapCloser, that uses low-depth (∼10×) long single-molecule reads. The algorithm extracts reads that bridge gap regions between 2 contigs within a scaffold, error corrects only the candidate reads, and assigns the best sequence data to each gap. As a demonstration, we used TGS-GapCloser to improve the scaftig NG50 value of 3 human genome assemblies by 24-fold on average with only ∼10× coverage of Oxford Nanopore or Pacific Biosciences reads, covering with sequence data up to 94.8% gaps with 97.7% positive predictive value. These improved assemblies achieve 99.998% (Q46) single-base accuracy with final inserted sequences having 99.97% (Q35) accuracy, despite the high raw error rate of single-molecule reads, enabling high-quality downstream analyses, including up to a 31-fold increase in the scaftig NGA50 and up to 13.1% more complete BUSCO genes. Additionally, we show that even in ultra-large genome assemblies, such as the ginkgo (∼12 Gb), TGS-GapCloser can cover 71.6% of gaps with sequence data.
CONCLUSIONS: TGS-GapCloser can close gaps in large genome assemblies using raw long reads quickly and cost-effectively. The final assemblies generated by TGS-GapCloser have improved contiguity and completeness while maintaining high accuracy. The software is available at https://github.com/BGI-Qingdao/TGS-GapCloser.
© The Author(s) 2020. Published by Oxford University Press.

Entities:  

Keywords:  MHC; gap closure; genome assembly; ginkgo; third-generation sequencing

Year:  2020        PMID: 32893860      PMCID: PMC7476103          DOI: 10.1093/gigascience/giaa094

Source DB:  PubMed          Journal:  Gigascience        ISSN: 2047-217X            Impact factor:   6.524


  41 in total

1.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs.

Authors:  Felipe A Simão; Robert M Waterhouse; Panagiotis Ioannidis; Evgenia V Kriventseva; Evgeny M Zdobnov
Journal:  Bioinformatics       Date:  2015-06-09       Impact factor: 6.937

2.  Minimap2: pairwise alignment for nucleotide sequences.

Authors:  Heng Li
Journal:  Bioinformatics       Date:  2018-09-15       Impact factor: 6.937

3.  QUAST: quality assessment tool for genome assemblies.

Authors:  Alexey Gurevich; Vladislav Saveliev; Nikolay Vyahhi; Glenn Tesler
Journal:  Bioinformatics       Date:  2013-02-19       Impact factor: 6.937

4.  Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls.

Authors:  Justin M Zook; Brad Chapman; Jason Wang; David Mittelman; Oliver Hofmann; Winston Hide; Marc Salit
Journal:  Nat Biotechnol       Date:  2014-02-16       Impact factor: 54.908

5.  DNA repeats in the human genome.

Authors:  P Catasti; X Chen; S V Mariappan; E M Bradbury; G Gupta
Journal:  Genetica       Date:  1999       Impact factor: 1.082

Review 6.  State of the art de novo assembly of human genomes from massively parallel sequencing data.

Authors:  Yingrui Li; Yujie Hu; Lars Bolund; Jun Wang
Journal:  Hum Genomics       Date:  2010-04       Impact factor: 4.639

7.  A pipeline for completing bacterial genomes using in silico and wet lab approaches.

Authors:  Rutika Puranik; Guangri Quan; Jacob Werner; Rong Zhou; Zhaohui Xu
Journal:  BMC Genomics       Date:  2015-01-29       Impact factor: 3.969

8.  Draft genome of the living fossil Ginkgo biloba.

Authors:  Rui Guan; Yunpeng Zhao; He Zhang; Guangyi Fan; Xin Liu; Wenbin Zhou; Chengcheng Shi; Jiahao Wang; Weiqing Liu; Xinming Liang; Yuanyuan Fu; Kailong Ma; Lijun Zhao; Fumin Zhang; Zuhong Lu; Simon Ming-Yuen Lee; Xun Xu; Jian Wang; Huanming Yang; Chengxin Fu; Song Ge; Wenbin Chen
Journal:  Gigascience       Date:  2016-11-21       Impact factor: 6.524

9.  Direct determination of diploid genome sequences.

Authors:  Neil I Weisenfeld; Vijay Kumar; Preyas Shah; Deanna M Church; David B Jaffe
Journal:  Genome Res       Date:  2017-04-05       Impact factor: 9.043

10.  Haplotyping germline and cancer genomes with high-throughput linked-read sequencing.

Authors:  Grace X Y Zheng; Billy T Lau; Michael Schnall-Levin; Mirna Jarosz; John M Bell; Christopher M Hindson; Sofia Kyriazopoulou-Panagiotopoulou; Donald A Masquelier; Landon Merrill; Jessica M Terry; Patrice A Mudivarti; Paul W Wyatt; Rajiv Bharadwaj; Anthony J Makarewicz; Yuan Li; Phillip Belgrader; Andrew D Price; Adam J Lowe; Patrick Marks; Gerard M Vurens; Paul Hardenbol; Luz Montesclaros; Melissa Luo; Lawrence Greenfield; Alexander Wong; David E Birch; Steven W Short; Keith P Bjornson; Pranav Patel; Erik S Hopmans; Christina Wood; Sukhvinder Kaur; Glenn K Lockwood; David Stafford; Joshua P Delaney; Indira Wu; Heather S Ordonez; Susan M Grimes; Stephanie Greer; Josephine Y Lee; Kamila Belhocine; Kristina M Giorda; William H Heaton; Geoffrey P McDermott; Zachary W Bent; Francesca Meschi; Nikola O Kondov; Ryan Wilson; Jorge A Bernate; Shawn Gauby; Alex Kindwall; Clara Bermejo; Adrian N Fehr; Adrian Chan; Serge Saxonov; Kevin D Ness; Benjamin J Hindson; Hanlee P Ji
Journal:  Nat Biotechnol       Date:  2016-02-01       Impact factor: 54.908

View more
  20 in total

1.  Reference-Guided De Novo Genome Assembly of the Flour Beetle Tribolium freemani.

Authors:  Marin Volarić; Evelin Despot-Slade; Damira Veseljak; Nevenka Meštrović; Brankica Mravinac
Journal:  Int J Mol Sci       Date:  2022-05-24       Impact factor: 6.208

2.  Draft Genome of Tanacetum Coccineum: Genomic Comparison of Closely Related Tanacetum-Family Plants.

Authors:  Takanori Yamashiro; Akira Shiraishi; Koji Nakayama; Honoo Satake
Journal:  Int J Mol Sci       Date:  2022-06-24       Impact factor: 6.208

3.  DENTIST-using long reads for closing assembly gaps at high accuracy.

Authors:  Arne Ludwig; Martin Pippel; Gene Myers; Michael Hiller
Journal:  Gigascience       Date:  2022-01-25       Impact factor: 7.658

4.  Assemblies of the genomes of parasitic wasps using meta-assembly and scaffolding with genetic linkage.

Authors:  Kameron T Wittmeyer; Sara J Oppenheim; Keith R Hopper
Journal:  G3 (Bethesda)       Date:  2022-01-04       Impact factor: 3.542

5.  Chromosome-Level Genome Assembly Reveals Dynamic Sex Chromosomes in Neotropical Leaf-Litter Geckos (Sphaerodactylidae: Sphaerodactylus).

Authors:  Brendan J Pinto; Shannon E Keating; Stuart V Nielsen; Daniel P Scantlebury; Juan D Daza; Tony Gamble
Journal:  J Hered       Date:  2022-07-09       Impact factor: 2.679

6.  Long-read sequencing of 111 rice genomes reveals significantly larger pan-genomes.

Authors:  Fan Zhang; Hongzhang Xue; Xiaorui Dong; Min Li; Xiaoming Zheng; Zhikang Li; Jianlong Xu; Wensheng Wang; Chaochun Wei
Journal:  Genome Res       Date:  2022-04-08       Impact factor: 9.438

7.  MetaPlatanus: a metagenome assembler that combines long-range sequence links and species-specific features.

Authors:  Rei Kajitani; Hideki Noguchi; Yasuhiro Gotoh; Yoshitoshi Ogura; Dai Yoshimura; Miki Okuno; Atsushi Toyoda; Tomomi Kuwahara; Tetsuya Hayashi; Takehiko Itoh
Journal:  Nucleic Acids Res       Date:  2021-12-16       Impact factor: 16.971

8.  Draft Genome Sequence of Corynebacterium sanguinis Strain Marseille-P8776.

Authors:  Mudra Khare; Dhiraj Sinha; Rita Zgheib; Amael Fadlane; Didier Raoult; Pierre-Edouard Fournier
Journal:  Microbiol Resour Announc       Date:  2022-05-04

9.  Comparison of long-read methods for sequencing and assembly of a plant genome.

Authors:  Valentine Murigneux; Subash Kumar Rai; Agnelo Furtado; Timothy J C Bruxner; Wei Tian; Ivon Harliwong; Hanmin Wei; Bicheng Yang; Qianyu Ye; Ellis Anderson; Qing Mao; Radoje Drmanac; Ou Wang; Brock A Peters; Mengyang Xu; Pei Wu; Bruce Topp; Lachlan J M Coin; Robert J Henry
Journal:  Gigascience       Date:  2020-12-21       Impact factor: 6.524

10.  The complete chloroplast genome of Dryopteris crassirhizoma Nakai.

Authors:  Qi Wang; Wenting Xu; Lirong Zhou; Biwei Mai; Naiyun Zhu; Xiaoli Zhao; Zhixian Lei
Journal:  Mitochondrial DNA B Resour       Date:  2021-05-27       Impact factor: 0.658

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.