Literature DB >> 36151399

Genome sequence assembly algorithms and misassembly identification methods.

Yue Meng1, Yu Lei2, Jianlong Gao3, Yuxuan Liu3, Enze Ma3, Yunhong Ding3, Yixin Bian3, Hongquan Zu4, Yucui Dong5, Xiao Zhu6.   

Abstract

The sequence assembly algorithms have rapidly evolved with the vigorous growth of genome sequencing technology over the past two decades. Assembly mainly uses the iterative expansion of overlap relationships between sequences to construct the target genome. The assembly algorithms can be typically classified into several categories, such as the Greedy strategy, Overlap-Layout-Consensus (OLC) strategy, and de Bruijn graph (DBG) strategy. In particular, due to the rapid development of third-generation sequencing (TGS) technology, some prevalent assembly algorithms have been proposed to generate high-quality chromosome-level assemblies. However, due to the genome complexity, the length of short reads, and the high error rate of long reads, contigs produced by assembly may contain misassemblies adversely affecting downstream data analysis. Therefore, several read-based and reference-based methods for misassembly identification have been developed to improve assembly quality. This work primarily reviewed the development of DNA sequencing technologies and summarized sequencing data simulation methods, sequencing error correction methods, various mainstream sequence assembly algorithms, and misassembly identification methods. A large amount of computation makes the sequence assembly problem more challenging, and therefore, it is necessary to develop more efficient and accurate assembly algorithms and alternative algorithms.
© 2022. The Author(s), under exclusive licence to Springer Nature B.V.

Entities:  

Keywords:  Genome assembly algorithms; Genome sequencing technology; Misassembly identification methods; Third-generation sequencing

Year:  2022        PMID: 36151399     DOI: 10.1007/s11033-022-07919-8

Source DB:  PubMed          Journal:  Mol Biol Rep        ISSN: 0301-4851            Impact factor:   2.742


  104 in total

1.  GAGE: A critical evaluation of genome assemblies and assembly algorithms.

Authors:  Steven L Salzberg; Adam M Phillippy; Aleksey Zimin; Daniela Puiu; Tanja Magoc; Sergey Koren; Todd J Treangen; Michael C Schatz; Arthur L Delcher; Michael Roberts; Guillaume Marçais; Mihai Pop; James A Yorke
Journal:  Genome Res       Date:  2012-01-06       Impact factor: 9.043

Review 2.  Assembly algorithms for next-generation sequencing data.

Authors:  Jason R Miller; Sergey Koren; Granger Sutton
Journal:  Genomics       Date:  2010-03-06       Impact factor: 5.736

3.  Sanger sequencing.

Authors:  Diego Estrada-Rivadeneyra
Journal:  FEBS J       Date:  2017-11-24       Impact factor: 5.542

Review 4.  Genome structural variation discovery and genotyping.

Authors:  Can Alkan; Bradley P Coe; Evan E Eichler
Journal:  Nat Rev Genet       Date:  2011-03-01       Impact factor: 53.242

Review 5.  A comparison of tools for the simulation of genomic next-generation sequencing data.

Authors:  Merly Escalona; Sara Rocha; David Posada
Journal:  Nat Rev Genet       Date:  2016-06-20       Impact factor: 53.242

Review 6.  New challenges, new opportunities: Next generation sequencing and its place in the advancement of HLA typing.

Authors:  Valia Bravo-Egana; Holly Sanders; Nilesh Chitnis
Journal:  Hum Immunol       Date:  2021-02-05       Impact factor: 2.850

Review 7.  Analysis of plant microbe interactions in the era of next generation sequencing technologies.

Authors:  Claudia Knief
Journal:  Front Plant Sci       Date:  2014-05-21       Impact factor: 5.753

Review 8.  Single molecule real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics.

Authors:  Simon Ardui; Adam Ameur; Joris R Vermeesch; Matthew S Hestand
Journal:  Nucleic Acids Res       Date:  2018-03-16       Impact factor: 16.971

9.  Haplotyping germline and cancer genomes with high-throughput linked-read sequencing.

Authors:  Grace X Y Zheng; Billy T Lau; Michael Schnall-Levin; Mirna Jarosz; John M Bell; Christopher M Hindson; Sofia Kyriazopoulou-Panagiotopoulou; Donald A Masquelier; Landon Merrill; Jessica M Terry; Patrice A Mudivarti; Paul W Wyatt; Rajiv Bharadwaj; Anthony J Makarewicz; Yuan Li; Phillip Belgrader; Andrew D Price; Adam J Lowe; Patrick Marks; Gerard M Vurens; Paul Hardenbol; Luz Montesclaros; Melissa Luo; Lawrence Greenfield; Alexander Wong; David E Birch; Steven W Short; Keith P Bjornson; Pranav Patel; Erik S Hopmans; Christina Wood; Sukhvinder Kaur; Glenn K Lockwood; David Stafford; Joshua P Delaney; Indira Wu; Heather S Ordonez; Susan M Grimes; Stephanie Greer; Josephine Y Lee; Kamila Belhocine; Kristina M Giorda; William H Heaton; Geoffrey P McDermott; Zachary W Bent; Francesca Meschi; Nikola O Kondov; Ryan Wilson; Jorge A Bernate; Shawn Gauby; Alex Kindwall; Clara Bermejo; Adrian N Fehr; Adrian Chan; Serge Saxonov; Kevin D Ness; Benjamin J Hindson; Hanlee P Ji
Journal:  Nat Biotechnol       Date:  2016-02-01       Impact factor: 54.908

10.  Whole genome sequencing of Neisseria meningitidis Y isolates collected in the Czech Republic in 1993-2018.

Authors:  Michal Honskus; Zuzana Okonji; Martin Musilek; Pavla Krizova
Journal:  PLoS One       Date:  2022-03-10       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.