Literature DB >> 30717661

Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb).

Dmitry A Kuzmin1,2, Sergey I Feranchuk1,3,4, Vadim V Sharov1,2, Alexander N Cybin1,2, Stepan V Makolov1,2, Yuliya A Putintseva1, Natalya V Oreshkova1,5, Konstantin V Krutovsky6,7,8,9.   

Abstract

BACKGROUND: De novo assembling of large genomes, such as in conifers (~ 12-30 Gbp), which also consist of ~ 80% of repetitive DNA, is a very complex and computationally intense endeavor. One of the main problems in assembling such genomes lays in computing limitations of nucleotide sequence assembly programs (DNA assemblers). As a rule, modern assemblers are usually designed to assemble genomes with a length not exceeding the length of the human genome (3.24 Gbp). Most assemblers cannot handle the amount of input sequence data required to provide sufficient coverage needed for a high-quality assembly.
RESULTS: An original stepwise method of de novo assembly by parts (sets), which allows to bypass the limitations of modern assemblers associated with a huge amount of data being processed, is presented in this paper. The results of numerical assembling experiments conducted using the model plant Arabidopsis thaliana, Prunus persica (peach) and four most popular assemblers, ABySS, SOAPdenovo, SPAdes, and CLC Assembly Cell, showed the validity and effectiveness of the proposed stepwise assembling method.
CONCLUSION: Using the new stepwise de novo assembling method presented in the paper, the genome of Siberian larch, Larix sibirica Ledeb. (12.34 Gbp) was completely assembled de novo by the CLC Assembly Cell assembler. It is the first genome assembly for larch species in addition to only five other conifer genomes sequenced and assembled for Picea abies, Picea glauca, Pinus taeda, Pinus lambertiana, and Pseudotsuga menziesii var. menziesii.

Entities:  

Keywords:  Larix sibirica; Siberian larch; de novo genome assembly

Mesh:

Year:  2019        PMID: 30717661      PMCID: PMC6362582          DOI: 10.1186/s12859-018-2570-y

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  27 in total

1.  SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.

Authors:  Anton Bankevich; Sergey Nurk; Dmitry Antipov; Alexey A Gurevich; Mikhail Dvorkin; Alexander S Kulikov; Valery M Lesin; Sergey I Nikolenko; Son Pham; Andrey D Prjibelski; Alexey V Pyshkin; Alexander V Sirotkin; Nikolay Vyahhi; Glenn Tesler; Max A Alekseyev; Pavel A Pevzner
Journal:  J Comput Biol       Date:  2012-04-16       Impact factor: 1.479

2.  Linguistic features of noncoding DNA sequences.

Authors:  R N Mantegna; S V Buldyrev; A L Goldberger; S Havlin; C K Peng; M Simons; H E Stanley
Journal:  Phys Rev Lett       Date:  1994-12-05       Impact factor: 9.161

3.  Search and clustering orders of magnitude faster than BLAST.

Authors:  Robert C Edgar
Journal:  Bioinformatics       Date:  2010-08-12       Impact factor: 6.937

4.  ABySS: a parallel assembler for short read sequence data.

Authors:  Jared T Simpson; Kim Wong; Shaun D Jackman; Jacqueline E Schein; Steven J M Jones; Inanç Birol
Journal:  Genome Res       Date:  2009-02-27       Impact factor: 9.043

5.  Reference-guided assembly of four diverse Arabidopsis thaliana genomes.

Authors:  Korbinian Schneeberger; Stephan Ossowski; Felix Ott; Juliane D Klein; Xi Wang; Christa Lanz; Lisa M Smith; Jun Cao; Joffrey Fitz; Norman Warthmann; Stefan R Henz; Daniel H Huson; Detlef Weigel
Journal:  Proc Natl Acad Sci U S A       Date:  2011-06-06       Impact factor: 11.205

6.  Comparisons with Caenorhabditis (approximately 100 Mb) and Drosophila (approximately 175 Mb) using flow cytometry show genome size in Arabidopsis to be approximately 157 Mb and thus approximately 25% larger than the Arabidopsis genome initiative estimate of approximately 125 Mb.

Authors:  Michael D Bennett; Ilia J Leitch; H James Price; J Spencer Johnston
Journal:  Ann Bot       Date:  2003-04       Impact factor: 4.357

7.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.

Authors:  Ben Langmead; Cole Trapnell; Mihai Pop; Steven L Salzberg
Journal:  Genome Biol       Date:  2009-03-04       Impact factor: 13.583

8.  Technical and biological variance structure in mRNA-Seq data: life in the real world.

Authors:  Ann L Oberg; Brian M Bot; Diane E Grill; Gregory A Poland; Terry M Therneau
Journal:  BMC Genomics       Date:  2012-07-07       Impact factor: 3.969

9.  How to apply de Bruijn graphs to genome assembly.

Authors:  Phillip E C Compeau; Pavel A Pevzner; Glenn Tesler
Journal:  Nat Biotechnol       Date:  2011-11-08       Impact factor: 54.908

10.  Aspects of coverage in medical DNA sequencing.

Authors:  Michael C Wendl; Richard K Wilson
Journal:  BMC Bioinformatics       Date:  2008-05-16       Impact factor: 3.169

View more
  13 in total

1.  A Reference Genome Sequence for the European Silver Fir (Abies alba Mill.): A Community-Generated Genomic Resource.

Authors:  Elena Mosca; Fernando Cruz; Jèssica Gómez-Garrido; Luca Bianco; Christian Rellstab; Sabine Brodbeck; Katalin Csilléry; Bruno Fady; Matthias Fladung; Barbara Fussi; Dušan Gömöry; Santiago C González-Martínez; Delphine Grivet; Marta Gut; Ole Kim Hansen; Katrin Heer; Zeki Kaya; Konstantin V Krutovsky; Birgit Kersten; Sascha Liepelt; Lars Opgenoorth; Christoph Sperisen; Kristian K Ullrich; Giovanni G Vendramin; Marjana Westergren; Birgit Ziegenhagen; Tyler Alioto; Felix Gugerli; Berthold Heinze; Maria Höhn; Michela Troggio; David B Neale
Journal:  G3 (Bethesda)       Date:  2019-07-09       Impact factor: 3.154

2.  Genome-Wide Prediction of Transcription Start Sites in Conifers.

Authors:  Eugeniya I Bondar; Maxim E Troukhan; Konstantin V Krutovsky; Tatiana V Tatarinova
Journal:  Int J Mol Sci       Date:  2022-02-03       Impact factor: 5.923

3.  Comparative Genomics of Seasonal Senescence in Forest Trees.

Authors:  Anastasia Y Batalova; Yuliya A Putintseva; Michael G Sadovsky; Konstantin V Krutovsky
Journal:  Int J Mol Sci       Date:  2022-03-29       Impact factor: 5.923

4.  A revised view on the evolution of glutamine synthetase isoenzymes in plants.

Authors:  José Miguel Valderrama-Martín; Francisco Ortigosa; Concepción Ávila; Francisco M Cánovas; Bertrand Hirel; Francisco R Cantón; Rafael A Cañas
Journal:  Plant J       Date:  2022-03-09       Impact factor: 7.091

5.  Evolution of complex genome architecture in gymnosperms.

Authors:  Tao Wan; Yanbing Gong; Zhiming Liu; YaDong Zhou; Can Dai; Qingfeng Wang
Journal:  Gigascience       Date:  2022-08-10       Impact factor: 7.658

6.  Siberian larch (Larix sibirica Ledeb.) mitochondrial genome assembled using both short and long nucleotide sequence reads is currently the largest known mitogenome.

Authors:  Yuliya A Putintseva; Eugeniya I Bondar; Evgeniy P Simonov; Vadim V Sharov; Natalya V Oreshkova; Dmitry A Kuzmin; Yuri M Konstantinov; Vladimir N Shmakov; Vadim I Belkov; Michael G Sadovsky; Olivier Keech; Konstantin V Krutovsky
Journal:  BMC Genomics       Date:  2020-09-23       Impact factor: 3.969

Review 7.  From Genome Sequencing to CRISPR-Based Genome Editing for Climate-Resilient Forest Trees.

Authors:  Hieu Xuan Cao; Giang Thi Ha Vu; Oliver Gailing
Journal:  Int J Mol Sci       Date:  2022-01-16       Impact factor: 5.923

8.  Assembled and annotated 26.5 Gbp coast redwood genome: a resource for estimating evolutionary adaptive potential and investigating hexaploid origin.

Authors:  David B Neale; Aleksey V Zimin; Sumaira Zaman; Alison D Scott; Bikash Shrestha; Rachael E Workman; Daniela Puiu; Brian J Allen; Zane J Moore; Manoj K Sekhwal; Amanda R De La Torre; Patrick E McGuire; Emily Burns; Winston Timp; Jill L Wegrzyn; Steven L Salzberg
Journal:  G3 (Bethesda)       Date:  2022-01-04       Impact factor: 3.542

Review 9.  Challenges and Perspectives in the Epigenetics of Climate Change-Induced Forests Decline.

Authors:  Isabel García-García; Belén Méndez-Cea; David Martín-Gálvez; José Ignacio Seco; Francisco Javier Gallego; Juan Carlos Linares
Journal:  Front Plant Sci       Date:  2022-01-04       Impact factor: 5.753

10.  Comparative Repeat Profiling of Two Closely Related Conifers (Larix decidua and Larix kaempferi) Reveals High Genome Similarity With Only Few Fast-Evolving Satellite DNAs.

Authors:  Tony Heitkam; Luise Schulte; Beatrice Weber; Susan Liedtke; Sarah Breitenbach; Anja Kögler; Kristin Morgenstern; Marie Brückner; Ute Tröber; Heino Wolf; Doris Krabel; Thomas Schmidt
Journal:  Front Genet       Date:  2021-07-12       Impact factor: 4.599

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.