Literature DB >> 25649620

Methods for the detection and assembly of novel sequence in high-throughput sequencing data.

Manuel Holtgrewe1, Leon Kuchenbecker2, Knut Reinert1.   

Abstract

MOTIVATION: Large insertions of novel sequence are an important type of structural variants. Previous studies used traditional de novo assemblers for assembling non-mapping high-throughput sequencing (HTS) or capillary reads and then tried to anchor them in the reference using paired read information.
RESULTS: We present approaches for detecting insertion breakpoints and targeted assembly of large insertions from HTS paired data: BASIL and ANISE. On near identity repeats that are hard for assemblers, ANISE employs a repeat resolution step. This results in far better reconstructions than obtained by the compared methods. On simulated data, we found our insert assembler to be competitive with the de novo assemblers ABYSS and SGA while yielding already anchored inserted sequence as opposed to unanchored contigs as from ABYSS/SGA. On real-world data, we detected novel sequence in a human individual and thoroughly validated the assembled sequence. ANISE was found to be superior to the competing tool MindTheGap on both simulated and real-world data.
AVAILABILITY AND IMPLEMENTATION: ANISE and BASIL are available for download at http://www.seqan.de/projects/herbarium under a permissive open source license.
© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Mesh:

Year:  2015        PMID: 25649620     DOI: 10.1093/bioinformatics/btv051

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  7 in total

1.  Efficient detection and assembly of non-reference DNA sequences with synthetic long reads.

Authors:  Dmitry Meleshko; Rui Yang; Patrick Marks; Stephen Williams; Iman Hajirasouliha
Journal:  Nucleic Acids Res       Date:  2022-10-14       Impact factor: 19.160

2.  Comprehensive evaluation of structural variation detection algorithms for whole genome sequencing.

Authors:  Shunichi Kosugi; Yukihide Momozawa; Xiaoxi Liu; Chikashi Terao; Michiaki Kubo; Yoichiro Kamatani
Journal:  Genome Biol       Date:  2019-06-03       Impact factor: 13.583

3.  Insertion variants missing in the human reference genome are widespread among human populations.

Authors:  Young-Gun Lee; Jin-Young Lee; Junhyong Kim; Young-Joon Kim
Journal:  BMC Biol       Date:  2020-11-13       Impact factor: 7.431

4.  Combining callers improves the detection of copy number variants from whole-genome sequencing.

Authors:  Manuel Holtgrewe; Marten Jäger; Marie Coutelier; Ricarda Flöttman; Martin A Mensah; Malte Spielmann; Peter Krawitz; Denise Horn; Dieter Beule; Stefan Mundlos
Journal:  Eur J Hum Genet       Date:  2021-11-08       Impact factor: 4.246

5.  ITD assembler: an algorithm for internal tandem duplication discovery from short-read sequencing data.

Authors:  Navin Rustagi; Oliver A Hampton; Jie Li; Liu Xi; Richard A Gibbs; Sharon E Plon; Marek Kimmel; David A Wheeler
Journal:  BMC Bioinformatics       Date:  2016-04-27       Impact factor: 3.169

6.  Discovery and genotyping of novel sequence insertions in many sequenced individuals.

Authors:  Pinar Kavak; Yen-Yi Lin; Ibrahim Numanagic; Hossein Asghari; Tunga Güngör; Can Alkan; Faraz Hach
Journal:  Bioinformatics       Date:  2017-07-15       Impact factor: 6.937

7.  Population-scale detection of non-reference sequence variants using colored de Bruijn Graphs.

Authors:  Thomas Krannich; W Timothy J White; Sebastian Niehus; Guillaume Holley; Bjarni V Halldórsson; Birte Kehr
Journal:  Bioinformatics       Date:  2021-11-02       Impact factor: 6.937

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.