Literature DB >> 35924489

Efficient detection and assembly of non-reference DNA sequences with synthetic long reads.

Dmitry Meleshko1,2, Rui Yang1,2, Patrick Marks3, Stephen Williams3, Iman Hajirasouliha2,4.   

Abstract

Recent pan-genome studies have revealed an abundance of DNA sequences in human genomes that are not present in the reference genome. A lion's share of these non-reference sequences (NRSs) cannot be reliably assembled or placed on the reference genome. Improvements in long-read and synthetic long-read (aka linked-read) technologies have great potential for the characterization of NRSs. While synthetic long reads require less input DNA than long-read datasets, they are algorithmically more challenging to use. Except for computationally expensive whole-genome assembly methods, there is no synthetic long-read method for NRS detection. We propose a novel integrated alignment-based and local assembly-based algorithm, Novel-X, that uses the barcode information encoded in synthetic long reads to improve the detection of such events without a whole-genome de novo assembly. Our evaluations demonstrate that Novel-X finds many non-reference sequences that cannot be found by state-of-the-art short-read methods. We applied Novel-X to a diverse set of 68 samples from the Polaris HiSeq 4000 PGx cohort. Novel-X discovered 16 691 NRS insertions of size > 300 bp (total length 18.2 Mb). Many of them are population specific or may have a functional impact.
© The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Year:  2022        PMID: 35924489      PMCID: PMC9561269          DOI: 10.1093/nar/gkac653

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   19.160


  48 in total

1.  Diversity in non-repetitive human sequences not found in the reference genome.

Authors:  Birte Kehr; Anna Helgadottir; Pall Melsted; Hakon Jonsson; Hannes Helgason; Adalbjörg Jonasdottir; Aslaug Jonasdottir; Asgeir Sigurdsson; Arnaldur Gylfason; Gisli H Halldorsson; Snaedis Kristmundsdottir; Gudmundur Thorgeirsson; Isleifur Olafsson; Hilma Holm; Unnur Thorsteinsdottir; Patrick Sulem; Agnar Helgason; Daniel F Gudbjartsson; Bjarni V Halldorsson; Kari Stefansson
Journal:  Nat Genet       Date:  2017-02-27       Impact factor: 38.330

2.  Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls.

Authors:  Justin M Zook; Brad Chapman; Jason Wang; David Mittelman; Oliver Hofmann; Winston Hide; Marc Salit
Journal:  Nat Biotechnol       Date:  2014-02-16       Impact factor: 54.908

3.  De novo diploid genome assembly for genome-wide structural variant detection.

Authors:  Lu Zhang; Xin Zhou; Ziming Weng; Arend Sidow
Journal:  NAR Genom Bioinform       Date:  2019-12-06

4.  ART: a next-generation sequencing read simulator.

Authors:  Weichun Huang; Leping Li; Jason R Myers; Gabor T Marth
Journal:  Bioinformatics       Date:  2011-12-23       Impact factor: 6.937

5.  Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications.

Authors:  Xiaoyu Chen; Ole Schulz-Trieglaff; Richard Shaw; Bret Barnes; Felix Schlesinger; Morten Källberg; Anthony J Cox; Semyon Kruglyak; Christopher T Saunders
Journal:  Bioinformatics       Date:  2015-12-08       Impact factor: 6.937

6.  Mapping copy number variation by population-scale genome sequencing.

Authors:  Ryan E Mills; Klaudia Walter; Chip Stewart; Robert E Handsaker; Ken Chen; Can Alkan; Alexej Abyzov; Seungtai Chris Yoon; Kai Ye; R Keira Cheetham; Asif Chinwalla; Donald F Conrad; Yutao Fu; Fabian Grubert; Iman Hajirasouliha; Fereydoun Hormozdiari; Lilia M Iakoucheva; Zamin Iqbal; Shuli Kang; Jeffrey M Kidd; Miriam K Konkel; Joshua Korn; Ekta Khurana; Deniz Kural; Hugo Y K Lam; Jing Leng; Ruiqiang Li; Yingrui Li; Chang-Yun Lin; Ruibang Luo; Xinmeng Jasmine Mu; James Nemesh; Heather E Peckham; Tobias Rausch; Aylwyn Scally; Xinghua Shi; Michael P Stromberg; Adrian M Stütz; Alexander Eckehart Urban; Jerilyn A Walker; Jiantao Wu; Yujun Zhang; Zhengdong D Zhang; Mark A Batzer; Li Ding; Gabor T Marth; Gil McVean; Jonathan Sebat; Michael Snyder; Jun Wang; Kenny Ye; Evan E Eichler; Mark B Gerstein; Matthew E Hurles; Charles Lee; Steven A McCarroll; Jan O Korbel
Journal:  Nature       Date:  2011-02-03       Impact factor: 49.962

7.  Nanopore sequencing and assembly of a human genome with ultra-long reads.

Authors:  Miten Jain; Sergey Koren; Karen H Miga; Josh Quick; Arthur C Rand; Thomas A Sasani; John R Tyson; Andrew D Beggs; Alexander T Dilthey; Ian T Fiddes; Sunir Malla; Hannah Marriott; Tom Nieto; Justin O'Grady; Hugh E Olsen; Brent S Pedersen; Arang Rhie; Hollian Richardson; Aaron R Quinlan; Terrance P Snutch; Louise Tee; Benedict Paten; Adam M Phillippy; Jared T Simpson; Nicholas J Loman; Matthew Loose
Journal:  Nat Biotechnol       Date:  2018-01-29       Impact factor: 54.908

8.  Efficient and unique cobarcoding of second-generation sequencing reads from long DNA molecules enabling cost-effective and accurate sequencing, haplotyping, and de novo assembly.

Authors:  Ou Wang; Robert Chin; Xiaofang Cheng; Michelle Ka Yan Wu; Qing Mao; Jingbo Tang; Yuhui Sun; Radoje Drmanac; Brock A Peters; Ellis Anderson; Han K Lam; Dan Chen; Yujun Zhou; Linying Wang; Fei Fan; Yan Zou; Yinlong Xie; Rebecca Yu Zhang; Snezana Drmanac; Darlene Nguyen; Chongjun Xu; Christian Villarosa; Scott Gablenz; Nina Barua; Staci Nguyen; Wenlan Tian; Jia Sophie Liu; Jingwan Wang; Xiao Liu; Xiaojuan Qi; Ao Chen; He Wang; Yuliang Dong; Wenwei Zhang; Andrei Alexeev; Huanming Yang; Jian Wang; Karsten Kristiansen; Xun Xu
Journal:  Genome Res       Date:  2019-04-02       Impact factor: 9.043

9.  Discovery and genotyping of novel sequence insertions in many sequenced individuals.

Authors:  Pinar Kavak; Yen-Yi Lin; Ibrahim Numanagic; Hossein Asghari; Tunga Güngör; Can Alkan; Faraz Hach
Journal:  Bioinformatics       Date:  2017-07-15       Impact factor: 6.937

10.  De novo human genome assemblies reveal spectrum of alternative haplotypes in diverse populations.

Authors:  Karen H Y Wong; Michal Levy-Sakin; Pui-Yan Kwok
Journal:  Nat Commun       Date:  2018-08-02       Impact factor: 14.919

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.