Literature DB >> 25926346

PopIns: population-scale detection of novel sequence insertions.

Birte Kehr1, Páll Melsted2, Bjarni V Halldórsson3.   

Abstract

MOTIVATION: The detection of genomic structural variation (SV) has advanced tremendously in recent years due to progress in high-throughput sequencing technologies. Novel sequence insertions, insertions without similarity to a human reference genome, have received less attention than other types of SVs due to the computational challenges in their detection from short read sequencing data, which inherently involves de novo assembly. De novo assembly is not only computationally challenging, but also requires high-quality data. Although the reads from a single individual may not always meet this requirement, using reads from multiple individuals can increase power to detect novel insertions.
RESULTS: We have developed the program PopIns, which can discover and characterize non-reference insertions of 100 bp or longer on a population scale. In this article, we describe the approach we implemented in PopIns. It takes as input a reads-to-reference alignment, assembles unaligned reads using a standard assembly tool, merges the contigs of different individuals into high-confidence sequences, anchors the merged sequences into the reference genome, and finally genotypes all individuals for the discovered insertions. Our tests on simulated data indicate that the merging step greatly improves the quality and reliability of predicted insertions and that PopIns shows significantly better recall and precision than the recent tool MindTheGap. Preliminary results on a dataset of 305 Icelanders demonstrate the practicality of the new approach.
AVAILABILITY AND IMPLEMENTATION: The source code of PopIns is available from http://github.com/bkehr/popins CONTACT: birte.kehr@decode.is SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Mesh:

Year:  2015        PMID: 25926346     DOI: 10.1093/bioinformatics/btv273

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  14 in total

1.  Diversity in non-repetitive human sequences not found in the reference genome.

Authors:  Birte Kehr; Anna Helgadottir; Pall Melsted; Hakon Jonsson; Hannes Helgason; Adalbjörg Jonasdottir; Aslaug Jonasdottir; Asgeir Sigurdsson; Arnaldur Gylfason; Gisli H Halldorsson; Snaedis Kristmundsdottir; Gudmundur Thorgeirsson; Isleifur Olafsson; Hilma Holm; Unnur Thorsteinsdottir; Patrick Sulem; Agnar Helgason; Daniel F Gudbjartsson; Bjarni V Halldorsson; Kari Stefansson
Journal:  Nat Genet       Date:  2017-02-27       Impact factor: 38.330

2.  Insertion of an SVA-E retrotransposon into the CASP8 gene is associated with protection against prostate cancer.

Authors:  Simon N Stacey; Birte Kehr; Julius Gudmundsson; Florian Zink; Aslaug Jonasdottir; Sigurjon A Gudjonsson; Asgeir Sigurdsson; Bjarni V Halldorsson; Bjarni A Agnarsson; Kristrun R Benediktsdottir; Katja K H Aben; Sita H Vermeulen; Ruben G Cremers; Angeles Panadero; Brian T Helfand; Phillip R Cooper; Jenny L Donovan; Freddie C Hamdy; Viorel Jinga; Ichiro Okamoto; Jon G Jonasson; Laufey Tryggvadottir; Hrefna Johannsdottir; Anna M Kristinsdottir; Gisli Masson; Olafur T Magnusson; Paul D Iordache; Agnar Helgason; Hannes Helgason; Patrick Sulem; Daniel F Gudbjartsson; Augustine Kong; Eirikur Jonsson; Rosa B Barkardottir; Gudmundur V Einarsson; Thorunn Rafnar; Unnur Thorsteinsdottir; Ioan N Mates; David E Neal; William J Catalona; José I Mayordomo; Lambertus A Kiemeney; Gudmar Thorleifsson; Kari Stefansson
Journal:  Hum Mol Genet       Date:  2016-01-05       Impact factor: 6.150

3.  Analysis of five deep-sequenced trio-genomes of the Peninsular Malaysia Orang Asli and North Borneo populations.

Authors:  Lian Deng; Haiyi Lou; Xiaoxi Zhang; Bhooma Thiruvahindrapuram; Dongsheng Lu; Christian R Marshall; Chang Liu; Bo Xie; Wanxing Xu; Lai-Ping Wong; Chee-Wei Yew; Aghakhanian Farhang; Rick Twee-Hee Ong; Mohammad Zahirul Hoque; Abdul Rahman Thuhairah; Bhak Jong; Maude E Phipps; Stephen W Scherer; Yik-Ying Teo; Subbiah Vijay Kumar; Boon-Peng Hoh; Shuhua Xu
Journal:  BMC Genomics       Date:  2019-11-12       Impact factor: 3.969

4.  Genome Informatics 2016.

Authors:  Davide Chicco; Michael M Hoffman
Journal:  Genome Biol       Date:  2017-01-16       Impact factor: 13.583

5.  Complex rearrangements and oncogene amplifications revealed by long-read DNA and RNA sequencing of a breast cancer cell line.

Authors:  Maria Nattestad; Sara Goodwin; Karen Ng; Timour Baslan; Fritz J Sedlazeck; Philipp Rescheneder; Tyler Garvin; Han Fang; James Gurtowski; Elizabeth Hutton; Elizabeth Tseng; Chen-Shan Chin; Timothy Beck; Yogi Sundaravadanam; Melissa Kramer; Eric Antoniou; John D McPherson; James Hicks; W Richard McCombie; Michael C Schatz
Journal:  Genome Res       Date:  2018-06-28       Impact factor: 9.043

6.  Where did you come from, where did you go: Refining metagenomic analysis tools for horizontal gene transfer characterisation.

Authors:  Enrico Seiler; Kathrin Trappe; Bernhard Y Renard
Journal:  PLoS Comput Biol       Date:  2019-07-23       Impact factor: 4.475

7.  Insertion variants missing in the human reference genome are widespread among human populations.

Authors:  Young-Gun Lee; Jin-Young Lee; Junhyong Kim; Young-Joon Kim
Journal:  BMC Biol       Date:  2020-11-13       Impact factor: 7.431

8.  PopAlu: population-scale detection of Alu polymorphisms.

Authors:  Yu Qian; Birte Kehr; Bjarni V Halldórsson
Journal:  PeerJ       Date:  2015-09-22       Impact factor: 2.984

9.  Discovery and genotyping of novel sequence insertions in many sequenced individuals.

Authors:  Pinar Kavak; Yen-Yi Lin; Ibrahim Numanagic; Hossein Asghari; Tunga Güngör; Can Alkan; Faraz Hach
Journal:  Bioinformatics       Date:  2017-07-15       Impact factor: 6.937

10.  Towards a better understanding of the low recall of insertion variants with short-read based variant callers.

Authors:  Wesley J Delage; Julien Thevenon; Claire Lemaitre
Journal:  BMC Genomics       Date:  2020-11-04       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.