Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 PopIns: population-scale detection of novel sequence insertions.

Literature DB >> 25926346

PopIns: population-scale detection of novel sequence insertions.

Birte Kehr¹, Páll Melsted², Bjarni V Halldórsson³.

Abstract

MOTIVATION: The detection of genomic structural variation (SV) has advanced tremendously in recent years due to progress in high-throughput sequencing technologies. Novel sequence insertions, insertions without similarity to a human reference genome, have received less attention than other types of SVs due to the computational challenges in their detection from short read sequencing data, which inherently involves de novo assembly. De novo assembly is not only computationally challenging, but also requires high-quality data. Although the reads from a single individual may not always meet this requirement, using reads from multiple individuals can increase power to detect novel insertions.
RESULTS: We have developed the program PopIns, which can discover and characterize non-reference insertions of 100 bp or longer on a population scale. In this article, we describe the approach we implemented in PopIns. It takes as input a reads-to-reference alignment, assembles unaligned reads using a standard assembly tool, merges the contigs of different individuals into high-confidence sequences, anchors the merged sequences into the reference genome, and finally genotypes all individuals for the discovered insertions. Our tests on simulated data indicate that the merging step greatly improves the quality and reliability of predicted insertions and that PopIns shows significantly better recall and precision than the recent tool MindTheGap. Preliminary results on a dataset of 305 Icelanders demonstrate the practicality of the new approach.
AVAILABILITY AND IMPLEMENTATION: The source code of PopIns is available from http://github.com/bkehr/popins CONTACT: birte.kehr@decode.is SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities: Disease Species

Mesh：

Year: 2015 PMID： 25926346 DOI： 10.1093/bioinformatics/btv273

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

Keyword Cloud
Cited

14 in total

1. Diversity in non-repetitive human sequences not found in the reference genome.

Authors: Birte Kehr; Anna Helgadottir; Pall Melsted; Hakon Jonsson; Hannes Helgason; Adalbjörg Jonasdottir; Aslaug Jonasdottir; Asgeir Sigurdsson; Arnaldur Gylfason; Gisli H Halldorsson; Snaedis Kristmundsdottir; Gudmundur Thorgeirsson; Isleifur Olafsson; Hilma Holm; Unnur Thorsteinsdottir; Patrick Sulem; Agnar Helgason; Daniel F Gudbjartsson; Bjarni V Halldorsson; Kari Stefansson
Journal: Nat Genet Date: 2017-02-27 Impact factor: 38.330

2. Insertion of an SVA-E retrotransposon into the CASP8 gene is associated with protection against prostate cancer.

Authors: Simon N Stacey; Birte Kehr; Julius Gudmundsson; Florian Zink; Aslaug Jonasdottir; Sigurjon A Gudjonsson; Asgeir Sigurdsson; Bjarni V Halldorsson; Bjarni A Agnarsson; Kristrun R Benediktsdottir; Katja K H Aben; Sita H Vermeulen; Ruben G Cremers; Angeles Panadero; Brian T Helfand; Phillip R Cooper; Jenny L Donovan; Freddie C Hamdy; Viorel Jinga; Ichiro Okamoto; Jon G Jonasson; Laufey Tryggvadottir; Hrefna Johannsdottir; Anna M Kristinsdottir; Gisli Masson; Olafur T Magnusson; Paul D Iordache; Agnar Helgason; Hannes Helgason; Patrick Sulem; Daniel F Gudbjartsson; Augustine Kong; Eirikur Jonsson; Rosa B Barkardottir; Gudmundur V Einarsson; Thorunn Rafnar; Unnur Thorsteinsdottir; Ioan N Mates; David E Neal; William J Catalona; José I Mayordomo; Lambertus A Kiemeney; Gudmar Thorleifsson; Kari Stefansson
Journal: Hum Mol Genet Date: 2016-01-05 Impact factor: 6.150

3. Analysis of five deep-sequenced trio-genomes of the Peninsular Malaysia Orang Asli and North Borneo populations.

Authors: Lian Deng; Haiyi Lou; Xiaoxi Zhang; Bhooma Thiruvahindrapuram; Dongsheng Lu; Christian R Marshall; Chang Liu; Bo Xie; Wanxing Xu; Lai-Ping Wong; Chee-Wei Yew; Aghakhanian Farhang; Rick Twee-Hee Ong; Mohammad Zahirul Hoque; Abdul Rahman Thuhairah; Bhak Jong; Maude E Phipps; Stephen W Scherer; Yik-Ying Teo; Subbiah Vijay Kumar; Boon-Peng Hoh; Shuhua Xu
Journal: BMC Genomics Date: 2019-11-12 Impact factor: 3.969

4. Genome Informatics 2016.

Authors: Davide Chicco; Michael M Hoffman
Journal: Genome Biol Date: 2017-01-16 Impact factor: 13.583

5. Complex rearrangements and oncogene amplifications revealed by long-read DNA and RNA sequencing of a breast cancer cell line.

Authors: Maria Nattestad; Sara Goodwin; Karen Ng; Timour Baslan; Fritz J Sedlazeck; Philipp Rescheneder; Tyler Garvin; Han Fang; James Gurtowski; Elizabeth Hutton; Elizabeth Tseng; Chen-Shan Chin; Timothy Beck; Yogi Sundaravadanam; Melissa Kramer; Eric Antoniou; John D McPherson; James Hicks; W Richard McCombie; Michael C Schatz
Journal: Genome Res Date: 2018-06-28 Impact factor: 9.043

6. Where did you come from, where did you go: Refining metagenomic analysis tools for horizontal gene transfer characterisation.

Authors: Enrico Seiler; Kathrin Trappe; Bernhard Y Renard
Journal: PLoS Comput Biol Date: 2019-07-23 Impact factor: 4.475

PopIns: population-scale detection of novel sequence insertions.

1. Diversity in non-repetitive human sequences not found in the reference genome.

2. Insertion of an SVA-E retrotransposon into the CASP8 gene is associated with protection against prostate cancer.

3. Analysis of five deep-sequenced trio-genomes of the Peninsular Malaysia Orang Asli and North Borneo populations.

4. Genome Informatics 2016.

5. Complex rearrangements and oncogene amplifications revealed by long-read DNA and RNA sequencing of a breast cancer cell line.

6. Where did you come from, where did you go: Refining metagenomic analysis tools for horizontal gene transfer characterisation.

7. Insertion variants missing in the human reference genome are widespread among human populations.

8. PopAlu: population-scale detection of Alu polymorphisms.

9. Discovery and genotyping of novel sequence insertions in many sequenced individuals.

10. Towards a better understanding of the low recall of insertion variants with short-read based variant callers.