Literature DB >> 25173571

Large scale comparison of non-human sequences in human sequencing data.

Hongseok Tae1, Enusha Karunasena1, Jasmin H Bavarva1, Lauren J McIver1, Harold R Garner2.   

Abstract

Several studies have demonstrated that unmapped reads in next generation sequencing data could be used to identify infectious agents or structural variants, but there has been no intensive effort to analyze and classify all non-human sequences found in individual large data sets. To identify commonality in non-human sequences by infectious agents and putative contamination events, we analyzed non-human sequences in 150 genomic sequencing data files from the 1000 Genomes Project and observed that 0.13% of reads on average showed similarities to non-human genomes. We compared results among different sample groups divided based on ethnicities, sequencing centers and enrichment methods (whole genome sequencing vs. exome sequencing) and found that sequencing centers had specific signatures of contaminating genomes as 'time stamps'. We also observed many unmapped reads that falsely indicated contamination because of the high similarity of human sequences to sequences in non-human genome assemblies such as mouse and Nicotiana. Published by Elsevier Inc.

Entities:  

Keywords:  Non-human sequences; Sequencing contamination; Unmapped reads

Mesh:

Substances:

Year:  2014        PMID: 25173571      PMCID: PMC4262678          DOI: 10.1016/j.ygeno.2014.08.009

Source DB:  PubMed          Journal:  Genomics        ISSN: 0888-7543            Impact factor:   5.736


  15 in total

1.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

2.  Adenovirus transmission--worthy of our attention.

Authors:  Gregory C Gray
Journal:  J Infect Dis       Date:  2006-08-25       Impact factor: 5.226

3.  VarScan: variant detection in massively parallel sequencing of individual and pooled samples.

Authors:  Daniel C Koboldt; Ken Chen; Todd Wylie; David E Larson; Michael D McLellan; Elaine R Mardis; George M Weinstock; Richard K Wilson; Li Ding
Journal:  Bioinformatics       Date:  2009-06-19       Impact factor: 6.937

4.  Human papillomavirus is a necessary cause of invasive cervical cancer worldwide.

Authors:  J M Walboomers; M V Jacobs; M M Manos; F X Bosch; J A Kummer; K V Shah; P J Snijders; J Peto; C J Meijer; N Muñoz
Journal:  J Pathol       Date:  1999-09       Impact factor: 7.996

5.  Tandem repeats finder: a program to analyze DNA sequences.

Authors:  G Benson
Journal:  Nucleic Acids Res       Date:  1999-01-15       Impact factor: 16.971

Review 6.  Challenges of sequencing human genomes.

Authors:  Daniel C Koboldt; Li Ding; Elaine R Mardis; Richard K Wilson
Journal:  Brief Bioinform       Date:  2010-06-02       Impact factor: 11.622

7.  Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads.

Authors:  Kai Ye; Marcel H Schulz; Quan Long; Rolf Apweiler; Zemin Ning
Journal:  Bioinformatics       Date:  2009-06-26       Impact factor: 6.937

8.  Whole exome capture in solution with 3 Gbp of data.

Authors:  Matthew N Bainbridge; Min Wang; Daniel L Burgess; Christie Kovar; Matthew J Rodesch; Mark D'Ascenzo; Jacob Kitzman; Yuan-Qing Wu; Irene Newsham; Todd A Richmond; Jeffrey A Jeddeloh; Donna Muzny; Thomas J Albert; Richard A Gibbs
Journal:  Genome Biol       Date:  2010-06-17       Impact factor: 13.583

9.  Search for an aetiological virus candidate in chronic lymphocytic leukaemia by extensive transcriptome analysis.

Authors:  Natalia Rego; Sergio Bianchi; Pilar Moreno; Helena Persson; Anders Kvist; Alvaro Pena; Pablo Oppezzo; Hugo Naya; Carlos Rovira; Guillermo Dighiero; Otto Pritsch
Journal:  Br J Haematol       Date:  2012-04-10       Impact factor: 6.998

10.  An integrated map of genetic variation from 1,092 human genomes.

Authors:  Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean
Journal:  Nature       Date:  2012-11-01       Impact factor: 49.962

View more
  16 in total

1.  Whole-Genome Sequencing Reveals Age-Specific Changes in the Human Blood Microbiota.

Authors:  Eun-Ju Lee; Joohon Sung; Hyung-Lae Kim; Han-Na Kim
Journal:  J Pers Med       Date:  2022-06-07

2.  What human sperm RNA-Seq tells us about the microbiome.

Authors:  Grace M Swanson; Sergey Moskovtsev; Clifford Librach; J Richard Pilsner; Robert Goodrich; Stephen A Krawetz
Journal:  J Assist Reprod Genet       Date:  2020-01-04       Impact factor: 3.412

3.  Sequencing facility and DNA source associated patterns of virus-mappable reads in whole-genome sequencing data.

Authors:  Xun Chen; Dawei Li
Journal:  Genomics       Date:  2020-12-07       Impact factor: 5.736

4.  What's in your next-generation sequence data? An exploration of unmapped DNA and RNA sequence reads from the bovine reference individual.

Authors:  Lynsey K Whitacre; Polyana C Tizioto; JaeWoo Kim; Tad S Sonstegard; Steven G Schroeder; Leeson J Alexander; Juan F Medrano; Robert D Schnabel; Jeremy F Taylor; Jared E Decker
Journal:  BMC Genomics       Date:  2015-12-29       Impact factor: 3.969

5.  Genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome.

Authors:  Natalie C Fonville; Karthik Raja Velmurugan; Hongseok Tae; Zalman Vaksman; Lauren J McIver; Harold R Garner
Journal:  Sci Rep       Date:  2016-06-09       Impact factor: 4.379

6.  Lessons for livestock genomics from genome and transcriptome sequencing in cattle and other mammals.

Authors:  Jeremy F Taylor; Lynsey K Whitacre; Jesse L Hoff; Polyana C Tizioto; JaeWoo Kim; Jared E Decker; Robert D Schnabel
Journal:  Genet Sel Evol       Date:  2016-08-17       Impact factor: 4.297

7.  Using reference-free compressed data structures to analyze sequencing reads from thousands of human genomes.

Authors:  Dirk D Dolle; Zhicheng Liu; Matthew Cotten; Jared T Simpson; Zamin Iqbal; Richard Durbin; Shane A McCarthy; Thomas M Keane
Journal:  Genome Res       Date:  2016-12-16       Impact factor: 9.043

8.  A virome-wide clonal integration analysis platform for discovering cancer viral etiology.

Authors:  Xun Chen; Jason Kost; Arvis Sulovari; Nathalie Wong; Winnie S Liang; Jian Cao; Dawei Li
Journal:  Genome Res       Date:  2019-03-14       Impact factor: 9.043

9.  De Novo Transcriptome Meta-Assembly of the Mixotrophic Freshwater Microalga Euglena gracilis.

Authors:  Javier Cordoba; Emilie Perez; Mick Van Vlierberghe; Amandine R Bertrand; Valérian Lupo; Pierre Cardol; Denis Baurain
Journal:  Genes (Basel)       Date:  2021-05-29       Impact factor: 4.096

10.  Small RNA-Based Antiviral Defense in the Phytopathogenic Fungus Colletotrichum higginsianum.

Authors:  Sonia Campo; Kerrigan B Gilbert; James C Carrington
Journal:  PLoS Pathog       Date:  2016-06-02       Impact factor: 6.823

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.