Literature DB >> 20089148

SeqTrim: a high-throughput pipeline for pre-processing any type of sequence read.

Juan Falgueras1, Antonio J Lara, Noé Fernández-Pozo, Francisco R Cantón, Guillermo Pérez-Trabado, M Gonzalo Claros.   

Abstract

BACKGROUND: High-throughput automated sequencing has enabled an exponential growth rate of sequencing data. This requires increasing sequence quality and reliability in order to avoid database contamination with artefactual sequences. The arrival of pyrosequencing enhances this problem and necessitates customisable pre-processing algorithms.
RESULTS: SeqTrim has been implemented both as a Web and as a standalone command line application. Already-published and newly-designed algorithms have been included to identify sequence inserts, to remove low quality, vector, adaptor, low complexity and contaminant sequences, and to detect chimeric reads. The availability of several input and output formats allows its inclusion in sequence processing workflows. Due to its specific algorithms, SeqTrim outperforms other pre-processors implemented as Web services or standalone applications. It performs equally well with sequences from EST libraries, SSH libraries, genomic DNA libraries and pyrosequencing reads and does not lead to over-trimming.
CONCLUSIONS: SeqTrim is an efficient pipeline designed for pre-processing of any type of sequence read, including next-generation sequencing. It is easily configurable and provides a friendly interface that allows users to know what happened with sequences at every pre-processing stage, and to verify pre-processing of an individual sequence if desired. The recommended pipeline reveals more information about each sequence than previously described pre-processors and can discard more sequencing or experimental artefacts.

Entities:  

Mesh:

Year:  2010        PMID: 20089148      PMCID: PMC2832897          DOI: 10.1186/1471-2105-11-38

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  16 in total

1.  Establishing a method of vector contamination identification in database sequences.

Authors:  G A Seluja; A Farmer; M McLeod; C Harger; P A Schad
Journal:  Bioinformatics       Date:  1999-02       Impact factor: 6.937

2.  Repbase update: a database and an electronic journal of repetitive elements.

Authors:  J Jurka
Journal:  Trends Genet       Date:  2000-09       Impact factor: 11.639

3.  An optimized protocol for analysis of EST sequences.

Authors:  F Liang; I Holt; G Pertea; S Karamycheva; S L Salzberg; J Quackenbush
Journal:  Nucleic Acids Res       Date:  2000-09-15       Impact factor: 16.971

4.  ESTAnnotator: A tool for high throughput EST annotation.

Authors:  Agnes Hotz-Wagenblatt; Thomas Hankeln; Peter Ernst; Karl-Heinz Glatting; Erwin R Schmidt; Sándor Suhai
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

5.  LUCY2: an interactive DNA sequence quality trimming and vector removal tool.

Authors:  Song Li; Hui-Hsien Chou
Journal:  Bioinformatics       Date:  2004-05-06       Impact factor: 6.937

6.  Figaro: a novel statistical method for vector sequence removal.

Authors:  James Robert White; Michael Roberts; James A Yorke; Mihai Pop
Journal:  Bioinformatics       Date:  2008-01-17       Impact factor: 6.937

7.  EST2uni: an open, parallel tool for automated EST analysis and database creation, with a data mining web interface and microarray expression data integration.

Authors:  Javier Forment; Francisco Gilabert; Antonio Robles; Vicente Conejero; Fernando Nuez; Jose M Blanca
Journal:  BMC Bioinformatics       Date:  2008-01-07       Impact factor: 3.169

8.  An optimized procedure greatly improves EST vector contamination removal.

Authors:  Yi-An Chen; Chang-Chun Lin; Chin-Di Wang; Huan-Bin Wu; Pei-Ing Hwang
Journal:  BMC Genomics       Date:  2007-11-13       Impact factor: 3.969

9.  ESTpass: a web-based server for processing and annotating expressed sequence tag (EST) sequences.

Authors:  Byungwook Lee; Taehui Hong; Sang Jin Byun; Taeha Woo; Yoon Jeong Choi
Journal:  Nucleic Acids Res       Date:  2007-05-25       Impact factor: 16.971

10.  ESTExplorer: an expressed sequence tag (EST) assembly and annotation platform.

Authors:  Shivashankar H Nagaraj; Nandan Deshpande; Robin B Gasser; Shoba Ranganathan
Journal:  Nucleic Acids Res       Date:  2007-06-01       Impact factor: 16.971

View more
  81 in total

1.  A Large Tn7-like Transposon Confers Hyper-Resistance to Copper in Pseudomonas syringae pv. syringae.

Authors:  Francesca Aprile; Zaira Heredia-Ponce; Francisco M Cazorla; Antonio de Vicente; José A Gutiérrez-Barranquero
Journal:  Appl Environ Microbiol       Date:  2020-12-23       Impact factor: 4.792

2.  Transcriptional response of bathypelagic marine bacterioplankton to the Deepwater Horizon oil spill.

Authors:  Adam R Rivers; Shalabh Sharma; Susannah G Tringe; Jeffrey Martin; Samantha B Joye; Mary Ann Moran
Journal:  ISME J       Date:  2013-08-01       Impact factor: 10.302

Review 3.  Next-generation transcriptome assembly.

Authors:  Jeffrey A Martin; Zhong Wang
Journal:  Nat Rev Genet       Date:  2011-09-07       Impact factor: 53.242

4.  Escaping the cut by restriction enzymes through single-strand self-annealing of host-edited 12-bp and longer synthetic palindromes.

Authors:  Fernando Castro-Chavez
Journal:  DNA Cell Biol       Date:  2011-09-06       Impact factor: 3.311

5.  Metatranscriptomics of N2-fixing cyanobacteria in the Amazon River plume.

Authors:  Jason A Hilton; Brandon M Satinsky; Mary Doherty; Brian Zielinski; Jonathan P Zehr
Journal:  ISME J       Date:  2014-12-16       Impact factor: 10.302

6.  Phenotypic plasticity in heterotrophic marine microbial communities in continuous cultures.

Authors:  Sara Beier; Adam R Rivers; Mary Ann Moran; Ingrid Obernosterer
Journal:  ISME J       Date:  2014-11-14       Impact factor: 10.302

7.  Metagenomic assessment of the potential microbial nitrogen pathways in the rhizosphere of a mediterranean forest after a wildfire.

Authors:  José F Cobo-Díaz; Antonio J Fernández-González; Pablo J Villadas; Ana B Robles; Nicolás Toro; Manuel Fernández-López
Journal:  Microb Ecol       Date:  2015-03-03       Impact factor: 4.552

Review 8.  From next-generation resequencing reads to a high-quality variant data set.

Authors:  S P Pfeifer
Journal:  Heredity (Edinb)       Date:  2016-10-19       Impact factor: 3.821

9.  Understanding pseudo-albinism in sole (Solea senegalensis): a transcriptomics and metagenomics approach.

Authors:  Patricia I S Pinto; Cláudia C Guerreiro; Rita A Costa; Juan F Martinez-Blanch; Carlos Carballo; Francisco M Codoñer; Manuel Manchado; Deborah M Power
Journal:  Sci Rep       Date:  2019-09-20       Impact factor: 4.379

10.  TagCleaner: Identification and removal of tag sequences from genomic and metagenomic datasets.

Authors:  Robert Schmieder; Yan Wei Lim; Forest Rohwer; Robert Edwards
Journal:  BMC Bioinformatics       Date:  2010-06-23       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.