Literature DB >> 17868636

An automated, high-throughput sequence read classification pipeline for preliminary genome characterization.

Philippe Chouvarine1, Surya Saha, Daniel G Peterson.   

Abstract

In the absence of a complete genome sequence, considerable insight into genome structure can be gained from survey sequencing of genomic DNA. To facilitate high-throughput characterization of genome structure based on shotgun sequence reads, we have developed an automated sequence read classification pipeline (SRCP). The SRCP uses a battery of novel and standard sequence analysis algorithms along with a sophisticated decision tree to place reads into "best fit" functional/descriptive categories. Once "primed" with genomic sequence data, the SRCP also permits estimation of gene/repeat enrichment afforded by reduced-representation sequencing techniques. To our knowledge, the SRCP is the only tool that has been designed to provide a description of a genome or a genome component based on sample sequence reads. In an initial test of the SRCP using sequence data from Sorghum bicolor, it was shown to provide results similar in quality to results generated by manual classification. Although the SRCP is not a replacement for manual sequence characterization, it can provide a rapid, high-quality overview of genome sequence content and facilitate subsequent annotation. The SRCP presumably can be adapted for analysis of any eukaryotic genome.

Entities:  

Mesh:

Year:  2007        PMID: 17868636     DOI: 10.1016/j.ab.2007.08.008

Source DB:  PubMed          Journal:  Anal Biochem        ISSN: 0003-2697            Impact factor:   3.365


  3 in total

1.  Adventures in the enormous: a 1.8 million clone BAC library for the 21.7 Gb genome of loblolly pine.

Authors:  Zenaida V Magbanua; Seval Ozkan; Benjamin D Bartlett; Philippe Chouvarine; Christopher A Saski; Aaron Liston; Richard C Cronn; C Dana Nelson; Daniel G Peterson
Journal:  PLoS One       Date:  2011-01-21       Impact factor: 3.240

2.  Characterization of the genome of bald cypress.

Authors:  Wenxuan Liu; Supaphan Thummasuwan; Sunish K Sehgal; Philippe Chouvarine; Daniel G Peterson
Journal:  BMC Genomics       Date:  2011-11-11       Impact factor: 3.969

3.  Empirical comparison of ab initio repeat finding programs.

Authors:  Surya Saha; Susan Bridges; Zenaida V Magbanua; Daniel G Peterson
Journal:  Nucleic Acids Res       Date:  2008-02-20       Impact factor: 16.971

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.