Literature DB >> 15883052

Evaluation of EST-data using the genome assembly.

Christian G Murray1, Thomas P Larsson, Tobias Hill, Rikard Björklind, Robert Fredriksson, Helgi B Schiöth.   

Abstract

Using expressed sequence tag (EST) data for genomewide studies requires thorough understanding of the nature of the problems that are related to handling these sequences. We investigated how EST clustering performs when the genome is used as guidance as compared to pairwise sequence alignment methods. We show that clustering with the genome as a template outperforms sequence similarity methods used to create other EST clusters, such as the UniGene set, in respect to the extent ESTs originating from the same transcriptional unit are separated into disjunct clusters. Using our approach, approximately 80% of the RefSeq genes were represented by a single EST cluster and 20% comprised of two or more EST clusters. In contrast, approximately 25% of all RefSeq genes were found to be represented by a single cluster for the UniGene clustering method. The approach minimizes the risk for overestimations due to the amount of disjunct clusters originating from the same transcript. We have also investigated the quality of EST-data by aligning ESTs to the genome. The results show how many ESTs are not adequately trimmed in respect of vector sequences and low quality regions. Moreover, we identified important problems related to ESTs aligned to the genome using BLAT, such as inferring splice junctions, and explained this aspect by simulations with synthetic data. EST-clusters created with the method are available upon request from the authors.

Mesh:

Substances:

Year:  2005        PMID: 15883052     DOI: 10.1016/j.bbrc.2005.04.070

Source DB:  PubMed          Journal:  Biochem Biophys Res Commun        ISSN: 0006-291X            Impact factor:   3.575


  2 in total

Review 1.  Clinical uses of microarrays in cancer research.

Authors:  Carl Virtanen; James Woodgett
Journal:  Methods Mol Med       Date:  2008

2.  The G protein-coupled receptor subset of the chicken genome.

Authors:  Malin C Lagerström; Anders R Hellström; David E Gloriam; Thomas P Larsson; Helgi B Schiöth; Robert Fredriksson
Journal:  PLoS Comput Biol       Date:  2006-06-02       Impact factor: 4.475

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.