Literature DB >> 16927006

Comparative genomics reveals long, evolutionarily conserved, low-complexity islands in yeast proteins.

Philip A Romov1, Fubin Li, Peter N Lipke, Susan L Epstein, Wei-Gang Qiu.   

Abstract

Eukaryotic proteomes abound in low-complexity sequences, including tandem repeats and regions with significantly biased amino acid compositions. We assessed the functional importance of compositionally biased sequences in the yeast proteome using an evolutionary analysis of 2838 orthologous open reading frame (ORF) families from three Saccharomyces species (S. cerevisiae, S. bayanus, and S. paradoxus). Sequence conservation was measured by the amino acid sequence variability and by the ratio of nonsynonymous-to-synonymous nucleotide substitutions (K(a)/K(s)) between pairs of orthologous ORFs. A total of 1033 ORF families contained one or more long (at least 45 residues), low-complexity islands as defined by a measure based on the Shannon information index. Low-complexity islands were generally less conserved than ORFs as a whole; on average they were 50% more variable in amino acid sequences and 50% higher in K(a)/K(s) ratios. Fast-evolving low-complexity sequences outnumbered conserved low-complexity sequences by a ratio of 10 to 1. Sequence differences between orthologous ORFs fit well to a selectively neutral Poisson model of sequence divergence. We therefore used the Poisson model to identify conserved low-complexity sequences. ORFs containing the 33 most conserved low-complexity sequences were overrepresented by those encoding nucleic acid binding proteins, cytoskeleton components, and intracellular transporters. While a few conserved low-complexity islands were known functional domains (e.g., DNA/RNA-binding domains), most were uncharacterized. We discuss how comparative genomics of closely related species can be employed further to distinguish functionally important, shorter, low-complexity sequences from the vast majority of such sequences likely maintained by neutral processes.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 16927006     DOI: 10.1007/s00239-005-0291-0

Source DB:  PubMed          Journal:  J Mol Evol        ISSN: 0022-2844            Impact factor:   2.395


  39 in total

1.  A simple algorithm to infer gene duplication and speciation events on a gene tree.

Authors:  C M Zmasek; S R Eddy
Journal:  Bioinformatics       Date:  2001-09       Impact factor: 6.937

2.  Local alignment statistics.

Authors:  S F Altschul; W Gish
Journal:  Methods Enzymol       Date:  1996       Impact factor: 1.600

3.  Information content of binding sites on nucleotide sequences.

Authors:  T D Schneider; G D Stormo; L Gold; A Ehrenfeucht
Journal:  J Mol Biol       Date:  1986-04-05       Impact factor: 5.469

4.  Simple sequence is abundant in eukaryotic proteins.

Authors:  G B Golding
Journal:  Protein Sci       Date:  1999-06       Impact factor: 6.725

5.  Evolution of simple sequence in proteins.

Authors:  M Huntley; G B Golding
Journal:  J Mol Evol       Date:  2000-08       Impact factor: 2.395

Review 6.  Yeast evolution and comparative genomics.

Authors:  Gianni Liti; Edward J Louis
Journal:  Annu Rev Microbiol       Date:  2005       Impact factor: 15.500

7.  Analysis of microsatellites in 13 hemiascomycetous yeast species: mechanisms involved in genome dynamics.

Authors:  Alain Malpertuy; Bernard Dujon; Guy-Franck Richard
Journal:  J Mol Evol       Date:  2003-06       Impact factor: 2.395

8.  Prediction and functional analysis of native disorder in proteins from the three kingdoms of life.

Authors:  J J Ward; J S Sodhi; L J McGuffin; B F Buxton; D T Jones
Journal:  J Mol Biol       Date:  2004-03-26       Impact factor: 5.469

9.  Protein length in eukaryotic and prokaryotic proteomes.

Authors:  Luciano Brocchieri; Samuel Karlin
Journal:  Nucleic Acids Res       Date:  2005-06-10       Impact factor: 16.971

10.  The COG database: an updated version includes eukaryotes.

Authors:  Roman L Tatusov; Natalie D Fedorova; John D Jackson; Aviva R Jacobs; Boris Kiryutin; Eugene V Koonin; Dmitri M Krylov; Raja Mazumder; Sergei L Mekhedov; Anastasia N Nikolskaya; B Sridhar Rao; Sergei Smirnov; Alexander V Sverdlov; Sona Vasudevan; Yuri I Wolf; Jodie J Yin; Darren A Natale
Journal:  BMC Bioinformatics       Date:  2003-09-11       Impact factor: 3.169

View more
  6 in total

1.  Candida albicans Als adhesins have conserved amyloid-forming sequences.

Authors:  Henry N Otoo; Kyeng Gea Lee; Weigang Qiu; Peter N Lipke
Journal:  Eukaryot Cell       Date:  2007-12-14

2.  Conserved processes and lineage-specific proteins in fungal cell wall evolution.

Authors:  Juan E Coronado; Saad Mneimneh; Susan L Epstein; Wei-Gang Qiu; Peter N Lipke
Journal:  Eukaryot Cell       Date:  2007-10-19

3.  Regions with two amino acids in protein sequences: A step forward from homorepeats into the low complexity landscape.

Authors:  Pablo Mier; Miguel A Andrade-Navarro
Journal:  Comput Struct Biotechnol J       Date:  2022-09-18       Impact factor: 6.155

4.  Tandem and cryptic amino acid repeats accumulate in disordered regions of proteins.

Authors:  Michelle Simon; John M Hancock
Journal:  Genome Biol       Date:  2009-06-01       Impact factor: 13.583

5.  CandidaDB: a multi-genome database for Candida species and related Saccharomycotina.

Authors:  Tristan Rossignol; Pierre Lechat; Christina Cuomo; Qiandong Zeng; Ivan Moszer; Christophe d'Enfert
Journal:  Nucleic Acids Res       Date:  2007-11-26       Impact factor: 16.971

6.  Homepeptide repeats: implications for protein structure, function and evolution.

Authors:  Muthukumarasamy Uthayakumar; Bowdadu Benazir; Sanjeev Patra; Marthandan Kirti Vaishnavi; Manickam Gurusaran; Kanagarajan Sureka; Jeyaraman Jeyakanthan; Kanagaraj Sekar
Journal:  Genomics Proteomics Bioinformatics       Date:  2012-08-04       Impact factor: 7.691

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.