Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Comparative genomics reveals long, evolutionarily conserved, low-complexity islands in yeast proteins.

Literature DB >> 16927006

Comparative genomics reveals long, evolutionarily conserved, low-complexity islands in yeast proteins.

Philip A Romov¹, Fubin Li, Peter N Lipke, Susan L Epstein, Wei-Gang Qiu.

Abstract

Eukaryotic proteomes abound in low-complexity sequences, including tandem repeats and regions with significantly biased amino acid compositions. We assessed the functional importance of compositionally biased sequences in the yeast proteome using an evolutionary analysis of 2838 orthologous open reading frame (ORF) families from three Saccharomyces species (S. cerevisiae, S. bayanus, and S. paradoxus). Sequence conservation was measured by the amino acid sequence variability and by the ratio of nonsynonymous-to-synonymous nucleotide substitutions (K(a)/K(s)) between pairs of orthologous ORFs. A total of 1033 ORF families contained one or more long (at least 45 residues), low-complexity islands as defined by a measure based on the Shannon information index. Low-complexity islands were generally less conserved than ORFs as a whole; on average they were 50% more variable in amino acid sequences and 50% higher in K(a)/K(s) ratios. Fast-evolving low-complexity sequences outnumbered conserved low-complexity sequences by a ratio of 10 to 1. Sequence differences between orthologous ORFs fit well to a selectively neutral Poisson model of sequence divergence. We therefore used the Poisson model to identify conserved low-complexity sequences. ORFs containing the 33 most conserved low-complexity sequences were overrepresented by those encoding nucleic acid binding proteins, cytoskeleton components, and intracellular transporters. While a few conserved low-complexity islands were known functional domains (e.g., DNA/RNA-binding domains), most were uncharacterized. We discuss how comparative genomics of closely related species can be employed further to distinguish functionally important, shorter, low-complexity sequences from the vast majority of such sequences likely maintained by neutral processes.

Entities: Species

Mesh：

Substances：
Fungal Proteins
Proteome

Year: 2006 PMID： 16927006 DOI： 10.1007/s00239-005-0291-0

Source DB: PubMed Journal: J Mol Evol ISSN： 0022-2844 Impact factor: 2.395

39 in total

1. A simple algorithm to infer gene duplication and speciation events on a gene tree.

Authors: C M Zmasek; S R Eddy
Journal: Bioinformatics Date: 2001-09 Impact factor: 6.937

2. Local alignment statistics.

Authors: S F Altschul; W Gish
Journal: Methods Enzymol Date: 1996 Impact factor: 1.600

3. Information content of binding sites on nucleotide sequences.

Authors: T D Schneider; G D Stormo; L Gold; A Ehrenfeucht
Journal: J Mol Biol Date: 1986-04-05 Impact factor: 5.469

4. Simple sequence is abundant in eukaryotic proteins.

Authors: G B Golding
Journal: Protein Sci Date: 1999-06 Impact factor: 6.725

5. Evolution of simple sequence in proteins.

Authors: M Huntley; G B Golding
Journal: J Mol Evol Date: 2000-08 Impact factor: 2.395

Review 6. Yeast evolution and comparative genomics.

Authors: Gianni Liti; Edward J Louis
Journal: Annu Rev Microbiol Date: 2005 Impact factor: 15.500

7. Analysis of microsatellites in 13 hemiascomycetous yeast species: mechanisms involved in genome dynamics.

Authors: Alain Malpertuy; Bernard Dujon; Guy-Franck Richard
Journal: J Mol Evol Date: 2003-06 Impact factor: 2.395

8. Prediction and functional analysis of native disorder in proteins from the three kingdoms of life.

Authors: J J Ward; J S Sodhi; L J McGuffin; B F Buxton; D T Jones
Journal: J Mol Biol Date: 2004-03-26 Impact factor: 5.469

9. Protein length in eukaryotic and prokaryotic proteomes.

Authors: Luciano Brocchieri; Samuel Karlin
Journal: Nucleic Acids Res Date: 2005-06-10 Impact factor: 16.971

10. The COG database: an updated version includes eukaryotes.

Authors: Roman L Tatusov; Natalie D Fedorova; John D Jackson; Aviva R Jacobs; Boris Kiryutin; Eugene V Koonin; Dmitri M Krylov; Raja Mazumder; Sergei L Mekhedov; Anastasia N Nikolskaya; B Sridhar Rao; Sergei Smirnov; Alexander V Sverdlov; Sona Vasudevan; Yuri I Wolf; Jodie J Yin; Darren A Natale
Journal: BMC Bioinformatics Date: 2003-09-11 Impact factor: 3.169

6 in total

1. Candida albicans Als adhesins have conserved amyloid-forming sequences.

Authors: Henry N Otoo; Kyeng Gea Lee; Weigang Qiu; Peter N Lipke
Journal: Eukaryot Cell Date: 2007-12-14

2. Conserved processes and lineage-specific proteins in fungal cell wall evolution.

Authors: Juan E Coronado; Saad Mneimneh; Susan L Epstein; Wei-Gang Qiu; Peter N Lipke
Journal: Eukaryot Cell Date: 2007-10-19

3. Regions with two amino acids in protein sequences: A step forward from homorepeats into the low complexity landscape.

Authors: Pablo Mier; Miguel A Andrade-Navarro
Journal: Comput Struct Biotechnol J Date: 2022-09-18 Impact factor: 6.155

4. Tandem and cryptic amino acid repeats accumulate in disordered regions of proteins.

Authors: Michelle Simon; John M Hancock
Journal: Genome Biol Date: 2009-06-01 Impact factor: 13.583

5. CandidaDB: a multi-genome database for Candida species and related Saccharomycotina.

Authors: Tristan Rossignol; Pierre Lechat; Christina Cuomo; Qiandong Zeng; Ivan Moszer; Christophe d'Enfert
Journal: Nucleic Acids Res Date: 2007-11-26 Impact factor: 16.971

6. Homepeptide repeats: implications for protein structure, function and evolution.

Authors: Muthukumarasamy Uthayakumar; Bowdadu Benazir; Sanjeev Patra; Marthandan Kirti Vaishnavi; Manickam Gurusaran; Kanagarajan Sureka; Jeyaraman Jeyakanthan; Kanagaraj Sekar
Journal: Genomics Proteomics Bioinformatics Date: 2012-08-04 Impact factor: 7.691

6 in total