| Literature DB >> 22195734 |
Yuan Zhou1, Jing Liu, Lei Han, Zhi-Gang Li, Ziding Zhang.
Abstract
BACKGROUND: The presence of tandem amino acid repeats (AARs) is one of the signatures of eukaryotic proteins. AARs were thought to be frequently involved in bio-molecular interactions. Comprehensive studies that primarily focused on metazoan AARs have suggested that AARs are evolving rapidly and are highly variable among species. However, there is still controversy over causal factors of this inter-species variation. In this work, we attempted to investigate this topic mainly by comparing AARs in orthologous proteins from ten angiosperm genomes.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22195734 PMCID: PMC3283746 DOI: 10.1186/1471-2164-12-632
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Summary of the ten angiosperm genomes included in this study
| Organism | Abbreviation | Genome size (Mbp) | Number of proteins | Reference |
|---|---|---|---|---|
| Arabidopsis | 120 | 27, 1692 | 25 | |
| papaya | 372 | 27, 181 | [ | |
| soybean | 1, 100 | 46, 260 | [ | |
| apple | 742 | 62, 997 | [ | |
| cottonwood | 550 | 40, 664 | [ | |
| grape | 500 | 26, 092 | [ | |
| false brome | 272 | 25, 525 | [ | |
| rice | 382 | 56, 795 | [ | |
| sorghum | 735 | 27, 561 | [ | |
| maize | 2, 500 | 32, 606 | [ |
The genome size is the total length of chromosomal sequences. A gene with multiple protein products was counted as one protein in this table.
Figure 1Correlation between AAR and coding GC content in plants. (A) Among angiosperm orthologs (the abbreviations used are detailed in Table 1), the average values are shown; (B) among Arabidopsis, moss and green algae orthologs, the average values are shown; (C) for all angiosperm orthologs from Arabidopsis; (D) for all angiosperm orthologs from rice; (E) in different regions of orthologous angiosperm proteins.
Figure 2Residue composition in AARs and across the entire set of orthologs. The fractions of residues from AARs and from the entire set of orthologs are shown as green columns and yellow columns, respectively. Only positive error bars are shown.
Putative functions for conserved long AARs in Arabidopsis
| Protein | Type | Description (gene/AAR) | Clue |
|---|---|---|---|
| AT3G24650.1 | S | Biochemical [ | |
| AT1G43850.1 | Q | Phenotypic [ | |
| AT4G32551.1 | Q | Speculation only [ | |
| AT5G67470.1 | P | Speculation only [ | |
| AT1G25540.1 | Q | Speculation only [ |
Fraction of fully disordered AARs
| Abbreviation | VSL2B Fraction | IUPred Fraction |
|---|---|---|
| Arabidopsis | 82.5% | 48.0% |
| papaya | 77.8% | 41.6% |
| soybean | 78.6% | 46.3% |
| apple | 67.9% | 45.0% |
| cottonwood | 79.6% | 45.5% |
| grape | 73.5% | 39.7% |
| false brome | 80.2% | 44.6% |
| rice | 81.5% | 45.6% |
| sorghum | 80.2% | 44.9% |
| maize | 78.4% | 42.4% |
All fractions of fully disordered AARs were significantly high in comparison with random samples (p < 0.001).
Figure 3Regulation of RCPs at the transcript level. (A) Comparison of mRNA half lives of RCPs (green boxes) and non-RCPs (yellow boxes) in Arabidopsis. Large outliers (> 50 h) were not shown. (B) Comparison of the tissue specificity index of RCPs with a single gene model (green boxes) and non-RCPs with a single gene model (yellow boxes) in Arabidopsis and rice orthologs.