| Literature DB >> 22862991 |
Zhengyuan Wang1, Dante Zarlenga, John Martin, Sahar Abubucker, Makedonka Mitreva.
Abstract
BACKGROUND: Proteins convey the majority of biochemical and cellular activities in organisms. Over the course of evolution, proteins undergo normal sequence mutations as well as large scale mutations involving domain duplication and/or domain shuffling. These events result in the generation of new proteins and protein families. Processes that affect proteome evolution drive species diversity and adaptation. Herein, change over the course of metazoan evolution, as defined by birth/death and duplication/deletion events within protein families and domains, was examined using the proteomes of 9 metazoan and two outgroup species.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22862991 PMCID: PMC3483195 DOI: 10.1186/1471-2148-12-138
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Classification of protein families and domains
| | | | | ||
| Family | 17,752 | 810 | 3,620 | 9,145 | 4,177 |
| Domain | 5,106 | 1,172 | 274 | 633 | 3,027 |
Birth and death evolutionary events
| | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| Hsa | 0.05 | 177 | 129 | 39 | 42 | 251 | 706 | 950 | 4208 |
| Mmu | 0.06 | 106 | 96 | 27 | 48 | 245 | 668 | 709 | 4165 |
| ((Tsp (Cbr Cel)) ((Bmo (Aae Dme)) (Gga (Hsa Mmu)))) | 0.16 | 1274 | 59 | 645 | 76 | 1789 | 369 | 7541 | 1760 |
| (Tsp (Cbr Cel) | 0.13 | 147 | 548 | 21 | 303 | 102 | 2816 | 284 | 11139 |
| ((Bmo (Aae Dme)) (Gga (Hsa Mmu))) | 0.09 | 805 | 36 | 175 | 27 | 474 | 1576 | 2538 | 5220 |
| (Cbr Cel) | 0.55 | 4564 | 379 | 91 | 159 | 658 | 1060 | 2118 | 3508 |
| Tsp | 0.62 | 1087 | 932 | 15 | 614 | 436 | 1506 | 1000 | 5778 |
| Cbr | 0.09 | 520 | 81 | 24 | 102 | 145 | 299 | 781 | 1264 |
| Cel | 0.09 | 295 | 51 | 31 | 91 | 174 | 319 | 1100 | 1044 |
| (Bmo (Aae Dme)) | 0.18 | 725 | 446 | 37 | 261 | 695 | 1593 | 2104 | 7310 |
| (Gga (Hsa Mmu) | 0.33 | 2113 | 255 | 348 | 89 | 1567 | 2126 | 8494 | 7548 |
| (Aae Dme) | 0.14 | 452 | 205 | 13 | 78 | 137 | 758 | 396 | 2583 |
| Bmo | 0.38 | 273 | 574 | 34 | 394 | 270 | 1675 | 615 | 6864 |
| Aae | 0.28 | 346 | 363 | 28 | 328 | 652 | 1040 | 1693 | 4114 |
| Dme | 0.33 | 249 | 357 | 52 | 116 | 368 | 961 | 999 | 3472 |
| (Hsa Mmu) | 0.09 | 1144 | 50 | 123 | 32 | 304 | 329 | 1132 | 1409 |
| Gga | 0.12 | 60 | 487 | 24 | 342 | 87 | 1346 | 262 | 7607 |
a Death/Birth events were normalized to branch lengths, Dupl/Del were normalized to both the branch lengths and the total number of universal families/domains.
b Species codes: Tsp: T. spiralis; Cbr: C. briggasae; Cel: C. elegans; Bmo: B. mori; Aae: A. aegypti; Dme: D. melanogaster; Gga: G. gallus; Mmu: M. musculus;
Hsa, H. sapiens; c Duplication; d Deletion.
Pearson's correlation coefficients (bold text) and their significance(regular text) of different evolutionary events
| | ||||||||
|---|---|---|---|---|---|---|---|---|
| | ||||||||
| | | 0.016 | 0.012 | 0.017 | 0.050 | 0.553 | 0.024 | 0.355 |
| | | |||||||
| | | | 0.108 | 4.5E-06 | 0.200 | 0.005 | 0.102 | 0.001 |
| | | | ||||||
| | | | | 0.135 | 3.70E-07 | 0.969 | 8.3E-08 | 0.862 |
| | | | | |||||
| | | | | | 0.139 | 0.072 | 0.096 | 0.043 |
| | | | | | ||||
| | | | | | | 0.838 | 4.5027E-10 | 0.716 |
| | | | | | | |||
| | | | | | | | 0.821 | 4.5E-08 |
| | | | | | | | ||
| | | | | | | | | 0.872 |
a Significance of Pearson's correlation coefficient was tested using t-distribution.
Figure 1Protein family and protein domain change indices. At each lineage, the index for protein family change is followed by that of domain change (separated by back slash ‘/’). The index for protein family change was calculated using the log ratio of protein family birth and death events reconstructed from 17,752 homologous multimember families (151,044 proteins), thus representing how changes in protein families at any given lineage favor family gain or family loss. The index defining the change in protein domain complexity was calculated using the log ratio of protein domain birth and death events reconstructed from 123,084 proteins (5,106 domains). Analogous to protein family change, this represents how domain changes at any given lineage favor domain gain or loss.
Figure 2Domain shuffling indices associated with the lineages over metazoan evolution. The indices are the log ratio of protein family birth and protein domain birth events inferred in the corresponding lineage.
Figure 3Adaptation indices associated with metazoan lineages. The indices were the summation of the logarithm of protein family birth events and death events, inferred at the corresponding lineages, normalized by the branch length of the lineage.
Figure 4Distribution of protein domains among the protein families at the last common ancestor (LCA) of each of the three metazoan groups and the universal families.
Figure 5A putative format for generating the vertebrate specific protein structure of PHD finger protein 3 (Cluster3894). The domain structure of PHD finger protein 3 was formed through domain shuffling between universal families, transcription elongation factor A (Cluster1010) and histone acetyltransferase (Cluster330), followed by the addition of a new functional domain.
Enriched biological process GOterms in protein families born at the LCAof the three major metazoan groups and the LCA of metazoans
| Families born at the LCA of metazoans | | ||
| | GO:0006355 | regulation of transcription, DNA-dependent | 1.29E-106 |
| | GO:0006836 | neurotransmitter transport | 7.59E-11 |
| | GO:0007275 | multicellular organismal development | 1.25E-09 |
| | GO:0006637 | acyl-CoA metabolic process | 4.33E-08 |
| | GO:0007186 | G-protein coupled receptor protein signaling pathway | 1.35E-07 |
| | GO:0007040 | lysosome organization and biogenesis | 4.78E-06 |
| | GO:0007223 | Wnt receptor signaling pathway, calcium modulating pathway | 4.78E-06 |
| | GO:0030704 | vitelline membrane formation | 6.47E-06 |
| | GO:0006508 | proteolysis | 1.48E-05 |
| | GO:0006094 | gluconeogenesis | 1.90E-05 |
| | GO:0006665 | sphingolipid metabolic process | 3.24E-05 |
| | GO:0007179 | transforming growth factor beta receptor signaling pathway | 3.24E-05 |
| | GO:0006869 | lipid transport | 5.22E-05 |
| | GO:0045449 | regulation of transcription | 9.33E-05 |
| | GO:0006835 | dicarboxylic acid transport | 3.50E-04 |
| | GO:0007600 | sensory perception | 3.50E-04 |
| | GO:0045087 | innate immune response | 8.15E-04 |
| | GO:0007026 | negative regulation of microtubule depolymerization | 9.14E-04 |
| Families born at the LCA of nematodes | | ||
| | GO:0051085 | chaperone cofactor-dependent protein folding | 9.14E-04 |
| | |||
| | GO:0007186 | G-protein coupled receptor protein signaling pathway | 3.54E-116 |
| | GO:0016998 | cell wall catabolic process | 7.49E-06 |
| | GO:0005992 | trehalose biosynthetic process | 2.37E-05 |
| | GO:0006812 | cation transport | 2.98E-04 |
| Families born at the LCA of arthropods | | ||
| | |||
| | GO:0006030 | chitin metabolic process | 1.08E-22 |
| | GO:0006814 | sodium ion transport | 2.32E-18 |
| | GO:0006950 | response to stress | 2.02E-11 |
| | GO:0007608 | sensory perception of smell | 4.74E-11 |
| Families born at the LCA of vertebrates | | ||
| | |||
| | GO:0007186 | G-protein coupled receptor protein signaling pathway | 7.41E-155 |
| | GO:0006955 | immune response | 1.29E-39 |
| | GO:0001558 | regulation of cell growth | 2.57E-10 |
| | GO:0007154 | cell communication | 4.19E-10 |
| | GO:0050909 | sensory perception of taste | 4.88E-10 |
| | GO:0045087 | innate immune response | 1.04E-09 |
| | GO:0015671 | oxygen transport | 7.68E-09 |
| | GO:0048468 | cell development | 1.03E-08 |
| | GO:0006691 | leukotriene metabolic process | 7.08E-08 |
| | GO:0042981 | regulation of apoptosis | 1.99E-07 |
| | GO:0019882 | antigen processing and presentation | 3.04E-07 |
| | GO:0009395 | phospholipid catabolic process | 3.60E-06 |
| | GO:0006915 | apoptosis | 5.43E-06 |
| | GO:0016049 | cell growth | 2.51E-05 |
| | GO:0006486 | protein amino acid glycosylation | 3.09E-05 |
| | GO:0006952 | defense response | 5.80E-05 |
| | GO:0006071 | glycerol metabolic process | 8.96E-05 |
| | GO:0009607 | response to biotic stimulus | 2.40E-04 |
| | GO:0030178 | negative regulation of Wnt receptor signaling pathway | 2.54E-04 |
| GO:0043065 | positive regulation of apoptosis | 2.54E-04 | |
a GO, Gene Onthology; b LCA, Last Common Ancestor.
Enriched biological process GOterms in protein families died at the LCAsof three major metazoan groups
| Families died at the LCA of arthropods | | ||
| | GO:0006637 | acyl-CoA metabolic process | 1.75E-18 |
| | GO:0030704 | vitelline membrane formation | 9.07E-18 |
| | GO:0006869 | lipid transport | 2.08E-09 |
| | GO:0006814 | sodium ion transport | 1.81E-08 |
| | GO:0045454 | cell redox homeostasis | 2.84E-08 |
| | GO:0006555 | methionine metabolic process | 4.00E-04 |
| | GO:0006952 | defense response | 5.87E-04 |
| Families died at the LCA of nematodes | | ||
| | GO:0006308 | DNA catabolic process | 8.53E-08 |
| | GO:0006281 | DNA repair | 7.29E-05 |
| | GO:0035023 | regulation of Rho protein signal transduction | 2.10E-04 |
| | GO:0006779 | porphyrin biosynthetic process | 2.65E-04 |
| | GO:0006493 | protein amino acid O-linked glycosylation | 2.96E-04 |
| | GO:0017000 | antibiotic biosynthetic process | 8.77E-04 |
| Families died at the LCA of vertebrates | | ||
| | GO:0007186 | G-protein coupled receptor protein signaling pathway | 3.00E-12 |
| | GO:0016539 | intein-mediated protein splicing | 1.00E-07 |
| | GO:0007154 | cell communication | 3.28E-05 |
| | GO:0006030 | chitin metabolic process | 9.21E-05 |
| | GO:0006097 | glyoxylate cycle | 1.83E-04 |
| GO:0007275 | multicellular organismal development | 1.98E-04 | |
a GO, Gene Onthology; b LCA, Last Common Ancestor.
Figure 6Protein families (A) and protein domains (B) exhibiting duplication and/or deletion at the last common ancestor (LCA) of metazoans and at the LCA of vertebrates (bold). The numbers of protein families and domains without any duplication or deletions are at the upper left corner.