| Literature DB >> 17916239 |
Luke Hakes1, John W Pinney, Simon C Lovell, Stephen G Oliver, David L Robertson.
Abstract
BACKGROUND: Genes in populations are in constant flux, being gained through duplication and occasionally retained or, more frequently, lost from the genome. In this study we compare pairs of identifiable gene duplicates generated by small-scale (predominantly single-gene) duplications with those created by a large-scale gene duplication event (whole-genome duplication) in the yeast Saccharomyces cerevisiae.Entities:
Mesh:
Year: 2007 PMID: 17916239 PMCID: PMC2246283 DOI: 10.1186/gb-2007-8-10-r209
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Comparison of the shared interaction ratio for duplicate gene products and random protein pairs. Whole-genome duplicates (WGDs) are illustrated in blue and small-scale duplicates (SSDs) are illustrated in red. Mean shared interaction ratio r is plotted against gene sequence divergence measured by non-synonymous substitution rate (Ka). The dashed lines indicate the average shared interaction ratio for WGDs (blue), SSDs (red), and pairs of proteins selected at random from the genome (black). Error bars show standard errors on the mean of r for each bin.
Figure 2Relationship between semantic distance and the proportion of pairs within each duplicate set. Whole-genome duplicates (WGDs) are illustrated in blue, small-scale duplicates (SSDs) in red, and random gene pairings in gray. A higher semantic distance indicates greater functional divergence.
Over-represented and under-represented functional annotations within the different duplicate sets
| GO ID | Description | Total | observed | ||
| Over-represented in set of WGDs | |||||
| 0004672 | Protein kinase activity | 127 | 52 | 2.7 × e-12 | <0.001 |
| 0003735 | Structural constituent of ribosome | 217 | 72 | 3.4 × e-11 | <0.001 |
| 0016773 | Phosphotransferase activity, alcohol group as acceptor | 171 | 61 | 3.9 × e-11 | <0.001 |
| 0016301 | Kinase activity | 197 | 67 | 4.9 × e-11 | <0.001 |
| 0004674 | Protein serine/threonine kinase activity | 69 | 32 | 1.1 × e-09 | <0.001 |
| 0016772 | Transferase activity, transferring phosphorus-containing groups | 294 | 78 | 4.3 × e-07 | <0.001 |
| 0016538 | Cyclin-dependent protein kinase regulator activity | 23 | 14 | 8.8 × e-07 | <0.001 |
| 0005198 | Structural molecule activity | 338 | 83 | 5.6 × e-06 | 0.001 |
| 0030234 | Enzyme regulator activity | 180 | 50 | 1.4 × e-05 | 0.002 |
| 0019887 | Protein kinase regulator activity | 44 | 18 | 4.3 × e-05 | 0.004 |
| 0016740 | Transferase activity | 641 | 135 | 4.6 × e-05 | 0.004 |
| 0005083 | Small GTPase regulator activity | 47 | 18 | 1.2 × e-04 | 0.018 |
| 0019207 | Kinase regulator activity | 47 | 18 | 1.2 × e-04 | 0.018 |
| 0035251 | UDP-glucosyltransferase activity | 13 | 8 | 2.0 × e-04 | 0.027 |
| 0003704 | Specific RNA polymerase II transcription factor activity | 45 | 17 | 2.2 × e-04 | 0.029 |
| 0016791 | Phosphoric monoester hydrolase activity | 88 | 27 | 2.4 × e-04 | 0.029 |
| 0030508 | Thiol-disulfide exchange intermediate activity | 8 | 6 | 2.9 × e-04 | 0.042 |
| Under-represented in set of WGDs | |||||
| 0008757 | S-adenosylmethionine-dependent methyltransferase activity | 62 | 0 | 2.7 × e-05 | <0.001 |
| 0016741 | Transferase activity, transferring one-carbon groups | 84 | 2 | 8.7 × e-05 | 0.003 |
| 0015078 | Hydrogen ion transporter activity | 54 | 0 | 1.1 × e-04 | 0.006 |
| 0008168 | Methyltransferase activity | 82 | 2 | 1.2 × e-04 | 0.006 |
| 0031202 | RNA splicing factor activity, transesterification mechanism | 51 | 0 | 1.8 × e-04 | 0.008 |
| 0016251 | General RNA polymerase II transcription factor activity | 62 | 1 | 3.4 × e-04 | 0.014 |
| Over-represented in set of SSDs | |||||
| 0051119 | Sugar transporter activity | 25 | 22 | 2.7 × e-20 | <0.001 |
| 0015144 | Carbohydrate transporter activity | 30 | 23 | 1.5 × e-18 | <0.001 |
| 0015145 | Monosaccharide transporter activity | 21 | 18 | 2.3 × e-16 | <0.001 |
| 0015149 | Hexose transporter activity | 21 | 18 | 2.3 × e-16 | <0.001 |
| 0015578 | Mannose transporter activity | 15 | 15 | 3.0 × e-16 | <0.001 |
| 0005353 | Fructose transporter activity | 15 | 15 | 3.0 × e-16 | <0.001 |
| 0017111 | Nucleoside-triphosphatase activity | 243 | 65 | 7.3 × e-16 | <0.001 |
| 0005355 | Glucose transporter activity | 18 | 16 | 3.5 × e-15 | <0.001 |
| 0016818 | Hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides | 264 | 67 | 4.4 × e-15 | <0.001 |
| 0016462 | Pyrophosphatase activity | 264 | 67 | 4.4 × e-15 | <0.001 |
| 0016817 | Hydrolase activity, acting on acid anhydrides | 264 | 67 | 4.4 × e-15 | <0.001 |
| 0005215 | Transporter activity | 410 | 84 | 6.7 × e-13 | <0.001 |
| 0003824 | Catalytic activity | 1885 | 252 | 7.3 × e-13 | <0.001 |
| 0016887 | ATPase activity | 185 | 46 | 2.5 × e-10 | <0.001 |
| 0016787 | Hydrolase activity | 707 | 109 | 2.1 × e-08 | <0.001 |
| 0016614 | Oxidoreductase activity, acting on CH-OH group of donors | 75 | 24 | 3.2 × e-08 | <0.001 |
| 0016616 | Oxidoreductase activity, acting on the CH-OH group of donors, NAD or NADP as acceptor | 67 | 22 | 7.2 × e-08 | <0.001 |
| 0004386 | Helicase activity | 83 | 24 | 2.8 × e-07 | <0.001 |
| 0042626 | ATPase activity, coupled to transmembrane movement of substances | 58 | 19 | 5.9 × e-07 | <0.001 |
| 0043492 | ATPase activity, coupled to movement of substances | 58 | 19 | 5.9 × e-07 | <0.001 |
| 0016820 | Hydrolase activity, acting on acid anhydrides, catalyzing transmembrane movement of substances | 58 | 19 | 5.9 × e-07 | <0.001 |
| 0016491 | Oxidoreductase activity | 262 | 49 | 1.2 × e-06 | <0.001 |
| 0015075 | Ion transporter activity | 145 | 32 | 2.6 × e-06 | <0.001 |
| 0008324 | Cation transporter activity | 124 | 28 | 6.9 × e-06 | <0.001 |
| 0042623 | ATPase activity, coupled | 125 | 28 | 8.2 × e-06 | 0.001 |
| 0018456 | Aryl-alcohol dehydrogenase activity | 8 | 6 | 1.5 × e-05 | 0.002 |
| 0015294 | Solute:cation symporter activity | 8 | 6 | 1.5 × e-05 | 0.002 |
| 0003924 | GTPase activity | 54 | 16 | 2.0 × e-05 | 0.002 |
| 0005354 | Galactose transporter activity | 6 | 5 | 3.9 × e-05 | 0.009 |
| 0015293 | Symporter activity | 9 | 6 | 4.3 × e-05 | 0.012 |
| 0005537 | Mannose binding | 4 | 4 | 7.6 × e-05 | 0.017 |
| 0015238 | Drug transporter activity | 15 | 7 | 2.0 × e-04 | 0.035 |
| 0003678 | DNA helicase activity | 35 | 11 | 2.2 × e-04 | 0.039 |
| Under-represented in set of SSD | |||||
| 0003676 | Nucleic acid binding | 494 | 12 | 1.8 × e-10 | 0 |
| 0005488 | Binding | 1034 | 58 | 1.1 × e-06 | 0 |
| 0003723 | RNA binding | 231 | 4 | 1.6 × e-06 | 0 |
| 0003677 | DNA binding | 220 | 6 | 8.0 × e-05 | 0.002 |
| 0030528 | Transcription regulator activity | 326 | 14 | 3.3 × e-04 | 0.006 |
| 0016779 | Nucleotidyltransferase activity | 80 | 0 | 3.7 × e-04 | 0.009 |
GO, Gene Ontology; SSD, small-scale duplicate; WGD, whole-genome duplicate.
Figure 3Visualization of the two sets of duplicates on a semantic distance network. (a) The yeast proteome is distributed spatially according to semantic distance, with six high-level functional classes highlighted in different colors that are either over-represented or under-represented in the whole-genome duplicate (WGD) or small-scale duplicate (SSD) sets (see Table 1). (b) WGDs are shown in blue and SSDs in red; the same six functional classes are highlighted. The products of the two types of duplicate gene have a tendency to occupy separate areas of semantic space, indicating involvement in different functions.
The relationship between dispensability and functional category for both WGDs and SSDs
| GO ID | Description | % all ORFs | % SSDs | % WGDs |
| Over-represented in set of essential genes | ||||
| 0003824 | Catalytic activity | 32.5 | 46.5+ | 35.8 |
| 0005488 | Binding | 17.8 | 10.7- | 17.9 |
| 0016740 | Transferase activity | 11.1 | 9.4 | 15.0+ |
| 0003676 | Nucleic acid binding | 8.5 | 2.2- | 10.1 |
| 0005515 | Protein binding | 7.5 | 5.5 | 5.9 |
| 0005198 | Structural molecule activity | 5.8 | 8.1 | 9.2+ |
| 0030528 | Transcription regulator activity | 5.6 | 2.6- | 7.0 |
| 0016772 | Transferase activity, transferring phosphorus-containing groups | 5.1 | 4.6 | 8.7+ |
| 0016462 | Pyrophosphatase activity | 4.6 | 12.4+ | 3.1 |
| 0016817 | Hydrolase activity, acting on acid anhydrides | 4.6 | 12.4+ | 3.1 |
| 0016818 | Hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides | 4.6 | 12.4+ | 3.1 |
| 0017111 | Nucleoside-triphosphatase activity | 4.2 | 12.0+ | 2.7 |
| 0003723 | RNA binding | 4.0 | 0.7- | 3.1 |
| 0016887 | ATPase activity | 3.2 | 8.5+ | 1.8 |
| 0016874 | Ligase activity | 2.2 | 1.7 | 2.0 |
| 0003702 | RNA polymerase II transcription factor activity | 2.1 | 0.7 | 2.7 |
| 0004386 | Helicase activity | 1.4 | 4.4+ | 0.4 |
| 0016779 | Nucleotidyltransferase activity | 1.4 | 0.0- | 1.0 |
| 0016251 | General RNA polymerase II transcription factor activity | 1.1 | 0.4 | 0.1- |
| Under-represented in set of essential genes | ||||
| 0005215 | Transporter activity | 7.1 | 15.5+ | 6.2 |
| 0016491 | Oxidoreductase activity | 4.5 | 9.0+ | 5.3 |
| 0015075 | Ion transporter activity | 2.5 | 5.9+ | 1.6 |
| 0008324 | Cation transporter activity | 2.1 | 5.2+ | 1.1 |
Gene Ontology (GO) categories significantly over-represented and under-represented (corrected P < 0.05) are sorted by abundance (1% cut-off). Significant over-representation and under-representation in the duplicate sets are denoted by superscript '+' and '-', respectively. ORF, open reading frame; SSD, small-scale duplicate; WGD, whole-genome duplicate.
Dispensability of SSD and WGD proteins found in complexes and those not found within protein complexes
| WGD | SSD | |
| Complexes | ||
| Essential | 16 (10%) | 15 (21%) |
| Not essential | 138 (90%) | 55 (79%) |
| Total | 154 | 70 |
| Non-complexes | ||
| Essential | 32 (5%) | 28 (7%) |
| Not essential | 642 (95%) | 398 (93%) |
| Total | 674 | 426 |
SSD, small-scale duplicate; WGD, whole-genome duplicate.
Figure 4Relationship between semantic distance, duplicate set and complex membership. The proportion of duplicate pairs having a certain level of functional divergence as measured by semantic distance for the following: pairs of complex-forming whole-genome duplicate (WGD; dark blue), complex-forming small-scale duplicate (SSD; red), non-complex-forming WGD (light blue), and non-complex-forming SSD (pink) proteins. Significant differences in the degree of functional divergence between the pairs in the two categories (complex and non-complex) are observed. No significant difference between the semantic distances of pairs of SSDs found in complexes and complex-forming WGD pairs is observed; nor, indeed, is there any difference between SSD pairs not in complexes and WGD pairs not found within complexes.