| Literature DB >> 20498846 |
Marcela Dávila López1, Juan José Martínez Guerra, Tore Samuelsson.
Abstract
The order of genes in eukaryotes is not entirely random. Studies of gene order conservation are important to understand genome evolution and to reveal mechanisms why certain neighboring genes are more difficult to separate during evolution. Here, genome-wide gene order information was compiled for 64 species, representing a wide variety of eukaryotic phyla. This information is presented in a browser where gene order may be displayed and compared between species. Factors related to non-random gene order in eukaryotes were examined by considering pairs of neighboring genes. The evolutionary conservation of gene pairs was studied with respect to relative transcriptional direction, intergenic distance and functional relationship as inferred by gene ontology. The results show that among gene pairs that are conserved the divergently and co-directionally transcribed genes are much more common than those that are convergently transcribed. Furthermore, highly conserved pairs, in particular those of fungi, are characterized by a short intergenic distance. Finally, gene pairs of metazoa and fungi that are evolutionary conserved and that are divergently transcribed are much more likely to be related by function as compared to poorly conserved gene pairs. One example is the ribosomal protein gene pair L13/S16, which is unusual as it occurs both in fungi and alveolates. A specific functional relationship between these two proteins is also suggested by the fact that they are part of the same operon in both eubacteria and archaea. In conclusion, factors associated with non-random gene order in eukaryotes include relative gene orientation, intergenic distance and functional relationships. It seems likely that certain pairs of genes are conserved because the genes involved have a transcriptional and/or functional relationship. The results also indicate that studies of gene order conservation aid in identifying genes that are related in terms of transcriptional control.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20498846 PMCID: PMC2871058 DOI: 10.1371/journal.pone.0010654
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Phylogenetic tree of species used in this study.
Tree was constructed by parsimony analysis of concatenated α-tubulin, β-tubulin, actin and the elongation factor 1-alpha (EF-1α) amino acid sequences, as further described under “Materials and Methods”.
Figure 2Eukaryotic Gene Order Browser (eGOB).
Genomic context of organisms that share a divergently transcribed pair of the heat shock proteins Hsp10 (red) and Hsp60 (yellow) as seen through the Eukaryotic Gene Order Browser (http://egob.biomedicine.gu.se). Arrows indicate the relative directions of genes. Homologous sequences, i.e. protein sequences that belong to the same cluster as defined in this case by OrthoMCL, are in the same colour.
Figure 3Evolutionary conservation and relative gene orientation.
For a range of evolutionary distances within an interval of 0.01 units the number of gene pairs corresponding to a certain relative gene orientation was calculated and plotted. Gene orientation considered were divergent (← →), convergent (→ ←) and co-directional (→→). Cumulative counts of gene pairs are shown. Randomized counts were obtained by shuffling for every species the identities of OrthoMCL clusters or Pfam groups. Based on these results of randomizations it would seem that the probability of finding a pair of genes with the same relative orientation in at least two different species by chance only is approximately 0.002–0.01. A. Genes clustered using OrthoMCL. B. Genes grouped on the basis of Pfam architectures.
Figure 4Relationship between intergenic distance and evolutionary conservation within the phylogenetic groups Metazoa and Fungi.
For all gene pairs present in more than one species a measure of evolutionary conservation was calculated based on the species involved as described under Materials and Methods. Lowess regression lines are shown. For calculation of evolutionary conservation groups only species within the respective groups (i.e Metazoa and Fungi) were considered. For reference, the mean values of intergenic distances for the divergently (← →), convergently (→ ←) and co-directionally (→→) transcribed gene pairs are 34912, 34165 and 22923 for Metazoa and 1343, 688 and 1230 for Fungi.
Figure 5Size distribution of intergenic regions in vertebrates.
Distribution of intergenic distances among divergently (← →), convergently (→ ←) and co-directionally (→→) transcribed gene pairs for selected organisms. An enrichment of bidirectional gene pairs is observed in vertebrates (Gallus gallus) but not in fishes (D. rerio and O. latipes) and non-vertebrate animals (Figure S2).
Figure 6Functional relationship of adjacent genes.
Gene pairs of Metazoa (panel A) and Fungi (panel B) are analyzed with respect to evolutionary conservation, relative gene orientation and functional similarity. For a range of evolutionary distances within an interval of 0.01 units the fraction of gene pairs where both genes have a GO similarity score larger than 0.4 [51] were calculated and plotted. For this plot genes were originally clustered with OrthoMCL.
Evolutionary conserved gene pairs.
| No. Org | No. Gene pairs | Evol. Cons. | GS2 score | GO | Gene 1 description | Gene 2 description | Func. Rel. | Phylum |
| 9 | 9 | 1.13 | 0.80 | * | 40S ribosomal protein S16 | 60S ribosomal protein L13 | ✓ | Fc Fm Fs Ft Pr |
| 24 | 68 | 0.95 | 0.63 | * | Histone H2B | Histone H2A | ✓ | Fb Fc Fp Fs Ft Fz M |
| 8 | 24 | 0.75 | 0.84 | * | ATP-binding cassette sub-family A | ATP-binding cassette sub-family A | ✓ | Fc H M V Pr |
| 19 | 21 | 0.64 | 0.00 | 60S ribosomal protein L21 | 40S ribosomal protein S9 | ✓ | Fb Fc Fp Fz | |
| 12 | 14 | 0.60 | 0.75 | * | Probable pyridoxine biosynthesis protein SNZ1 | Probable glutamine amidotransferase SNO1 | ✓ | Fp Fs M Pr |
| 17 | 17 | 0.56 | 0.38 | * | DNA replication licensing factor MCM2 | Protein mlo2 | Fp Fs Ft | |
| 16 | 16 | 0.56 | 0.00 | Putative Xaa-Pro aminopeptidase - Uncharacterized peptidase C22G7.01c | Importin beta-like protein kap111 - Pleiotropic drug resistance regulatory protein 6 - Tryptophanyl-tRNA synthetase, mitochondrial | Fp Fs Ft | ||
| 17 | 17 | 0.55 | 0.25 | * | Eukaryotic initiation factor 4A-III | Pre-mRNA-splicing factor PRP9 | Fb Fc Fp | |
| 9 | 9 | 0.55 | 1.00 | * | Chitin synthase | Chitin synthase | ✓ | Fp H |
| 4 | 19 | 0.54 | 0.62 | * | Histone H2B | Histone H2A | ✓ | Fs M V |
| 16 | 21 | 0.49 | 0.80 | * | Histone H4 | Histone H3 | ✓ | Fp Fs Ft |
| 12 | 12 | 0.48 | 0.10 | * | Inositol hexakisphosphate and diphosphoinositol-pentakisphosphate kinase | U3 small nucleolar RNA-associated protein 21 - Uncharacterized WD repeat-containing protein C1672.07 | Fp Fs Ft | |
| 11 | 11 | 0.48 | 0.30 | * | Uncharacterized protein C11G11.07 - mRNA transport regulator MTR10 | Probable small nuclear ribonucleoprotein E | Fb Fp Fs Ft | |
| 15 | 15 | 0.47 | 0.27 | * | Pre-mRNA-splicing factor SYF1 | Vacuolar proton pump subunit D | Fp Fs | |
| 14 | 14 | 0.46 | 0.00 | * | 60S ribosomal protein L11 | Small nuclear ribonucleoprotein-associated protein B | Fb Fp Ft | |
| 14 | 14 | 0.46 | 0.22 | * | Ribosome biogenesis protein RLP24 | Mitochondrial import inner membrane translocase subunit TIM14 | Fp Fs | |
| 17 | 17 | 0.46 | 0.13 | * | ATP-dependent rRNA helicase RRP3 | Brix domain-containing protein C1B9.03c - Ribosome biogenesis protein SSF1 | Fp Fs | |
| 13 | 13 | 0.45 | 0.00 | Uncharacterized protein | Uncharacterized protein | Fp Fs | ||
| 14 | 14 | 0.45 | 0.43 | * | U3 small nucleolar RNA-associated protein 17 | DNA-directed RNA polymerases I, II, and III subunit RPABC5 | Fp Fs Ft | |
| 2 | 3 | 0.45 | 0.12 | * | Protein kinase gsk3 | Guanosine-diphosphatase | Fm Fz | |
| 12 | 16 | 0.44 | 0.80 | * | Iron transport multicopper oxidase FET precursor | Iron transporter FTH1 - Plasma membrane iron permease | ✓ | Fb Fc Fp Ft Fz |
| 12 | 17 | 0.43 | 0.09 | * | Alpha-glucosidase | Alpha-glucosides permease MPH2/3 | Fb Fp Fs | |
| 7 | 7 | 0.43 | 0.62 | * | Homogentisate 1,2-dioxygenase | Fumarylacetoacetase | Fp H | |
| 15 | 15 | 0.42 | 0.60 | * | Palmitoyltransferase ERF2 | Uncharacterized protein C3H7.08c | Fp Ft | |
| 15 | 15 | 0.42 | 0.47 | * | Protein CASP | Vacuolar protein sorting-associated protein 3 | Fp Ft | |
| 15 | 15 | 0.42 | 0.45 | * | Eukaryotic translation initiation factor 5A-1,2 | Vacuolar protein sorting-associated protein 52 | Fp Ft | |
| 15 | 15 | 0.42 | 0.11 | * | Vacuolar protein-sorting-associated protein 24 | Protein wos2 | Fp Ft | |
| 15 | 15 | 0.42 | 0.00 | Uncharacterized WD repeat-containing protein | RNA processing protein efg1 | Fp Ft | ||
| 15 | 15 | 0.42 | 0.00 | Regulator of ribosome biosynthesis | 37S ribosomal protein S23, mitochondrial | Fp Ft | ||
| 4 | 4 | 0.41 | 0.19 | * | Pre-mRNA-splicing factor CWC24 (Complexed with CEF1 protein 24) | Co-chaperone protein HscB, mitochondrial precursor | Fs V | |
| 14 | 14 | 0.41 | 0.09 | * | Actin-related protein 2/3 complex subunit 4 | Ubiquitin carboxyl-terminal hydrolase 6 | Fp Ft | |
| 12 | 12 | 0.41 | 0.30 | * | ATP-dependent RNA helicase DBP5 | Uncharacterized protein C12C2.05c | Fp Ft | |
| 14 | 14 | 0.40 | 0.91 | * | DNA repair protein RAD16 | DNA repair protein RAD7 | ✓ | Fp Ft |
| 13 | 13 | 0.40 | 0.35 | * | Calcineurin subunit B | Enhancer of polycomb-like protein 1 | Fp Ft | |
| 15 | 15 | 0.40 | 0.24 | * | DNA-directed RNA polymerase III subunit RPC3 | Cytochrome b-c1 complex subunit 2, mitochondrial precursor | Fp Fs | |
| 14 | 14 | 0.40 | 0.69 | * | DNA-directed RNA polymerases I, II, and III subunit RPABC2 | Transcription factor IIIA | ✓ | Fp Fs |
| 11 | 11 | 0.40 | 0.00 | * | Cullin-3 | Uncharacterized protein C24H6.02c | Fp Ft | |
| 13 | 13 | 0.40 | 0.51 | * | Serine/threonine-protein kinase chk1 | Ubiquitin-conjugating enzyme E2-20 kDa | Fp Ft | |
| 13 | 13 | 0.40 | 0.24 | * | Eukaryotic peptide chain release factor GTP-binding subunit | Ran-specific GTPase-activating protein 30 | Fp Ft | |
| 10 | 10 | 0.40 | 0.00 | Protein pdh1 precursor - Uncharacterized membrane protein YOL107W | Uncharacterized WD repeat-containing protein C1235.09 | Fp Ft | ||
| 12 | 12 | 0.39 | 0.53 | * | Elongation of fatty acids protein 2 | Cytochrome c oxidase polypeptide VI, mitochondrial precursor | Fp Ft | |
| 13 | 13 | 0.39 | 0.00 | Probable 60S ribosomal protein L28e | UPF0357 protein C1687.07 precursor | Fp Ft | ||
| 10 | 10 | 0.39 | 0.27 | * | Geranylgeranyl transferase type-2 subunit alpha | Meiosis-specific APC/C activator protein AMA1 | Fp Ft | |
| 14 | 14 | 0.39 | 0.71 | * | 60 kDa heat shock protein, mitochondrial precursor | 10 kDa heat shock protein, mitochondrial precursor | ✓ | Fb Fc M |
| 8 | 12 | 0.39 | 0.00 | Beta-1,3-glucan-binding protein precursor | Fb Fp Fs | |||
| 6 | 6 | 0.39 | 0.00 | 40S ribosomal protein S15 | 60S acidic ribosomal protein P2-beta | ✓ | Fb Fs Ft | |
| 13 | 16 | 0.38 | 1.00 | * | 3-oxoacyl-(acyl-carrier-protein) synthase | S-acyl fatty acid synthase thioesterase | ✓ | Fb Fp |
| 2 | 2 | 0.38 | 0.00 | Vesicle associated membrane protein | DNA excision repair protein ERCC-1 | V Pr | ||
| 12 | 12 | 0.38 | 0.42 | * | Histone deacetylase | Chromatin modification-related protein YNG2 | Fp Ft | |
| 11 | 11 | 0.38 | 0.23 | * | Biotin ligase | Mitochondrial genome maintenance protein MGM101, mitochondrial precursor | Fp Ft | |
| 3 | 4 | 0.38 | 0.65 | * | ATP synthase subunit beta, mitochondrial precursor | ATP synthase subunit delta, mitochondrial precursor | ✓ | Fc Fz H |
Gene pairs are ordered according to evolutionary conservation. First column shows the number of species where a particular gene pair is present. Second column shows the total count of the gene pair in all species where it occurs. A star (*) indicates that both genes in a pair have a GO annotation. Functional relationships were inferred by mining of literature. Fp, Pezizomycotina; Fs, Saccharomycotina; Ft, Thaphrinomycotina; Fb, Basidiomycota; Fc, Chytridiomycota; Fm, Microsporidia; M, Mammals; V, Viridiplantae; Pr, Protozoa; H, Heterokonta.
Conserved pairs of divergently transcribed genes from human.
| Gene 1 | Gene 2 | Evolutionary conservation | Intergenic distance | GS2 score | Gene 1 description | Gene 2 description | Functional relationship | References |
| HIST1H2AJ | HIST1H2BM | 0.54 | 304 | 0.62 | Histone H2A type 1-J | Histone H2B type 1-M | ✓ |
|
| HSPD1 | HSPE1 | 0.39 | 49 | 0.71 | 60 kDa heat shock protein, mitochondrial precursor | 10 kDa heat shock protein, mitochondrial | ✓ |
|
| IMMP1L | ELP4 | 0.28 | 128 | 0.34 | Mitochondrial inner membrane protease subunit 1 | Elongator complex protein 4 | ||
| PPAT | PAICS | 0.43 | 70 | 0.38 | Amidophosphoribos yltransferase precursor | Multifunctional protein ADE2 | ✓ |
|
| GBA2 | RGP1 | 0.16 | 115 | 0 | Non-lysosomal glucosylceramidase | Retrograde Golgi transport protein RGP1 homolog | ||
| COL4A1 | COL4A2 | 0.15 | 118 | 0.76 | Collagen alpha-1(IV) chain precursor (Arresten) | Collagen alpha-2(IV) chain precursor | ✓ |
|
| DUOX2 | DUOXA2 | 0.14 | <0 | 0.17 | Dual oxidase 2 | Dual oxidase maturation factor 2 | ✓ |
|
| DUOXA1 | DUOX1 | 0.14 | 135 | 0.17 | Dual oxidase maturation factor 1 | Dual oxidase 1 | ✓ |
|
| RTN4IP1 | QRSL1 | 0.13 | 80 | 0.75 | Reticulon-4-interacting protein 1, mitochondrial precursor | Glutamyl-tRNA(Gln) amidotransferase subunit A homolog | ||
| LRBA | MARB21L2 | 0.13 | <0 | 0.14 | Lipopolysaccharide-responsive and beige-like anchor protein | Protein mab-21-like 2 |
Ten most conserved human bidirectional gene pairs where only those with an intergenic distance less than 1000 base pares are included. Functional relationships were inferred by mining of literature. For a more comprehensive list of gene pairs see Table S4.