| Literature DB >> 23160176 |
Abstract
Orthologous relationships between genes are routinely inferred from bidirectional best hits (BBH) in pairwise genome comparisons. However, to our knowledge, it has never been quantitatively demonstrated that orthologs form BBH. To test this "BBH-orthology conjecture," we take advantage of the operon organization of bacterial and archaeal genomes and assume that, when two genes in compared genomes are flanked by two BBH show statistically significant sequence similarity to one another, these genes are bona fide orthologs. Under this assumption, we tested whether middle genes in "syntenic orthologous gene triplets" form BBH. We found that this was the case in more than 95% of the syntenic gene triplets in all genome comparisons. A detailed examination of the exceptions to this pattern, including maximum likelihood phylogenetic tree analysis, showed that some of these deviations involved artifacts of genome annotation, whereas very small fractions represented random assignment of the best hit to one of closely related in-paralogs, paralogous displacement in situ, or even less frequent genuine violations of the BBH-orthology conjecture caused by acceleration of evolution in one of the orthologs. We conclude that, at least in prokaryotes, genes for which independent evidence of orthology is available typically form BBH and, conversely, BBH can serve as a strong indication of gene orthology.Entities:
Mesh:
Year: 2012 PMID: 23160176 PMCID: PMC3542571 DOI: 10.1093/gbe/evs100
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FSchematic of the genome comparison for testing the BBH–orthology conjecture.
FDependency of the relationship between the middle genes in syntenic gene triplets on the distance between the compared genomes.
The Status of Middle Genes (x) in Syntenic Gene Triplets (ixj) Depending on the Distance between Compared Genomes
| Taxa | BBH d | BBH/G | T/BBH | fBBH | |||
|---|---|---|---|---|---|---|---|
| Enterobacteria | 0.3042 | 0.9810 | 0.0010 | 0.0180 | 0.5555 | 0.4487 | 0.9978 |
| Gamma-proteobacteria | 0.8509 | 0.9451 | 0.0060 | 0.0490 | 0.4058 | 0.1717 | 0.9885 |
| | 1.1818 | 0.9082 | 0.0150 | 0.0768 | 0.3346 | 0.0936 | 0.9886 |
| Bacteria | 1.3591 | 0.8956 | 0.0169 | 0.0875 | 0.2890 | 0.0649 | 0.9816 |
| Archaea | 1.7915 | 0.5918 | 0.0542 | 0.3540 | 0.1973 | 0.0206 | 0.9075 |
| | 0.4881 | 0.8987 | 0.0084 | 0.0929 | 0.5804 | 0.2327 | 0.9849 |
| Methanomicrobia | 1.2498 | 0.9434 | 0.0000 | 0.0566 | 0.2910 | 0.0754 | 0.9815 |
| Euryarchaeota | 1.3500 | 0.9441 | 0.0103 | 0.0456 | 0.2815 | 0.0646 | 0.9863 |
| Archaea | 1.5352 | 0.8133 | 0.0135 | 0.1732 | 0.2518 | 0.0452 | 0.9202 |
| Bacteria | 1.7606 | 0.5900 | 0.0449 | 0.3631 | 0.1915 | 0.0217 | 0.8815 |
aMean distance between BBH for the master genome and other genomes in the respective group.
bFraction of best hits among the counterparts of the middle genes in syntenic triplets.
cFraction of other significant hits among the counterparts.
dFraction of nonhomologous genes among the counterparts.
eFraction of genes in BBH.
fFraction of BBH in syntenic triplets.
gFraction of BBH among best hits.
Test of the BBH–Orthology Conjecture for Selected Pairs of Genomes
| Gene | Tree Analysis and Status of the BBHO Conjecture |
|---|---|
| | Large family of paralogs (MerR-like HTH-containing transcription regulators) |
| | Fragment or different architecture |
| | Fragment or different architecture |
| | Violated |
| | Large family of paralogs (two-component regulatory system) |
| | Large family of paralogs (periplasmic binding proteins) |
| | Fragment or different architecture |
| | Fragment or different architecture |
| | Fragment or different architecture |
| | Violated |
| | Supported |
| | Supported |
| | Supported |
| | Supported |
| | Supported |
| | Large family of paralogs (permeases) |
| | fragment or different architecture |
| | Supported |
| | Large family of paralogs (uncharacterized proteins) |
| | Large family of paralogs (uncharacterized proteins) |
| | Fragment or different architecture |
| | Fragment or different architecture |
| | Supported |
| | Fragment or different architecture |
| | Violated |
| | Supported |
FExamples of deviations from the predictions of the BBH–orthology conjecture. (A) Haloarcula marismortui–Pyrococcus furiosus, phosphopyruvate hydratase (COG0148). BBH–orthology conjecture violated due to an acceleration of evolution of one of the in situ homologs. (B) Escherichia coli K12–Ralstonia eutropha, NADH:ubiquinone oxidoreductase NuoE (COG1905). BBH–orthology conjecture violated, probably due to an acceleration of evolution of one of the in situ homologs. (C) Escherichia coli K12–Cronobacter turicensis. Outer membrane porin OmpC (COG3203). BBH–orthology conjecture violated, complex evolutionary relationships between multiple paralogs. (D) Escherichia coli K12–Bacillus subtilis, ribosomal protein S14 (COG0199). Compatible with the BBH–orthology conjecture. The in situ homolog is markedly more distant from the query gene in sequence and domain architecture as well as the position in the tree. (E) Escherichia coli K12–R. eutropha, Thiol-disulfide isomerase and thioredoxins (COG0526). Compatible with BBH–orthology conjecture; the best hit and the in situ homolog are closely related in-paralogs in R. eutropha. (F) Escherichia coli K12–R. eutropha, Nitrate reductase alpha subunit (COG5013). Compatible with BBH–orthology conjecture; the best hit and the in situ homolog are closely related in-paralogs in R. eutropha.