| Literature DB >> 28175279 |
Eric D Salomaki1, Christopher E Lane1.
Abstract
The enslavement of an alpha-proteobacterial endosymbiont by the last common eukaryotic ancestor resulted in large-scale gene transfer of endosymbiont genes to the host nucleus as the endosymbiont transitioned into the mitochondrion. Mitochondrial genomes have experienced widespread gene loss and genome reduction within eukaryotes and DNA sequencing has revealed that most of these gene losses occurred early in eukaryotic lineage diversification. On a broad scale, more recent modifications to organelle genomes appear to be conserved and phylogenetically informative. The first red algal mitochondrial genome was sequenced more than 20 years ago, and an additional 29 Florideophyceae mitochondria have been added over the past decade. A total of 32 genes have been described to have been missing or considered non-functional pseudogenes from these Florideophyceae mitochondria. These losses have been attributed to endosymbiotic gene transfer or the evolution of a parasitic life strategy. Here we sequenced the mitochondrial genomes from the red algal parasite Choreocolax polysiphoniae and its host Vertebrata lanosa and found them to be complete and conserved in structure with other Florideophyceae mitochondria. This result led us to resequence the previously published parasite Gracilariophila oryzoides and its host Gracilariopsis andersonii, as well as reevaluate reported gene losses from published Florideophyceae mitochondria. Multiple independent losses of rpl20 and a single loss of rps11 can be verified. However by reannotating published data and resequencing specimens when possible, we were able to identify the majority of genes that have been reported as lost or pseudogenes from Florideophyceae mitochondria.Entities:
Keywords: Rhodophyta; mitochondria; gene loss; parasite; atp8; rpl20
Mesh:
Year: 2017 PMID: 28175279 PMCID: PMC5381584 DOI: 10.1093/gbe/evw267
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Table of all currently available Florideophyceae mitochondrion genomes that were examined in this study with GenBank Accession, genome length, and A/T%
| Species | GenBank Accession | Length | AT Content (%) | Reported Missing Genes | Notes |
|---|---|---|---|---|---|
| KF649303 | 32,878 | 66.6 |
| ||
| KJ398158 | 26,097 | 73.3 |
| ||
| KJ398159 | 26,200 | 71.5 |
| ||
| KU145004 and KU145005 | 24,508 | 71.2 | noted as partial in article ( |
| |
| 24,494 | 71.2 | ||||
| NC_001677 | 25,836 | 72.1 | Complete | ||
| KX687877 | 25,357 | 79.4 |
| ||
| KU053956 | 25,391 | 74.3 |
| ||
| KU641510 | 26,504 | 69.9 |
Although the manuscript is published, not yet available on GenBank Reported use of GTG ( | ||
| KX247283 | 26,052 | 77.4 |
| ||
| KF290995 | 24,922 | 70.5 |
| ||
| KC875854 | 24,901 | 69.5 |
| ||
| KP728466 | 26,898 | 72.4 |
Complete | ||
| KF852534 | 25,272 | 71.6 |
Complete | ||
| KJ526627 | 25,973 | 71.9 |
Complete | ||
| NC_014771 and KX687879 | 25,161 | 71.9 |
| ||
| NC_014772 and KX687878 | 27,036 | 72 |
| ||
| NC_023251 | 26,534 | 72.4 |
| ||
| JQ071938 | 25,883 | 72.5 |
Complete | ||
| NC_023094 | 27,943 | 69.8 |
Complete, but uses multiple start codons, seemingly where unnecessary (see | ||
| KM999231 | 28,906 | 68.6 |
Complete. Contains hypothetical protein CDS in | ||
| KF649304 | 33,066 | 67.8 |
| ||
| KF833365 | 25,242 | 69.9 |
| ||
| KX525587 | 25,906 | 65.0 |
Complete | ||
| KF649305 | 29,735 | 67.8 |
Complete | ||
| HQ586061 | 25,894 | 76.1 |
| ||
| KJ398160 | 26,431 | 76.4 |
| ||
| KC875852 | 26,351 | 70.5 |
| ||
| KJ398161 | 26,351 | 74.3 |
Complete | ||
| KJ398162 | 25,906 | 73.3 |
| ||
| KJ398163 | 26,438 | 74.1 |
| ||
| KJ398164 | 26,767 | 71.5 |
| ||
| KF186230 | 26,202 | 71.6 |
| ||
| KX687880 | 25,119 | 71.7 |
|
Note.—Genes previously reported as missing are listed along with notes regarding their status as a result of this study.
The use of alternative start codons by gene based on published literature and the proximity to the closest in-frame ATG initiation codon. Alternative initiation codons that are supported by the lack of a nearby ATG initiation codon and conserved gene start location based on alignment are indicated in bold
| Gene | Species | Location with Alternative Initiation Codon | Location with ATG Initiation Codon | Difference in Gene Length in nucleotides (Amino Acids) |
|---|---|---|---|---|
| 7,298 (ATT) | 7,304 | 6 (2) | ||
| 23,475 (ATT) | 23,466 | 9 (3) | ||
| 21,462 (ATT) | 21,465 | 3 (1) | ||
| — | — | — | ||
| — | — | — | ||
| 3,889 (ATT) | 3,904 | 15 (5) | ||
| 7,845 (TTG) | 7,818 | 27 (9) | ||
| 6,556 (ATT) | 6,535 | 21 (7) | ||
| — | — | — | — | |
| — | — | — | — | |
| — | — | — | — | |
| — | — | — | — | |
| 22,165 | 22,150 | 15 (5) | ||
| — | — | — | — | |
| — | — | — | — | |
| — | — | — | — | |
| — | — | |||
| 2,693 (ATC) | 2,732 | 39 (13) | ||
| 2,574 (ATA) | — | — | ||
| 10,426 (ATT) | 10,423 | 3 (1) | ||
| 14,946 (ATT) | 14,938 | 9 (3) | ||
| — | — | |||
| — | — | — | ||
| 13,852 (TTG) | 13,849 | 3 (1) | ||
| 17,061 (ATT) | 17,022 | 39 (13) | ||
| 11,319 (CTT) | 11,310 | 9 (3) | ||
| 20,718 (TTA) | 20,733 | 15 (5) | ||
| 20,662 (ATA) | — | — | ||
| 29,469 (ATT) | 29,487 | 18 (6) | ||
| — | — | |||
| 23,492 (TTA) | 23,546 | 54 (18) | ||
| 22,915 (ATC) | 22,942 | 27 (9) |
aIndicates examples where other non-ATG initiation codons from translation table 4 (Protozoa Mitochondrion) are also possible locations for the gene to start although no ATG codon is found within 30 nucleotides (10 amino acid residues) upstream or downstream from the start of the currently annotated gene.
bThe H. rubra sdhC gene annotation is longer than other copies of sdhC and the beginning of the gene overlaps with a tRNA. Starting annotation at ATG makes the gene much more similar in length to other Florideophyceae copies of sdhC.
cGene not previously annotated in GenBank.
dThe Ch. crispus TatC (ymf16) gene is currently annotated with a GTT initiation codon, which is not found for any other Florideophyceae mitochondrion gene nor is it a start codon in translation table 4 (Protozoa Mitochondrion). Four other ORFs in the same reading frame that use either ATA or TTA as a start codon for TatC gene are found from 12 to 39 nucleotides downstream of the GTT codon.
A/T%, non-synonymous to synonymous mutation (dN/dS) ratio and individual dN and dS values for genes encoded on the Florideophyceae mitochondrion genomes
| Gene | AT Content (%) | d | d | d | Species with pseudogene only |
|---|---|---|---|---|---|
| 79.8 | 0.51645 | 022882 | 0.44307 | ||
| 72.5 | 0.12864 | 0.07486 | 0.58195 | ||
| 78.8 | 0.51248 | 0.20110 | 0.39241 | ||
| 66.1 | 0.01550 | 0.00212 | 0.13673 | ||
| 70.3 | 0.11691 | 0.05443 | 0.46555 | ||
| 67.2 | 0.06936 | 0.01711 | 0.24675 | ||
| 69.5 | 0.14959 | 0.04549 | 0.30414 | ||
| 68.3 | 0.12819 | 0.07186 | 0.56058 | ||
| 69.0 | 0.09454 | 0.06667 | 0.70520 | ||
| 74.6 | 0.31481 | 0.15877 | 0.50434 | ||
| 74.5 | 0.15718 | 0.07060 | 0.44919 | ||
| 72.7 | 0.18548 | 0.07790 | 0.41999 | ||
| 75.3 | 0.14619 | 0.05654 | 0.38677 | ||
| 71.9 | 0.21716 | 0.08547 | 0.39356 | ||
| 75.1 | 0.26439 | 0.17651 | 0.66762 | ||
| 75.9 | 0.36193 | 0.13664 | 0.37752 | ||
| 79.1 | 0.62160 | 0.46012 | 0.74023 | ||
| 76.9 | 0.51571 | 0.28532 | 0.55326 | ||
| 77.3 | 0.45177 | 0.19375 | 0.42886 | ||
| 67.5 | 0.20794 | 0.11986 | 0.57642 | ||
| 71.7 | 0.20603 | 0.10707 | 0.51969 | ||
| 78.6 | 0.47084 | 0.17487 | 0.37140 | ||
| 78.2 | 0.46556 | 0.26263 | 0.56412 | ||
| 80.0 | 0.50046 | 0.34550 | 0.69037 |
Note.—Species with a pseudogene, rather than a functional copy of the gene, are listed in the far right column and were left out of calculations of A/T% and dN/dS ratio.
aIntrons removed and CDSs only were used for A/T% and dN/dS analysis.
bCe. japonicum nad3 left out of dN/dS analysis.
cD. binghamiae rps3 left out of dN/dS analysis.
dNo evidence for remnant pseudogene, appears to be a complete loss.
Current status of Florideophyceae mitochondrial genes previously reported as missing in Hancock et al. (2010) and Yang et al. (2015) or otherwise unannotated
| Gene | Current Status | ||
|---|---|---|---|
| Gene Present | Location in Published Sequence or GenBank Accession Number | Pseudogene | |
| KX687876 | |||
| 20,431 > 20,024 | |||
| 24,587 > 24,213 | |||
| 20,389 > 19,985 | |||
| 20,528 > 20,127 | |||
| 25,592 > 25,894 | |||
| 26,129 > 26,474 (43) | |||
| 30,851 > 31,093 | |||
| 24,127 > 24,351 | |||
| 23.912 > 24,148 | |||
| 24,440 > 24,709 | |||
| 24,248 > 24,487 | |||
| 12,331 > 12,044 | |||
| 10,977 > 10,678 | |||
| 21,403 > 21,789 | |||
| 11,593 > 11,234 | |||
| 11,358 > 10,966 | |||
| 15,759 > 15,514 | |||
| 16,636 > 16,403 | |||
| 15,284 > 15,036 | |||
| 15,277 > 15,029 | |||
| 6,935 > 7,489 | |||
| 7,560 > 8,102 | |||
| 7,238 > 7,780 | |||
| 13,225 > 13,785 | |||
| 7,261 > 7,803 | |||
| 7,234 > 7,776 | |||
aIndicates presence of functional gene is dependent on non-ATG start codon, alternatively these could be pseudogenes. RNA sequence data would be required to confirm gene function.
bThe atp4 gene was annotated as hypothetical protein CDS in GenBank but considered as atp4 in Yang et al. (2015), figure 2.
cThe atp4 gene was annotated as hypothetical protein CDS in GenBank but considered as ymf39 in Yang et al. (2015), figure 2.
dLocation in newly sequenced Gr. oryzoides mitochondrion (GenBank KX687879).
eLocation in newly sequenced G. andersonii mitochondrion (GenBank KX687878).
fNo evidence for remnant pseudogene.
gThe Ce. japonicum TatC gene seems likely to be the result of homopolymer sequence error, though possibility of pseudogene remains. Due to high levels of variation in length and sequence of Florideophyceae TatC genes, we continue to recognize the Ce. japonicum TatC gene as a pseudogene until firm evidence contradicts this.
. 2.—Alignment of the original Ce. japonicum nad3 gene with the modified Ce. japonicum nad3 (“T” deleted from base 36; red box) and copies of the nad3 gene from Ch. crispus, Gracilaria vermiculophylla, G. andersonii, Sp. durum, C. polysiphoniae, and V. lanosa. Manual deletion of one ‘T’ from the string of 26 “T”s and 3 “C”s between 32 and 60 bp from the start codon restores conservation of the length and sequence of the Ce. japonicum nad3 gene. Genes are shown with the amino acid translation below.
. 1.—Translated alignment of sdhD genes from florideophycean mitochondria showing A. taxiformis, Ce. japonicum, and Ce. sungminbooi (top three sequences) share critical conserved residues with all other Florideophyceae sdhD genes.