| Literature DB >> 23824218 |
Liliana Milani1, Fabrizio Ghiselli, Davide Guerra, Sophie Breton, Marco Passamonti.
Abstract
Despite numerous comparative mitochondrial genomics studies revealing that animal mitochondrial genomes are highly conserved in terms of gene content, supplementary genes are sometimes found, often arising from gene duplication. Mitochondrial ORFans (ORFs having no detectable homology and unknown function) were found in bivalve molluscs with Doubly Uniparental Inheritance (DUI) of mitochondria. In DUI animals, two mitochondrial lineages are present: one transmitted through females (F-type) and the other through males (M-type), each showing a specific and conserved ORF. The analysis of 34 mitochondrial major Unassigned Regions of Musculista senhousia F- and M-mtDNA allowed us to verify the presence of novel mitochondrial ORFs in this species and to compare them with ORFs from other species with ascertained DUI, with other bivalves and with animals showing new mitochondrial elements. Overall, 17 ORFans from nine species were analyzed for structure and function. Many clues suggest that the analyzed ORFans arose from endogenization of viral genes. The co-option of such novel genes by viral hosts may have determined some evolutionary aspects of host life cycle, possibly involving mitochondria. The structure similarity of DUI ORFans within evolutionary lineages may also indicate that they originated from independent events. If these novel ORFs are in some way linked to DUI establishment, a multiple origin of DUI has to be considered. These putative proteins may have a role in the maintenance of sperm mitochondria during embryo development, possibly masking them from the degradation processes that normally affect sperm mitochondria in species with strictly maternal inheritance.Entities:
Keywords: Doubly Uniparental Inheritance of mitochondria; endogenous virus; mitochondrial ORFans; mitochondrial inheritance
Mesh:
Substances:
Year: 2013 PMID: 23824218 PMCID: PMC3730352 DOI: 10.1093/gbe/evt101
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Sequences Used in the Analyses
| Species | mt Genome | Accession Number | ORF |
|---|---|---|---|
| Mollusca, Bivalvia | |||
| | F | GU001953 | Mse-FORF, Mse-ORF-B |
| KC243365–75 | Mse-FORF | ||
| KC243354–64 | Mse-ORF-B | ||
| M | GU001952 | Mse-ORF-B | |
| KC243376–87 | Mse-ORF-B | ||
| | F | AY515227 | Mca-FORF |
| M | AF188284 | Mca-MORF1, Mca-MORF2 | |
| | F | AY350784 | Med-FORF |
| M | AY823623 | Med-MORF | |
| | F | AY497292 | Mga-FORF |
| M | HM027630 | Mga-MORF | |
| | F | GU936625 | Mtr-FORF |
| M | AF188282 | Mtr-MORF | |
| | F | AB065375 | Rph-FORF |
| KC243324–31 | Rph-FORF | ||
| M | AB065374 | Rph-MORF | |
| KC243347–53 | Rph-MORF | ||
| | F | FJ809753 | Vel-FORF |
| M | FJ809752 | Vel-MORF | |
| | GU269271 | Peu-ORF | |
| Cnidaria | |||
| | JN700949 | Pno-ORF314 | |
| | YP_005353032.1 | Amo-PolB | |
| | AFU34533.1 | Ico-mtMutS | |
Note.—Mitochondrial genome type is specified only for ascertained DUI species. ORF column is the name given to the amino acid sequence.
FLargest Unassigned Regions (LURs). Schematic structure of female (F) and male (M) LURs of Musculista senhousia. Triangles indicate tRNAs.
p-Distance (p-D) and Standard Error Values of Novel Mitochondrial ORFs in DUI Bivalves
| Species | ORF | Nucleotide | Translation | |||
|---|---|---|---|---|---|---|
| p-D | SE | p-D | SE | |||
| fORF | 0.019 | 0.004 | 0.035 | 0.010 | 11 | |
| Male ORF-B | 0.004 | 0.002 | 0.008 | 0.004 | 12 | |
| Female ORF-B | 0.024 | 0.005 | 0.056 | 0.012 | 8 | |
| Overall ORF-B | 0.030 | 0.006 | 0.063 | 0.014 | 20 | |
| fORF | 0.005 | 0.003 | 0.014 | 0.008 | 4 | |
| mORF1 | 0.015 | 0.009 | 0.031 | 0.021 | 4 | |
| mORF2 | 0.011 | 0.007 | 0.033 | 0.022 | 4 | |
| fORF | 0.013 | 0.002 | 0.026 | 0.006 | 134 | |
| mORF | 0.017 | 0.004 | 0.039 | 0.012 | 25 | |
| fORF | 0.024 | 0.004 | 0.048 | 0.009 | 16 | |
| mORF | 0.029 | 0.008 | 0.062 | 0.021 | 47 | |
| mORF (edulis-like) | 0.023 | 0.007 | 0.042 | 0.017 | 14 | |
| fORF | 0.007 | 0.002 | 0.014 | 0.005 | 8 | |
| mORF | 0.025 | 0.007 | 0.046 | 0.016 | 9 | |
| fORF | 0.009 | 0.003 | 0.011 | 0.006 | 8 | |
| mORF | 0.004 | 0.002 | 0.000 | 0.000 | 7 | |
| fORF | 0.000 | 0.000 | 0.000 | 0.000 | 3 | |
Note.—Number of ORF sequences used for each species is dependant on the number of available and suitable sequences on GenBank. p-distances of Myt. edulis, Myt. galloprovincialis, and Myt. trossulus mORFs were calculated only on the last part of the ORF immediately following the poly-A sequence (see text for details). N = number of sequences used.
aOnly complete female ORF-B were considered.
bMale ORF-B and complete female ORF-B were considered.
cmORF sequences matching Myt. edulis mORF.
F(A) PSI-Coffee alignment of FORFs of family Mytilidae (accession nos.: GU001953, AY515227, AY350784, AY497292, GU936625); (B) PSI-Coffee alignment of MORFs of Mytilus species (accession nos.: AY823623, HM027630, AF188282).
F(A) PSI-Coffee alignment of Mytilus edulis FORF and MORF (accession nos. of sequences containing the ORF are reported in the figure); (B) PSI-Coffee alignment of Ruditapes philippinarum FORF (accession nos. of entire FLURs: KC243324–31) and MORF (accession nos. of entire MUR21 sequences: KC243347–53).
FFunctional domains in FORFs and MORFs (position in the amino acid sequence as identified by HHpred). Sequences with similarities are boxed in the same color and with the same type of line; red: similarities among FORFs; blue: similarities among MORFs; orange: K = poly-K region; S = poly-S region (see also PSI-Coffee alignments, figs. 2 and 3 and supplementary tables S2–S18, Supplementary Material online). Numbers indicate sequence length.
FPercentage of amino acid difference of novel proteins and of all mtDNA-encoded protein genes. Amino acid divergence (% amino acid difference) was calcultated with MEGA5 for each mt protein coding gene among: (A) F mt genomes [for (i) Mytilus spp.; (ii) Mytilidae, i.e., Mytilus spp. and Musculista senhousia; (iii) Mytilidae + the venerid Ruditapes philippinarum; and (iv) Mytilidae + the venerid R. philippinarum + the unionoid Venustaconcha ellipsiformis], (B) M mt genomes [for (i) Mytilus spp.; (ii) Mytilidae, i.e., Mytilus spp. and M. senhousia; (iii) Mytilidae + the venerid R. philippinarum; and (iv) Mytilidae + the venerid R. philippinarum + the unionoid V. ellipsiformis], and (C) between F and M mt genomes [for (i) Mytilus spp.; (ii) R. philippinarum; and (iii) V. ellipsiformis]. For the Mytilus edulis species complex (i.e., Myt. edulis, Myt. galloprovincialis, and Myt. trossulus), pairwise sequence difference was first calculated for each gene and the results were then exported to Microsoft Excel for calculations of means and SDs. For both R. philippinarum and V. ellipsiformis only, one whole F mtDNA and one whole M mtDNA are present in database and no error can be calculated. Omitted comparisons are due to the impossibility to obtain a good alignment. Note: F mtDNA = female mitochondrial genome; M mtDNA = male mitochondrial genome. Mytilus spp. = Myt. edulis species complex. Accession nos. mitochondrial genomes (F-type and M-type mtDNA, respectively): Myt. edulis NC_006161 and AY823623; Myt. galloprovincialis NC_006886 and AY363687; Myt. trossulus DQ198231 and DQ198225; M. senhousia GU001953 and GU001954; R. philippinarum AB065375.1 and AB065374.1; V. ellipsiformis FJ809753 and FJ809752.
Signal Peptide and Transmembrane-Helix Prediction in the Novel Putative Proteins
| Signal Peptide | |||||||
|---|---|---|---|---|---|---|---|
| FORF | Mse | Mca | Med | Mga | Mtr | Rph | Vel |
| Software | |||||||
| Phobius | 1–20 | — | 1–20* | 1–20* | 1–18* | 1–18 | — |
| InterProScan | 1–20 | 1–31 | — | — | 1–18 | 1–18 | 1–44 |
| PrediSi | 1–28 | — | 1–20* | 1–20* | 1–18* | 1–18 | 1–44* |
| SignalP 4.0 | 1–20 | 1–20* | 1–20* | 1–20* | 1–18* | — | 1–44* |
Note.—Signal peptide: Only signal peptides statistically supported (Phobius posterior label probability > 0.5; PrediSi score > 0.5; SignalP score > D-cutoff 0.5; significance test not provided by InterProScan) or found at least by two softwares are shown; *Significance < 0.5; (n) = Mca-MORF2 results. Transmembrane helices: Only transmembrane helices considered significant (TMpred score > 500; Phobius posterior label probability > 0.5; significance test not provided by the other softwares) or found by at least two softwares are shown; **TMpred score < 500; values in bold indicate helices not overlapping with the predicted signal peptide; (n) = Mca-MORF2 results.
Function Analysis of Novel Mitochondrial ORFs
| Mse-FORF | Mca-FORF |
|---|---|
|
Atome2 (highest probability): Chemokine (13), highest score 75.17 Human tissue factor, score 70.69 Eotaxin (2), score 67.89 and 62.51 Erythrocyte binding antigen 175, score 54.15 I-Tasser (confirmation): Cell division protein kinase 9/Protein Tat, RhoGAP protein, Glypican-1, Small-inducible cytokine A13, Erythrocyte binding antigen 175, HHpred (confirmation): SARS receptor-binding domain-like, probability 54.78, aa 31–61 Small inducible cytokine A1 precursor, probability 27.01, aa 5–91 I-Tasser (highest probability): Exportin-5, Cullin-5, Nucleoporin NUP170, BRO1 protein, TM-score > 0.5 GTP-binding nuclear protein Ran, TM-score > 0.5 I-Tasser (highest probability): Telomeric repeat-binding factor (2), TM-score > 0.5 ATP-dependent RNA helicase (2), TM-score > 0.5 HHpred (highest probability): More than 40 hits, highest probability 75.84, aa 1–21 HHpred (highest probability): Glucocorticoid receptor-like (10), highest probability 64.44, aa 68–85 |
Atome2 (highest probability): Unique short US2 glycoprotein, score 82.16 Killer cell immunoglobulin-like receptor 2DL1 (2), highest score 75.39 Pertussis toxin subunit 5, score 55.44 Putative ABC type-2 transporter, score 54.92 I-Tasser (confirmation): Receptor-type adenylate cyclase (2), TM-score > 0.5 HHpred (confirmation): RAB6-interacting protein 2 (2), highest probability 62.09, aa 39–99 Integral membrane protein, probability 48.70, aa 10–38 TonB Periplasmic protein TonB, probability 45.44, aa 44–75 NIPSNAP, probability 35.39, aa 12–34 Membrane protein, probability 31.83, aa 54–66 Membrane or secreted protein, probability 30.95, aa 3–51 Transport protein Sec24A (2), probability 30.88, aa 3–51 Membrane protein containing DUF1112, probability 28.02, aa 14–66 Atome2 (highest probability): Fibronectin, score 70.61 PfEMP1 variant 2 of strain MC, score 58.80 Human tissue factor (2), highest score 52.67 I-Tasser (highest probability): Antiviral helicase SKI2, Proliferating cell nuclear antigen PcnA, Infectivity protein G3P, Cyclophilin-like domain, HHpred (confirmation): SKI2/RNA helicase, probability 50.91, aa 101–123 Peptidyl-prolyl isomerase G/cyclophilin G, probability 38.59, aa 17–69 HHpred (highest probability): Keratin (9 hits), highest probability 76.37, aa 25–95 HHpred (highest probability): Sterol regulatory element binding protein (2), highest probability 71.01, aa 17–66 CG17964-PH, isoform H, probability 28.03, aa 54–122 |
Note.—Hits with the highest probability are reported for each of the three programs together with eventual confirmation of the same biological process from the other two softwares. Norm. Z-score > 1 = good alignment; TM-score > 0.5 = similar fold with query (Zhang 2008; Xu and Zhang 2010); (n) = number of the same hit (protein), when more than one. See also supplementary tables S2–S16, Supplementary Material online.
Hits to Viral Proteins Found in Novel Mitochondrial ORFs
| DUI sp. | Hits | Position |
|---|---|---|
| FORF | ||
| Mse | Protein Tat [Atome2; score 54.94] ( | n.a. |
| Protein Tat [I-Tasser; norm. | n.a. | |
| Protein Tat [HHpred; probability 25.94] | 62–73 | |
| SARS receptor-binding domain-like [HHpred; 54.78] | 31–61 | |
| Hepatitis E virus ORF-2 ( | 61–69 | |
| Fijivirus P9-2 protein ( | 8–50 | |
| Mca | Unique short US2 glycoprotein | n.a. |
| Pre-neck appendage protein (Bacteriophage) (5 hits) [Atome2; score 57.87–51.81] | n.a. | |
| Antiviral helicase SKI2 [I-Tasser; norm. | n.a. | |
| Infectivity protein G3P ( | n.a. | |
| Cyclophilin-like domain ( | n.a. | |
| Phage small terminase subunit ( | 8–45 | |
| Med | Retrovirus capsid dimerization domain-like (2) [HHpred; probability 35.34, 29.28] | 14–43 |
| Mga | Retrovirus capsid dimerization domain-like (2) [HHpred; probability 35.47, 30.09] | 14–43 |
| Mtr | Unique short US2 glycoprotein ( | n.a. |
| Positive stranded ssRNA viruses [HHpred; probability 28.66] | 16–54 | |
| Rph | Polymerase PB2 ( | n.a. |
| Vel | VP1, the protein that forms the mRNA-capping machine ( | n.a. |
| Fibritin ( | n.a. | |
| MORF | ||
| McaORF1 | Early 35 kDa protein ( | n.a. |
| Phosphatidylinositol 3-kinase regulatory subunit alpha ( | n.a. | |
| V-bcl-2 ( | n.a. | |
| McaORF2 | Circulin A ( | n.a. |
| First immunoglobulin (Ig) domain of nectin-3 ( | 12–21 | |
| Coxsackie virus and adenovirus receptor (Glycoprotein A33; CTX-related type I transmembrane protein) [HHpred; probability 51.10] | 5–21 | |
| Coxsackie virus and adenovirus receptor (Car), domain 1 [ | 12–21 | |
| Hepatitis A virus cellular receptor 1 [ | 12–25 | |
| Med | Replicase polyprotein 1ab ( | n.a. |
| Macro domain of Non-structural protein 3 ( | n.a. | |
| Mga | HIV-1 envelope protein chimera ( | n.a. |
| Proliferation-associated protein 2G4 ( | n.a. | |
| Viral protein [I-Tasser; norm. | n.a. | |
| Mtr | — | — |
| Rph | Unique short US2 glycoprotein ( | n.a. |
| Viral protein | n.a. | |
| CRISPR-associated DEAD/DEAH-box helicase Csf4 ( | 144–165 | |
| d.172.1 gp120 core (56502) SCOP seed sequence: d1g9mg_ ( | 125–157 | |
| Vel | — | — |
| MseORFB | Unique short US2 glycoprotein ( | n.a. |
| Gag-Pol polyprotein ( | n.a. | |
| Glycosyltransferase (Mannosyltransferase) ( | n.a. | |
| VAC_I5L (dsDNA viruses, no RNA stage; Poxviridae) ( | 6–24 | |
| Peu | Terminase small subunit ( | n.a. |
| CAG38821 ( | n.a. | |
| Terminase small subunit ( | n.a. | |
| DNA polymerase processivity factor ( | n.a. | |
| CRISPR-associated DxTHG motif protein ( | 4–17 | |
| Pno-ORF314 | Capsid protein P27 ( | n.a. |
| Retrovirus capsid protein, N-terminal core domain ( | 21–50 | |
| RSV capsid protein {Rous sarcoma virus [TaxId: 11886]} [HHpred; probability 80.17] | 21–59 | |
| JSRV capsid, capsid protein P27; zinc-finger, metal-binding {Jaagsiekte sheep retrovirus} ( | 21–59 | |
| Capsid protein P27; retrovirus, N-terminal core domain {Mason-pfizer monkey virus} ( | 21–59 | |
| GAG polyprotein capsid protein P27; retrovirus, immature GAG{Rous sarcoma virus} ( | 21–50 | |
| Capsid protein P27; viral protein, retrovirus, GAG; 7.00 A {Mason-pfizer monkey virus} [HHpred; probability 44.98] | 22–59 | |
| Capsid protein; two independent domains helical bundles, virus/viral protein {Rous sarcoma virus} [HHpred; probability 43.53] | 21–47 | |
| Tat binding protein 1 (TBP-1)-interacting protein (TBPIP) ( | 3–50 | |
Note.—Norm. Z-score > 1 = good alignment; TM-score > 0.5 = similar fold with query (Zhang 2008; Xu and Zhang 2010); (n) = number of the same hit (protein); position: amino acid position in the query sequence; n.a. = non applicable.