| Literature DB >> 31188893 |
Christoffer Rozenfeld1, Jose Blanca2, Victor Gallego1, Víctor García-Carpintero2, Juan Germán Herranz-Jusdado1, Luz Pérez1, Juan F Asturiano1, Joaquín Cañizares2, David S Peñaranda1.
Abstract
Paralogues pairs are more frequently observed in eels (Anguilla sp.) than in other teleosts. The paralogues often show low phylogenetic distances; however, they have been assigned to the third round of whole genome duplication (WGD), shared by all teleosts (3R), due to their conserved synteny. The apparent contradiction of low phylogenetic difference and 3R conserved synteny led us to study the duplicated gene complement of the freshwater eels. With this aim, we assembled de novo transcriptomes of two highly relevant freshwater eel species: The European (Anguilla anguilla) and the Japanese eel (Anguilla japonica). The duplicated gene complement was analysed in these transcriptomes, and in the genomes and transcriptomes of other Actinopterygii species. The study included an assessment of neutral genetic divergence (4dTv), synteny, and the phylogenetic origins and relationships of the duplicated gene complements. The analyses indicated a high accumulation of duplications (1217 paralogue pairs) among freshwater eel genes, which may have originated in a WGD event after the Elopomorpha lineage diverged from the remaining teleosts, and thus not at the 3R. However, very similar results were observed in the basal Osteoglossomorpha and Clupeocephala branches, indicating that the specific genomic regions of these paralogues may still have been under tetrasomic inheritance at the split of the teleost lineages. Therefore, two potential hypotheses may explain the results: i) The freshwater eel lineage experienced an additional WGD to 3R, and ii) Some duplicated genomic regions experienced lineage specific rediploidization after 3R in the ancestor to freshwater eels. The supporting/opposing evidence for both hypotheses is discussed.Entities:
Mesh:
Year: 2019 PMID: 31188893 PMCID: PMC6561569 DOI: 10.1371/journal.pone.0218085
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Methodology.
Pipeline of the bioinformatics methodology. Folders describe the software used, light grey boxes describe the action taken, light brown bubbles describes the rationale for selected actions, and light blue boxes describe the specific goal of each section. Finally, green boxes represent external data input.
Fig 2Synteny illustration.
Visualization of the assigned synteny types: “some synteny” (), paralogues of genes found close to one duplicate are also found close to the other duplicate; “no synteny” (), less than two paralogues for other genes are found close to both paralogue duplicates; “close” (), duplicated genes are close in the genome; “no information” (), the duplicated genes are located in small scaffolds with too few gene families close by; “conflicting syntenies” (), different synteny classification found in the genomes of the different species affected by the duplication. Sand coloured boxes represent genes which have not been assigned to a gene family, pink boxes represent the gene from which synteny is being assessed; all other colour boxes represent other genes which have been assigned to a gene family.
Included transcriptomes.
| Species | N.° Reads | Q30 | Transcripts | Mean GC content (%) |
|---|---|---|---|---|
| European eel | 181,322,106 | 0.994 | 77,247 | 51.17 |
| Northern Pike | 553,710,218 | 0.989 | 68,489 | 48.05 |
| Elephantnose fish | 498,451,616 | 0.993 | 74,642 | 49.75 |
| Silver arowana | 490,649,254 | 0.992 | 78,610 | 49.18 |
| Japanese eel | 458,032,126 | 0.986 | 64,857 | 48.13 |
Metrics of included raw read datasets from European eel (Anguilla anguilla), Japanese eel (Anguilla japonica), northern Pike (Esox lucius), elephantnose fish (Gnathonemus petersi), and silver arowana (Osteoglossum bicirrhosum).
Gene quantities far each species included.
| Species | Transcripts | Genes | Representative transcripts with predicted protein | % of genes assigned to a gene family | |
|---|---|---|---|---|---|
| European eel | 77,247 | 54,879 | 27,696 | 25,862 | 93.38 |
| Japanese eel | 64,857 | 46,585 | 23,780 | 23,098 | 97.13 |
| Zebrafish | 58,274 | 32,189 | 25,790 | 22,703 | 88.03 |
| Northern pike | 68,489 | 49,154 | 23,843 | 21,696 | 90.99 |
| Elephantnose fish | 74,642 | 50,455 | 24,857 | 22,036 | 88.65 |
| Spotted gar | 22,483 | 18,341 | 18,341 | 17,872 | 97.44 |
| Silver arowana | 78,610 | 55,667 | 24,938 | 21,604 | 86.63 |
| Asian arowana | 43,354 | 23,799 | 22,740 | 20,637 | 90.75 |
| Atlantic salmon | 109,584 | 55,104 | 48,593 | 42,625 | 87.72 |
| Fugu | 47,841 | 18,523 | 18,523 | 17,698 | 95.55 |
| Platyfish | 20,454 | 20,379 | 20,379 | 19,807 | 97.19 |
Quantities of included genes per included species: European eel (Anguilla anguilla), Japanese eel (Anguilla japonica), zebrafish (Danio rerio), northern pike (Esox lucius), elephantnose fish (Gnathonemus petersi), spotted gar (Lepisosteus oculatus), Asian arowana (Scleropages formosus), silver arowana (Osteoglossum bicirrhosum), Atlantic salmon (Salmo salar), fugu (Takifugu rubripes), and platyfish (Xiphophorus maculatus). “Transcripts” represents unigenes, “Genes” represents the number of transcript clusters, “Representative transcripts with predicted protein” represents the number of genes with a successful protein annotation, “Gene family transcripts” represents the representative transcripts with predicted protein with a successful gene family annotation, and “% of genes assigned to a gene family” represents the percentage of representative transcripts with predicted protein with successful gene family annotation.
Fig 3BUSCO analysis.
BUSCO (Benchmarking set of Universal Single-Copy Orthologues) result for included genomes and transcriptomes. The sequence of a BUSCO gene can be found complete or fragmented in each genome and it can be found once (single copy), more than once (duplicated) or not found (missing). Included genomes are: European eel (Anguilla anguilla), Japanese eel (Anguilla japonica), Asian arowana (Scleropages formosus), zebrafish (Danio rerio), northern pike (Esox lucius), spotted gar (Lepisosteus oculatus), fugu (Takifugu rubripes), platyfish (Xiphophorus maculatus) and Atlantic salmon (Salmo salar). Included transcriptomes: European eel, Japanese eel, northern pike, elephantnose fish (Gnathonemus petersii) and silver arowana (Osteoglossum bicirrhosum).
Fig 44dTv and synteny distributions of duplications per branch of the PHYLDOG species tree.
Quantity, 4dTv and synteny distributions of duplications assigned to each branch of the PHYLDOG species tree. Each panel represents the branch with the corresponding number in the cladogram in the bottom right-hand corner. Species included in this study are: European eel (Anguilla anguilla), Japanese eel (Anguilla japonica), zebrafish (Danio rerio), northern pike (Esox lucius), spotted gar (Lepisosteus oculatus), fugu (Takifugu rubripes), platyfish (Xiphophorus maculatus), Atlantic salmon (Salmo salar), elephantnose fish (Gnathonemus petersii), Asian arowana (Scleropages formosus) and silver arowana (Osteoglossum bicirrhosum). The synteny types are the following: close (), duplicated genes are close in the genome; some synteny (), paralogues of genes found close to one duplicate are also found close to the other duplicate; no synteny (), less than two paralogeus for other genes are found close to both paralogue duplicates; no information (), the duplicated genes are located in small scaffolds with too few genes close by; conflicting syntenies (), different synteny classifications found in the genomes of the different species affected by the duplication.
Fig 54dTv distribution between European eel, elephantnose fish, and the arowanas homologs.
4dTv distribution of European eel (Anguilla anguilla) and Japanese eel homologs (), European eel and elephantnose fish (Gnathonemus petersii) homologs (), and European eel, silver arowana (Osteoglossum bicirrhosum) homologs (), and European eel and Asian arowana (Scleropages formosus) homologs ().
Fig 6Density distribution of all 4dTv distances between teleost paralogues.
Histograms of all 4dTv distances between paralogues of the included teleosts, presented with yellow and blue bars. Furthermore, a probability density estimate curve is plotted on top of the histograms in red. Density values (y-axis) do not correspond to the density estimate. The included species are: European eel (Anguilla anguilla), Japanese eel (Anguilla japonica), zebrafish (Danio rerio), northern pike (Esox lucius), spotted gar (Lepisosteus oculatus), fugu (Takifugu rubripes), platyfish (Xiphophorus maculatus), Atlantic salmon (Salmo salar), elephantnose fish (Gnathonemus petersii), Asian arowana (Scleropages formosus) and silver arowana (Osteoglossum bicirrhosum).
Enriched Go-terms from the shared freshwater eel branch of Fig 4.
| Aspect | GO ID | Term | Annotated | Significant | Expected | FDR |
|---|---|---|---|---|---|---|
| Biological Process | GO:0007264 | small GTPase mediated signal transductio … | 256 | 43 | 22.97 | 0.000064 |
| Biological Process | GO:0045176 | apical protein localization | 3 | 3 | 0.27 | 0.00072 |
| Biological Process | GO:0008045 | motor neuron axon guidance | 10 | 5 | 0.9 | 0.00099 |
| Biological Process | GO:0048514 | blood vessel morphogenesis | 121 | 20 | 10.86 | 0.00257 |
| Biological Process | GO:0000132 | establishment of mitotic spindle orienta … | 8 | 4 | 0.72 | 0.00335 |
| Biological Process | GO:0048596 | embryonic camera-type eye morphogenesis | 11 | 4 | 0.99 | 0.00625 |
| Biological Process | GO:0015991 | ATP hydrolysis coupled proton transport | 20 | 6 | 1.79 | 0.00661 |
| Biological Process | GO:0008333 | endosome to lysosome transport | 2 | 2 | 0.18 | 0.00804 |
| Biological Process | GO:0015031 | protein transport | 284 | 39 | 25.48 | 0.00900 |
| Biological Process | GO:0006886 | intracellular protein transport | 156 | 19 | 14 | 0.00907 |
| Biological Process | GO:0007160 | cell-matrix adhesion | 16 | 5 | 1.44 | 0.01084 |
| Biological Process | GO:0001756 | somitogenesis | 50 | 10 | 4.49 | 0.01200 |
| Biological Process | GO:0060042 | retina morphogenesis in camera-type eye | 37 | 9 | 3.32 | 0.01887 |
| Biological Process | GO:0072358 | cardiovascular system development | 280 | 44 | 25.12 | 0.01905 |
| Biological Process | GO:0040023 | establishment of nucleus localization | 4 | 3 | 0.36 | 0.02262 |
| Biological Process | GO:0009826 | unidimensional cell growth | 3 | 2 | 0.27 | 0.02268 |
| Biological Process | GO:0030326 | embryonic limb morphogenesis | 3 | 2 | 0.27 | 0.02268 |
| Biological Process | GO:0008202 | steroid metabolic process | 34 | 3 | 3.05 | 0.02280 |
| Biological Process | GO:0007179 | transforming growth factor beta receptor … | 13 | 4 | 1.17 | 0.02382 |
| Biological Process | GO:0071840 | cellular component organization or bioge … | 846 | 81 | 75.91 | 0.02822 |
| Biological Process | GO:0048884 | neuromast development | 15 | 4 | 1.35 | 0.02854 |
| Biological Process | GO:0001569 | patterning of blood vessels | 8 | 3 | 0.72 | 0.02858 |
| Biological Process | GO:0016998 | cell wall macromolecule catabolic proces … | 8 | 3 | 0.72 | 0.02858 |
| Biological Process | GO:0046835 | carbohydrate phosphorylation | 8 | 3 | 0.72 | 0.02858 |
| Biological Process | GO:0043473 | pigmentation | 57 | 7 | 5.11 | 0.02863 |
| Biological Process | GO:0060059 | embryonic retina morphogenesis in camera … | 14 | 4 | 1.26 | 0.03104 |
| Biological Process | GO:0001702 | gastrulation with mouth forming second | 23 | 6 | 2.06 | 0.04241 |
| Biological Process | GO:0060034 | notochord cell differentiation | 6 | 3 | 0.54 | 0.04259 |
| Biological Process | GO:0061035 | regulation of cartilage development | 7 | 3 | 0.63 | 0.04260 |
| Biological Process | GO:0009103 | lipopolysaCellular Componentharide biosynthetic process | 4 | 2 | 0.36 | 0.04268 |
| Biological Process | GO:0043114 | regulation of vascular permeability | 4 | 2 | 0.36 | 0.04268 |
| Biological Process | GO:0015721 | bile acid and bile salt transport | 4 | 2 | 0.36 | 0.04268 |
| Biological Process | GO:0006511 | ubiquitin-dependent protein catabolic pr … | 91 | 14 | 8.17 | 0.04728 |
| Biological Process | GO:0030900 | forebrain development | 53 | 10 | 4.76 | 0.04832 |
| Biological Process | GO:0042074 | cell migration involved in gastrulation | 37 | 9 | 3.32 | 0.04885 |
| Cellular Component | GO:0031105 | septin complex | 6 | 4 | 0.54 | 0.00084 |
| Cellular Component | GO:0030018 | Z disc | 10 | 5 | 0.9 | 0.00099 |
| Cellular Component | GO:0031461 | cullin-RING ubiquitin ligase complex | 28 | 8 | 2.52 | 0.00555 |
| Cellular Component | GO:0008290 | F-actin capping protein complex | 2 | 2 | 0.18 | 0.00807 |
| Cellular Component | GO:0005915 | zonula adherens | 2 | 2 | 0.18 | 0.00807 |
| Cellular Component | GO:0005737 | cytoplasm | 1750 | 175 | 157.31 | 0.01734 |
| Cellular Component | GO:0005768 | endosome | 47 | 8 | 4.22 | 0.01771 |
| Cellular Component | GO:0033180 | proton-transporting V-type ATPase V1 do … | 10 | 4 | 0.9 | 0.01913 |
| Cellular Component | GO:0030424 | axon | 7 | 3 | 0.63 | 0.01920 |
| Cellular Component | GO:0000159 | protein phosphatase type 2A complex | 3 | 2 | 0.27 | 0.02275 |
| Cellular Component | GO:0005667 | transcription factor complex | 79 | 12 | 7.1 | 0.02336 |
| Cellular Component | GO:0005912 | adherens junction | 7 | 4 | 0.63 | 0.04255 |
| Cellular Component | GO:0031519 | PcG protein complex | 9 | 4 | 0.81 | 0.04258 |
| Cellular Component | GO:0005890 | sodium:potassium-exchanging ATPase compl … | 4 | 2 | 0.36 | 0.04281 |
| Cellular Component | GO:0043198 | dendritic shaft | 4 | 2 | 0.36 | 0.04281 |
| Cellular Component | GO:0005885 | Arp2/3 protein complex | 4 | 2 | 0.36 | 0.04281 |
| Cellular Component | GO:0005765 | lysosomal membrane | 16 | 4 | 1.44 | 0.04914 |
| Molecular Function | GO:0005525 | GTP binding | 251 | 43 | 22.24 | 1.6e-05 |
| Molecular Function | GO:0043168 | anion binding | 1289 | 142 | 114.24 | 0.0018 |
| Molecular Function | GO:0060089 | molecular transducer activity | 778 | 56 | 68.95 | 0.0078 |
| Molecular Function | GO:0004331 | fructose-2 6-bisphosphate 2-phosphatase … | 2 | 2 | 0.18 | 0.0078 |
| Molecular Function | GO:0045296 | cadherin binding | 2 | 2 | 0.18 | 0.0078 |
| Molecular Function | GO:0046933 | proton-transporting ATP synthase activit … | 10 | 4 | 0.89 | 0.0083 |
| Molecular Function | GO:0004702 | receptor signaling protein serine/threon … | 19 | 7 | 1.68 | 0.0112 |
| Molecular Function | GO:0008242 | omega peptidase activity | 6 | 3 | 0.53 | 0.0113 |
| Molecular Function | GO:0031683 | G-protein beta/gamma-subunit complex bin … | 6 | 3 | 0.53 | 0.0113 |
| Molecular Function | GO:0008013 | beta-catenin binding | 7 | 3 | 0.62 | 0.0185 |
| Molecular Function | GO:0016820 | hydrolase activity acting on acid anhyd … | 38 | 8 | 3.37 | 0.0219 |
| Molecular Function | GO:0004749 | ribose phosphate diphosphokinase activit … | 3 | 2 | 0.27 | 0.0221 |
| Molecular Function | GO:0008601 | protein phosphatase type 2A regulator ac … | 3 | 2 | 0.27 | 0.0221 |
| Molecular Function | GO:0003796 | lysozyme activity | 3 | 2 | 0.27 | 0.0221 |
| Molecular Function | GO:0008146 | sulfotransferase activity | 43 | 10 | 3.81 | 0.0249 |
| Molecular Function | GO:0051287 | NAD binding | 22 | 5 | 1.95 | 0.0298 |
| Molecular Function | GO:0003714 | transcription corepressor activity | 9 | 3 | 0.8 | 0.0388 |
| Molecular Function | GO:0008514 | organic anion transmembrane transporter … | 22 | 5 | 1.95 | 0.0399 |
| Molecular Function | GO:0004872 | receptor activity | 695 | 41 | 61.59 | 0.0495 |
Enriched Go-terms from the duplicated genes shared by freshwater eels. “Aspects” indicates the specific GO-term aspect of each enriched GO-term. “GO ID” indicates the identification number of each enriched GO-term. “Term “indicates the verbal description of each enriched GO-term. “Annotated” indicates the number of GO-terms which are associated with each enriched GO-term. “Significant” indicates the number of GO-terms associated to each enriched GO-term found among the duplicated genes. “Expected” indicates the number of GO-terms expected to be found linked to each enriched GO-term. “FDR” indicates the False Discovery Rate adjusted P-value from the Fisher exact test of enrichment.
Enriched KEGG-terms from the shared freshwater eel branch of Fig 4.
| KEGG ID | Term | Annotated | Significant | Expected | FDR |
|---|---|---|---|---|---|
| 04728 | Dopaminergic synapse | 38 | 283 | 12 | 0,000001 |
| 03015 | mRNA surveillance pathway | 21 | 129 | 5 | 0,000082 |
| 04660 | T cell receptor signaling pathway | 27 | 204 | 8 | 0,000082 |
| 04071 | Sphingolipid signaling pathway | 29 | 238 | 10 | 0,000112 |
| 05142 | Chagas disease (American trypanosomiasis) | 23 | 180 | 7 | 0,000518 |
| 04659 | Th17 cell differentiation | 23 | 184 | 7 | 0,000596 |
| 05162 | Measles | 21 | 168 | 7 | 0,001190 |
| 04390 | Hippo signaling pathway | 29 | 282 | 11 | 0,001190 |
| 04658 | Th1 and Th2 cell differentiation | 19 | 148 | 6 | 0,001601 |
| 04261 | Adrenergic signaling in cardiomyocytes | 28 | 291 | 12 | 0,004378 |
| 05100 | Bacterial invasion of epithelial cells | 20 | 183 | 7 | 0,005845 |
| 05032 | Morphine addiction | 18 | 155 | 6 | 0,005845 |
| 00625 | Chloroalkane and chloroalkene degradation | 5 | 10 | 0 | 0,005845 |
| 04640 | Hematopoietic cell lineage | 12 | 79 | 3 | 0,006536 |
| 04910 | Insulin signaling pathway | 26 | 276 | 11 | 0,006551 |
| 04630 | Jak-STAT signaling pathway | 18 | 171 | 7 | 0,012920 |
| 04016 | MAPK signaling pathway—plant | 6 | 22 | 1 | 0,014553 |
| 04022 | cGMP-PKG signaling pathway | 29 | 350 | 14 | 0,015915 |
| 04917 | Prolactin signaling pathway | 15 | 133 | 5 | 0,015915 |
| 05130 | Pathogenic Escherichia coli infection | 13 | 108 | 4 | 0,019059 |
| 05418 | Fluid shear stress and atherosclerosis | 22 | 245 | 10 | 0,022738 |
| 00020 | Citrate cycle (TCA cycle) | 8 | 47 | 2 | 0,022808 |
| 04080 | Neuroactive ligand-receptor interaction | 31 | 395 | 16 | 0,023393 |
| 04151 | PI3K-Akt signaling pathway | 37 | 499 | 20 | 0,023393 |
| 04514 | Cell adhesion molecules (CAMs) | 21 | 231 | 9 | 0,023393 |
| 04391 | Hippo signaling pathway—fly | 16 | 158 | 6 | 0,034750 |
| 05340 | Primary immunodeficiency | 7 | 41 | 2 | 0,036069 |
| 04350 | TGF-beta signaling pathway | 16 | 164 | 7 | 0,038388 |
| 05133 | Pertussis | 12 | 111 | 5 | 0,045104 |
| 05152 | Tuberculosis | 21 | 247 | 10 | 0,047421 |
| 04664 | Fc epsilon RI signaling pathway | 12 | 113 | 5 | 0,048089 |
| 00510 | N-Glycan biosynthesis | 9 | 71 | 3 | 0,048943 |
| 04144 | Endocytosis | 37 | 533 | 22 | 0,048943 |
| 00350 | Tyrosine metabolism | 6 | 34 | 1 | 0,048943 |
| 04510 | Focal adhesion | 28 | 379 | 15 | 0,048943 |
Enriched KEGG-terms from the duplicated genes shared by freshwater eels. “KEGG ID” indicates the identification number of each enriched KEGG pathway. “Term”indicates the verbal description of each enriched KEGG pathway. “Annotated” indicates the number of KEGG pathways, which are associated with each enriched KEGG pathway. “Significant” indicates the number of KEGG pathways associated with each enriched KEGG pathway found among the duplicated genes. “Expected” indicates the number of KEGG pathways expected to be found associated with each enriched KEGG pathway. “FDR” indicates the False Discovery Rate adjusted P-value from the Fisher exact test of enrichment.