| Literature DB >> 27189481 |
Jeremy Pasquier1, Cédric Cabau2, Thaovi Nguyen1, Elodie Jouanno1, Dany Severac3, Ingo Braasch4,5, Laurent Journot3, Pierre Pontarotti6, Christophe Klopp7, John H Postlethwait4, Yann Guiguen1, Julien Bobe8.
Abstract
With more than 30,000 species, ray-finned fish represent approximately half of vertebrates. The evolution of ray-finned fish was impacted by several whole genome duplication (WGD) events including a teleost-specific WGD event (TGD) that occurred at the root of the teleost lineage about 350 million years ago (Mya) and more recent WGD events in salmonids, carps, suckers and others. In plants and animals, WGD events are associated with adaptive radiations and evolutionary innovations. WGD-spurred innovation may be especially relevant in the case of teleost fish, which colonized a wide diversity of habitats on earth, including many extreme environments. Fish biodiversity, the use of fish models for human medicine and ecological studies, and the importance of fish in human nutrition, fuel an important need for the characterization of gene expression repertoires and corresponding evolutionary histories of ray-finned fish genes. To this aim, we performed transcriptome analyses and developed the PhyloFish database to provide (i) de novo assembled gene repertoires in 23 different ray-finned fish species including two holosteans (i.e. a group that diverged from teleosts before TGD) and 21 teleosts (including six salmonids), and (ii) gene expression levels in ten different tissues and organs (and embryos for many) in the same species. This resource was generated using a common deep RNA sequencing protocol to obtain the most exhaustive gene repertoire possible in each species that allows between-species comparisons to study the evolution of gene expression in different lineages. The PhyloFish database described here can be accessed and searched using RNAbrowse, a simple and efficient solution to give access to RNA-seq de novo assembled transcripts.Entities:
Keywords: Assembly; Gar; Gene duplication; Gene expression; Holostean; Mcam; Salmonids; Stra8; Teleosts
Mesh:
Year: 2016 PMID: 27189481 PMCID: PMC4870732 DOI: 10.1186/s12864-016-2709-z
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Phylogenetic tree of the PhyloFish species. Cladogram showing phylogenetic relationships among ray-finned fish analyzed in the present study. Tree topology was adapted from [2]. For each phylogenetic group, the number of species in the PhyloFish set is indicated between brackets. The teleost specific (TGD) and salmonid-specific (SaGD) whole genome duplication events are indicated in red
Species present in the PhyloFish database
| Name | Species | Phylogenetic group | Nb of contigs | WGD |
|---|---|---|---|---|
| Bowfin |
| Amiiformes | 35064 | VGD2 |
| Spotted gar |
| Lepisosteiformes | 41396 | VGD2 |
| European eel |
| Anguilliformes | 60263 | TGD |
| Butterfly fish |
| Osteoglossiformes | 44577 | TGD |
| Arowana |
| Osteoglossiformes | 55739 | TGD |
| Elephantnose fish |
| Osteoglossiformes | 53423 | TGD |
| Aliss shad |
| Clupeiformes | 53363 | TGD |
| Zebrafish |
| Cypriniformes | 48158 | TGD |
| Panga |
| Siluriformes | 43013 | TGD |
| Black ghost knifefish |
| Gymnotiformes | 45356 | TGD |
| Mexican tetra (cave) |
| Chraraciformes | 47729 | TGD |
| Mexican tetra (surface) |
| Characiformes | 46670 | TGD |
| Northern pike |
| Esociformes | 48567 | TGD |
| Eastern mudminnow |
| Esociformes | 46381 | TGD |
| Grayling |
| Salmoniformes | 67157 | SaGD |
| European whitefish |
| Salmoniformes | 74701 | SaGD |
| American whitefish |
| Salmoniformes | 66996 | SaGD |
| Brown trout |
| Salmoniformes | 75338 | SaGD |
| Rainbow trout |
| Salmoniformes | 78415 | SaGD |
| Brook trout |
| Salmoniformes | 69441 | SaGD |
| Sweetfish |
| Osmeriformes | 47484 | TGD |
| Atlantic cod |
| Gadiformes | 50564 | TGD |
| Medaka |
| Beloniformes | 42186 | TGD |
| European perch |
| Perciformes | 49204 | TGD |
For each species, the common name (according to fishbase.org), the species name, phylogenetic group, the number of de novo assembled contigs generated, and position related to successive whole genome duplication (WGD) are shown. VGD2 (vertebrate 2nd round of WGD), TGD (teleost-specific WGD), SaGD (salmonid specific WGD)
Fig. 2Stra8 proposed gene evolution in teleosts following the TGD WGD. Maximum-likelihood phylogeny of Stra8 (a) was performed using the PhyML software [38] implemented in the Phylogeny.fr web platform [39] using default “a la carte” parameters and a bootstrapping procedure (N = 100 bootstraps). The resulting tree was exported and edited in Evolview [40]. Input sequences were retrieved using a tblastn search of the PhyloFish database using as bait the Southern catfish Stra8 protein (AGM53488.1). PhyloFish Stra8 coding sequences (in bold on the tree) were submitted to GenBank with the following accession numbers: Lepisosteus oculatus (KU161162), Osteoglossum bicirhosum (KU161164), Anguilla anguilla (KU161163), Alosa alosa (KU161165), Astyanax mexicanus (KU161166), Apteronotus albifrons (KU161167), Oncorhynchus mykiss (KU161168), Salvelinus fontinalis (KU161169), Coregonus lavaretus (KU161172), Coregonus clupeaformis (KU161171), Salmo trutta (KU161170), Thymallus thymallus (KU161173), Umbra pygmae (KU161174) and Plecoglossus altivelis (KU161175). This dataset was complemented with two additional teleost public sequences of Stra8 in Esox lucius (XP_012986862) and Astyanax mexicanus (XP_007229918.1) and a Stra8 sequence deduced from the Anguilla japonica genome (scaffold 6093). The tree was rooted with tetrapod sequences using the Homo sapiens STRA8 (AAP47163.1) and Alligator mississippiensis Stra8 (XP_006261218.1). b Schematic representation of the deduced evolution of stra8 based on PhyloFish sequences. This analysis suggests that stra8 was completely lost in Acanthomorpha, but also specifically and independently lost in the Cypriniformes lineage. The tree is based on [2]
Fig. 3Tissue expression profiles of stra8 reveal expression predominantly in testes in most PhyloFish species. Relative expression of stra8 was calculated as the percentage of the maximum rpkm (number of reads per kilobase per million reads) value per species. ND: no data (tissue not sequenced in that species)
Fig. 4Phylogeny of Mcam in teleosts following the TGD and SaGD WGDs. Maximum-likelihood phylogeny of Mcam was performed using the PhyML software [38] implemented in the Phylogeny.fr web platform [39] using default “a la carte” parameters and a bootstrapping procedure (N = 100 bootstraps). The resulting tree was exported and edited in Evolview [40]. Input sequences were retrieved using a tblastn search of the PhyloFish database using as bait the zebrafish Mcam protein (XP_005157627.1), in the Mcama branch of the tree. PhyloFish Mcam coding sequences are shown in bold on the tree. The tree was rooted with tetrapod sequences using the Homo sapiens MCAM (AAH56418) and Alligator sinensis Mcam (XP_014373905). A few additional published teleosts Mcam sequences were added in the analysis (normal font). The TGD and SaGD are shown with red stars