| Literature DB >> 30384840 |
Hua Ying1, Ira Cooke2, Susanne Sprungala2, Weiwen Wang3, David C Hayward3, Yurong Tang3,4, Gavin Huttley3,4, Eldon E Ball3,5, Sylvain Forêt3,5, David J Miller6,7.
Abstract
BACKGROUND: Despite the biological and economic significance of scleractinian reef-building corals, the lack of large molecular datasets for a representative range of species limits understanding of many aspects of their biology. Within the Scleractinia, based on molecular evidence, it is generally recognised that there are two major clades, Complexa and Robusta, but the genomic bases of significant differences between them remain unclear.Entities:
Keywords: Complex coral; Gene family expansion; Histidine biosynthesis; Hox cluster; Nucleotide substitution model; ParaHox; Robust coral; Scleractinia
Mesh:
Year: 2018 PMID: 30384840 PMCID: PMC6214176 DOI: 10.1186/s13059-018-1552-8
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1Phylogenetic positions and morphology of cnidarians used in the present study. a Molecular phylogeny of corals and sea anemones inferred using a maximum likelihood method based on the general nucleotide substitution model. The scale bar indicates 0.1 substitutions per site. b Goniastrea aspera (R): b1 A small isolated colony, which appears brown due to the zooxanthellae which it contains. Often this species occurs as larger colonies covering several meters in shallow water habitats that may sometimes be quite turbid. b2 Closeup of polyps showing cream-coloured lobes of the oral disc. b3 Closeup of the skeleton showing the complex skeletal structure underlying each polyp. c Fungia fungites (R): c1 The colony consists of a single polyp which is usually withdrawn during the day and expanded at night. c2 Closeup of the mouth area, covered by sometimes multicoloured living tissue. c3 The skeleton consists of closely spaced septa which support the living tissue. d Galaxea fascicularis (C): d1 A large encrusting colony. d2 Closeup of the polyps, which can be of diverse colours. Galaxea is unusual amongst corals in that the polyps are often extended during the day. d3 The Galaxea skeleton differs considerably from those of many other massive corals in that only thin layers of coenostium link the individual polyps. e Portion of a colony of Acropora millepora (C). f Colony of Acropora digitifera (C). g Colony of Porites lutea (C). h The sea anemone, Aiptasia pallida. i The starlet sea anemone, Nematostella vectensis. Photo credits are given in the Acknowledgements
Genome assembly and annotation statistics for the three sequenced coral genomes
|
|
|
| ||
|---|---|---|---|---|
| Assembled genome size (Mb)a | 334 | 606 | 764 | |
| SNP rate | 1.27 | 1.19 | 0.89 | |
| Scaffolds | Number | 11,269 | 7424 | 5396 |
| N50 (kb) | 87 | 323 | 518 | |
| Largest (kb) | 874 | 1804 | 2896 | |
| GC% | 39.43 | 38.41 | 39.29 | |
| Number of genesb | 22,418 | 38,209 | 35,901 | |
| Repeatsc | Total repeat (%) | 31.71 | 37.36 | 44.77 |
| Interspersed repeat (%) | 30.41 | 35.59 | 42.91 | |
aSee Additional file 3: Table S3 for more detail
bSee Additional file 3: Table S6 for more detail
cSee Additional file 3: Table S5 for more detail
Fig. 2The significant preservation of gene collinearity between complex and robust corals. Synteny relationships are shown in the form of circle plots. In each plot, three species were chosen: two from the same lineage and another from a different lineage. The broken lines at the periphery of each circle represent scaffolds, with the length of each segment indicating the relative length of that scaffold. The top five most synteny block-rich scaffolds were selected from each species and were linked with the corresponding syntenic regions in the other species. Grey lines link syntenic blocks that were identified only in one pair of species, whilst red lines highlight syntenic blocks that were shared (minimum two overlapping syntenic orthologs) by two pairs of species. The number of syntenic blocks between robust (e.g. Fungia) and complex (e.g. Galaxea) corals (a) greatly exceeds that between the sea anemones Aiptasia and Nematostella (b)
Fig. 3Organisation of HOX-related genes in corals and sea anemones. Relative positions and orientations of cluster H1 genes are largely conserved across taxa with the exception of Aiptasia. Arrows indicate the direction of transcription and colours distinguish between orthologous groups of genes identified by phylogenetic analysis. Genes that have been duplicated within a family are represented by multiple smaller arrows. The nomenclature used here for the H1 genes is based on Chourrout et al. [35] and Baumgarten et al. [20]; note that Mnx1 is also known as hlxB9. Unconnected arrows represent different scaffolds (two from Nematostella and two from Aiptasia); dotted lines indicate genes that were assembled on a different scaffold from other H1 genes, but can be putatively placed in the cluster by comparing gene arrangement with other species; arrows without outline indicate a manually corrected gene model; vertical wavy lines represent a long genomic distance with 10–20 genes between. Gene order and orientation in ParaHox cluster H2 are highly conserved in corals, whilst local gene rearrangement is observed in both sea anemones. Black box and circle represent the POMP and CD027 genes, respectively
Fig. 4Venn diagram of shared and unique PFAM-A domains amongst sea anemones and complex and robust corals. Set membership counts shown without parentheses consider a domain to be present in a lineage if it is found in any of the species in that lineage. Numbers in parentheses indicate set memberships calculated with the requirement that the domain is present in all species in that group. In the situation where corals of the complex or robust lineages form a group with anemones, the domain is counted when it is present in all coral species and either of the sea anemones
Lineage restricted PFAM-A domains
| Domain | Description | Domain | Description | ||
|---|---|---|---|---|---|
| Present in ANEMONE not in coral | DUF3445 | Protein of unknown function (DUF3445) | Present in COMPLEX not in robust corals | BCL_N* | BCL7, N-terminal conserver region |
| DUF853 | Bacterial protein of unknown function (DUF853) | SR-25* | Nuclear RNA-splicing-associated protein | ||
| IF3_N | Translation initiation factor IF-3, N-terminal domain | Toxin_R_bind_N* | Clostridium neurotoxin, N-terminal receptor binding | ||
| CHZ | Histone chaperone domain CHZ | DUF4557 | Domain of unknown function (DUF4557) | ||
| DUF455 | Protein of unknown function (DUF455) | Nucleoplasmin* | Nucleoplasmin | ||
| MotA_ExbB | MotA/TolQ/ExbB proton channel family | CBM_11* | Carbohydrate binding domain (family 11) | ||
| COX7a | Cytochrome c oxidase subunit VIIa | Microtub_assoc* | Microtubule associated | ||
| PNMA | PNMA | Drc1-Sld2* | DNA replication and checkpoint protein | ||
| SLBB | SLBB domain | Prefoldin_3* | Prefoldin subunit | ||
| 5-nucleotidase | 5′-nucleotidase | CutA1* | CutA1 divalent ion tolerance protein | ||
| MacB_PCD | MacB-like periplasmic core domain | DUF2414* | Protein of unknown function (DUF2414) | ||
| Glyco_trans_4_2 | Glycosyl transferase 4-like | Dynein_IC2* | Cytoplasmic dynein 1 intermediate chain 2 | ||
| HtrL_YibB | Bacterial protein of unknown function (HtrL_YibB) | MIF4G_like* | MIF4G like | ||
| Glyco_transf_17 | Glycosyltransferase family 17 | TAN* | Telomere-length maintenance and DNA damage repair | ||
| DUF1762 | Protein of unknown function (DUF1762) | DUF2201 | VWA-like domain (DUF2201) | ||
| FAM216B | FAM216B protein family | MGC-24* | Multi-glycosylated core protein 24 (MGC-24) | ||
| Resistin | Resistin | Hint_2 | Hint domain | ||
| DUF3598 | Domain of unknown function (DUF3598) | Mem_trans* | Membrane transport protein | ||
| PDH | Prephenate dehydrogenase | DUF4606* | Domain of unknown function (DUF4606) | ||
| YbjQ_1 | Putative heavy-metal-binding | ||||
| CBM_20 | Starch binding domain | ||||
| IF2_N | Translation initiation factor IF-2, N-terminal region | ||||
| IlvN | Acetohydroxy acid isomeroreductase, catalytic domain | ||||
| Present in CORAL not in anemone | OLF | Olfactomedin-like domain | Present in ROBUST not in complex corals | RAG1 | Recombination-activation protein 1 (RAG1) |
| LRRNT | Leucine rich repeat N-terminal domain | DUF2341 | Domain of unknown function (DUF2341) | ||
| Myb_DNA-bind_3 | Myb/SANT-like DNA-binding domain | Toxin_60 | Putative toxin 60 | ||
| HTH_19 | Helix-turn-helix domain |
|
| ||
| Parvo_coat_N | Parvovirus coat protein VP1 | Glt_symporter* | Sodium/glutamate symporter | ||
| ScpA_ScpB | ScpA/B protein | NTPase_I-T* | Protein of unknown function DUF84 | ||
| MFS_3 | Transmembrane secretion effector | XG_Ftase* | Xyloglucan fucosyltransferase | ||
| BBE | Berberine and berberine like | Phospholip_A2_2* | Phospholipase A2 | ||
| LBR_tudor | Lamin-B receptor of TUDOR domain | Polysacc_lyase | Polysaccharide lyase | ||
| HATPase_c_4 | ATP-dependent DNA helicase recG C-terminal |
|
| ||
| PSDC | Phophatidylserine decarboxylase | Sigma70_r4_2* | Sigma-70, region 4 | ||
| STAG | STAG domain | HTH_Tnp_Tc3_2* | Transposase | ||
| GH3 | GH3 auxin-responsive promoter |
|
| ||
| Glutaminase | Glutaminase | EURL | EURL protein | ||
| Smoothelin | Smoothelin cytoskeleton protein |
|
| ||
| DUF1982 | Domain of unknown function (DUF1982) | Hexapep_2* | Hexapeptide repeat of succinyl-transferase | ||
| MRP-L28 | Mitochondrial ribosomal protein L28 | DUF4094 | Domain of unknown function (DUF4094) | ||
| Tocopherol_cycl | Tocopherol cyclase |
|
| ||
| Sec39 | Secretory pathway protein Sec39 | DUF1864 | Domain of unknown function (DUF1864) | ||
| DUF3496 | Domain of unknown function (DUF3496) | Phage_T7_tail | Phage T7 tail fibre protein | ||
| DUF4613 | Domain of unknown function (DUF4613) |
|
| ||
| HK | Hydroxyethylthiazole kinase family | Lipase_GDSL_3 | GDSL-like Lipase/Acylhydrolase family | ||
| Macro_2 | Macro-like domain | ||||
| DUF72 | Protein of unknown function DUF72 | ||||
| SRP9–21 | Signal recognition particle 9 kDa protein (SRP9) | ||||
| PMT_2 | Dolichyl-phosphate-mannose-protein mannosyltransferase | ||||
| STAT_int | STAT protein, protein interaction domain | ||||
| CNPase | 2′,3′-cyclic nucleotide 3′-phosphodiesterase (CNP or CNPase) | ||||
| UPF0066 | Uncharacterised protein family UPF0066 | ||||
| Tnp_zf-ribbon_2 | DDE_Tnp_1-like zinc-ribbon | ||||
| Eno-Rase_NADH_b | NAD(P)H binding domain of trans-2-enoyl-CoA reductase | ||||
| CR6_interact | Growth arrest and DNA-damage-inducible proteins-interacting protein 1 | ||||
| BRCA-2_OB3 | BRCA2, oligonucleotide/oligosaccharide-binding, domain 3 |
*Domains present in Nematostella or Aiptasia. See Additional file 2: Table S14 for more detail
Domains shown in italics represent domains involved in the histidine biosynthesis pathway
Fig. 5Histidine biosynthetic pathway in robust corals. The biochemical pathway by which histidine is synthesised from phosphoribosyl pyrophosphate is the same in plants, fungi, and bacteria, but some steps are brought about by unrelated proteins in different organisms. In robust corals, a fungal-like complement of enzymes is involved, the proteins responsible being (a) ATP phosphoribosyltransferase, (b) histidine biosynthesis trifunctional protein, (c) 5′ProFAR isomerase, (d) IGP synthase, (e) imidazoleglycerol-phosphate dehydratase, (f) histidinol-phosphate aminotransferase, and (g) histidinol-phosphate phosphatase. Abbreviations used: PRPP, phosphoribosyl pyrophosphate; ATP; adenosine triphosphate; PPi, pyrophosphate; PR-ATP, phosphoribosyl-ATP; PR-AMP, phosphoribosyl-AMP; 5′ProFAR, 1-(5-phosphoribosyl)-5-[(5-phosphoribosylamino) methylideneamino] imidazole-4 carboxamide; PRFAR, 5-[(5-phospho-1-deoxyribulos-1-ylamino)methylideneamino]-1-(5-phosphoribosyl) imidazole-4-carboxamide; IGP, imidazole-glycerol phosphate; AICAR, 1-(5′-phosphoribosyl)-5-amino-4-imidazolecarboxamide; IAP, imidazole-acetol phosphate; Hol-P, L-histidinol phosphate; Pi, phosphate; NAD+, oxidised nicotinamide adenine dinucleotide; NADH, reduced nicotinamide adenine dinucleotide
Genes involved in histidine biosynthesis pathway in cnidarians
| Species | KEGG K Identifier | Gene ID | Activity (Fig. | Step catalysed (Fig. | SwissProt Accession ID | Match species | Gene description |
|---|---|---|---|---|---|---|---|
|
| K00765 | ffun1.m4.9038 | A | 1 | Q75AK8 |
| ATP phosphoribosyltransferase |
| K14152 | ffun1.m4.19036 | B | 2, 3, 9 and 10 | P45353 |
| Histidine biosynthesis trifunctional protein | |
| K01814 | ffun1.m4.11161 | C | 4 | Q6C2U0 |
| 1-(5-phosphoribosyl)-5-[(5-phosphoribosylamino)methylideneamino] imidazole-4-carboxamide isomerase | |
| K01663 | ffun1.m4.971 | D | 5 | Q9SZ30 |
| Imidazole glycerol phosphate synthase hisHF | |
| K01693 | ffun1.m4.26536 | E | 6 | P28624 |
| Imidazoleglycerol-phosphate dehydratase | |
| K00817 | ffun1.m4.6504 | F | 7 | A5FFY0 |
| Histidinol-phosphate aminotransferase | |
| K04486 | ffun1.m4.13871 | G | 8 | O14059 |
| Probable histidinol-phosphatase | |
|
| K00765 | gasp1.m3.565 | A | 1 | Q99145 |
| ATP phosphoribosyltransferase |
| K14152 | gasp1.m3.3160 | B | 2, 3, 9 and 10 | P45353 |
| Histidine biosynthesis trifunctional protein | |
| K01814 | gasp1.m3.11564 | C | 4 | Q10184 |
| 1-(5-phosphoribosyl)-5-[(5-phosphoribosylamino)methylideneamino] imidazole-4-carboxamide isomerase | |
| K01663 | gasp1.m3.19737 | D | 5 | Q9SZ30 |
| Imidazole glycerol phosphate synthase hisHF | |
| K01693 | gasp1.m3.16481 | E | 6 | Q12578 |
| Imidazoleglycerol-phosphate dehydratase | |
| K00817 | gasp1.m3.19230 | F | 7 | Q11VM5 |
| Histidinol-phosphate aminotransferase | |
| K04486 | gasp1.m3.11323 | G | 8 | O14059 |
| Probable histidinol-phosphatase | |
|
| K04486 | 1.2.15090 | G | 8 | O14059 |
| Probable histidinol-phosphatase |
|
| K04486 | gfas1.m1.2962 | |||||
|
| K04486 | plut2.m8.12019 | |||||
|
| K01663 | NEMVEDRAFT_ | D | 5 | Q9SZ30 |
| Imidazole glycerol phosphate synthase hisHF |
See Additional file 2: Table S18 for more detail
Fig. 6Phylogenetic analysis resolves the robust coral ATP phosphoribosyltransferases from their fungal and Symbiodinium homologues. ATP phosphoribosyltransferase proteins catalyse the first step in the histidine biosynthetic pathway, but the robust coral sequences are clearly resolved from those of representatives of other kingdoms of life in phylogenetic analyses. IQ-TREE was applied to generate the unrooted tree shown and automatic model selection chose LG + G4 as the best model. Numbers on nodes represent UFboot values based on 1000 iterations. Branch lengths indicate the expected number of amino acid substitutions per site. It is clear that robust coral proteins form a tight clade closest to fungal proteins, whereas proteins derived from Symbiodinium strains that are endosymbiotic with robust corals were most similar to those from other Symbiodinium isolates. In the case of Symbiodinium strains, the host species is indicated in parentheses, the exception being S. kawagutii, which, although isolated in association with the coral Montipora verrucosa (C), is now thought to be non-symbiotic [127]
Fig. 7Loss of histidine biosynthesis trifunctional protein (HIS2, K14152) in complex corals and sea anemones. Syntenic regions surrounding the robust coral K14152 locus are conserved amongst robust corals, complex corals, and sea anemones, but K14152 has been lost in the latter two groups. Gene names identified from a blast search against the SwissProt database are shown at the top of the figure. Gene identifiers from each species are displayed beneath the coloured boxes. Relative positions of syntenic orthologs are aligned, and blank spaces represent missing syntenic orthologs. Red crosses are used to indicate that blast searches (using Evalue threshold 0.1) of those syntenic regions were conducted with both the robust coral K014152 nucleotide and protein sequences but did not detect any homology