| Literature DB >> 31286066 |
Li Guo1, Sijie Liang1, Zhongyi Zhang1, Hang Liu1, Songwen Wang2, Kehou Pan3, Jian Xu4, Xue Ren5, Surui Pei5, Guanpin Yang1,6,7.
Abstract
The species of the genus Nannochloropsis are unique in their maintenance of a nucleus-plastid continuum throughout their cell cycle, non-motility and asexual reproduction. These characteristics should have been endorsed in their gene assemblages (genomes). Here we show that N. oceanica has a genome of 29.3 Mb consisting of 32 pseudochromosomes and containing 7,330 protein-coding genes; and the host nucleus may have been overthrown by an ancient red alga symbiont nucleus during speciation through secondary endosymbiosis. In addition, N. oceanica has lost its flagella and abilities to undergo meiosis and sexual reproduction, and adopted a genome reduction strategy during speciation. We propose that N. oceanica emerged through the active fusion of a host protist and a photosynthesizing ancient red alga and the symbiont nucleus became dominant over the host nucleus while the chloroplast was wrapped by two layers of endoplasmic reticulum. Our findings evidenced an alternative speciation pathway of eukaryotes.Entities:
Keywords: Evolution; Genomics
Mesh:
Substances:
Year: 2019 PMID: 31286066 PMCID: PMC6610115 DOI: 10.1038/s42003-019-0500-9
Source DB: PubMed Journal: Commun Biol ISSN: 2399-3642
The characteristics of N. oceanica genome
| Assembled genome size (bp) | 29,303,273 |
|---|---|
| Read coverage depth | 112× |
| No. of contigs | 129 |
| Contig N50 (bp) | 664,749 |
| Length of maximum contig | 1,540,838 |
| No. of contigs clustered | 129 |
| No. of contigs ordered and oriented | 128a |
| No. of contigs in trunks | 86b |
| No. of pseudochromosomes | 32 |
| No. of nuclear chromosomes | 30 |
| No. of chloroplast chromosome | 1 |
| No. of mitochondrial chromosome | 1 |
| Total length of pseudochromosomes (bp) | 29,303,273 |
| Max. length of pseudochromosomes (bp) | 1,670,642 |
| No. of protein-coding genes | 7330 |
| Average length of protein-coding genes (bp)c | 2084 |
| Length percentage of repeat sequence (bp) | 19.54 |
| Length percentage of non-protein-coding genes (bp) | 0.0674 |
| No. of exons each gene | 2.87 |
| Average length of exons (bp) | 483.7 |
| Average length of introns (bp) | 370.9 |
aOne contig is a singleton, which is not related with any other contigs
bA trunk contains at least more than three contigs
cA portion of these genes are annotated as hypothetical
Fig. 1Protein-encoding gene models on each pseudochromosomes of N. oceanica. Mitochondrial and chloroplast pseudochromosomes are presented linearly but they can be circulated as their two ends contain the repeat assemblies of sequences. a The pseudochromosomes with gene models marked with vertical bars. The green bars represent gene models in the sense strand, while the yellow ones in the anti-sense strand of DNA. b The heatmap showing the interaction between 30 nuclear pseudochromosomes. c The Venn diagram of functional genes annotated against NT, NR, BLASTX, and BLASTP databases. d The similarity between nuclear pseudochromosomes. Less homozygosity is found among these pseudochromosomes, which is one of the characteristics of N. oceanica, genome reduction
Fig. 2Phylogeny of Nannochloropsis and their relatives in different taxa deduced from nuclear genome (a), chloroplast genome (b), 18S ribosomal RNA gene (18S rDNA, c) and ribulose bisphosphate carboxylase large chain gene (rbcL) (d). a The phylogenetic tree of species inferred from the OrthoFinder using the genome protein sequences each species from the whole genome. b The phylogenetic tree of species inferred from the OrthoFinder using the chloroplast genome sequences each species. c The phylogenetic trees for 18S rDNA. The tree shows the consensus tree topology inferred by Bayesian analysis using alignments of 18S genes from NCBI. The scale bar indicates the nucleotide substitutions per site. This consensus topology derived from 512 trees, lnL = 22033.73. d The phylogenetic trees for rbcL protein. The tree shows the consensus tree topologies inferred by Bayesian analysis using alignments of rbcL proteins from NCBI. Scale bars represent 0.1 amino acid substitutions per site. In total, 439 aligned amino acid sites were analyzed. This consensus topology derived from 726 trees, α = 0.47 (0.41 < α < 0.56), pI = 0.0019 (0.0000007 < pI < 0.0059) and lnL = 10364.8
Fig. 3Homologous protein-coding genes of N. oceanica found among a few representative microalgal species. a The percentages of homologous genes of P. tricornutum, T. pseudonana, A. anophagefferens, N. oceanica, and N. gaditana found in the nuclear genomes of E. huxleyi, G. theta, C. merolae, and G. sulphuraria. b The homologous protein-coding genes of N. oceanica found in A. anophagefferens but not in P. tricornutum, in P. tricornutum but not in A. anophagefferens and in both of them. c The number and percentage of the homologous genes of N. oceanica found in both A. anophagefferens and P. tricornutum (2478 in total) can be further partitioned into three patterns (I through III) and multi-copy genes. The pattern I contains the genes phylogenetically near to those of A. anophagefferens, while patterns II and III contain the genes phylogenetically far from those of A. anophagefferens
The identification of host ERAD (endoplasmic reticulum-associated protein degradation) and symbiont SELMA (symbiont-derived ERAD-like machinery) components in N. oceanica (NO), N. gaditana (NG), P. tricornutum (PT), C. merolae (CM), and G. sulphuraria (GS)
| Species | NOa | NGa | PTa | CMa | GSa | ||||
|---|---|---|---|---|---|---|---|---|---|
| H | S? | H | S? | H | S | ||||
| ER translocation | Sec61 | X | − | X | − | X | − | X | X |
| Derlin protein | Dfm1, hDer1-1 (sDer1-1) | X | Xsp | X | X | X | X | X | X |
| Der1, hDer1-2 (sDer1-2) | X | − | X | − | X | X | X | X | |
| Ubiquitinylation | Hrd1/Der3 (ptE3p) | X | − | X | − | X | X | X | X |
| Hrd3 | − | − | − | − | X | − | X | X | |
| Uba1 (sUba1) | X | X | X | X | X | X | X | X | |
| Doa10 | X | − | X | − | X | − | X | X | |
| Ubc6 (sUbc6) | Xsf | − | X | − | X | X | X | X | |
| Polyubiquitin (sUbi) | X | − | X | − | X | X | − | − | |
| Cdc48 complex | Cdc48 (sCdc48-1) | X | − | X | − | X | X | X | X |
| (sCdc48-2) | − | − | − | − | − | X | − | − | |
| Ufd1 (sUfd1) | X | − | X | − | X | X | X | X | |
| Npl4 (sNpl4) | X | − | X | − | X | X | X | X | |
| (sUBX) | − | − | − | − | − | X | X | X | |
| (sPUB) | − | − | − | − | − | X | − | − | |
| Processing | Png1 (sPng1) | X | X | X | X | X | X | X | X |
| Dsk2 (sUbq) | X | X | − | X | X | X | X | X | |
| Rad23 | X | − | X | − | X | − | X | X | |
| Ufd2 | X | − | X | − | X | − | X | X | |
| (ptDUP) | − | − | − | − | − | X | − | − | |
| Unknown function | (PPP1) | X | − | X | − | − | X | X | X |
| Chaperones | Hsp70 (sHsp70) | X | Xsp | X | X | X | X | X | X |
| Hsp40/Ydj1/MAS5 | X | − | X | − | X | − | X | X | |
| (sDPC) | − | − | − | − | − | X | − | − |
The genes of N. oceanica are searched against the genome obtained in this study, while those of N. gaditana are searched against the genome published early[19] (e-value < e−5 and annotated as the genes searched). In parentheses, symbiont specific if with s or putatively symbiont specific if without s. X, found; –, not found; ?, not sure as the signal peptide at the N-terminal was not found except two with superscript sp of N. oceanica. sf, annotated as different names but the function is the same
aFor these two species, host-specific (H) and symbiont-specific (S) are judged according to the similarity between protein query and its hit. Data cited from Stork et al.[42]
The identification of translocon components in the outer and inner membranes (OM and IM) of plastid of N. oceanica (NO), N. gaditana (NG), P. tricornutum (PT), C. merolae (CM), and G. sulphuraria (GS)
| Species | NO | NG | PT | CM | GS | |
|---|---|---|---|---|---|---|
| Translocon in IM | Tic20 | − | − | X | X | X |
| Tic22 | X | X | X | X | X | |
| Tic55 | X | X | X | − | − | |
| Tic110 | X | − | X | X | X | |
| Translocon in OM | Toc75 | − | − | − | X | X |
| Omp85 | X | X | X | X | X | |
| Toc64 | X | X | X | X | X | |
| Toc34 | X | X | X | X | X |
The genes of N. oceanica are searched against the genome obtained in this study, while those of remaining species are searched against the genome published early. X, found; –, not found
The numbers of eight categories of flagellum-associated genes of two Nannochloropsis species and their comparison with those (in average) of protists determined early[35]
| Gene category | Motile protists | Motile small protists | Nonmotile protists |
|
| Gene category |
|---|---|---|---|---|---|---|
| Structural gene | Tubulin | 4 | 4 | 4 | 2 | 2 |
| Radial spoke | 15 | 15 | 8 | 2 | 1 | |
| Central pair | 11 | 10 | 8 | 6 | 4 | |
| ODA | 19 | 19 | 11 | 6 | 4 | |
| IDA | 21 | 21 | 16 | 4 | 2 | |
| Functional gene | IFT-A complex | 6 | 6 | 1 | 2 | 0 |
| IFT-B complex | 21 | 20 | 6 | 7 | 4 | |
| BBSome | 8 | 8 | 2 | 0 | 0 | |
| Total | 104 | 104 | 55 | 29 | 17 | Total |
Nonmotile species include stramenopiles, haptophytes, diatoms, and green algae; motile species are all dinoflagellates; motile small species include cryptophytes, euglenophytes, stramenopiles, and haptophytes. The genes of N. oceanica are searched against the genome obtained in this study, while those of N. gaditana are searched against the genome published early[19] (searching with e-value < e−5 as the threshold of expected hits; filtrated with those of full lengths as candidates and accepted as the protein expected if a candidate shows the highest similarity which is usually extremely less than the e-value threshold for searching and is annotated simultaneously as the expected). ODA, outer dynein arm; IDA, inner dynein arm
Identification of known meiosis-specific genes used as a meiosis detection toolkit
| Species | Meiosis-specific genesa | |||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
| |
|
| X | X | X | X | X | X | X | X |
|
| X | X | X | X | X | X | X | X |
|
| X | X | X | X | X | X | X | X |
|
| X | X | X | X | X | X | X | X |
|
| X | X | X | X | X | X | X | X |
|
| X | X | − | − | X | − | X | X |
|
| X | X | − | X | − | X | X | X |
aOf nine meiosis-specific genes[51, 52], Rec8 is not identified in two species of genus Nannochloropsis. A homologous gene of Rec8 is found in the genome of N. oceanica; however, it is phylogenetically similar to Rad21, a meiosis associating gene. We believed that a meiosis-specific gene is truly absent and not replaced by an unidentified gene if it exists in a wide taxonomical range of species which include here, for example, a yeast, two green algae, a brown alga, and a haptophyte. These genes are identified by searching against the genomes either downloaded from public databanks or sequenced ourselves (N. oceanica) with BLASTp or tBLASTn and by phylogenetic analysis
Fig. 4The hypothetical route of N. oceanica speciation through cellular fusion and nuclear haploidization. The symbiont nucleus may have overthrown the host nucleus during speciation of Nannochloropsis. The cell fusion mode of speciation needs no evolution of new structure and function, which should aid to fast speciation of Nannochloropsis. Instead of free floating, the plastid homing in the host cytoplasm may be enveloped by either one or two layers of cytoplasmic reticulum, forming a continuum with nucleus and dividing simultaneously with nucleus