| Literature DB >> 30318349 |
Jon Bråte1, Ralf S Neumann1, Bastian Fromm2, Arthur A B Haraldsen1, James E Tarver3, Hiroshi Suga4, Philip C J Donoghue3, Kevin J Peterson5, Iñaki Ruiz-Trillo6, Paul E Grini1, Kamran Shalchian-Tabrizi7.
Abstract
The emergence of multicellular animals was associated with an increase in phenotypic complexity and with the acquisition of spatial cell differentiation and embryonic development. Paradoxically, this phenotypic transition was not paralleled by major changes in the underlying developmental toolkit and regulatory networks. In fact, most of these systems are ancient, established already in the unicellular ancestors of animals [1-5]. In contrast, the Microprocessor protein machinery, which is essential for microRNA (miRNA) biogenesis in animals, as well as the miRNA genes themselves produced by this Microprocessor, have not been identified outside of the animal kingdom [6]. Hence, the Microprocessor, with the key proteins Pasha and Drosha, is regarded as an animal innovation [7-9]. Here, we challenge this evolutionary scenario by investigating unicellular sister lineages of animals through genomic and transcriptomic analyses. We identify in Ichthyosporea both Drosha and Pasha (DGCR8 in vertebrates), indicating that the Microprocessor complex evolved long before the last common ancestor of animals, consistent with a pre-metazoan origin of most of the animal developmental gene elements. Through small RNA sequencing, we also discovered expressed bona fide miRNA genes in several species of the ichthyosporeans harboring the Microprocessor. A deep, pre-metazoan origin of the Microprocessor and miRNAs comply with a view that the origin of multicellular animals was not directly linked to the innovation of these key regulatory components.Entities:
Keywords: DGCR8; Drosha; Holozoa; Ichthyosporea; Pasha; Sphaeroforma; evolution; miRNA; microRNA; microprocessor
Mesh:
Substances:
Year: 2018 PMID: 30318349 PMCID: PMC6206976 DOI: 10.1016/j.cub.2018.08.018
Source DB: PubMed Journal: Curr Biol ISSN: 0960-9822 Impact factor: 10.834
Figure 1The Evolution of the Animal miRNA Biogenesis Pathway across Holozoa
(A) Schematic drawing of the canonical miRNA pathway in animals. Key proteins are indicated inside rectangles.
(B) Phylogenetic tree of Holozoa with Fungi and Amoebozoa as outgroups. Green branches on the tree indicate the hypothesized origin and evolutionary trajectory of the Microprocessor components (Drosha and Pasha), and black branches indicate the absence of Microprocessor components. Open circles indicate loss of both Microprocessor components. Taxa highlighted in red have been sequenced for small RNAs in this study.
(C) Presence of miRNAs and genes involved in miRNA biogenesis and function are indicated by filled circles, and absence is indicated by empty circles. For Dicer, filled circles means that two or more Dicers were discovered, and half-filled circles means a single Dicer was identified. Taxa with no circles for miRNAs indicate that small RNAs have not been sequenced.
See also Tables S2 and S3.
Figure 2Comparison between the Domain Composition of the Human and Ichthyosporean miRNA Biogenesis Machinery
The domain composition of the ichthyosporean Dicer, Drosha, and Pasha sequences discovered in the reciprocal BLAST searches was compared against their human counterparts (Dicer [DICER1; Q9UPY3], Drosha [Q9NRR4], and Pasha [DGCR8; Q8WYQ5]), as annotated in InterPro [23]. The sequences identified in the reciprocal BLAST searches were annotated using InterProScan5 [23] and CD-Search [24] and by comparing sequence alignments and secondary structures (see STAR Methods). All domains were identified by both InterProScan and CD-Search annotation programs except the following: Dicer dimerization domain of A. parasiticum; WW domains of A. parasiticum, A. whisleri, and P. gemmata; dsRBD domains of S. sirkka Drosha and of Pasha in P. gemmata, S. arctica, and C. fragrantissima (C-terminal domain only), which were identified by CD-Search only; and the N-terminal dsRBD domains of Pasha in P. gemmata and C. fragrantissima, which were identified by InterProScan only. The RNase III-A domains of S. sirkka, S. arctica, and S. napiecek Dicer were identified using an alignment and structural modeling approach as described in STAR Methods. In addition, the single RNaseIII domain of S. napiecek Drosha was only identified by InterProScan. Incomplete domains are indicated by a jagged border. All boxes and lines are drawn to scale according to their InterProScan annotation (in cases where InterProScan did not identify a domain, the size was chosen based on the homologous domain from a closely related sequence). Except for C. fragrantissima, all genes are from de novo assembled transcriptome data; hence, the many short contigs and aberrant domains are likely due to incomplete assemblies.
See also Figure S1.
Figure 3Identification of Ichthyosporean Drosha and Pasha Sequences
(A) The modeled protein structure of the Drosha homolog identified in the ichthyosporean Abeoforma whisleri. Indicated in red is the unique Drosha insertion, including the so-called “Bump” helix [25]. Modeled structures of other identified ichthyosporean Drosha genes are shown in Figure S1.
(B) Phylogeny of Dicer and Drosha sequences. Drosha sequences are indicated in the orange box; all other sequences are Dicer. The topology with the highest likelihood in a maximum-likelihood (ML) framework is shown, with ML bootstrap and Bayesian posterior probability (BP) nodal support values drawn onto the branching points (ML/BP). Only support values above 50% ML and/or 0.75 BP are shown. Accession numbers are given in parentheses. For all taxa, accession numbers refer to the UniProt database, except for Trichoplax adhaerens and Amphimedon queenslandica, which are from NCBI RefSeq (the A. queenslandica Dicer D sequence is taken from [26]), and Sycon ciliatum, which is from http://www.compagen.org. Ichthyosporean species are indicated in bold font. A Drosha ortholog was also detected in S. napiecek, but this sequence was incompletely assembled and did not cover the RNase III domains and was, therefore, not included in the analysis.
(C) Ichthyosporean sequences identified as Pasha in the reciprocal BLAST searches (bold font) analyzed together with double-stranded RNA binding motif (DSRM)-containing sequences from the Pfam database (see STAR Methods for details). All Pasha sequences are indicated in a purple box. HYL1 homologs are marked with an asterisk. UniProt accession numbers are given in parentheses (except for Amphimedon queenslandica, for which the NCBI RefSeq accession number is given). Tree topology and support values were created in the same way as for the phylogeny in (B).
See also Figure S1 and Table S3.
Figure 4Secondary Structure and Small RNA Coverage of a Novel Ichthyosporean miRNA
(A) The secondary structure of the novel miRNA Sar-Mir-Nov-1 identified in Sphaeroforma arctica with the likely Drosha and Dicer cut sites indicated. Mature and star strands are indicated in red and blue, respectively, and magenta indicates the presence of offset reads resulting from the Drosha cuts.
(B) The mapping of small RNA reads on the genomic location of Sar-Mir-Nov-1. The numbers on the x axis correspond to the numbers in the secondary structure in (A). Note the presence of offset reads (external reads mapping outside the pre-miRNA) that are in accordance with Drosha processing. See Figure S2 for more miRNA structures.
See also Figures S2–S4 and Table S1.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Marine Broth | Difco | Cat# 279110 |
| Trizol | Life-Technologies | Cat# 15596026 |
| Illumina Truseq small RNA seq kit | Illumina | NA |
| mirPremier microRNA Isolation Kit | Sigma-Aldrich | SNC50 |
| Terminator 5′-Phosphate-Dependent Exonuclease | Epicenter | NA |
| Tobacco Acid Pyrophosphatase | Epicenter | T19050 |
| Unprocessed small RNA and mRNA reads, and novel gene sequences used in this study. | This paper | ENA: PRJEB21207 |
| Iñaki Ruiz-Trillo’s lab. Original reference [ | Strain JP610 | |
| Brandon Hassett [ | Strain B5 | |
| Brandon Hassett [ | Strain B4 | |
| ATCC nr. 30864 | N/A | |
| Iñaki Ruiz-Trillo’s lab (available from ATCC nr. PRA-284) | N/A | |
| Trimmomatic v0.35 | [ | |
| Trinity v2.0.6 | [ | |
| Transdecoder v3.0.0 | [ | |
| Cufflinks v2.1.1 | [ | |
| Blastp | [ | |
| InterProScan | [ | |
| CD-search | [ | |
| Geneious R9 | [ | |
| Mafft v.7 | [ | |
| Phyre2 web server | [ | |
| PhyloBayes-MPI v1.5 | [ | |
| RAxML v8.0.26 | [ | |
| TopHat v2.0.14 | [ | |
| Blat v3.5 | [ | |
| NCBI Genome | Adig_1.1. ID: 10529 | |
| NCBI Genome | ASM20922v1. ID: 230 | |
| NCBI Genome | v1.0. ID: 354 | |
| NCBI Genome | v1.0. ID: 2698 | |
| SCIL_WGA_130802 | ||
| NHGRI | ||
| Neurobase | ||
| NCBI SRA | SRX956664 | |
| Data Commons | N/A | |
| NCBI Genome | v1.0. ID: 713 | |
| NCBI SRA | SRX956675 | |
| NCBI Genome | Proterospongia_sp_ATCC50818. ID: 24391 | |
| Figshare | v03 | |
| NCBI SRA | SRX096927, SRX096925 | |
| NCBI SRA | SRX377508 | |
| NCBI SRA | SRX179384, SRX096923, SRX096918 | |
| Figshare | ||
| NCBI SRA | SRX738222 | |
| NCBI SRA | SRX377507 | |
| NCBI Genome, this study | Spha_arctica_JP610_V1. ID: 11004 | |
| NCBI SRA | SRX737879 | |
| NCBI SRA | SRX738098, SRX732498 | |
| NCBI Genome | dicty_2.7. ID: 56 | |
| NCBI Genome | Font_alba_ATCC_38817_V2. ID: 12936 | |
| NCBI SRA | SRX737107 | |
| NCBI Genome | A_macrogynus_V3. ID: 327 | |
| NCBI Genome | Mort_vert_NRRL_6337_V1. ID: 801 | |
| NCBI Genome | Rozella_k41_t100. ID: 12422 | |
| NCBI Genome | S_punctatus_V1. ID: 344 | |