| Literature DB >> 25522231 |
Jaeyoung Choi, Ki-Tae Kim, Jongbum Jeon, Jiayao Wu, Hyeunjeong Song, Fred O Asiegbu, Yong-Hwan Lee.
Abstract
BACKGROUND: RNA interference (RNAi) is involved in genome defense as well as diverse cellular, developmental, and physiological processes. Key components of RNAi are Argonaute, Dicer, and RNA-dependent RNA polymerase (RdRP), which have been functionally characterized mainly in model organisms. The key components are believed to exist throughout eukaryotes; however, there is no systematic platform for archiving and dissecting these important gene families. In addition, few fungi have been studied to date, limiting our understanding of RNAi in fungi. Here we present funRNA http://funrna.riceblast.snu.ac.kr/, a fungal kingdom-wide comparative genomics platform for putative genes encoding Argonaute, Dicer, and RdRP. DESCRIPTION: To identify and archive genes encoding the abovementioned key components, protein domain profiles were determined from reference sequences obtained from UniProtKB/SwissProt. The domain profiles were searched using fungal, metazoan, and plant genomes, as well as bacterial and archaeal genomes. 1,163, 442, and 678 genes encoding Argonaute, Dicer, and RdRP, respectively, were predicted. Based on the identification results, active site variation of Argonaute, diversification of Dicer, and sequence analysis of RdRP were discussed in a fungus-oriented manner. funRNA provides results from diverse bioinformatics programs and job submission forms for BLAST, BLASTMatrix, and ClustalW. Furthermore, sequence collections created in funRNA are synced with several gene family analysis portals and databases, offering further analysis opportunities.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25522231 PMCID: PMC4290597 DOI: 10.1186/1471-2164-15-S9-S14
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Identification pipeline for funRNA. The identification pipeline for funRNA consists of two steps: i) defining domain profiles from protein sequences encoded by the reference sequences; and ii) scanning 1,440 proteomes with domain profiles for Argonaute, Dicer, and RdRP. In "Domain analysis", colored boxes indicate essential domains: blue, IPR003100 (Argonaute/Dicer protein, PAZ); red, IPR003165 (Stem cell self-renewal protein Piwi); purple, IPR005034 (Dicer double-stranded RNA-binding fold); green, IPR000999 (Ribonuclease III); orange, IPR001159 (Double-stranded RNA-binding); and gray, IPR007855 (RNA-dependent RNA polymerase, eukaryotic type). In addition, sequences collected from funRNA can be subjected to bioinformatics analysis on the funRNA website as well as in CFGP 2.0 by data exchange through the Favorite Browser.
Figure 2Distribution of the average number of genes across the taxonomic spectrum. The average numbers of gene families for each fungal taxon is shown as a cumulative bar chart. The sizes of the blue, red, and green areas in a stack indicate the average number of putative genes encoding Argonaute, Dicer, and RdRP, respectively.
Summary of the average number of genes per genome across the taxonomic spectrum.
| Kingdom | Phylum | Subphylum | |||
|---|---|---|---|---|---|
| Chromalveolata | 0 | 0 | 0.15 (0-2) | ||
| Chromista | 2.75 (0-6) | 0.25 (0-1) | 0.88 (0-3) | ||
| Fungi | Ascomycota | Pezizomycotina | 2.66 (1-7) | 1.9 (0-3) | 3.04 (2-5) |
| Saccharomycotina | 0.08 (0-1) | 0 | 0 | ||
| Taphrinomycotina | 0.83 (0-1) | 0.83 (0-1) | 0.83 (0-1) | ||
| Basidiomycota | Agaricomycotina | 5.68 (0-10) | 2.46 (0-4) | 6.93 (0-14) | |
| Pucciniomycotina | 2.25 (2-3) | 2 (0-5) | 3.25 (1-5) | ||
| Ustilaginomycotina | 0 | 0 | 0 | ||
| Wallemiomycetes** | 0 | 0 | 0 | ||
| Blastocladiomycetes** | 8 | 1 | 0 | ||
| Chytridiomycota | N/D | 3.33 (2-6) | 3.00 (3) | 0.33 (0-1) | |
| Microsporidia | N/D | 0.50 (0-1) | 0.25 (0-1) | 0 | |
| Zygomycota | Mucoromycotina | 2.33 (2-3) | 1.33 (1-2) | 4.00 (3-5) | |
| Metazoa | 9.88 (1-49) | 2.24 (0-7) | 1.27 (0-9) | ||
| Viridiplantae | Chlorophyta | 0.90 (0-2) | 0.40 (0-2) | 0.30 (0-1) | |
| Streptophyta | 13.55 (6-25) | 4.86 (0-11) | 7.27 (3-14) | ||
| Other | 1.76 (0-14) | 0.29 (0-3) | 0.76 (0-4) | ||
* Range of number of genes is shown in parentheses.
** Class name is shown if no subphylum name is assigned.
List of species selected for sequence analysis of RdRPs and reconciliation analysis of Argonautes.
| Species name | Taxonomy | Argonaute gene | RdRP gene |
|---|---|---|---|
| Archaea>Euryarchaeota>N/D | 1 | 0 | |
| Archaea>Crenarchaeota>N/D | 1 | 0 | |
| Bacteria>Aquificae>N/D | 1 | 0 | |
| Fungi>Ascomycota>Pezizomycotina | 2 | 2 | |
| Fungi>Ascomycota>Pezizomycotina | 1 | 2 | |
| Fungi>Ascomycota>Pezizomycotina | 2 | 2 | |
| Fungi>Ascomycota>Pezizomycotina | 3 | 3 | |
| Fungi>Ascomycota>Pezizomycotina | 2 | 3 | |
| Fungi>Ascomycota>Pezizomycotina | 2 | 5 | |
| Fungi>Ascomycota>Pezizomycotina | 5 | 5 | |
| Fungi>Ascomycota>Pezizomycotina | 2 | 3 | |
| Fungi>Ascomycota>Pezizomycotina | 3 | 3 | |
| Fungi>Ascomycota>Pezizomycotina | 4 | 2 | |
| Fungi>Ascomycota>Pezizomycotina | 2 | 3 | |
| Fungi>Ascomycota>Pezizomycotina | 2 | 4 | |
| Fungi>Ascomycota>Saccharomycotina | 1 | 0 | |
| Fungi>Ascomycota>Taphrinomycotina | 1 | 1 | |
| Fungi>Basidiomycota>Agaricomycotina | 7 | 7 | |
| Fungi>Basidiomycota>Agaricomycotina | 6 | 6 | |
| Fungi>Basidiomycota>Agaricomycotina | 6 | 8 | |
| Fungi>Basidiomycota>Agaricomycotina | 6 | 6 | |
| Fungi>Basidiomycota>Agaricomycotina | 1 | 1 | |
| Fungi>Basidiomycota>Pucciniomycotina | 2 | 5 | |
| Fungi>Basidiomycota>Pucciniomycotina | 2 | 5 | |
| Fungi>Blastocladiomycota>N/D | 8 | 0 | |
| Fungi>Chytridiomycota>N/D | 2 | 0 | |
| Fungi>Zygomycota>Mucoromycotina | 2 | 4 | |
| Fungi>Zygomycota>Mucoromycotina | 2 | 5 | |
| Chromista>Oomycota>Oomycotina | 5 | 1 | |
| Viridiplantae>Streptophyta>N/D | 14 | 6 | |
| Viridiplantae>Streptophyta>N/D | 25 | 6 | |
| Metazoa>Arthropoda>N/D | 12 | 0 | |
| Metazoa>Nematoda>N/D | 31 | 4 | |
| Metazoa>Chordata>Craniata | 17 | 0 |
Figure 3Duplication and loss of Argonaute genes and variation of the catalytic motif. Gene duplication and loss events were estimated by reconciliation analysis. Red and blues dots are shown at internal nodes if duplication and loss were predicted, respectively. Black dots indicate nodes where both duplication and loss were discovered. Numbers of species-level duplication and loss events, and the number of putative genes encoding Argonaute, are shown between the tree and species name. In the rightmost column, amino acid variation of the DDH motif is shown with symbols: i) filled squares indicate that all the genes in the corresponding species had the conserved reference residue; ii) shaded squares indicate existence of the conserved residue and variants; iii) empty squares indicate variants without the conserved residue; and iv) single-letter amino acid codes indicate conserved residues, but not the reference amino acid. For the complete list of partial alignments near each amino acid, see Additional file 2. Pie charts shown in the internal nodes display the distribution of DDH motif variants for each taxon specified. The proportion of genes containing the conserved DDH motif is shown in green; H substituted by D, E, or K is shown in red; H substituted by another amino acid is shown in blue; and other variants are shown in orange.
Domain profile definitions used in funRNA
| Gene family | InterPro accession number | Domain description | Number of genes* | Number of genomes* |
|---|---|---|---|---|
| Argonaute | IPR003100 | Argonaute/Dicer protein, PAZ | 1,163 | 209 |
| IPR003165 | Stem cell self-renewal protein Piwi | |||
| Dicer | IPR000999 | Ribonuclease III | 442 | 180 |
| IPR001159** | Double-stranded RNA-binding | |||
| IPR005034** | Dicer double-stranded RNA-binding fold | |||
| RdRP | IPR007855 | RNA-dependent RNA polymerase, eukaryotic type | 678 | 157 |
* Numbers of fungal genes/genomes are shown in parentheses.
** Genes containing one of two double-stranded RNA-binding domains were predicted to be Dicer-encoding genes.
Figure 4Functionalities of the funRNA website. A) Web interface of funRNA displays graphical charts for better recognition of the distribution of genes. B) Tools including similarity search tools (BLAST and BLASTMatrix) and a multiple sequence alignment tool (ClustalW) are provided via the Favorite Browser. C) Protein domain analysis can be conducted with the sequences collected in Favorites. D) Users' sequence collections can be further analyzed by the tools available in CFGP 2.0 and other sister databases.