| Literature DB >> 22303390 |
Abstract
ncRNAs are key genes in many human diseases including cancer and viral infection, as well as providing critical functions in pathogenic organisms such as fungi, bacteria, viruses, and protists. Until now the identification and characterization of ncRNAs associated with disease has been slow or inaccurate requiring many years of testing to understand complicated RNA and protein gene relationships. High-throughput sequencing now offers the opportunity to characterize miRNAs, siRNAs, small nucleolar RNAs (snoRNAs), and long ncRNAs on a genomic scale, making it faster and easier to clarify how these ncRNAs contribute to the disease state. However, this technology is still relatively new, and ncRNA discovery is not an application of high priority for streamlined bioinformatics. Here we summarize background concepts and practical approaches for ncRNA analysis using high-throughput sequencing, and how it relates to understanding human disease. As a case study, we focus on the parasitic protists Giardia lamblia and Trichomonas vaginalis, where large evolutionary distance has meant difficulties in comparing ncRNAs with those from model eukaryotes. A combination of biological, computational, and sequencing approaches has enabled easier classification of ncRNA classes such as snoRNAs, but has also aided the identification of novel classes. It is hoped that a higher level of understanding of ncRNA expression and interaction may aid in the development of less harsh treatment for protist-based diseases.Entities:
Keywords: Giardia; Trichomonas; high-throughput sequencing; miRNA; ncRNA; siRNA; snoRNA
Year: 2011 PMID: 22303390 PMCID: PMC3268645 DOI: 10.3389/fgene.2011.00096
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Summary of ncRNA discovery in human pathogenic protists.
| Protist | Lineage | Disease | ncRNAs+ | RNAi proteins | Functional RNAi |
|---|---|---|---|---|---|
| Diplomonad | Giardiasis | miRNA, siRNA | Dicer-like, Argonaute, RdRp (Macrae et al., | Not proven natively with miRNA but long dsRNA shown to control specific gene down regulation (Rivero et al., | |
| Parabasalid | Trichomoniasis | miRNA, siRNA | Dicer-like, Argonaute, RdRp (Carlton et al., | Yes (Lin et al., | |
| Apicomplexa | Malaria | – | Absence of proteins with a PAZ or piwi domain (Baum et al., | No (Baum et al., | |
| Amebozoa | Amoebic dysentery | siRNA | Argonaute, Dicer-like*, RdRp | Yes (Reviewed in Zhang et al., | |
| Kinetoplastid | siRNA, miRNA ( | Argonaute, Dicer-like (not in | Yes in | ||
| Kinetoplastid | Leishmaniasis | siRNA | Argonaute, Dicer-like | Some species (reviewed in Lye et al., |
+These protists all contain RNase P RNA, RNase MRP RNA, tRNAs, rRNAs, snRNAs, and snoRNAs; *atypical protein structure.
Figure 1Genomic approaches to ncRNA identification. Both the traditional laboratory approach and the more recent high-throughput sequencing approach begin with the isolation of total RNA from culture, followed by size selection of the RNA by excising a given band from a polyacrylamide gel. Under the traditional approach the excised RNA is cloned then sequenced by Sanger Sequencing to obtain candidate miRNA sequences. With High-throughput sequencing the RNA is sequenced directly without cloning and bioinformatics is used to select the best candidates. Computational approaches do not begin with biological samples but instead use mathematical models based on known ncRNAs to search an already sequenced genome. Both the traditional and computational approaches require that candidate gene expression be confirmed by additional laboratory work. Key: Molecular biology stages are represented by the flask icon. All other stages use RNA genomic and bioinformatics procedures.
Figure 2A general approach for using high-throughput sequencing data to search for small and larger ncRNAs. Total RNA can contain ncRNA sequences of different sizes. Small ncRNAs such as miRNAs and siRNAs will produce data containing the ncRNA sequence plus adaptor sequence. Once the adaptor is trimmed off, the sequence is ready for mapping and further genomics. When longer ncRNAs are sequenced, they are fragmented to a predetermined size by the sequencing process. After sequencing they can be assembled by mapping (see text) into their longer sequences. This approach was successfully used for the analysis of ncRNAs from Giardia lamblia (see text for details).
Figure 3ncRNAs processes characterized in . Giardia has two nuclei which appear identical and replicate at the same time and all of the processes within Transcription, RNP Biogenesis, Transcriptional regulation, and Core RNA processing could be expected to be found in both nuclei. RNAi may also be important in viral defense since some siRNAs found in high-throughput sequencing do map to the stable Giardiavirus (unpublished results). It is not known yet whether Giardia has RNA storage granules but their presence in Trypanosoma does not preclude them here. A similar diagram can be visualized for Trichomonas except that it only has one nucleus.