| Literature DB >> 36246658 |
Nagesh Srikakulam1, Ganapathi Sridevi2, Gopal Pandi1.
Abstract
The Reference Transcriptomic Dataset (RTD) is an accurate and comprehensive collection of transcripts originating from a given organism. It holds the key to precise transcript quantification and downstream analysis of differential expressions and regulations. Currently, transcriptome annotations for most crop plants are far from complete. For example, Oryza sativa indica (O. sativa indica) is reported to have 40,759 transcripts in the Ensembl database without alternative transcript isoforms and alternative splicing (AS) events. To generate a high-quality RTD, we conducted RNA sequencing of rice leaf samples collected at various time points during Rhizoctonia solani infection. The obtained reads were analyzed by adopting the recently developed computational analysis pipeline to assemble the RTD with increased transcript and AS diversity for O. sativa indica (IndicaRTD). After stringent quality filtering, the newly constructed transcriptome annotation was comprised of 122,968 non-redundant transcripts from 53,695 genes. This study identified many novel transcripts compared to Ensembl deposited data that are important for regulating molecular and physiological processes in the plant system. Currently, the assembled IndicaRTD must allow fast quantification of transcript and gene expression with high precision.Entities:
Keywords: RNA sequencing; Rhizoctonia solani; alternative splicing; reference transcriptome data; rice plant
Year: 2022 PMID: 36246658 PMCID: PMC9558114 DOI: 10.3389/fgene.2022.995072
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.772
STAR first and second-pass mapping splice junction (SJ) statistics. RNA-seq clean reads were mapped on the Oryza sativa Indica genome using STAR aligner, which generated the possible SJs. The total number of unique SJs was given with uniquely and multi-mapped reads crossing the SJs. The numbers of both canonical and annotated SJs with at least 1 uniquely mapped read crossing the junction are given.
| STAR alignment splice junctions (SJs) statistics | ||
|---|---|---|
| Feature | First-pass mapping with 2 mismatches | Second-pass mapping with 0 mismatches |
| Unique no. of novel SJs (Uniquely + multi-mapped reads) | 214,063 | 2,031 |
| Unique no. of annotated SJs (Uniquely + multi-mapped reads) | 103,338 | 304,758 |
| Unique no. of novel canonical SJs with at least 1 uniquely mapped read count | 213,128 | 1,862 |
| Unique no. of annotated canonical SJs with at least 1 uniquely mapped read count | 101,766 ( | 300,875 ( |
| Total unique no. of canonical SJs with at least 1 uniquely mapped read count | 314,894 | 302,737 |
| Total no. of unique SJs (Uniquely + multi-mapped reads) | 317,401 | 306,789 |
FIGURE 1(A) Illustration of the usage of exon and intron coordinates for non-redundant transcripts to evaluate transcript GTF files. (B) An example set of non-redundant transcripts for the gene ID G10043 for the Taco merged annotation file by exon and intron coordinates.
FIGURE 2Density plots of the JCC scores of IndicaRTD (red line) and Ensembl RTD (black line) genes for each sequencing library. A comparison of the distribution of genes with JCC scores for both IndicaRTD and Ensembl RTD annotation for 12_1, 12_2, 24_1, 24_2, 24_3, and 36_1 sequencing libraries are shown. The X- and Y-axis represent the JCC score and gene density, respectively.
Comparison of IndicaRTD and Ensembl RTD. (A) The table shows the percentages of reads aligned with the transcripts and the number of transcripts mapped by the RNA-seq reads by the Salmon mapping tool for both Ensembl and IndicaRTD annotation. (B) The table shows the percentages of reads aligned to the transcripts and the number of transcripts mapped by the RNA-seq reads by the Kallisto mapping tool for both Ensembl and IndicaRTD annotation.
| (A) Salmon alignment | ||||
|---|---|---|---|---|
| Read mapping rate for the transcripts | Number of transcripts with read mapping | |||
|
| Indica RTD (current study) |
| Indica RTD (current study) | |
| 12_1 | 74.31 | 90.60 | 27224 | 78192 |
| 12_2 | 73.96 | 90.66 | 27363 | 78661 |
| 24_1 | 69.85 | 87.33 | 25909 | 72772 |
| 24_2 | 77.64 | 90.94 | 27206 | 75207 |
| 24_3 | 75.71 | 90.75 | 27240 | 76214 |
| 36_1 | 79.30 | 91.73 | 26185 | 72401 |
| 36_2 | 72.42 | 90.12 | 26104 | 74237 |
| 48_1 | 76.19 | 90.93 | 26127 | 69162 |
| 48_2 | 69.41 | 89.77 | 26530 | 74537 |
| 60_1 | 77.07 | 90.99 | 26835 | 76159 |
| 60_2 | 75.04 | 90.53 | 26534 | 74710 |
| 60_3 | 75.10 | 90.68 | 26801 | 77213 |
| 72_1 | 73.96 | 90.44 | 26741 | 76542 |
| 72_2 | 75.42 | 91.08 | 25781 | 73774 |
| 72_3 | 73.81 | 90.57 | 26941 | 77593 |
| C1 | 77.64 | 90.94 | 27206 | 75204 |
| C2 | 73.41 | 88.37 | 26996 | 77617 |
| C3 | 72.82 | 86.20 | 26893 | 77717 |