| Literature DB >> 35806374 |
Jing Liu1,2, Shengcai Chen1,2, Min Liu1, Yimian Chen1,2, Wei Fan1,2, Seunghee Lee3, Han Xiao2, Dave Kudrna3, Zixin Li1, Xu Chen1,2, Yaqi Peng2, Kewei Tian1,2, Bao Zhang2, Rod A Wing3, Jianwei Zhang3,4, Xuelu Wang2.
Abstract
Alternative splicing (AS) is a ubiquitous phenomenon among eukaryotic intron-containing genes, which greatly contributes to transcriptome and proteome diversity. Here we performed the isoform sequencing (Iso-Seq) of soybean underground tissues inoculated and uninoculated with Rhizobium and obtained 200,681 full-length transcripts covering 26,183 gene loci. It was found that 80.78% of the multi-exon loci produced more than one splicing variant. Comprehensive analysis of these identified 7874 differentially splicing events with highly diverse splicing patterns during nodule development, especially in defense and transport-related processes. We further profiled genes with differential isoform usage and revealed that 2008 multi-isoform loci underwent stage-specific or simultaneous major isoform switches after Rhizobium inoculation, indicating that AS is a vital way to regulate nodule development. Moreover, we took the lead in identifying 1563 high-confidence long non-coding RNAs (lncRNAs) in soybean, and 157 of them are differentially expressed during nodule development. Therefore, our study uncovers the landscape of AS during the soybean-Rhizobium interaction and provides systematic transcriptomic data for future study of multiple novel directions in soybean.Entities:
Keywords: Iso-Seq; differentially splicing event; long non-coding RNA; major isoform switch; nodule development; soybean
Mesh:
Substances:
Year: 2022 PMID: 35806374 PMCID: PMC9266934 DOI: 10.3390/ijms23137371
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 6.208
Summary statistics of Iso-Seq data.
| Library Size (kb) | SMRT Cells | Polymerase Reads | FLNC 1 | High Quality Isoforms |
|---|---|---|---|---|
| 1–2 | 12 | 905,500 | 419,736 | 117,970 |
| 2–4 | 12 | 1,231,202 | 612,254 | 144,968 |
| 3–5 | 12 | 1,238,704 | 542,507 | 105,423 |
| 3.5–6 | 8 | 697,016 | 161,752 | 55,532 |
| 4–10 | 8 | 818,271 | 246,844 | 28,810 |
| Total | 52 | 4,890,693 | 1,983,093 | 452,703 |
1 FLNC: full-length non-chimeric reads.
Figure 1Summary of the Iso-Seq data. (a) The distribution of transcripts length for Glycine max cv. Jidou 17 (JD17) reference (pink) and Iso-Seq data (green); (b) Pie chart for the percentage of isoform numbers per locus; (c) Number of exons per transcript for the Iso-Seq transcripts; (d,e) Composition of the Iso-Seq isoforms relative to the JD17 reference (d) and detailed categories of novel isoforms compared to the JD17 reference (e); (f) Splicing junctions of the Iso-Seq data versus the JD17 reference.
Figure 2Genome-wide analysis of alternative splicing dynamics during nodule development. (a) Number of differentially splicing events at each time point. Different colors represent five types of common AS events. IR: Intron retention, A3SS: Alternative 3′ splicing site, A5SS: Alternative 5′ splicing site, ES: Exon skipping, MXE: Mutually exclusive exons. dpi: day post-inoculation; (b) GO enrichment analysis of the differentially spliced genes at early stage (Early, 1 to 10 dpi) and late stage (Late, 15 to 30 dpi); (c) Heatmap of the global alternative splicing changes measured by ∆PSI between inoculated and uninoculated groups over time courses. Red represents higher PSI under inoculated conditions, and blue represents the lower. ∆PSI: The difference of percent spliced in; (d) The number of the differentially spliced genes (DSGs) versus the differentially expressed genes (DEGs) at 9 time points. Pearson correlation coefficient (PCC) was performed, p-value = 9.66 × 10−6; (e) Spearman correlation coefficient (SCC) of the expression between transcripts and their corresponding genes. 16,301 DE but not DS genes (DE_noDS, N = 71,604, r = 0.52), 630 DS but not DE genes (noDE_DS, N = 20,503, r = 0.20), 1888 DS and DE genes (DE_DS, N = 77,884, r = 0.17) and 1000 random selected genes (Random, N = 4976, median SCC = 0.39). Kruskal–Wallis test was used, p < 2.2 × 10−16. N showed the number of genes-transcripts pairs for correlation calculation.
Figure 3Global differential transcript usage analysis on isoform level. (a) PCA analysis of all expressed genes. Red points showed the not differentially expressed genes (nDEGs), grey points showed the differentially expressed genes (DEGs); (b) Expression patterns of differentially expressed transcripts (DETs) from nDEGs. The values were calculated by (FPKM Inoculation + 1)/(FPKM Uninoculation + 1) and scaled using Z-score; (c) Intersections of various gene sets experienced major isoform switches between inoculated and uninoculated conditions during nodule development, including stage-specific (green dots) and continuous changes (yellow lines); (d) Comparison of transcript and CDS lengths between the former and latter isoforms, where major isoform switched irreversibly and characteristically under inoculation; (e) RT-PCR validation of expression levels DETs from (b).
Figure 4Characteristics of long non-coding RNAs in soybean root and nodule tissues. (a) Classification of lncRNAs according to their intersection with JD17-annotated protein-coding transcripts on the genome. There are four part of these categories and eight subcategories; (b–d) lincRNAs and lncNATs were compared to Iso-Seq-detected mRNA on (b) exon numbers per transcript, (c) expression levels and (d) length distributions. Grey stands for mRNA, red for lncNATs (antisense lncRNAs), blue for lincRNAs (intergenic lncRNAs without 2 kb of protein-coding transcripts). Wilcoxon test was used; (e) Expression patterns of 157 differentially expressed lincRNAs and lncNATs; (f) The spearman correlation coefficient (SCC) of expression pattern between differentially expressed (DE) lncRNA and corresponding protein-coding (PC) transcripts (lncNATs with anti-strand PC transcripts and lincRNA with adjacent PC transcripts) across the root and nodule development. Wilcoxon test was used.