| Literature DB >> 32082334 |
Jose A Vargas-Asencio1, Keith L Perry1.
Abstract
Gene regulation involves the orchestrated action of multiple regulators to fine-tune the expression of genes. Hierarchical interactions and co-regulation among regulators are commonly observed in biological systems, leading to complex regulatory networks. Small RNA (sRNAs) have been shown to be important regulators of gene expression due to their involvement in multiple cellular processes. In plants, microRNA (miRNAs) and phased small interfering RNAs (phasiRNAs) correspond to two well-characterized types of sRNAs involved in the regulation of posttranscriptional gene expression, although information about their targets and interactions with other gene expression regulators is limited. We describe an extended sRNA-mediated regulatory network in Arabidopsis thaliana that provides a reference frame to understand sRNA biogenesis and activity at the genome-wide level. This regulatory network combines a comprehensive evaluation of phasiRNA production and sRNA targets supported by degradome data. The network includes ~17% of genes in the A. thaliana genome, representing ~50% annotated gene ontology (GO) functional categories. Approximately 14% of genes with GO annotations corresponding to regulation of gene expression were found to be under sRNA control. The unbiased bioinformatic approach used to produce the network was able to detect 107 PHAS loci (regions of phasiRNA production), 5,047 active phasiRNAs (~70% of which were non-canonical), and reconstruct 17 regulatory modules resulting from complex regulatory interactions between different sRNA-regulatory pathways. Known regulatory modules like miR173-TAS-PPR/TPR and miR390-TAS3-ARF/F-box were faithfully reconstructed and expanded, illustrating the accuracy and sensitivity of the methods and providing confidence for the validity of findings of previously unrecognized modules. The network presented here includes a 2X increase in the number of identified PHAS loci, a large complement (~70%) of non-canonical phasiRNAs, and the most comprehensive evaluation of sRNA cleavage activity in A. thaliana to date. Structural analysis showed similarities to networks of other biological systems and demonstrated connectivity between phasiRNA regulatory modules with extensive co-regulation of transcripts by miRNAs and phasiRNAs. The described regulatory network provides a reference that will facilitate global analyses of individual plant regulatory programs such as those that control homeostasis, development, and responses to biotic and abiotic environmental changes.Entities:
Keywords: Arabidopsis; degradome; miRNA; network; phasiRNA; regulation
Year: 2020 PMID: 32082334 PMCID: PMC7001039 DOI: 10.3389/fpls.2019.01710
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Figure 1PHAS loci and phasiRNA triggers. (A) Histogram showing the number of PHAS loci detected per sRNA library across all libraries. Libraries are enumerated in the x axis; 902 libraries were evaluated and only those in which PHAS loci were found (n=426) are shown. The y axis shows the number of PHAS loci detected per library. (B) Histogram summarizing of recognition events (detection) of bona fide PHAS loci across all libraries. PHAS loci are enumerated in the x axis. The y axis shows the number of libraries in which a given PHAS locus was detected. Dotted line indicates the three detection events threshold utilized. (C) Distribution of number of phasiRNA production triggers in PHAS loci. (D) Degradome supported sRNA triggers. A Dot plot representation of the relationship between the number of degradome supported sRNA triggers and the length in nucleotides of the PHAS loci. (E) Boxplot representation of the degradome scores (deg_score) of identified sRNA triggers per PHAS loci length.
Overlap of detected phasiRNA loci to genomic features (n=107).
|
| Overlapping genomic feature data | Overlap | Reference | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Chr | Start | End | Inferred polarity | Feature type | Feature polarity | GeneID | Feature annotation | |||
| Trigger identified | Chr1 | 11454588 | 11454825 | + | . | . | . | 0 | ||
| Chr2 | 5497 | 5801 | + | . | . | . | 0 | |||
| Chr2 | 6461 | 8276 | + | . | . | . | 0 | |||
| Chr3 | 14197115 | 14197304 | + | . | . | . | 0 | |||
| Chr3 | 16158765 | 16159371 | - | . | . | . | 0 | |||
| Chr4 | 1318879 | 1319142 | - | . | . | . | 0 | |||
| Chr5 | 7006522 | 7007118 | + | . | . | . | 0 | |||
| Chr5 | 11814198 | 11814509 | + | . | . | . | 0 | |||
| Chr1 | 3945841 | 3946359 | + | gene | + | AT1G11700 | Senescence regulator | 518 | ||
| Chr1 | 4368802 | 4369096 | - | gene | - | AT1G12820 | Auxin signaling F-box 3 | 294 |
| |
| Chr1 | 4577301 | 4577793 | - | gene | - | AT1G13360 | Hypothetical protein | 492 | ||
| Chr1 | 7088193 | 7088490 | - | gene | - | AT1G20450 | Dehydrin family protein | 297 | ||
| Chr1 | 10472578 | 10473145 | - | gene | - | AT1G29910 | Chlorophyll A/B binding protein 3 | 567 | ||
| Chr1 | 17203735 | 17203844 | + | Transposable element | + | AT1G46120 | Copia-like retrotransposon family | 109 | ||
| Chr1 | 17890967 | 17891581 | - | gene | - | AT1G48410 | ARGONAUTE 1 (AGO1) | 614 |
| |
| Chr1 | 18549377 | 18549692 | - | gene | - | AT1G50055 | Trans-acting siRNA1b primary transcript (TAS1b) | 315 |
| |
| Chr1 | 23177838 | 23178693 | - | gene | - | AT1G62590 | Pentatricopeptide repeat (PPR) superfamily protein | 855 |
| |
| Chr1 | 23205212 | 23206109 | - | gene | - | AT1G62670 | Pentatricopeptide repeat (PPR) superfamily protein | 897 | ||
| Chr1 | 23299601 | 23300476 | + | gene | + | AT1G62910 | Pentatricopeptide repeat (PPR) superfamily protein | 875 |
| |
| Chr1 | 23302103 | 23302999 | + | gene | + | AT1G62914 | Pentatricopeptide repeat (PPR) superfamily protein | 896 | ||
| Chr1 | 23307088 | 23307771 | + | gene | + | AT1G62930 | Tetratricopeptide repeat (TPR)-like superfamily protein | 683 |
| |
| Chr1 | 23389496 | 23390241 | - | gene | - | AT1G63080 | Pentatricopeptide repeat (PPR) superfamily protein | 745 |
| |
| Chr1 | 23413410 | 23414052 | + | gene | + | AT1G63130 | Tetratricopeptide repeat (TPR)-like superfamily protein | 642 |
| |
| Chr1 | 23419941 | 23420383 | + | gene | + | AT1G63150 | Tetratricopeptide repeat (TPR)-like superfamily protein | 442 |
| |
| Chr1 | 23451248 | 23451797 | + | gene | + | AT1G63230 | Tetratricopeptide repeat (TPR)-like superfamily protein | 549 | ||
| Chr1 | 23489425 | 23489630 | - | gene | - | AT1G63320 | Pentatricopeptide repeat (PPR) superfamily protein | 153 | ||
| Chr1 | 23490163 | 23490976 | + | gene | + | AT1G63330 | Pentatricopeptide repeat (PPR) superfamily protein | 813 |
| |
| Chr1 | 23507868 | 23508766 | + | gene | + | AT1G63400 | Pentatricopeptide repeat (PPR) superfamily protein | 898 |
| |
| Chr1 | 23987412 | 23987854 | - | gene | - | AT1G64583 | Tetratricopeptide repeat (TPR)-like superfamily protein | 442 | ||
| Chr2 | 8508 | 9465 | + | gene | + | AT2G03875 | Novel transcribed region | 891 | ||
| Chr2 | 11721669 | 11722113 | - | gene | - | AT2G27400 | Trans-acting siRNA1a primary transcript (TAS1a) | 444 |
| |
| Chr2 | 15090887 | 15091257 | + | gene | + | AT2G35945 | Natural antisense transcript overlaps with AT2G35940 | 370 | ||
| Chr2 | 16011479 | 16012261 | + | gene | + | AT2G38230 | Pyridoxine biosynthesis 1.1 | 782 | ||
| Chr2 | 16537499 | 16538006 | - | gene | - | AT2G39675 | Trans-acting siRNA1c primary transcript (TAS1c) | 507 |
| |
| Chr2 | 16539685 | 16540023 | - | gene | - | AT2G39681 | Trans-acting siRNA primary transcript (TAS2) | 338 |
| |
| Chr2 | 18618970 | 18619432 | - | gene | - | AT2G45160 | GRAS family transcription factor | 462 | ||
| Chr3 | 5862034 | 5862383 | + | gene | + | AT3G17185 | Trans-acting siRNA primary transcript (TAS3) | 349 |
| |
| Chr3 | 6915591 | 6915801 | - | gene | - | AT3G19890 | F-box family protein | 210 | ||
| Chr3 | 7795243 | 7796095 | + | gene | + | AT3G22121 | Natural antisense transcript overlaps with AT3G22120 | 852 | ||
| Chr3 | 8529883 | 8530661 | - | gene | - | AT3G23690 | Basic helix-loop-helix (bHLH) DNA-binding superfamily protein | 778 |
| |
| Chr3 | 9417547 | 9417820 | - | gene | - | AT3G25795 | Trans-acting siRNA primary transcript (TAS4) | 270 |
| |
| Chr3 | 9870143 | 9870671 | + | gene | + | AT3G26810 | Auxin signaling F-box 2 | 528 | ||
| Chr3 | 14200432 | 14202247 | + | gene | + | AT3G06365 | Novel transcribed region | 1815 | ||
| Chr3 | 22410991 | 22411491 | - | gene | - | AT3G60630 | GRAS family transcription factor | 500 | ||
| Chr3 | 23273360 | 23273801 | - | gene | - | AT3G62980 | F-box/RNI-like superfamily protein | 441 | ||
| Chr4 | 57957 | 58359 | - | gene | - | AT4G00150 | GRAS family transcription factor | 402 | ||
| Chr4 | 1472812 | 1473032 | + | gene | + | AT4G04565 | Long non-coding RNA | 220 | ||
| Chr4 | 1476283 | 1476590 | + | gene | + | AT4G04595 | Novel transcribed region | 307 | ||
| Chr4 | 5764837 | 5765363 | + | gene | + | AT4G08990 | DNA (cytosine-5-)-methyltransferase family protein | 526 | ||
| Chr4 | 8382142 | 8382898 | - | Pseudogene | - | AT4G14610 | pseudogene (CC-NBS-LRR class) | 756 | ||
| Chr4 | 10276479 | 10276990 | - | gene | - | AT4G18670 | Leucine-rich repeat (LRR) family protein | 511 | ||
| Chr4 | 17639712 | 17640120 | - | gene | - | AT4G37540 | LOB domain-containing protein 39 | 408 | ||
| Chr4 | 18097248 | 18097605 | - | gene | - | AT4G38770 | Proline-rich protein 4 | 357 | ||
| Chr5 | 5461590 | 5461946 | + | gene | + | AT5G16640 | Pentatricopeptide repeat (PPR) superfamily protein | 356 | ||
| Chr5 | 17566574 | 17567501 | + | gene | + | AT5G43740 | Disease resistance protein (CC-NBS-LRR class) family | 927 | ||
| Chr5 | 23394264 | 23394495 | + | gene | + | AT5G57735 | tasiR-ARF | 231 |
| |
| Chr5 | 24309516 | 24309726 | - | gene | - | AT5G60450 | Auxin response factor 4 | 210 |
| |
| No trigger identified | ||||||||||
| Chr1 | 24721142 | 24721509 | . | . | . | . | 0 | |||
| Chr2 | 7349167 | 7349520 | . | . | . | . | 0 | |||
| Chr2 | 7839895 | 7839958 | . | . | . | . | 0 | |||
| Chr3 | 14199468 | 14199772 | . | . | . | . | 0 | |||
| Chr3 | 17445687 | 17445934 | . | . | . | . | 0 | |||
| Chr5 | 7683815 | 7684395 | . | . | . | . | 0 | |||
| Chr5 | 22322745 | 22322921 | . | . | . | . | 0 | |||
| Chr1 | 27833 | 28316 | . | gene | + | AT1G01040 | Dicer-like 1 | 483 | ||
| Chr1 | 4185045 | 4185423 | . | gene | - | AT1G12300 | Tetratricopeptide repeat (TPR)-like superfamily protein | 378 | ||
| Chr1 | 4295826 | 4296206 | . | gene | - | AT1G12620 | Pentatricopeptide repeat (PPR) superfamily protein | 380 | ||
| Chr1 | 4354454 | 4355245 | . | gene | + | AT1G12775 | Pentatricopeptide repeat (PPR) superfamily protein | 791 | ||
| Chr1 | 6194911 | 6196018 | . | gene | - | AT1G18000 | Major facilitator superfamily protein | 1107 | ||
| Chr1 | 6200123 | 6201091 | . | gene | + | AT1G18010 | Major facilitator superfamily protein | 968 | ||
| Chr1 | 15464434 | 15465161 | . | Transposable element | + | AT1TE51040 | ATHILA6A | 727 | ||
| Chr1 | 15471434 | 15472100 | . | Transposable element | + | AT1TE51040 | ATHILA6A | 666 | ||
| Chr1 | 15485357 | 15486023 | . | Transposable element | + | AT1TE51040 | ATHILA6A | 666 | ||
| Chr1 | 21125812 | 21126104 | . | Transposable element | - | AT1TE69815 | VANDAL6 | 292 | ||
| Chr1 | 23275517 | 23276374 | . | Pseudogene | - | AT1G62860 | pseudogene of pentatricopeptide (PPR) repeat-containing protein | 857 | ||
| Chr1 | 23386048 | 23386692 | . | gene | - | AT1G63070 | Pentatricopeptide repeat (PPR) superfamily protein | 644 |
| |
| Chr1 | 23587585 | 23587805 | . | gene | - | AT1G63615 | Hypothetical protein | 220 | ||
| Chr1 | 23587585 | 23587805 | . | gene | + | AT1G63630 | Tetratricopeptide repeat (TPR)-like superfamily protein | 220 | ||
| Chr1 | 29427956 | 29428166 | . | gene | - | AT1G09793 | Long noncoding RNA | 210 | ||
| Chr1 | 29427956 | 29428166 | . | gene | + | AT1G09797 | Long noncoding RNA | 210 | ||
| Chr2 | 855647 | 856343 | . | gene | - | AT2G02950 | Phytochrome kinase substrate 1 | 696 | ||
| Chr2 | 3251985 | 3252358 | . | gene | - | AT2G07671 | ATP synthase subunit C family protein | 356 | ||
| Chr2 | 3966746 | 3967025 | . | Transposable element | - | AT2TE16865 | ATHILA2 | 279 | ||
| Chr2 | 11513043 | 11513358 | . | gene | + | AT2G26975 | Ctr copper transporter family | 315 | ||
| Chr2 | 13529851 | 13530171 | . | gene | + | AT2G31820 | Ankyrin repeat family protein | 320 | ||
| Chr3 | 343230 | 343814 | . | gene | - | AT3G02020 | Aspartate kinase 3 | 584 | ||
| Chr3 | 3584608 | 3585005 | . | gene | - | AT3G11410 | Protein phosphatase 2CA | 397 | ||
| Chr3 | 4341697 | 4341988 | . | gene | + | AT3G13370 | Formin-like protein | 291 | ||
| Chr3 | 6524342 | 6524556 | . | gene | - | AT3G18930 | Transmembrane protein | 214 | ||
| Chr3 | 6524342 | 6524556 | . | gene | + | AT3G18915 | RING/U-box superfamily protein | 214 | ||
| Chr3 | 11983759 | 11983991 | . | gene | + | AT3G00610 | Novel transcribed region | 232 | ||
| Chr3 | 15677716 | 15678000 | . | Transposable element | - | AT3TE63405 | ATENSPM2 | 284 | ||
| Chr3 | 17136708 | 17137721 | . | gene | - | AT3G46550 | Fasciclin-like arabinogalactan family protein | 1013 | ||
| Chr3 | 18733898 | 18734234 | . | gene | + | AT3G50480 | Homolog of RPW8 4 | 336 | ||
| Chr4 | 3741606 | 3741942 | . | Transposable element | - | AT4TE16565 | ATHILA2 | 336 | ||
| Chr4 | 4554898 | 4555171 | . | Transposable element | + | AT4TE19135 | ATENSPM3 | 273 | ||
| Chr4 | 5567801 | 5567937 | . | Transposable element | - | AT4TE23345 | VANDAL21 | 136 | ||
| Chr4 | 7890508 | 7890765 | . | gene | - | AT4G13575 | Hypothetical protein | 257 | ||
| Chr4 | 10180261 | 10180408 | . | gene | - | AT4G06805 | Long noncoding RNA | 147 | ||
| Chr4 | 10180261 | 10180408 | . | gene | + | AT4G06810 | Long noncoding RNA | 147 | ||
| Chr5 | 7684660 | 7684943 | . | gene | - | AT5G22960 | Alpha/beta-Hydrolases superfamily protein | 283 | ||
| Chr5 | 9789495 | 9789663 | . | gene | - | AT5G27660 | Trypsin family protein with PDZ domain-containing protein | 168 | ||
| Chr5 | 11850409 | 11850683 | . | Transposable element | + | AT5TE42470 | ATHILA6A | 274 | ||
| Chr5 | 12167704 | 12167920 | . | Transposable element | + | AT5TE43315 | ATHILA | 216 | ||
| Chr5 | 15555417 | 15556760 | . | gene | + | AT5G38850 | Disease resistance protein (TIR-NBS-LRR class) | 1343 |
| |
| Chr5 | 15556984 | 15557720 | . | gene | + | AT5G38850 | Disease resistance protein (TIR-NBS-LRR class) | 736 |
| |
| Chr5 | 15699210 | 15699441 | . | Transposable element | + | AT5TE56690 | RathE2_cons | 231 | ||
| Chr5 | 15757646 | 15758194 | . | gene | + | AT5G39370 | Curculin-like (mannose-binding) lectin family protein | 393 |
| |
| Chr5 | 16640239 | 16640870 | . | gene | - | AT5G41610 | Cation/H+ exchanger 18 | 631 |
| |
| Chr5 | 16640239 | 16640870 | . | gene | + | AT5G41612 | Natural antisense transcript overlaps with AT5G41610 | 631 | ||
| Chr5 | 17560854 | 17561363 | . | gene | + | AT5G43725 | Other RNA | 509 | ||
| Chr5 | 11778496 | 11778932 | + | Transposable element | + | AT5TE42355 | ATHILA2 | 436 | ||
| Chr5 | 17560854 | 17561363 | . | gene | + | AT5G43730 | Disease resistance protein (CC-NBS-LRR class) | 509 | ||
Figure 2Network representation of interactions between phased interfering small RNA producing loci (PHAS) and small RNA (sRNA). PHAS loci connected by sRNA triggers were grouped into modules. Gene families previously reported as associated to PHAS loci and sRNA derived from these loci were colored as follows: miRNA (micro RNA) = red; TAS (Trans-acting small interfering RNAs) = dark blue, PPR/TPR (Pentatricopeptide/Tetratricopeptide repeat-like superfamily) = dark green; ARF-AFB-F-box (Auxin response factor/F-box containing protein) = purple, Non-coding RNA=black, Regulators=light green, NBS-LRR (Nucleotide binding leucine rich repeat protein) = yellow, genes/regions not previously associated to phasiRNA production or unannotated = light blue, Others=cyan. Diamonds represent sRNAs; rectangles represent PHAS loci. Non-coding RNAs include novel transcribed regions, natural antisense RNA, and long non-coding RNAs. Other acronyms are: GRAS TF=GRAS family transcription factor, AGO1=Argonaute 1 protein, MET= DNA (cytosine-5-)-methyltransferase family protein, LOB-TF= LOB domain-containing transcription factor, bHLH-TF= Basic helix-loop-helix (bHLH) DNA-binding superfamily protein transcription factor, CBP=chlorophyll A/B binding protein, NTR=novel transcribed region. The edge thickness between sRNAs and PHAS loci represents the degradome support for each interaction. Details for all of the node names in very small font can be in found in , where they are listed individually within each of the 17 regulatory modules; alternatively they can be read within the (enlarged) online version.
Figure 3(A) Relative abundance of reads (>50 copies) mapping to PHAS loci that matched annotated phased registers. The category “Described in the annotation” indicates percentage of all reads mapping to regions where PHAS loci were detected that belong to the registers indicated in the annotation. (B) Relative abundance of unique 21 and 22 nt long sRNAs (>50 copies) based on their type, showing the relative proportion of sRNA types among unique reads. (C) Relative abundance of total 21 and 22 nt long sRNAs (>50 copies) based on their type, showing the sRNA types among all reads.
Figure 4Summary of information available in degradome libraries. The histogram shows library yield as the number of million filtered reads for each of the 39 libraries. Colors: black refers to data produced in this study (16 libraries) and red refers to NCBI SRA data (23 libraries).
Figure 5Empirical cumulative distribution function of degradome detection events per sRNA-transcript pair across all degradome libraries (n=39). Gray dashed line indicates the location for the degradome score (deg_score) value of 15; black dashed line indicates the corresponding 99% threshold.
Figure 6Pie chart representation of the relative abundance of sRNA target sites within different regions of their target's genes. The number in parenthesis indicates the total number of target counts for the respective region. CDS, coding sequence; UTR, untranslated region; ncRNA, non-coding RNA; lncRNA, long non-coding RNA; uORF, upstream open reading frame; snoRNA, small nucleolar RNA; snRNA, small nuclear RNA.
Figure 7Pie chart representation of the relative abundance of phasiRNAs based on their size and phased register.
Figure 8Representation of the resulting sRNA-mediated regulatory network. sRNA nodes are colored in light blue, transcripts are colored green, and edges are gray. The type and abundance are mentioned for each class of nodes. The network has been manually organized to reflect the biogenesis of sRNAs. Two sets of miRNAs (and numbers) are diagrammed: the miRNAs (412) that do not induce phasiRNA production, and those that induce phasiRNA production (15).
Figure 9sRNA-mediated network functional analysis. Network representation of GO Slim categories showing enrichment for genes under sRNA regulation. Size of the nodes is proportional to the total amount of genes in the category; color scale indicates the corrected p-values for the enrichment test as described in Maere et al. (2005). Non-colored nodes are not significantly enriched (corrected p-value > 0.05).
Figure 10Degree distributions of sRNA-mediated network components. (A) Total degree distribution. (B) Degree distribution of the individual sRNA-mediated network components. Degree is represented by K and p(K) and is the number of nodes with degree K divided by total nodes. Regression lines for statistically significant correlations are shown. lRNA, long RNA (=transcripts).